Skip to main content

Table 1 Size of each dataset in terms of assembled gigabases and number of sequences, together with the number of predicted attC sites and ORFs

From: A comprehensive survey of integron-associated genes present in metagenomes

  Gigabases of assembled data Number of sequences Number of predictedattCsites1 Number of predicted ORF’s
Dabases
CAMERA [73] 66 179,126,552 354 (0.005) 360
MG-RAST [74] 13 7,881,749 5,377 (0.4) 6,471
NTenv (GenBank) [75] 87 86,661,686 5,094 (0.06) 6,467
EBI Metagenomics [76] 3 3,886,782 1,283 (0.4) 1,668
Other Datasets
Tara Oceans [81] 61 57,540,959 2,746 (0.05) 3,507
Aquatic microbiome [82] 1 4,094,883 2 (0.002) 2
Marine biofilm2 3 2,046,453 1,440 (0.5) 1,909
Human gut [83] 10 6,589,348 2 (0.0002) 2
Human gut from diabetic patients [84] 2 891,652 2 (0.001) 2
Human gut from travelers [85] 18 20,555,914 14 (0.0008) 14
Elephant gut [86] 1 311,295 29 (0.03) 41
Corn and prairie crops soil [87] 2 4,944,181 29 (0.02) 30
Microbial fuel cells [88] 0.15 207,982 38 (0.3) 42
Subarctic microbiomes [89] 0.04 169,650 2 (0.05) 2
Total 267 374,739,436 16,376 20,517
    (11,5853) (13,3974)
  1. 1In parenthesis, copies per million bases.
  2. 2Prepared by the authors.
  3. 3Non-redundant hits.
  4. 4Non-redundant hits. Aminoacid sequences