Skip to main content

Table 1 Size of each dataset in terms of assembled gigabases and number of sequences, together with the number of predicted attC sites and ORFs

From: A comprehensive survey of integron-associated genes present in metagenomes

 

Gigabases of assembled data

Number of sequences

Number of predictedattCsites1

Number of predicted ORF’s

Dabases

CAMERA [73]

66

179,126,552

354 (0.005)

360

MG-RAST [74]

13

7,881,749

5,377 (0.4)

6,471

NTenv (GenBank) [75]

87

86,661,686

5,094 (0.06)

6,467

EBI Metagenomics [76]

3

3,886,782

1,283 (0.4)

1,668

Other Datasets

Tara Oceans [81]

61

57,540,959

2,746 (0.05)

3,507

Aquatic microbiome [82]

1

4,094,883

2 (0.002)

2

Marine biofilm2

3

2,046,453

1,440 (0.5)

1,909

Human gut [83]

10

6,589,348

2 (0.0002)

2

Human gut from diabetic patients [84]

2

891,652

2 (0.001)

2

Human gut from travelers [85]

18

20,555,914

14 (0.0008)

14

Elephant gut [86]

1

311,295

29 (0.03)

41

Corn and prairie crops soil [87]

2

4,944,181

29 (0.02)

30

Microbial fuel cells [88]

0.15

207,982

38 (0.3)

42

Subarctic microbiomes [89]

0.04

169,650

2 (0.05)

2

Total

267

374,739,436

16,376

20,517

   

(11,5853)

(13,3974)

  1. 1In parenthesis, copies per million bases.
  2. 2Prepared by the authors.
  3. 3Non-redundant hits.
  4. 4Non-redundant hits. Aminoacid sequences