Skip to main content

Table 2 General features of the metagenome datasets

From: Comparative metagenomics of three Dehalococcoides-containing enrichment cultures: the role of the non-dechlorinating community

Feature KB-1 DonnaII ANAS
Type of sequencing Sanger 454 454 & Sanger
Total number of bases pre-assembly 106,515,530 930,446,714 330,964,688
Number of contigs 6,361 47,030 10,807
Total length of contigs (bp) 14,988,108 24,573,718 30,615,713
Number of singletons 18,629 105,608 15,486
Total length of singletons (bp) 13,487,233 57,708,799 10,450,264
Largest contig (bp) 155,970 121,460 921,258
Average contig size (bp) 2,356 522 2,832
Average G + C content (%) 52.33 52.28 51.91
Protein coding genes 40,766 194,527 60,992
- with COGs 21,857 116,001 39,920
- connected to KEGG pathways 8,077 36,685 11,878
rRNA genes (5 S/16 S/23 S) 18 (7/5/6) 185 (11/62/112) 40 (23/8/9)
tRNA genes 330 818 525
CRISPR count 48 7 57
MG-RAST data    
% Dhc in culture* 43.7 31.3 18.2
Metagenome size (bp)* 106,508,248 916,191,214 330,396,345
Average read length* 958 477 547
Number of sequences* 111,162 1,920,396 603,841
Number (%) identified for metabolic analysis 63,352 (57.0) 363,424 (18.9) 222,012 (36.8)
Number (%) identified for phylogenetic analysis 88,888 (80.0) 540,785 (28.2) 294,470 (48.8)
  1. * = post-MG-RAST preprocessing, which removed duplicate reads and nonsense reads from the datasets.
  2. † = maximum e-value of 1x10-5, minimum alignment length ~100.
\