Skip to main content

Table 2 General features of the metagenome datasets

From: Comparative metagenomics of three Dehalococcoides-containing enrichment cultures: the role of the non-dechlorinating community

Feature

KB-1

DonnaII

ANAS

Type of sequencing

Sanger

454

454 & Sanger

Total number of bases pre-assembly

106,515,530

930,446,714

330,964,688

Number of contigs

6,361

47,030

10,807

Total length of contigs (bp)

14,988,108

24,573,718

30,615,713

Number of singletons

18,629

105,608

15,486

Total length of singletons (bp)

13,487,233

57,708,799

10,450,264

Largest contig (bp)

155,970

121,460

921,258

Average contig size (bp)

2,356

522

2,832

Average G + C content (%)

52.33

52.28

51.91

Protein coding genes

40,766

194,527

60,992

- with COGs

21,857

116,001

39,920

- connected to KEGG pathways

8,077

36,685

11,878

rRNA genes (5 S/16 S/23 S)

18 (7/5/6)

185 (11/62/112)

40 (23/8/9)

tRNA genes

330

818

525

CRISPR count

48

7

57

MG-RAST data

   

% Dhc in culture*

43.7

31.3

18.2

Metagenome size (bp)*

106,508,248

916,191,214

330,396,345

Average read length*

958

477

547

Number of sequences*

111,162

1,920,396

603,841

Number (%) identified for metabolic analysis†

63,352 (57.0)

363,424 (18.9)

222,012 (36.8)

Number (%) identified for phylogenetic analysis†

88,888 (80.0)

540,785 (28.2)

294,470 (48.8)

  1. * = post-MG-RAST preprocessing, which removed duplicate reads and nonsense reads from the datasets.
  2. † = maximum e-value of 1x10-5, minimum alignment length ~100.