Skip to main content

Table 1 The sequencing datasets used in the experiments.

From: Subset selection of high-depth next generation sequencing reads for de novo genome assembly using MapReduce framework

Dataset

1

2

3

Species11

E. coli

B. cereus

Grouper

Genome size

4.6 Mbp

5.2 Mbp

~1.1 Gbp2

Read length

2 × 300 bp

2 × 300 bp

2 × 200 bp

Mean quality score

34

34

35

% Bases with quality score > 30

83%

85%

92%

Depth

2853x

2669x

~110-120x2

  1. 1 The full scientific names of those species are Escherichia coli, Bacillus cereus and Epinephelus lanceolatus.
  2. 2 Those are estimated values by ALLPATHS-LG, because the complete reference genome is not yet available.