Skip to main content

Table 1 The sequencing datasets used in the experiments.

From: Subset selection of high-depth next generation sequencing reads for de novo genome assembly using MapReduce framework

Dataset 1 2 3
Species11 E. coli B. cereus Grouper
Genome size 4.6 Mbp 5.2 Mbp ~1.1 Gbp2
Read length 2 × 300 bp 2 × 300 bp 2 × 200 bp
Mean quality score 34 34 35
% Bases with quality score > 30 83% 85% 92%
Depth 2853x 2669x ~110-120x2
  1. 1 The full scientific names of those species are Escherichia coli, Bacillus cereus and Epinephelus lanceolatus.
  2. 2 Those are estimated values by ALLPATHS-LG, because the complete reference genome is not yet available.