Skip to main content

Table 4 Evaluation of assemblies of the simulated dataset (200 × 150 bp, 1% error) and dataset D2 and D3 with CloudBrush, Contrail, and Velvet

From: A de novo next generation genomic sequence assembler based on string graph and MapReduce cloud computing framework

Dataset Assembler # of contigs1 N50 Largest
contig size
Prec
-ision
Recall # of valid
contigs1
# of invalid
contigs1
Runtime
(sec)
200 × 150 bp
1% error
CloudBrush 229 112531 327245 99.20% 96.00% 152 77 10616
  Contrail 2540 7554 36335 90.12% 95.92% 957 1583 15823
  Velvet 209 78642 327101 99.63% 98.10% 168 41 1317
D2
dataset
CloudBrush 361 52961 156592 98.10% 98.15% 230 131 8622
  Contrail 300 43609 124089 98.47% 96.98% 250 50 7200
  Velvet 189 71764 174184 93.60% 92.20% 164 25 927
D3
dataset
CloudBrush 37064 8880 114585 93.65% 92.41% 24603 10387 48603
  Contrail 31870 8274 105244 96.99% 90.89% 25236 6116 44619
  Velvet 23565 10847 106863 95.55% 89.01% 20187 2838 13963
  1. 1 Contigs with lengths > 200 bp are counted.