Skip to main content

Table 1 Contig assembly of deeply sequenced bacterial genomes

From: BASE: a practical de novo assembler for large genomes using long NGS reads

  Tools Parameters Correct N50 Misatch/Indel Aligned rate Coverage Time(sec)
S.aureus MW2 (240X, 100 bp HiSeq) SPAdes 51,63,85 299,305 134/6 99.79 % 100.00 % 1239
SOAPdenovo2 87-95 82,495 0/0 99.84 % 99.27 % 25;16
SGA 29;91 74,584 7/0 99.81 % 99.98 % 1228;1149
BASE 4 92,706 0/0 100.00 % 99.97 % 161; 93
V.para (240X, 250 bp MiSeq) SPAdes 33,55,65,75,85,99 169,978 118/45 99.97 % 99.97 % 4616
SOAPdenovo2 125 88,858 23/30 99.98 % 99.98 % 110;1
SGA 29;149 95,711 58/26 99.80 % 99.97 % 2478;2884
BASE 4 159,715 29/29 100.00 % 99.75 % 676; 388
  1. S.aureus MW2 has its real reference with length 2.8 Mb and V.para has its species’ reference with length 5.1 Mb and two chromosomes. Both of these two bacteria are sequenced up to 240X. GAGE validation pipeline was used to calculate the corrected contig N50, base errors, structural errors, contig aligned rate and reference coverage. Except BASE used single thread for contig assembly part, and other the assemblies were all performed with 24 threads. The time before semicolon is for index building and after semicolon is for assembly. For SGA, indexing time contains the time used in the indexing after error correction and filtering; assembly time contains the time used in the overlap and assembly