Skip to main content

Advertisement

Table 2 Assembly results, single-cell E.coli and S. aureus datasets (contigs of length ≥ 200 are used).

From: BayesHammer: Bayesian clustering for error correction in single-cell sequencing

Statistics BayesHammer BayesHammer(scaff old) Coral Coral (scaff old) EulerSR EulerSR (scaff old) Hammer, expanded Hammer, no expansion Hammer, no expansion(scaff old) Hammer(scaff old)
  Single-cell E. coli, reference length 4639675, reference GC content 50.79%
# contigs (1000 bp) 191 158 276 224 231 150 195 282 242 173
# contigs 521 462 675 592 578 375 529 655 592 477
Largest contig 269177 284968 179022 179022 267676 267676 268464 210850 210850 268464
Total length 4952297 4989404 5064570 4817757 4817757 4902434 4977294 5097148 5340871 5005022
N50 110539 113056 45672 67849 74139 95704 97639 65415 84893 109826
NG50 112065 118432 55073 87317 77762 108976 101871 68595 96600 112161
NA50 110539 113056 45672 67765 74139 95704 97639 65415 84841 109826
NGA50 112064 118432 55073 87317 77762 108976 101871 68594 96361 112161
# misassemblies 4 6 9 12 6 8 4 4 7 7
# misassembled contigs 4 6 9 10 6 8 4 4 7 7
Misass. contigs length 42496 94172 62114 150232 47372 149639 43304 26872 147140 130706
Genome covered (%) 96.320 96.315 96.623 96.646 95.337 95.231 96.287 96.247 96.228 96.281
GC (%) 49.70 49.69 49.61 49.56 49.90 49.74 49.68 49.64 49.60 49.68
# mismatches/100 kbp 11.22 11.70 8.36 9.10 5.55 5.82 12.77 54.11 52.48 13.08
# indels/100 kbp 1.07 8.26 9.17 12.76 0.52 47.80 0.91 1.17 7.96 8.69
# genes 4065 + 4079 + 3998 + 4040 + 3992 + 4020 + 4068 + 4034 + 4048 + 4078 +
  124 part 110 part 180 part 143 part 140 part 107 part 123 part 152 part 136 part 111 part
  Single-cell S. aureus, reference length 2872769, reference GC content 32.75%
# contigs (1000 bp) 95 85 132 113 82 70 114 272 258 101
Total length (1000 bp) 3019597 3309342 3055585 3066662 2972925 2993100 3033912 3389846 3405223 3509555
# contigs 260 241 455 423 166 134 312 721 711 292
Largest contig 282558 328686 208166 208166 254085 535477 282558 148002 166053 328679
Total length 3081173 3368034 3160497 3166169 3008746 3020256 3111423 3575679 3594468 3584266
N50 87684 145466 62429 90701 101836 145466 74715 30788 34943 131272
NG50 112566 194902 87636 99341 108151 159555 88292 39768 45889 180022
NA50 87684 145466 62429 89365 100509 145466 68711 30788 34552 112801
NGA50 88246 148064 74452 90101 101836 145466 88289 35998 42642 148023
# misassemblies 15 17 11 14 4 5 11 14 18 14
# misassembled contigs 12 14 9 10 4 5 9 14 16 12
Misass. contigs length 340603 779785 478009 523596 377133 918380 402997 272677 324361 940356
Genome covered (%) 99.522 99.483 99.449 99.447 99.213 99.254 99.204 98.820 98.888 99.221
GC (%) 32.67 32.63 32.64 32.63 32.66 32.67 32.67 32.39 32.38 32.57
# mismatches per 100 kbp 3.18 8.01 12.44 12.65 9.72 10.28 17.38 54.92 55.50 15.36
# indels per 100 kbp 2.17 2.30 15.50 15.67 3.80 4.08 3.57 2.64 2.72 3.04
# genes 2540 + 2547 + 2532 + 2540 + 2547 + 2550 + 2535 + 2477 + 2485 + 2539 +
  36 part 30 part 45 part 37 part 30 part 27 part 41 part 91 part 85 part 38 part