Skip to main content

Table 4 Summary of improvements made to the initial genome assemblies by PAGIT

From: Generation of whole genome sequences of new Cryptosporidium hominis and Cryptosporidium parvum isolates directly from stool samples

Isolate

Initial assembly statistics: No. N50 Av. (kb)

Gaps closed by IMAGE

ICORN Sequence corrections: SNP Indel

RATT Gene annotations transferred

Final assembly statistics: No. N50 Av. Size (kb)

Paired reads aligning concordantly (%)

Contaminant sequence NC BCT (kb)

UKH3

1902 94.7 4.9

330

210 27

3649

34 1000 268.7 9136

87.5

41.1 0.0

UKH4

2375 29.1 4.0

343

584 99

3503

110 1032 84.4 9293

93.5

40.1 14.1

UKH5

1305 29.9 7.0

437

189 61

3569

65 1006 142.4 9257

96.2

6.9 2.4

UKP2

3084 105.2 3.1

174

182 48

3600

67 1011 136.7 9164

85.1

89.6 1.6

UKP3

919 57.5 10.0

259

226 60

3546

55 1009 167.7 9224

90.2

2.0 0.0

UKP4

1966 49.3 4.8

334

170 108

3553

52 891 177.4 9222

91.0

7.3 0.7

UKP5

3830 9.7 2.4

1435

341 80

3527

47 1031 198.4 9329

88.6

31.4 14.6

UKP6

16895 40.3 0.8

274

3144 394

3565

417 1014 22.7 9471

81.2

685.7 69.5

UKP7

1158 30.2 7.8

321

185 80

3556

55 1026 168.7 9278

90.4

2.5 0.0

UKP8

1792 42.0 5.2

356

152 47

3545

63 1015 147.4 9290

93.5

37.9 24.4

  1. The assembly statistics (initial and final) include the number of scaffolds, scaffold N50 metric, scaffold mean length (Av.), and the total size of the final assembly. Gene annotations were transferred by RATT out of a total of 3805 gene annotations in the reference assembly. The “Contaminant sequence” column refers to the total length non-concordantly mapping read pairs that have been assembled separately and identified as non-cryptosporidium sequence (NC) and bacterial sequence (BCT)