Skip to main content

Table 1 The number of genomes in each database after filtering by SNP distance. The distance was calculated by summing the number of unique SNPs between genomes. aIn order to have a smaller database to benchmark against slower/more memory intensive tools, the number of genomes in d10small was restricted to be 200. The 200 genomes were randomly selected relative to the overall distribution of lineages, with a minimum requirement of five genomes for each lineage. D10 was selected as source set for the small benchmarking set to ensure the broadest possible strain and distance representation

From: QuantTB – a method to classify mixed Mycobacterium tuberculosis infections within whole genome sequencing data

Name

Minimum Genomic Distance (SNPs)

Number of genomes

d10

10

4933

d10smalla

10a

200a

d25

25

3686

d50

50

2843

d100

100

2167