BMC Genomics

Table 1 The synthetic datasets and the number of simulated sequence variations. The Average Sequence Identity (ASI) is estimated by the total mismatches divided by the number of nucleobases

From: GSAlign: an efficient sequence alignment tool for intra-species genomes

Dataset	Genome size	SNV	Small indel	large indel	ASI
simHG-1X	3,088,279,342	58,421,383	1,001,626	285,757	97.93%
simHG-3X	3,088,292,247	175,100,939	962,721	275,584	93.86%
simHG-5X	3,088,289,999	291,714,646	919,762	263,271	89.90%
NA12878	6,070,700,436	3,088,156	531,315	NA	99.84%

Back to article page

ISSN: 1471-2164

Contact us

Submission enquiries: bmcgenomics@biomedcentral.com
General enquiries: ORSupport@springernature.com