Identification of genomic indels and structural variations using split reads

BMC Genomics

Table 1 Number of sequences in simulated and down-sampled datasets

Sequence type	Coverage	Number of sequences	Number of base pairs
Generated sequences	20×	2,477,629	994,491,814
Mapped sequences	20×	2,476,347	993,977,159
Used sequences	20×	2,476,088	993,873,276
	15×	1,857,784	745,693,367
	10×	1,236,929	496,489,303
	5×	619,052	248,478,809
	1×	123,633	49,625,857

Notes:
1. We use 49,691,432 bp as the size of human chromosome 22 for calculating the sequence coverage.
2. For 1~15× target sequence coverage, we sample from the used sequences in the 20×-coverage dataset.

ISSN: 1471-2164