Skip to main content

Table 1 Number of sequences in simulated and down-sampled datasets

From: Identification of genomic indels and structural variations using split reads

Sequence type Coverage Number of sequences Number of base pairs
Generated sequences 20× 2,477,629 994,491,814
Mapped sequences 20× 2,476,347 993,977,159
Used sequences 20× 2,476,088 993,873,276
  15× 1,857,784 745,693,367
  10× 1,236,929 496,489,303
  619,052 248,478,809
  123,633 49,625,857
  1. Notes:
  2. 1. We use 49,691,432 bp as the size of human chromosome 22 for calculating the sequence coverage.
  3. 2. For 1~15× target sequence coverage, we sample from the used sequences in the 20×-coverage dataset.