Skip to main content

Table 1 Number of sequences in simulated and down-sampled datasets

From: Identification of genomic indels and structural variations using split reads

Sequence type

Coverage

Number of sequences

Number of base pairs

Generated sequences

20×

2,477,629

994,491,814

Mapped sequences

20×

2,476,347

993,977,159

Used sequences

20×

2,476,088

993,873,276

 

15×

1,857,784

745,693,367

 

10×

1,236,929

496,489,303

 

5×

619,052

248,478,809

 

1×

123,633

49,625,857

  1. Notes:
  2. 1. We use 49,691,432 bp as the size of human chromosome 22 for calculating the sequence coverage.
  3. 2. For 1~15× target sequence coverage, we sample from the used sequences in the 20×-coverage dataset.