Skip to main content

Table 1 Estimates and exhaustive calculations of the human genome coverage of n-mer Space

From: Sequence space coverage, entropy of genomes and the potential to detect non-human DNA in human samples

n-mer

Mean Coverage from Stochastic Sampling 1

95% Confidence Interval

Coverage from Exhaustive search

12

99.726%

99.718%

99.733%

99.730%

13

96.416%

96.319%

96.514%

96.458%

14

84.470%

84.308%

84.633%

84.444%

15

62.041%

61.865%

62.217%

62.124%

16

37.065%

36.934%

37.197%

NA2

17

16.156%

16.058%

16.254%

NA

18

5.332%

5.278%

5.386%

NA

19

1.529%

1.508%

1.551%

NA

20

0.382%

0.369%

0.396%

NA

  1. 1100,000 n-mers were used to estimate coverage of the genome, repeated 5 times to estimate the 95% confidence interval.
  2. 2The set of all possible n-mers greater than 15 bp is too large to be exhaustively searched given our current computational resources.