Skip to main content
Fig. 8 | BMC Genomics

Fig. 8

From: Context dependency of nucleotide probabilities and variants in human DNA

Fig. 8

Substitution model. Model substitution probabilities shown for the models with context-insensitive α (k=0), the ones with α depending on 1, 2, and 3 bases to each side (k=1, 2, 3), and the simple model conditioned on the 3 bases to each side. The model substitution probability for a site is the sum of the probabilities for the three possible substitutions. A The cumulative distribution of model substitution probabilities for all sites (solid lines) and for SNPs (dashed) on Chr1 shown for the five models. Note that for all models there are very few sites with substitution probability above 0.3. B The fraction of sites on Chr1 with an observed variant in the 1000 Genomes project (1KGP) plotted against p. The y values are SNP counts in small probability intervals (10−4) divided by total counts. The curves are smoothed with splines. Estimates are noisy for larger probabilities due to low counts. C As B for SNPs in 1KGP, Clinvar and COSMIC for the k=1 model and simple only. For latter two, counts are scaled so they sum to the number of SNPs in the 1KGP set for Chr1. For high mutability values there are few SNPs, so the curves are very noisy especially for Clinvar

Back to article page