Skip to main content
Fig. 2 | BMC Genomics

Fig. 2

From: Single genome retrieval of context-dependent variability in mutation rates for human germline

Fig. 2

Single genome determination of the context-dependent substitution rate constants. a–d The Trek approach is applicable to a genome containing multiple remnants of retrotransposon subfamilies silenced at different time epochs (a). We can consider those subfamilies as substitution counters that had different resetting ages (b). The full consensus sequence of the most recent subfamily is taken as a reference (a). The remnants are then grouped by their age and fully mapped onto the reference sequence (b). For each position i in the reference sequence, the fractions of the four bases in all the time groups are calculated (c). The comparison of these fractions coming from individual base types across different time periods enables a linear model fitting, through which we can reveal the rates for the substitutions into the b 2, b 3 and b 4 bases from the consensus (b 1) state of the given position (d). The steps (c) and (d) are repeated for all the positions in the reference sequence, producing single-nucleotide resolution core substitution rate constants with sequence-context dependency as sampled in the reference sequence of the mobile element. To assure the high quality and neutrality of the retrieved rates, we accounted for the sites in the reference sequence that had at least 700 mapped occurrences in each time group (b), with the same wild-type variant being always the prevalent one (more than 80%) in each subfamily (c) and producing a Pearson’s correlation coefficient of at least 0.7 in the time-evolution plots (d)

Back to article page