Fig. 5 | BMC Genomics

From: Single genome retrieval of context-dependent variability in mutation rates for human germline

Oligomeric composition of the ideal neighbour-invariant sequence. ad The oligomeric content of the human genome (x-axis) is compared to the content expected by chance (y-axis) in a sequence that has the exact single-base composition as the human genome, but has substitution rates that are purely context independent. This corresponds to a hypothetic genome simulation with perfectly correct \(r_{i,j}^{sb}\) single-base rate constants, but without any \(\delta r_{i,j}^{sr}\) sequence-context dependency present. The lengths of the k-mers, along with the Pearson’s correlation coefficients without and with (the values in the brackets) the CpG containing oligomer data (red points) are shown on the bottom right corners of the plots. The correlation coefficients are notably smaller compared to the in silico sequence, equilibrated based on the full set of context-dependent Trek \(r_{i,j}^{core}\) constants. The dashed lines depict the diagonals for the ideal match of the k-mer contents

