Skip to main content

Advertisement

Table 5 Violations of the 2nd Chargaff rule on HG38. Columns contain the values of #T/#A, #G/#C on different chromosomes, as well as their Y and Z values. The latter reflect the significance of the inequality

From: Inversion symmetry of DNA k-mer counts: validity and deviations

  T/A G/C Y(T,A) Y(G,C) Z(T,A) Z(G,C)
chr1 1.002593 1.001175 0.001295 0.000587 15 5.76
chr2 1.00274 1.002747 0.001368 0.001372 16.41 13.49
chr3 1.002416 1.002824 0.001207 0.00141 13.19 12.5
chr4 1.001062 1.002595 0.000531 0.001296 5.75 11.04
chr5 1.004679 1.004144 0.002334 0.002068 24.44 17.5
chr6 1.000537 1.001981 0.000268 0.000989 2.72 8.12
chr7 1.003332 1.001884 0.001663 0.000941 16.15 7.57
chr8 0.999241 1.002536 −0.00038 0.001266 −3.53 9.65
chr9 1.001327 1.002823 0.000663 0.001409 5.61 9.99
chr10 1.0039 1.002911 0.001946 0.001454 17.18 10.82
chr11 1.001915 1.002815 0.000956 0.001405 8.48 10.51
chr12 1.003102 1.003317 0.001548 0.001656 13.75 12.2
chr13 1.003831 1.005012 0.001912 0.002499 14.83 15.36
chr14 1.008943 1.007342 0.004451 0.003658 32.58 22.24
chr15 1.001842 1.00411 0.00092 0.002051 6.44 12.23
chr16 1.009601 1.007001 0.004778 0.003488 32.17 21.07
chr17 1.002905 1.006812 0.00145 0.003395 9.77 20.81
chr18 1.005494 1.016917 0.00274 0.008388 19.03 47.34
chr19 1.009276 1.007636 0.004617 0.003803 25.46 20.13
chr20 1.011147 1.012815 0.005542 0.006367 33.22 33.7
chr21 1.003017 1.005026 0.001506 0.002507 7.33 10.15
chr22 0.998893 1.009337 −0.00055 0.004647 −2.52 19.94
chrX 1.003463 1.005699 0.001728 0.002842 16.73 22.23
chrY 1.008873 1.000209 0.004417 0.000105 17.58 0.34
  1. All Z values are very significant, but for Z(G,C) on chrY which corresponds to a p-value of 0.367. All other have inequality p-values < 0.01. On all chromosomes we observe #G > #C on the positive strand. Same is true for #T > #A, but for chr8 and chr22, where #T < #A, which is also a significant observation (|Z| > 2.575 corresponds to an inequality p-value < 0.005)