H3K4me1, in contrast to all other active chromatin marks, is positively correlated with DNA methylation within hypomethylated regions at enhancers and promoters
The correlation between specific chromatin marks and DNA methylation has already been studied in promoters and gene coding regions [1, 20], but with insufficient focus on enhancers. Therefore, we compiled a set of 210,048 genomic sites, each of length 1 k base (kb), centered over Promoters-TSSs (+/− 500 bp of the TSS), as well as the cross-tissue putative enhancers (reported in 19 mouse cell types). We calculated the average DNA methylation of each genomic site in mouse ESCs, and split the list of genomic sites into two groups based on their DNA methylation level: hypermethylated sites (DNA methylation >50%, N = 186,564) and hypomethylated sites (DNA methylation ≤50%, N = 23,484). Hyper- and hypomethylation usually refer to increased or decreased DNA methylation without a specific boundary, and we also use these terms to simplify the presentation of our results. The 50% is not a sharp boundary and slight changes in its value do not affect our conclusions.
Within each DNA methylation group, we analyzed the correlation with DNA methylation of promoters and enhancers. While the promoters are easy to determine since they are around the TSS, the enhancers can occur in any genomic region including repeat-associated regions {Short Interspersed Nuclear Element (SINE), Long Interspersed Nuclear Element (LINE), Simple repeat, Long Terminal Repeat (LTR), DNA Transposon, Low complexity, DNA Transposon}, Intergenic, Intron, coding regions {Exon, 3’UTR, Transcription Termination Site (TTS)}, Non-coding, CpG island, and Others (merging the cases with less than 100 members, see Methods).
For each of the resulting 14 classes (one promoter and 13 enhancer classes), we calculated the correlation of DNA methylation with 9 chromatin marks {H3K4me1, H3K4me2, H3K4me3, H3K9ac, H3K9me3, H3K27ac, H3K20me3, H3K27me3, H3K36me3}, the repressive histone 3 (H3), the gene transcription marker RNA polymerase 2 (Pol2), the enhancer marker histone acetyltransferase P300, and the binding of the insulator marker CTCF in mouse ESCs (Fig. 1, and Table 1, rows 1, 4–7, 12).
The active chromatin marks (H3K4me1, H3K9ac, H3K4me3, H3K27ac) show negative correlations with DNA hypermethylated classes, while some repressive marks, including H3K9me3 and H3K20me3, are positively correlated (Fig. 1a). Among all the genomic regions in our study, the negative correlation with DNA hypermethylated is especially strong in Promoter-TSSs. The DNA hypomethylated sites represent a similar pattern, particularly for the active chromatin marks (H3K9ac, H3K4me3, and H3K27ac) (Fig. 1b). H3K4me1, however, exhibits an opposite pattern between DNA hyper- and hypomethylated regulatory regions: its correlation with DNA methylation is negative within all hypermethylated classes (Fig. 1a), but positive for DNA hypomethylated classes, especially within Promoter-TSSs, and within putative enhancers in CpG islands, Exons and 3’UTRs (Fig. 1b). The latter result is unexpected, since DNA methylation is generally regarded as a repressive epigenetic mark, and H3K4me1 is a hallmark of active or poised enhancers [4], hence a negative correlation is more likely between them.
According to their distance to the TSSs, the enhancers are usually classified into proximal and distal enhancers. The flanking regions of promoters are usually enriched with H3K4me1 defining proximal enhancers; however, the study of DNA methylation regulating histone methylation is more relevant in distal regions. Therefore, in order to focus our analysis on distal enhancers we excluded from the list of putative enhancers those located inside genes (promoters, exons and introns) or within a distance of less than 3 kb from the closest TSS. We found a clear anti-correlation between H3K4me1 and DNA methylation over hypermethylated distal enhancers (Additional file 1: Figure S1a), whereas the distal enhancers over genomic regions with DNA methylation lower than 50% are positively correlated between H3K4me1 and DNA methylation (Additional file 1: Figure S1b). Additionally, it is noteworthy in Additional file 1: Figure S1b the very high correlation of H3K4me1 with distal enhancers lying over CpG islands. Hence, H3K4me1 exhibits positive correlation with DNA hypermethylated enhancers in general and with DNA hypermethylated distal enhancers in particular.
Since enhancers are often shorter than 1 kb and both H3K4me1 and DNA methylation could localize to the same 1 kb element, but not necessary with local overlap, this could drive correlations between DNA methylation and H3K4me1. Therefore, we performed the above correlation analysis with window sizes of +/− 100 bp (total size 200 bp, Additional file 1: Figure S2) and +/− 200 bp (total size 400 bp, Additional file 1: Figure S3). The smaller window sizes decrease the number of promoters/enhancers, since many of them lack required number of CpGs in the smaller window to measure the DNA methylation level. Nonetheless, these results confirm the correlation between H3K4me1/H3K4me3 and DNA methylation, independently of the window size.
H3K4me1, in contrast to all other active chromatin marks, is enriched at intermediate DNA methylation level
To analyze the distinct deposition of H3K4me1 over the DNA methylation landscape, we sorted the list of regulatory regions based on their DNA methylation level, and averaged the enrichments of each chromatin mark (Fig. 2a). We found that repressive chromatin marks such as H3K9me3, H4K20me3 and histone 3 (H3) are statistically significantly overrepresented in hypermethylated regions, while active chromatin marks are enriched at DNA hypomethylated promoters and enhancers (p-value <1e-15), i.e., the regulatory regions with DNA methylation >95% are 5-fold more enriched of H3K9me3 and simultaneously 10-fold less enriched of H3K4me3, compared to the <5% DNA methylated regions.
H3K4me1 enrichment is clearly distinct from all the other active chromatin marks (Fig. 2b). It is most enriched (0.9) at intermediate DNA methylation levels (25 - 75%), and is enrichment diminished at DNA methylation levels below 25% or above 75%, whereas H3K27ac, whose enrichment distinguishes the active from primed enhancers, is enriched in the lower range (25 - 35%) of the same intermediate DNA methylation level and decreases linearly in the higher range (35 - 75%) of the intermediate DNA methylation (Fig. 2b). Thus, when the DNA methylation of the enhancers decreases, the enhancers switch from a primed to an active state.
We studied the correlation of the signal of the three methylation states of H3K4 {me1, me2, me3} with the DNA methylation level, and found that while H3K4me2 and H3K4me3 signals anticorrelate with DNA methylation level across the whole DNA methylation range, H3K4me1 correlates positively with DNA methylation in the 0 - 50% range and negatively in the 50 - 100% range (Fig. 2f-h). We observed that DNA methylation affects RNA expression differentially promoters and enhancers. Whereas in the case of promoters, RNA expression was depleted for the middle range of DNA methylation (Fig. 2c), for the case of enhancers RNA expression was less affected for DNA methylation levels of more than 75%. We searched for non-canonically expressed enhancers, i.e., those that being highly methylated (DNA methylation >75%) are nevertheless expressed. Among them we found multiple enzymes, such as the three loci of the muscle pyruvate kinase (Pkm), lactate dehydrogenase C (Ldhc), glycogen synthase 2 (Gys2), prolyl 4-hydroxylase subunit β (P4hb), two loci of the protein phosphatase 4, catalytic subunit (Ppp4c), the epigenetic regulators tet methylcytosine dioxygenase 1 (Tet1), and jade family PHD finger 1 (Jade1); and transcriptional regulators such as the transcriptional repressor pro-apoptotic WT1 regulator (Pawr); the transcriptional and translational initiators: basic transcription factor 3 (Btf3) and eukaryotic translation initiation factor 4, gamma 2 (Eif4g2) among others (Fig. 2e).
Next, we validated our finding that in contrast to the other active chromatin marks (H3K9ac, H3K4me3, H3K4me2, H3K27ac), H3K4me1 is less enriched in both unmethylated and highly methylated regulatory regions, but overrepresented in regions with intermediate levels of DNA methylation (Fig. 2a and b), by co-localizing the DNA methylation level and histone mark signals with the known enhancer coordinates of the Myc/c-Myc and Sox2 pluripotent genes in ESCs [45, 46] (Fig. 2i). In the case of Myc, the three known enhancers co-localize with peaks of high H3K4me1 signal and intermediate DNA methylation level. In the case of Sox2, two enhancers (5 and 6) co-localize with peaks of high H3K4me1 signal and intermediate DNA methylation level, and four enhancers (1, 2, 3 and 4) co-localize with peaks of the P300 and very low DNA methylation level.
Neither methyl-binding proteins, nor cytosine hydroxymethylation can explain the distinct H3K4me1/3 deposition
To search for possible molecular mechanisms that explain the positive correlation between DNA methylation and H3K4me1 at hypo- to intermediate DNA methylation level at regulatory sites, we examined two conjectures: (i) Proteins with MBDs could be potential mediators of the distinct H3K4me1/3 deposition. (ii) The transition of cytosine methylation towards unmethylation through the cytosine hydroxymethylation transitory state could be associated with the H3K4me1 enrichment at intermediate DNA methylation level.
MBD proteins link to DNA through binding DNA methylated sites to some histone modifications, i.e. MBD1 forms a complex with the H3K9 methylase SETDB1, which is suggested to form stable heterochromatin histone marks over methylated DNA [47, 48]. Additionally, MBD3 is enriched at active promoters (with a positive correlation with H3K4me3) and at the enhancers of active genes that are usually H3K4me1 marked [49, 50]. Indirect interactions between MBDs and H3K4 methylation can also be hypothesized, i.e. ZIC2, an enhancer-binding factor which co-localizes H3K4me1 and the other enhancer marks (P300, H3K27ac) is shown to interact with MBD3/NURD in mouse ESCs [51]. Thus, the MBDs could be effectors of the crosstalk between DNA methylation and the H3K4me1 and H3K4me3 interaction observed here.
Therefore, to check the MBD effectors hypothesis we compared the chromatin immunoprecipitation sequencing (ChIP-seq) profiles of the MBD proteins for which data is available: MBD1A/B, MBD2, MBD3, MBD4 and MECP2 (Table 1, row 11) with enrichment sites of H3K4me1 and H3K4me3 in mouse ESCs (Table 1, row 4). In this analysis we included all genomic sites that showed a statistically significant peak of the chromatin marks or of the protein binding, regardless of whether such genomic sites are located within promoter/enhancer regions or not. H3K4me1 peaks occur at intermediate to high DNA methylation level, median DNA methylation (Med) = 76%, whereas the MBD proteins binding loci are very highly DNA methylated (Med > 90%), with the exception of MBD3 (Med = 52%) and MBD2 (Med = 81%). H3K4me3 enrichment occurs at low DNA methylation level (Med = 24%) (Fig. 3a). Such results point out lack of correlation between H3K4me3 deposition and MBD protein binding DNA methylation over all the DNA methylation ranges (low, intermediate and high), and not so obvious lack of correlation between H3K4me1 deposition and MBD protein binding DNA methylation. To resolve this case, we zoomed into the intermediate to high range of DNA methylation (50 - 100%) to check some possible correlation of MBD binding and H3K4me1 enrichment. For this purpose, we calculated the fraction of the highly methylated peaks (DNA methylation >95%) among all peaks of H3K4me1 and H3K4me3, and MBD binding regions (Fig. 3b). 10 - 20% of the MBD binding peaks populate the over 95% DNA methylation range, in contrast to only 2% H3K4me1 marks populating the same range, which rejects the possibility of overlap direct interaction between methyl-binding proteins and H3K4 methylation. We analyzed further this possibility through computing the number of all possible pairwise overlaps between peaks of two signals (chromatin marks or methyl-binding proteins) (Fig. 3c). We found that for all methyl-binding proteins there were more peak overlaps with H3K3me3 than with H3K3me1. The methyl-binding protein with highest number of overlaps with H3K4me1 is MBD3, i.e. it has a 21% of peaks overlapping with the H3K4me1 (accounting for 7592 peaks), and a 23% peaks overlapping with H3K3me3 (amounting to 8524 peaks). The other methyl-binding proteins have even less overlaps with H3K4me1 peaks (5 to 13%). These results abrogate the hypothesis of a possible connection between methyl-binding proteins and H3K4me1 deposition.
We observed that H3K4me1 is enriched at intermediate DNA methylation level, leading to the conjecture that such intermediary level might correspond to bidirectional DNA high ↔ low methylation transitions. Since it has been considered that DNA cytosine hydroxymethylation (5hmC) is an intermediate state in the process of active DNA cytosine demethylation [52], is conceivable to hypothesize that the observed intermediate DNA cytosine methylation associated with H3K4me1 enrichment might also correlate with DNA cytosine hydroxymethylation. Therefore, it is worth to study whether there is a correlation between DNA cytosine hydroxymethylation and the dynamics of the distinct H3K4me1/3 states when DNA methylation is in the transitory way to be reduced during the intermediary DNA cytosine hydroxymethylation.
To check the DNA cytosine hydroxymethylation hypothesis, we designed a method to find out which one of the DNA cytosine methylations (5mC or 5hmC), has stronger impact on the level of H3K4 methylation (H3K4me1 and H3K4me3). For this purpose, we compared alternations between a present (+) and an absent (−) state of one form of cytosine methylation (5mC or 5hmC) while the other form remains constant at a background level. Since WGBS (Whole-Genome Bisulfite Sequencing) data cannot discriminate directly between 5mC and 5hmC levels of a CpG but the sum of both DNA methylation types, we designed a method to infer 5mC from WGBS and TAB-seq (Tet-assisted bisulfite sequencing), see Eq. 1 in Methods section. We identified two groups of putative enhancers for each form of cytosine methylation (5mC or 5hmC). Each of these two groups has two subgroups, each subgroup with a similar distribution of one form of cytosine methylation working as a background but with altered level into two states (present +, or absent -) of the other form of cytosine methylation. Thus, the 5hmC alteration group consists of two subgroups {5hmC+, 5hmC-} of enhancers with significantly different 5hmC (present +, or absent -) but equal 5mC distributions, while the 5mC alteration group consists of two subgroups {5mC+, 5mC-} of enhancers with significantly different 5mC (present +, or absent -) but equal 5hmC distributions. Hence we could study the effect of the “altered” (present +, or absent -) form of DNA cytosine methylation, independently from the “background” (equal) form of methylation. We calculated the enrichment of H3K4me1 and H3K4me3 for each of the identified groups to study whether the hydroxymethylation of cytosines (5hmC) is the cause of the positive correlation between DNA methylation and H3K4me1 on DNA hypomethylated regulatory sites (Figs. 3d and e, and Table 1, rows 1, 3 and 4). We found out that alternation in 5mC levels coincides with a significant change in both H3K4me1 and H3K4me3 enrichment of regulatory sites, the H3K4me1 level increases from the group of 5mC- to the group of 5mC+ enhancers whereas the H3K4me3 level decreases from the group of 5mC- to the group of 5mC+. However, both H3K4me1 and H3K4me3 enrichments of enhancers having similar 5mC but different 5hmC are almost the same. Hence, a possible role of cytosine hydroxymethylation on H3K4me1/3 regulation is rejected and the role of cytosine methylation on H3K4me1/3 regulation is reinforced.
DNA methylation regulates H3K4me1 - H3K4me3 seesaw
Since our previous conjectures for explaining the molecular mechanisms ruling the enrichment of H3K4me1 within DNA methylated regulatory sites were rejected, we asked the reverse question: Why H3K4me1 is not increased at DNA unmethylated regulatory sites (promoters and putative enhancers) as it could be expected for an active mark? We have already observed elevated H3K4me3 over diminished H3K4me1on DNA unmethylated regulatory sites, Particularly, the enrichment of H3K4me3 has the highest fold-change between DNA hypo- and hypermethylated regulatory sites among all active chromatin marks in this study. Hence, we hypothesized the existence of a seesaw between H3K4me1 and H3K4me3 occupancy within regulatory sites, which is controlled by DNA methylation. While both chromatin marks are depleted at DNA hypermethylated regions, the activation of this seesaw mechanism is restricted to the regulatory sites with zero to intermediate levels of DNA methylation.
We checked this hypothesis in mouse pluripotent ESCs (Fig. 4 and Table 1, rows 1 and 4). Regulatory regions with the highest H3K4me3 enrichments were DNA unmethylated and H3K4me1 decreased (Fig. 4a). In contrast, the regions with elevated H3K4me1 enrichment had higher DNA methylation but less H3K4me3. A similar analysis of cortex and liver cells (Table 1, row 15) confirmed that our finding is also true for differentiated cells (Figs. 4b and c).
The regulation of the H3K4me1 - H3K4me3 seesaw by DNA methylation is mediated through protein CXXC DNA binding domains
The MLL1/2 and SET1A/B protein complexes responsible for deposition of H3K4me3 to the nucleosomes [42, 53, 54] share homologous CXXC subunits (CXXC7 in MLL1/2 and CXXC1 in the CFP1 component of the SET1A/B complex). These CXXC subunits are missing in the H3K4me1 depositing histone methyltransferase MLL3/4 complex [31,32,33]. CXXC binding domains are known to bind unmethylated CpG rich genomic regions, particularly CpG islands [39, 55]. To obtain mechanistic insights into the seesaw mechanism here proposed, we studied the influence of the presence or absence of CXXC domains on the performance of the seesaw mechanism through the computational analysis of knock out (KO) of such domains in pluripotent and differentiated mouse cells.
An expected consequence of the seesaw mechanism would be the elevation of H3K4me1 after the block of H3K4me3 in DNA hypomethylated regions. We counted the number of H3K27me3 and H3Kme1 peaks in wild type (WT) and CXXC7 (MLL1) KO from mouse embryonic fibroblast (MEF) (Table 1, row 9) and the results confirmed such prediction: the number of H3K4me1 peaks in the MLL1 KO is 31% higher (p-value <1e-15, binomial test) than in the WT cells (Fig. 4d). The frequency of H3K27me3 peaks had a minor (< 4%) difference that showed the analysis was not biased (Fig. 4d).
Next, we studied how the influence on the H3K4 methylation exerted by the CXXC1 (CFP1) component of SET1A/B in ESCs is related to DNA methylation. We used the ESC WT and Cfp1 KO H3K4me1/3 ChIP-seq peaks from Clouaire et al. [39] (Table 1, row 8) and we co-localized them with ESC DNA methylation from Stadler et al. [56] (Table 1, row 1). We identified 8409 H3K4me3 peaks specific for WT cells and 13,184 peaks specific for Cfp1 KO, in addition to the 78,847 common peaks between WT and Cfp1 KO cells. The number of distal (i.e. > 5 kb distance from a TSS) H3K4me3 peaks is significantly increased in Cfp1 KO cells (11,352 peaks in Cfp1 KO, and 4663 in WT).
To study how DNA methylation influences the lack of unmethylated CpG CXXC binding domains (Cfp1 KO) on H3K4me3, we selected the peaks with at least 2% CpG content. We found a significant change between the DNA methylation of the WT and Cfp1 KO specific H3K4me3 peaks, median DNA methylation of 13 and 79%, respectively (Fig. 4e). Since WT peaks are restricted to DNA hypomethylated regions, this finding suggests that the ablation of Cfp1 allows the appearance of H3K4me3 peaks in DNA hypermethylated regions. This suggestion is in agreement with previous studies [39, 55]. We do not exclude, however, the possibility of reduced activity of DNA methyltransferases and global hypomethylation in Cfp1-KO cells as reported previously [57].
Additionally, we identified 7638 H3K4me1 peaks specific to WT, 8234 specific to Cfp1 KO cells, and 116,373 H3K4me1 peaks in both cell types. Since we hypothesized that there is a seesaw mechanism between H3K4me1 and H3K4me3 within low to intermediate DNA methylation, we focused our analysis on peaks with DNA methylation below 50%. The WT-specific H3K4me1 peaks have significantly higher H3K4me1, but lower H3K4me3 enrichment than the Cfp1 KO specific peaks, and vice versa (Fig. 4f, g, p-value <1e-15). Particularly, H3K4me1 enrichment shows a significantly negative correlation (Pearson’s correlation coefficient r = −0.71) with H3K4me3 enrichment, i.e. within low to intermediate DNA methylation increased H3K4me1 levels encompassed with reduced H3K4me3 levels when Cfp1 is knocked out which further confirms the seesaw model (Fig. 4f), thus reduced H3K4me3 (due to Cfp1KO) elevates the seesaw towards H3K4me1.
To illustrate how the co-localization of the H3K4me1 and H3K4me3 signals is influenced by the disruption of CFP1, we studied the genomic region around the master of pluripotency transcription factor Pou5f1/Oct4 (Fig. 4h). The unmethylated promoter of Pou5f1 (region I) is depleted of H3K4me1 and enriched of H3K4me3 in WT cells, while the Cfp1 KO cells are enriched of H3K4me1 and depleted of H3K4me3 in the same loci. Similarly, the transcriptional intermediary factor Trim28, the pluripotency-associated Mir290 cluster of microRNAs, and the non-coding RNA gene Gas5 (regions III-IV) show elevated H3K4me1 coinciding with depleted H3K4me3 after Cfp1 KO. These results show how the disruption of CFP1, alters the balance between H3K4me1 and H3K4me3. The promoter shared between Tcf19 (Transcription factor 19) and Cchcr1 (Coiled-coil α-helical rod protein 1) transcribed in opposite directions (region II), however, it shows almost similar chromatin patterns in WT and Cfp1 KO cells.
DNA hypomethylation causes H3K4me3 enrichment and aberrant gene expression
We have provided several lines of evidence showing that the seesaw mechanism between H3K4me1 and H3K4me3 is regulated by DNA methylation. However, the biological impact of such regulation still needs to be identified. It is also important to determine whether this regulatory function of DNA methylation is a specific property of pluripotent cells or whether it exists also in differentiated cells. Therefore, we studied MEF cells in absence (KO) or presence (WT) of Dnmt1, the key maintainer of DNA methylation after cell division (Fig. 5 and Table 1, rows 2, 10 and 14). In addition to 23,859 common H3K4me3 peaks in Dnmt1 WT and KO cell types, we found a gain of 8648 (30%) of genomic loci which were H3K4me3-enriched specifically in the Dnmt1-KO cells. This is almost twice the number of specific peaks of WT cells (4515 WT-specific) (Fig. 5a). Furthermore, the number of H3K27me3 peaks had a modest change (3%) between cell types, which confirms that the results were not cell type dependent. Similar to Cfp1 KO cells, there were significantly more frequent distal H3K4me3 peaks specific for Dnmt1 KO cells (N = 5652) compared to the WT specific (N = 1516) (Additional file 1: Fig. S4).
The DNA methylation pattern is significantly different between the specific peaks for each cell type, WT and Dnmt1 KO (Fig. 5b, Additional file 1: Fig. S1). The WT-specific H3K4me3 peak locations are hypomethylated in both WT and Dnmt1 KO cells (21 and 18% median DNA methylation, respectively). In contrast, Dnmt1 KO-specific H3K4me3 peaks show significant loss of DNA methylation after Dnmt1 KO (28%), while they are hypermethylated in WT (77% median DNA methylation).
Presence of H3K4me3 peaks coincides with a major shift in transcription (Fig. 5c). Among Dnmt1 KO-specific H3K4me3 peaks, up-regulated transcribed regions in the same cell type (compared to the WT cells) are 21 times more frequent than down-regulated ones (N = 2931 and 139 respectively, minimum 2-fold change in transcription). Similarly, WT- specific H3K4me3 peaks with more than 2-fold up-regulation in the same cells (compared to Dnmt1 KO) are 7.5 times more frequent than down-regulated regions (N = 1064 and 141, respectively).
We analyzed how the H3K4me3 peaks for each cell type, WT and Dnmt1 KO, split between enhancers and promoters. At DNA methylomics level, the WT-specific H3K4me3 peaks are under-methylated in Dnmt1 KO samples in relation to WT samples both in promoters (Fig. 5d) and enhancers (Fig. 5f). Interestingly, in the case of enhancers, there is a depression of DNA methylation in the Dnmt1 KO samples for the DNA methylation level around 75% of the WT samples (Fig. 5f). The Dnmt1 KO-specific peaks are slightly more under-methylated over promoters (Fig. 5e) than over enhancers (Fig. 5g), with a high dispersion of DNA methylation in Dnmt1 KO samples in the same loci of enhancers in which the WT samples are highly DNA methylated (Fig. 5g). At transcriptomics level, the expression behavior in WT-specific H3K4me3 peaks over promoters (Fig. 5h) and over enhancers (Fig. 5j) is very similar. In both cases the transcription in WT samples is up-regulated in relation to the transcription in Dnmt1 KO samples. Interestingly, in the case of Dnmt1 KO-specific H3K4me3 peaks there is a strong dichotomy in the transcription behavior of enhancers (Fig. 5i) and promoters (Fig. 5k). In both cases, the expression is very similar in Dnmt1 KO and WT samples for expression level higher than 4 (in log2 scale). However, for low transcription levels, the transcription is up-regulated in Dnmt1 KO samples in relation to WT samples in enhancers (Fig. 5k).
We studied how this observation at genomics level translates into the co-localization of signals at loci of four cell-type specific genes (Tex19.1, Hspb2, Capn11, En1) and one house housekeeping gene, Gapdh (Fig. 5i). Both epigenetic (DNA methylation and H3K4me3) and transcriptional patterns of Gapdh (region I) are similar in WT and Dnmt1 KO cells. In contrast, the pluripotency-associated gene Tex19.1 that is specifically active in ES, placenta and germ cells [58] loses DNA methylation (from 100% to 25–75%) in its promoter in Dnmt1 KO cells. This is supported by the fact that the number of CpGs with 75–100% methylation is reduced to almost zero at genomics scale in Dnmt1 KO cells [59]. The DNA methylation loss is coincident with H3K4me3 enrichment and downstream ectopic expression in MEF cells (region II). Same scenario develops at the Heat Shock Protein Family B (Small) Member 2 coding Hspb2 gene, normally expressed in muscle and heart (region III). Region IV is an intronic long terminal repeat (LTR) located within the spermatogenic-specific Calpain 11 coding gene Capn11. It is silent in WT MEFs, H3K4me3 enriched and transcribed after being hypomethylated in Dnmt1 KO cells, although the Capn11 gene itself is silent in both cell types. An intergenic region upstream of the neural specific Engrailed Homeobox En1 coding gene is also shown to undergo DNA hypomethylation, H3K4me3 enrichment and active transcription in Dnmt1 KO cells (region V).
We compared the genomic location of H3K4me3 peaks specific to WT and Dnmt1 KO cells (Fig. 5m). The Dnmt1 KO-specific H3K4me3 peaks were overrepresented within retroelements including LTR, LINE (long intergenic non-coding elements) and SINE (short intergenic non-coding elements). Exons, promoters and distal CpG-rich regions were elevated for WT peaks. This finding was confirmed by profiling RNA-seq peaks specific to WT and Dnmt1 KO cells. LTR, LINE and SINE elements were significantly overrepresented in KO cells, while WT cells showed transcription enrichment within introns, intergenic regions and LTRs (Fig. 5n).