A distinct epigenetic signature at targets of a leukemia protein

Background Human myelogenous leukemia characterized by either the non random t(8; 21)(q22; q22) or t(16; 21)(q24; q22) chromosome translocations differ for both their biological and clinical features. Some of these features could be consequent to differential epigenetic transcriptional deregulation at AML1 targets imposed by AML1-MTG8 and AML1-MTG16, the fusion proteins deriving from the two translocations. Preliminary findings showing that these fusion proteins lead to transcriptional downregulation of AML1 targets, marked by repressive chromatin changes, would support this hypothesis. Here we show that combining conventional global gene expression arrays with the power of bioinformatic genomic survey of AML1-consensus sequences is an effective strategy to identify AML1 targets whose transcription is epigenetically downregulated by the leukemia-associated AML1-MTG16 protein. Results We interrogated mouse gene expression microarrays with probes generated either from 32D cells infected with a retroviral vector carrying AML1-MTG16 and unable of granulocyte differentiation and proliferation in response to the granulocyte colony stimulating factor (G-CSF), or from 32D cells infected with the cognate empty vector. From the analysis of differential gene expression alone (using as criteria a p value < 0.01 and an absolute fold change > 3), we were unable to conclude which of the 37 genes downregulated by AML1-MTG16 were, or not, direct AML1 targets. However, when we applied a bioinformatic approach to search for AML1-consensus sequences in the 10 Kb around the gene transcription start sites, we closed on 17 potential direct AML1 targets. By focusing on the most significantly downregulated genes, we found that both the AML1-consensus and the transcription start site chromatin regions were significantly marked by aberrant repressive histone tail changes. Further, the promoter of one of these genes, containing a CpG island, was aberrantly methylated. Conclusion This study shows that a leukemia-associated fusion protein can impose a distinct epigenetic repressive signature at specific sites in the genome. These findings strengthen the conclusion that leukemia-specific oncoproteins can induce non-random epigenetic changes.


Background
Nuclear hormone receptors and transcription factors can regulate the transcription of their target genes by inducing chromatin changes. Paradigmatic are the retinoic acid receptor alpha (RARα) and the transcription factor core binding factor (CBF), which regulate in this way the transcription of target genes involved in hematopoietic processes [1,2]. Differently from RARα, which epigenetically activates its targets by recruiting coactivator protein complexes with histone acetyl transferase (HAT) activity only when bound to retinoic acid, CBF can directly recruit HAT-containing complexes to activate its targets [3][4][5][6]. One of the two CBF subunits, CBFα or AML1, can bind target genes endowed with the AML1-consensus sequence TG(T/C)GGT via its N-terminal DNA-binding domain [7]. AML1, encoding a master hematopoietic transcription factor, is frequently affected by different chromosome translocations in leukemic cells [8]. Moreover, AML1 haploinsufficiency was found to be associated with familial platelet disorder, a condition predisposing to acute myeloid leukemia [9].
Two leukemia-associated chromosome translocations, the t(8;21)(q22;q22) and the t(16;21)(q24;q22), result in the fusion between the N-terminal region of AML1 and the Cterminal regions of two almost identical chromatin corepressors, MTG8 and MTG16, leading to the formation of AML1-MTG8 and AML1-MTG16, respectively [10][11][12][13]. Upon fusion with either MTG8 or MTG16, AML1 is converted from a transcriptional activator into a transcriptional repressor of AML1-targets. Specific MTG domains in the wild type, as well as in the MTG fusion proteins, can interact, directly or via other corepressors such as NCoR and Sin3A, with histone deacetylases (HDACs), thus creating a repressive chromatin state at AML1 target sites (reviewed in [14,15]). Repression at these sites is further enhanced by the formation of oligomers between the fusion proteins and wild-type MTG proteins [16][17][18].
Myeloid cell differentiation systems, such as the 32D mouse myeloid cell line, ectopically expressing either AML1-MTG8 or AML1-MTG16, were used as models to simulate some of the effects of these fusion proteins in myelogenesis and leukemogenesis. Both fusion proteins, when exogenously expressed in the 32D background, were shown to affect granulocytic differentiation and produce distinct effects on cell proliferation [19][20][21]. In a preliminary study, we found that AML1-MTG16, when exogenously expressed in 32D cells, can induce aberrant myeloid phenotypes in association with repressive modifications at the chromatin of the Colony stimulating factor 1 receptor (Csf1r), an AML1-target gene encoding the macrophage colony stimulating factor receptor [19]. Based on this finding, we hypothesize that the comparative epigenetic analysis of the changes induced by differ-ent AML1-MTG fusion proteins in an identical cell context (e.g. the 32D context) might provide a lead to elucidating the differences observed in leukemic cells carrying either one of the two proteins [8]. The objective of this study was to demonstrate whether AML1-MTG16 induces epigenetic changes at AML1-target genes in the 32D myeloid cell genome. Only by coupling global gene expression array analysis with a bioinformatic genomic survey for the AML1-consensus sequence, we were able to close onto AML1-targets downregulated by AML1-MTG16. AML1-MTG16-induced transcriptional downregulation was marked by the acquisition of a distinct repressive chromatin signature.

Global gene expression array analysis of AML1-MTG16expressing cells
To study the molecular and biological consequences of AML1-MTG16 expression in a myeloid differentiation cell model, we previously developed, by infecting 32D mouse myeloblasts with retroviral particles carrying either the pLNCX2 vector containing the AML1-MTG16 cDNA or the cognate empty vector, stable independent clones expressing AML1-MTG16 (hereafter called A16 clones) and stable independent control clones (hereafter called "mock" clones), respectively ( Figure 1A). Upon treatment with granulocyte colony stimulating factor (G-CSF), A16 clones do not undergo granulocytic differentiation and proliferate significantly less than mock clones ( Figure 1B). Global gene expression analysis (setting the p-value at < 0.05 and the absolute fold change at > 1.5) of a prototypic A16 clone and a prototypic mock clone grown either with interleukin 3 (IL-3) or G-CSF for 16 h, was combined with bioinformatic analysis of the proteins encoded by all the differentially expressed genes with the Ingenuity software (see Methods). This analysis clearly revealed a network comprising proteins critical for platelet function in A16 cells (see Additional file 1). The identification of this protein network strongly supports the biological data, indicating the occurrence of functional AML1 haploinsufficiency in A16 cells [9].
Further analysis of the gene expression data (setting the pvalue at < 0.01 and the absolute fold change at > 3) enabled us to identify 138 differentially expressed genes, of which 66 differentially expressed genes in cells grown with IL-3, 67 differentially expressed genes in cells grown with G-CSF, and 5 differentially expressed genes in both cells grown with IL-3 and G-CSF ( Figure 1C, left, and Table 1 and Table 2). According to the Ingenuity software, the differentially expressed genes in A16 cells were mostly implicated in tumorigenesis, cell proliferation, and hematopoiesis ( Figure 1C, right). Since from this analysis alone we were unable to conclude whether, or not, these genes were AML1-MTG16 direct targets, we devised a bio-informatic approach aimed at identifying the AML1-consensus sequence in the 10 Kb region around the transcription start site of these genes.

Identification of genes containing the AML1-consensus sequence by bioinformatic analysis
Since the AML1-MTG proteins have a transcriptionally repressive function (reviewed in [14]), we focused our bioinformatic analysis on the 37 genes downregulated by AML1-MTG16 (see genes in bold in Table 1 and Table 2). Specifically, we searched the 10 Kb around the transcription start site of each gene for either the AML1-binding consensus sequence TG(T/C)GGT or, this sequence in reverse orientation, ACC(G/A)CA. With the MEME software (see Methods) we identified a conserved motif, hereafter called AML1-consensus motif (Figure 2A), encompassing the AML1-consensus sequence in seventeen out of the 37 genes ( Figure 2B and Table 3). We Global gene expression analysis of AML1-MTG16-expressing cells Figure 1 Global gene expression analysis of AML1-MTG16-expressing cells. A. The 32D cell model, comprising clones expressing the AML1-MTG16 protein (A16 clones) and control clones ("mock" clones), which do not express the fusion protein. B. A16 clones, differently from mock clones, do not undergo granulocytic differentiation and display an impaired proliferation in the presence of G-CSF. C. Most of the genes whose expression is significantly affected in A16 cells were found previously implicated in biological processes. Limits: p-value < 0.01; absolute fold change > 3.
In bold are the AML1-MTG16-downregulated genes searched for AML1-consensus motifs.

Table 2: Selection of genes differentially expressed in AML1-MTG16-positive cells versus AML1-MTG16-negative cells grown in the presence of G-CSF for 16 h. (Continued)
focused on five of these genes, Fcer1a, Tcfec, Ptprcap, F2rl3, and Mgmt ( Figure 2B, right), because they were among the most significantly downregulated genes. Fcer1a, Tcfec, Ptprcap, F2rl3, and Mgmt encode for known proteins. Specifically, Fcer1a is the Fc fragment of IgE and is involved in the immune response [22]; Tcfec is a transcription factor that induces, among other genes, the G-CSF receptor gene [23,24]; Ptprcap is a transmembrane protein associated with CD45, a key regulator of lymphocytes activation [25]; F2rl3 is a member of G protein-coupled proteaseactivated receptors (PARs) of the coagulation factor II (thrombin) and plays an important role in platelet activation [26]; Mgmt is a DNA repair enzyme that is frequently lost in cancer due to epigenetic silencing [27]. Downregulation of these genes was confirmed by real time RT-PCR ( Figure 2C).

Fcer1a, Tcfec, Ptprcap, F2rl3, and Mgmt are direct AML1-MTG16 targets
Quantitative chromatin immunoprecipitation (ChIP) with an anti-AML1 specific antibody, but not with an anti-MTG16 antibody (data not shown), showed significant (p < 0.05) enrichment of the region encompassing the AML1-consensus motif (see bars in figure 3A, left) relative to an arbitrary control region without the AML1-consensus motif in the mock clone chromatin for all five genes, indicating endogenous AML1 binding at these regions ( Figure 3B). ChIP with an anti-MTG16 antibody showed instead a significant enrichment of exogenous AML1-MTG16 in the same chromatin regions in the A16 clones ( Figure 3B). The human homologues of these genes also contain an AML1-consensus sequence(s) in the 10Kb region surrounding the transcription start site, pointing to AML1-MTG16-induced downregulation of putative AML1-targets  (1) The Ptprcap AML1-consensus motif is located in an intron of a 5' adjacent gene (Coro1b). these five genes as novel, bona fide direct AML1-targets genes.
Repressive histone modifications are often associated with aberrant hypermethylation at CpG islands present in the 5' regulatory regions of many genes [32,33] and references within). By using the CpG island searcher [34], a software for the identifying CpG islands, we could identify a CpG island only in the Mgmt promoter region [35] ( Figure 4B). Bisulfite sequencing analysis of this region detected hypermethylation in AML1-MTG16-positive cells ( Figure 4B).
The overall epigenetic analysis indicates that downregulation of AML1-targets by AML1-MTG16 can be achieved, even in the absence of DNA methylation, when there is a critical quantitative level of repressive histone changes.

Discussion
In this study we show the effectiveness of integrating global gene expression array analysis with a bioinformatic approach aimed at detecting AML1-consensus sequences for identifying novel putative direct AML1-targets downregulated by AML1-MTG16 in 32D cells. Downregulation of these genes is marked by a distinct repressive chromatin profile.
When we surveyed the 37 most significantly downregulated genes for the presence of the AML1-consensus motif(s) in the 10 Kb region encompassing the transcription start site, we closed on seventeen putative direct AML1-MTG16 targets. For five of these genes, Fcer1a, Tcfec, Ptprcap, F2rl3 and Mgmt, which were among the most significantly downregulated, we were able to demonstrate, using ChIP analysis, the binding of both AML1 and AML1-MTG16 to the gene regions containing the AML1-motifs. Thus, our two-tier approach, combining gene expression array analysis with bioinformatic survey for transcription factor-consensus sequences, seems to be a powerful strategy for identifying transcription factor targets, which would otherwise be missed when using conventional gene expression array analysis alone.
The chromatin of the five downregulated genes, Fcer1a, Tcfec, Ptprcap, F2rl3, and Mgmt, was marked not only by significant levels of histone H4 hypoacetylation, but also by significant levels of repressive histone H3-K9 trimethylation, suggesting that AML1-MTG16 might induce the recruitment of both histone deacetylases [28] and histone methyltransferases. Apparently, a critical quantity of repressive histone modifications, even in the absence of CpG methylation, might per se be sufficient to "lock in" a transcriptionally downregulated state. In the case of Mgmt, which has a CpG island, it is instead possible that the accumulation of histone repressive changes preceded CpG hypermethylation [ [36], and references within].
It is noteworthy that all the genes for which we demonstrated AML1-MTG16-induced epigenetic downregulation encode for functions relevant to either hematopoiesis and/or leukemogenesis. We would like to underline that downregulation of two of the genes that we identified might be relevant to AML1-MTG16-induced leukemogenesis. One of these genes is Tcfec, whose human counterpart encodes a transcription factor that induces the granulocyte colony stimulating factor receptor G-CSFR [23,24]. Remarkably, Tcfec downregulation in A16 cells is paralleled by a significant downregulation of G-csfr (data not shown), indicating that AML1-MTG16 might have triggered a coordinated cascade of transcriptional downregulation, as we observed in other differentiation model systems [37,38]. The second gene is Mgmt, encoding the DNA repair enzyme O6-Methylguanine-DNA-methyltransferase, which is frequently silenced and hypermethylated in leukemia [39]. MGMT epigenetic silencing is thought to lead to random mutations in cancer [40]. A recent study has shown that expression of different acute myeloid leukemia fusion proteins, including AML1-MTG8, leads to downregulation of several DNA repair genes [41]. Thus, the induction of a "mutator phenotype" might be a common consequence of leukemia fusion protein expression.
A few global gene expression studies on cells expressing exogenous AML1-MTG8 have been recently described [42][43][44]. Given the use of different cell systems, it is difficult to compare the differentially expressed genes in AML1-MTG16-positive 32D cells with the differentially expressed genes reported for AML1-MTG8. Nevertheless, we could identify a few gene families (e.g. S100 Calciumbinding proteins) that are similarly affected by both AML1-MTG8 and AML1-MTG16 even in different cell contexts. Extending our study to the comparison of the epigenetic signatures imposed by either exogenous AML1-Repressive epigenetic changes at the AML1-targets Figure 4 Repressive epigenetic changes at the AML1-targets. A. ChIP with antibodies against either acetylated histone H4 or trimethylated histone H3 Lysine 9 (tri-Met-H3-K9) followed by quantitative PCR with primers amplifying a region encompassing either the transcription start site (+1) or the AML1-consensus detected a different level of repressive histone changes in all five genes in A16 cells. B. In silico analysis identified a CpG island only in the Mgmt promoter. This CpG island is hypermethylated in A16 cells (bottom, right).
MTG16 or exogenous AML1-MTG8 in the very same cell context (e.g. 32D cells) might enable us to narrow down additional critical epigenetic signatures consequent to t(8;21) and t (16;21) translocations.

Conclusion
In this study, we show that AML1-MTG16, the leukemia fusion protein associated with the non-random chromosome translocation t(16;21)(q24;q22), can impose transcriptional downregulation marked by a distinct epigenetic signature at specific AML1-target sites in the genome. Thus, our findings further support the hypothesis that non-random genetic abnormalities can lead to non-random epigenetic changes in leukemia cells [19,45].

Cell cultures
Stable clones obtained from mouse myeloid 32D cells infected either with pLNCX2-AML1-MTG16 (A16 clones) or the empty vector pLNCX2 (mock clones) were previously described [19]. Two prototypic A16 clones and two prototypic mock clones were used in this study. Cells were maintained in the presence of 10 ng/ml of murine IL-3 (BD Biosciences, San Jose, CA, USA) in RPMI 1640 medium supplemented with 10% fetal calf serum, 1% antibiotics (penicillin/streptomycin), adjusting the cell density to 2 × 10 5 cells/ml daily. To induce granulocyte differentiation, cells were washed in RPMI medium, and IL-3 was replaced with 10 ng/ml human G-CSF (Amgen, Thousand Oaks, CA, USA). Differentiation was microscopically evaluated on cytospin preparations stained with May-Grünwald-Giemsa.

RNA extraction and microarray hybridization
Total RNA was extracted with RNeasy mini kit (Qiagen, Hilden, Germany) and treated with DNase (Qiagen). Double stranded cDNA was generated from 5 µg RNA using Superscript ds cDNA synthesis kit (Invitrogen, Carlsbad, CA, USA) and T7-oligo(dT) primers. The cDNA was purified with GeneChip Sample Cleanup Module (Affymetrix, Santa Clara, CA, USA) and used to synthesize biotin-labeled cRNA with Enzo RNA transcript Labeling Kit (Enzo Life Science, Farmingdale, NY, USA). Purified cRNA was quantified by spectrophotometric methods and the concentration was adjusted in order to exclude the carryover of unlabeled RNA. 11 µg of cRNA were then fragmented in fragmentation buffer (Affymetrix) at 95°C for 35 minutes and hybridized for 16 h at 45°C onto MOE430A microarrays (Affymetrix). After washing and staining, the chips were scanned in a Hewett-Packard/ Affymetrix scanner at 570 nm. For all the samples the 5'/ 3' ratios of Gapdh were 0.7 -0.9. In comparative experiments the scaling factor, noise and presence calls were similar. Gene expression data represent the average of two independent experiments.

Microarray data analysis
The arrays were normalized by geometric mean intensity for each probe set and scaled using log2 transformation for further analysis. Comparison between the A16 and mock clones grown with either IL-3 or G-CSF was done using Spotfire Decision Site. This comparison generated a p-value from a t-test to statistically extract significant changes in mRNA expression levels between the groups. p-values < 0.05 were considered significant. The null hypothesis is that the samples between the groups are derived from the same population i.e. there is no significant differential expression. The t-test looks at the variance within the groups as well as between them. To be considered significantly differentially expressed the variance had to be greater between than within the groups to a level of p < 0.05. Ratios were generated by dividing the average of the unlogged control data by the average of the unlogged AML1-MTG16 data. Ratios were then portrayed as positive or negative fold change between A16 and mock. To confirm statistical significance of these ratios the differentially expressed genes had to satisfy an arbitrary cut-off ratio as well as having a p-value < 0.05 (see Results section). Analysis of the protein networks was performed by using Ingenuity Pathways Analysis (Ingenuity Systems, Redwood City, CA), software able to identify molecular networks based on known functional or physical interactions among the proteins encoded by the differentially expressed genes.

Search of AML1-consensus sequence in differentially expressed genes
The well-annotated genes differentially expressed in the A16 clone versus the mock clone either in the presence of IL-3 or G-CSF (p < 0.01 and absolute fold change >3) were searched for the AML1-consensus sequence "5'-TG(T/ C)GGT-3"' in the 10 kb region surrounding the transcription initiation sites (from -5000 bp to +5000 bp) using an in-house built PERL script. A 400 bp sequence flanking the potential AML1-binding sites (200 bp on each side) was extracted and analyzed with MEME, which is a software package to discover motifs in groups of related DNA sequences [46], and with multiple sequence alignment to test whether additional conserved motifs in the surrounding regions could be identified and to assess the sequence conservation extending the potential AML1-binding sites.

Real-time RT-PCR
Total RNA was obtained using Trizol (Invitrogen), treated with DNase I (Ambion, Austin, TX, USA), retrotranscribed with SuperScript™ First-Strand Synthesis System (Invitrogen) and amplified by Real-time RT-PCR on an iCycler (Bio-Rad, Hercules, CA, USA) by using iQ SYBR Green Supermix (Bio-Rad) and primers specific for γ actin, F2rl3, Fcer1a, Ptprcap, Tcfec, and Mgmt (Table 4). Transcript levels of the genes of interest were quantitated by the Delta-

Real time PCR primers
γ-Actin