The rules of gene expression in plants: Organ identity and gene body methylation are key factors for regulation of gene expression in Arabidopsis thaliana
© Aceituno et al; licensee BioMed Central Ltd. 2008
Received: 05 May 2008
Accepted: 23 September 2008
Published: 23 September 2008
Microarray technology is a widely used approach for monitoring genome-wide gene expression. For Arabidopsis, there are over 1,800 microarray hybridizations representing many different experimental conditions on Affymetrix™ ATH1 gene chips alone. This huge amount of data offers a unique opportunity to infer the principles that govern the regulation of gene expression in plants.
We used bioinformatics methods to analyze publicly available data obtained using the ATH1 chip from Affymetrix. A total of 1887 ATH1 hybridizations were normalized and filtered to eliminate low-quality hybridizations. We classified and compared control and treatment hybridizations and determined differential gene expression. The largest differences in gene expression were observed when comparing samples obtained from different organs. On average, ten-fold more genes were differentially expressed between organs as compared to any other experimental variable. We defined "gene responsiveness" as the number of comparisons in which a gene changed its expression significantly. We defined genes with the highest and lowest responsiveness levels as hypervariable and housekeeping genes, respectively. Remarkably, housekeeping genes were best distinguished from hypervariable genes by differences in methylation status in their transcribed regions. Moreover, methylation in the transcribed region was inversely correlated (R2 = 0.8) with gene responsiveness on a genome-wide scale. We provide an example of this negative relationship using genes encoding TCA cycle enzymes, by contrasting their regulatory responsiveness to nitrate and methylation status in their transcribed regions.
Our results indicate that the Arabidopsis transcriptome is largely established during development and is comparatively stable when faced with external perturbations. We suggest a novel functional role for DNA methylation in the transcribed region as a key determinant capable of restraining the capacity of a gene to respond to internal/external cues. Our findings suggest a prominent role for epigenetic mechanisms in the regulation of gene expression in plants.
Understanding the regulation of gene expression is essential to understand the form and function of living systems. Microarray technology has been widely used in many organisms to understand genome-wide changes in gene expression in response to treatments , in different organs , cell-types  and along developmental time series . Therefore, a large amount of microarray data representing many different biological conditions has accumulated over recent years. This data has been used successfully to hypothesize on gene function on a global scale in different organisms, such as yeast and C. elegans [5–7], and to suggest shared regulatory mechanisms. Promoters of genes with strongly correlated expression patterns in multiple experiments are likely to be bound by a common transcription factor , and conserved regulatory motifs have been identified based solely on expression data . From a systems view, however, we believe that this data has been underutilized as a resource to understand the basic rules of gene expression.
To learn the general rules that govern gene expression in plants, we took advantage of a large microarray database available for Arabidopsis in the NASCarrays database . Using this data, we defined the internal and external cues that regulate the expression of all of the Arabidopsis genes that are represented in the Affymetrix ATH1 gene chips. We quantified the effect of the different experimental conditions on gene expression, which revealed tissue type to be the most influential variable. We also analyzed different structural features and correlated it with the capacity of the genes to respond to the different stimuli. We found evidence for a mechanistic relationship between DNA methylation in the body of the gene (i.e., the transcript region) and the regulation of gene expression, thus assigning a novel and important role for the methylation of the body of the gene in eukaryotic genomes.
Results and discussion
The Arabidopsis transcriptome is robust to most perturbations but strongly influenced by organ type
Given its impact on global gene expression levels, we next wished to evaluate the importance of organ type in the context of typical experimental factors that are tested in the laboratory. We compared the number of genes responding in shoots or roots for each of the nine treatments in the AtGenExpress abiotic stress series. On average, only 13% of the total genes that responded to a treatment responded in both organs. By contrast, a much higher proportion of genes (88%) were regulated by the treatment in an organ-specific manner (Additional File 11). This data indicate that plant responses to external stimuli are strongly organ-dependent and underscore the need for a more thorough survey of organ-specific and, by extension, cell-specific responses in Arabidopsis and other plants .
Housekeeping and hypervariable genes possess marked structural differences
To identify properties that explain the capacity of a gene to respond to stimuli, we ranked genes based on the number of comparisons in which they are differentially expressed. As shown in Figure 2C, the Arabidopsis genome contains genes that are regulated in a wide range of comparisons, with an average of 14 comparisons, or 3% of the total comparisons in our dataset. The underlying data is provided in Additional File 12. We expect structural differences to be maximized at the extremes of this distribution. We defined housekeeping genes based on three criteria: (1) genes that were not differentially expressed in any of the 474 comparisons, (2) genes with signal intensities higher than the median intensity across the entire dataset and (3) genes with the lowest signal variability (measured with the interquartile range, see Materials and Methods) across the entire dataset. In contrast, we defined hypervariable genes based on the following three criteria: (1) genes that were within the top 1% of the gene responsiveness distribution, (2) genes with the largest signal variability, and (3) genes that show differential expression by stimuli from six of the eight categories described in Figure 1A. These criteria defined 384 housekeeping genes and 123 hypervariable genes (Additional files 13 and 14).
Contrasting features of housekeeping and hypervariable genes.
CDS length (bp) a
2624 (s.e. = 89)
1178 (s.e. = 73)
1931 (s.e. = 8)
Gene length (bp) a
3117 (s.e. = 87)
1493 (s.e. = 78)
2229 (s.e. = 8)
Total exon length (bp) a
1941 (s.e. = 52)
1169 (s.e. = 50)
1568 (s.e. 6)
Total intron length (bp) a
1173 (s.e. = 52)
323 (s.e. = 44)
660 (s.e. = 4)
Number of exons (pb) a
8 (s.e. = 0.31)
3 (s.e. = 0.24)
5 (s.e = 0.03)
Genes without introns
6% (p = 5E-16)
33% (p = 0.0007)
Average number of transcription factor binding sites b
27 ± 1.2 (p < 0.01)
46 ± 1.8 (p < 0.0001)
30 ± 0.1
TATA-containing genes c
5% (p = 1.3E-6)
45% (p = 6.1E-15)
Genes coding for unstable transcripts d
8% (p = 9E-11)
Shared among eukaryotes e
18% (p = 0.002)
34% (p = 2E-10)
Body methylation f
63% (p = 1.5E-35)
8% (p = 2E-10)
Promoter methylation f
Body methylation g
36% (p = 9.1E-21)
2% (p = 3.8E-8)
Eukaryotic genes are transcriptionally regulated by the coordinated interaction of multiple protein factors that interact with discrete binding sites and with each other . These binding sites are usually located upstream of the transcribed region they regulate . The promoters of hypervariable genes often have a TATA-box sequence and contain a larger number of predicted transcription factor binding sites as compared to the housekeeping genes or the genome average (Table 1 and Additional File 16). These data suggest that the presence of a TATA box and the number of transcription factor binding sites in the promoter region of some of the most responsive genes in Arabidopsis may explain their capacity to respond to stimuli, as was previously found in an analysis of a smaller expression dataset . However, it is clear that this simple rule does not always apply and that other factors are necessary to explain gene expression responses.
In addition to gene structure, epigenetic mechanisms such as DNA methylation are known to have an impact on gene expression in eukaryotes, particularly in heterochromatic regions [22, 23]. To evaluate the potential role of DNA methylation in the gene expression responses observed for housekeeping and hypervariable genes, we analyzed the methylation patterns of these two groups of genes. We used two recently published genome-wide methylation data sets [24, 25] to analyze methylation in the promoter and transcribed regions of each gene. Using the methylome data produced by Zhang et al. , we found that a large proportion of housekeeping genes were methylated in their transcribed regions (a significant enrichment compared to the expected genome frequency; p = 1.5E-35, Table 1). By contrast, only 8% of the hypervariable genes were methylated in their transcribed regions (a significant depletion; p = 2E-10, Table 1). Similar results were obtained with an independently generated methylome data set . These results suggest that the capacity of Arabidopsis housekeeping and hypervariable genes to respond to stimuli not only depends on structural features in their promoter or transcribed regions, such as transcription factor binding sites, but may also have an important epigenetic component.
Transcript region methylation is the most important factor to explain genome-wide responses to internal/external stimuli
To evaluate the importance of these features for gene expression responses on a genomic scale, we performed a regression analysis of the gene responsiveness for all Arabidopsis genes as a function of each of the structural features described above. We used a linear model of the form: Y ~ αX + β, where Y was the observed gene responsiveness of all genes and X was the structural feature under evaluation (e.g. presence of TATA-box, cis-acting binding sites in the promoter or gene body methylation). Thus, the effects detected were free from any bias arising from gene selection, as could be the case when analyzing this relatively small group of housekeeping and hypervariable genes.
Results of the simple and multiple linear regression analyses
Frequency of genes target of H3k27me3
TAIR Genome v6.0
Cis -acting elements
Methylation + TATA-box
We also evaluated the relationship between the presence of modified histones and gene responsiveness. We used a recently published genomic survey of trimethylation in lysine 27 of histone H3 (H3K27me3) f. We found a weak correlation between the frequency of H3K27me3 gene targets and gene responsiveness, with an R2 of 0.12 (Figure 3F and Additional File 19). This finding is consistent with the hypothesis that H3K27me3 mostly acts in a DNA methylation-independent manner, as previously suggested . Other histone modifications, such as H3K4 or H3K9 methylation  or combinations thereof , may be related to gene body methylation in Arabidopsis, thus "marking" the corresponding chromatin region for or against the regulation of gene expression .
Gene body methylation and regulation of expression by nitrate in TCA cycle genes
Relationship between the methylation status and nitrate regulation of TCA cycle genes.
Responsiveness to nitrate
MDH (malate dehydrogenase); malate dehydrogenase
malate dehydrogenase, cytosolic, putative
malate dehydrogenase (NAD), mitochondrial
PMDH1 (PEROXISOMAL NAD-MALATE DEHYDROGENASE 1); malate dehydrogenase
malate dehydrogenase (NAD), mitochondrial, putative
PMDH2 (PEROXISOMAL NAD-MALATE DEHYDROGENASE 2), PMDH2 (PEROXISOMAL NAD-MALATE DEHYDROGENASE 2); malate dehydrogenase
malate dehydrogenase, cytosolic, putative
malate dehydrogenase (NADP), chloroplast, putative
malate dehydrogenase, cytosolic, putative
isocitrate dehydrogenase, putative/NAD+ isocitrate dehydrogenase, putative
IDH1 (ISOCITRATE DEHYDROGENASE 1); isocitrate dehydrogenase (NAD+)
isocitrate dehydrogenase, putative/NADP+ isocitrate dehydrogenase, putative
isocitrate dehydrogenase, putative/NAD+ isocitrate dehydrogenase, putative
isocitrate dehydrogenase, putative/NAD+ isocitrate dehydrogenase, putative
isocitrate dehydrogenase, putative/NADP+ isocitrate dehydrogenase, putative
ICDH (ICDH); isocitrate dehydrogenase (NADP+)
The analysis of the large and heterogeneous whole-genome microarray dataset available in the public domain proved useful to evaluate principles that govern regulation of gene expression in plants. Our global and systematic analysis of the quantitative effect of different experimental factors (e.g., mutations, stress and organ identity) on the plant transcriptome revealed the key role of developmental processes for establishing mRNA levels throughout the plant. This process in turn determines how cells, organs and tissues respond to exogenous cues. Our data indicate that plant responses to external stimuli are strongly organ-dependent and underscore the need for a more thorough survey of organ-specific and, by extension, cell-specific responses in Arabidopsis and other plants .
The second part of our analysis provided a weighted insight into the role of different molecular mechanisms in the global regulation of gene expression in Arabidopsis. The data indicate that DNA methylation within the body of Arabidopsis genes is a key factor that may determine or negatively influence the capacity of genes to respond to internal or external cues. The presence of a TATA-box may favor gene responsiveness but to a lesser extent than the negative effect of DNA methylation. Surprisingly, our data indicate that other gene structural features (e.g., number of cis-acting elements, gene size, presence and number of introns) are less important than DNA methylation and the presence of a TATA-box. These results highlight the importance of epigenetic mechanisms for the global control of gene expression. As a concrete example, we found consistency between regulation by an external stimulus (nitrate) and gene body methylation for a discrete biological process, the TCA cycle, beyond what would be expected by chance. The results presented here suggest a model whereby gene body DNA methylation restrains the ability of a gene to be regulated, regardless of regulatory signals (e.g., binding sites for specific transcription factors in the promoter region). This effect would not be directly dependent on basal gene expression levels. Moreover, our results provide a plausible functional role for the DNA methylation that is found in the body of a large number of Arabidopsis genes. This new role differs from the proposed role for DNA methylation in suppressing spurious transcriptional initiation [25, 39] and reinforces the link between the regulation of gene expression and DNA methylation in eukaryotes.
The CEL data files comprising all ATH1 Affymetrix hybridizations through the end of 2005 were obtained from NASCArrays through the AffyWatch Subscription Service. This data comprised 1887 hybridizations corresponding to 108 different experiments. The entire hybridization set was normalized using the Robust Multiarray Analysis method  available from Bioconductor http://www.bioconductor.org. Once normalized, the hybridizations were quality-controlled using the method devised by Persson et al . Briefly, this method uses a Kolmogorov-Smirnov goodness-of-fit test to evaluate whether the distribution of deleted residuals for an individual hybridization deviates from a "t" distribution. According to Persson et al , this occurs when the value of the D statistic from the goodness-of-fit test is more than 0.15. The CEL files with a D statistic over this cut-off value were excluded from the analysis. This step resulted in the exclusion of 186 CEL files.
For the analysis of differential expression, the remaining 1701 hybridizations were mapped to their corresponding experiments. Controls and biologically meaningful tests were identified and grouped with their replicates. Comparisons in which the control or treatment hybridizations had less than 2 replicates were discarded. This process resulted in a list of 474 biologically meaningful comparisons (control versus test), including 1295 hybridizations. In the case of tissue comparisons, we used rosette leaves as a control, and all other tissues were considered tests. Rosette leaves were chosen as the reference because they are the prototypical organ system . We classified the comparisons according the experimental variable involved using the criteria defined by TAIR , and according to the RNA source organ (Figure 1)
Differential expression analysis
The comparisons were analyzed for differential gene expression using the RankProducts method , implemented as a Bioconductor package . This method outperformed other methods to define differential expression in a study comparing ten different methods , particularly in high-noise, low-replicate datasets. Our comparisons have a low number of replicates (average = 2.7) and a high variability (pooled variance of the whole dataset = 4.04). We also evaluated the performance of RankProducts as compared to other popular alternative methods based on biological criteria. We defined regulation using RankProducts, average fold change and t-test with different FDR corrections for multiple testing [44, 45]. To evaluate the methods, we randomly chose five test comparisons from different experimental categories (e.g. biotic, abiotic, tissue).
We evaluated the functional coherence of the differentially expressed genes by the different methods by evaluating enriched gene ontology (GO) terms in the resulting lists. For most of the comparisons tested, visual inspection revealed enriched GO terms that were obviously related to the experimental factor. This was not the case for the other methods. As an example, 245 genes were found to be differentially expressed in the comparison DO.1.1 (Additional File 1). Out of these 245 genes, 217 were previously identified as regulated in these experiments using a different method in a prior study . In addition, the 140 down-regulated genes determined by RankProducts showed an overrepresentation of "transport" and other functional terms previously known to be related to the experimental factor . Similarly, the abscisic acid response evaluated in comparison AQ.4.4 (Additional File 1) identified 241 differentially expressed genes. Among the up-regulated genes, we found that the 'abscisic acid response' functional term was overrepresented.
With the results of the differential expression analysis, a "regulation matrix" was created. This matrix contained the p-value for the down- and up-regulation of all of the ATH1 Affymetrix chip probes across the 474 comparisons. The cut-off for defining a probe as differentially expressed was 0.05. The complete data file with ratios is available from http://virtualplant.bio.puc.cl/cgi-bin/Lab/download.cgi. Additional data files are available upon request.
Housekeeping and hypervariable gene definition
The least responsive genes (housekeeping genes) were defined as follows: first, we selected genes which did not show differential expression in any comparison (5652 genes). Second, these genes were filtered for expression above the median of the entire NASC dataset (1758 genes). Third, we choose only those having a signal difference between the 1st and 3rd quartile (interquartile range) that was in the bottom 5 percentile of the signal interquartile ranges from the whole dataset. This ensured the selection of 384 expressed Arabidopsis genes that exhibit the lowest expression variability.
For the most responsive genes (hypervariable genes), we first choose genes that were regulated in 86 or more comparisons, corresponding to the top 1% most responsive genes from Figure 2C. Second, we selected genes that were regulated in at least six out of the eight categories defined in Figure 1A to avoid any bias due to large categories (e.g., abiotic stress experiments). We did not use an expression cutoff, since as expected hypervariable genes were sufficiently expressed, with a median signal of 8.4 across the NASC dataset (the global median is 7.4). From the 185 genes selected by these criteria, we choose those with a signal interquartile range in the upper 5% of the entire dataset. Thus, we defined a group of 123 "hypervariable genes".
Structural and phylogenetic analyses and correlation with gene responsiveness
Gene structural features (gene, CDS, exon, intron lengths and numbers) – were obtained from the TAIR 6.0 Arabidopsis genome . Phylogenetic classifications of the genes were obtained from the Plant-Specific Database . Methylation status of the different genes (body methylated, body unmethylated and promoter methylated) was obtained from Zhang et al.  or Zilberman et al. . TATA-box presence or absence in the promoter region of Arabidopsis genes was obtained from Molina and Grotewold. The number of transcription factor binding sites in gene promoters was calculated from the data in the AtCis Database from AGRIS . Unstable transcripts were extracted from the data generated by Gutierrez et al. . All data were processed using custom-made scripts in R http://www.R-project.org and Perl languages. Statistical analyses and graphs were done in R, GraphPad Prisma 4.0 software or Microsoft Excel.
Statistical and regression analysis
Calculation of significant enrichment or depletion was done in R using the hypergeometric distribution. t-tests were carried out with the GraphPad Prisma 4.0 software. Simple and multiple linear regression models used to predict gene responsiveness as a function of various structural parameters were done in R. We used simple models of the form: Y ~ αX + β, where Y, the response variable, is the gene responsiveness and X is the value of the structural feature under evaluation. In the case of categorical features, such as methylation or the presence of TATA-box, X represented the frequency of the feature in a group of genes sharing the same responsiveness. For multiple linear regressions, we used models of the form: Y ~ αX + βZ + γW... where Y was the gene responsiveness and X, Z, W, etc. corresponded to different features to evaluate. Models were fitted using the lm function from the R statistical software. We used the R2 parameter to evaluate the quality of the model, since R2 represents the extent of data variability explained by the model. As a complementary approach for categorical features, we used one factor ANOVA models. They have the form Y ~ αX + β, where X was a factor encoding the presence or absence of those features at two different levels. We used the 'aov' function in R to fit the model. We used the F statistic to estimate the significance of the contribution of the factors to the response. To estimate the differences between the levels of the factors, we followed the Tukey procedure, using the 'glht' function from the 'multcomp' package in R. The Bayesian Information Criteria was calculated in R using the 'BIC' function in the package 'nlme'. Graphs were done in R, GraphPad Prisma 4.0 software or Microsoft Excel.
Gene body methylation and regulation by nitrate for TCA cycle genes
We retrieved the genes corresponding to the TCA cycle from AraCyc . We then determined the gene responsiveness of these genes in four previously published microarray data sets [34–37] that were not included in the NASCarrays database and were therefore not used to derive our genome-wide conclusions. We intersected the methylation status [24, 25] and regulation by nitrate of the genes encoding malate dehydogenases and isocitrate dehydrogenases using the VirtualPlant software platform http://www.virtualplant.org. Statistical analysis of enrichment was performed as described above.
We thank Xiaoyu Zhang, Dr. Steve Jacobsen and Dr. Joseph Ecker for kindly providing genome-wide DNA methylation data in a custom format, and Juanita Larraín-Linton for her proof-reading. This work was funded by grants from: ICGEB (CRPCHI0501), FONDECYT (1060457), MILLENNIUM NUCLEUS FOR PLANT FUNCTIONAL GENOMICS (P006-09-F), FUNDACION ANDES (C14060/62) and NSF (DBI0445666) to R.A.G. F.F.A. was funded by a Ph.D. fellowship from CONICYT.
- Kilian J, Whitehead D, Horak J, Wanke D, Weinl S, Batistic O, D'Angelo C, Bornberg-Bauer E, Kudla J, Harter K: The AtGenExpress global stress expression data set: protocols, evaluation and model data analysis of UV-B light, drought and cold stress responses. Plant J. 2007, 50 (2): 347-363. 10.1111/j.1365-313X.2007.03052.x.PubMedView Article
- Schmid M, Davison TS, Henz SR, Pape UJ, Demar M, Vingron M, Scholkopf B, Weigel D, Lohmann JU: A gene expression map of Arabidopsis thaliana development. Nat Genet. 2005, 37 (5): 501-506. 10.1038/ng1543.PubMedView Article
- Birnbaum K, Shasha DE, Wang JY, Jung JW, Lambert GM, Galbraith DW, Benfey PN: A Gene Expression Map of the Arabidopsis Root. Science. 2003, 302 (5652): 1956-1960. 10.1126/science.1090022.PubMedView Article
- Spencer MW, Casson SA, Lindsey K: Transcriptional profiling of the Arabidopsis embryo. Plant Physiol. 2007, 143 (2): 924-940. 10.1104/pp.106.087668.PubMedPubMed CentralView Article
- Hughes TR, Marton MJ, Jones AR, Roberts CJ, Stoughton R, Armour CD, Bennett HA, Coffey E, Dai H, He YD, Kidd MJ, King AM, Meyer MR, Slade D, Lum PY, Stepaniants SB, Shoemaker DD, Gachotte D, Chakraburtty K, Simon J, Bard M, Friend SH: Functional discovery via a compendium of expression profiles. Cell. 2000, 102 (1): 109-126. 10.1016/S0092-8674(00)00015-5.PubMedView Article
- Kim SK, Lund J, Kiraly M, Duke K, Jiang M, Stuart JM, Eizinger A, Wylie BN, Davidson GS: A Gene Expression Map for Caenorhabditis elegans. Science. 2001, 293 (5537): 2087-10.1126/science.1061603.PubMedView Article
- Stuart JM, Segal E, Koller D, Kim SK: A gene-coexpression network for global discovery of conserved genetic modules. Science. 2003, 302 (5643): 249-255. 10.1126/science.1087447.PubMedView Article
- Allocco D, Kohane I, Butte A: Quantifying the relationship between co-expression, co-regulation and gene function. BMC Bioinformatics. 2004, 5 (1): 18-10.1186/1471-2105-5-18.PubMedPubMed CentralView Article
- Harmer SL, Hogenesch JB, Straume M, Chang HS, Han B, Zhu T, Wang X, Kreps JA, Kay SA: Orchestrated transcription of a key pathway in Arabidopsis by the circadian clock. Science. 2000, 290: 2110-2113. 10.1126/science.290.5499.2110.PubMedView Article
- Craigon DJ, James N, Okyere J, Higgins J, Jotham J, May S: NASCArrays: a repository for microarray data generated by NASC's transcriptomics service. Nucleic Acids Res. 2004, D575-577. 10.1093/nar/gkh133. 32 Database
- Breitling R, Armengaud P, Amtmann A, Herzyk P: Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments. FEBS Lett. 2004, 573 (1–3): 83-92. 10.1016/j.febslet.2004.07.055.PubMedView Article
- Jeffery IB, Higgins DG, Culhane AC: Comparison and evaluation of methods for generating differentially expressed gene lists from microarray data. BMC Bioinformatics. 2006, 7: 359-10.1186/1471-2105-7-359.PubMedPubMed CentralView Article
- Kunst L, Klenz JE, Martinez-Zapater J, Haughn GW: AP2 Gene Determines the Identity of Perianth Organs in Flowers of Arabidopsis thaliana. Plant Cell. 1989, 1 (12): 1195-1208. 10.1105/tpc.1.12.1195.PubMedPubMed CentralView Article
- Tajima Y, Imamura A, Kiba T, Amano Y, Yamashino T, Mizuno T: Comparative studies on the type-B response regulators revealing their distinctive properties in the His-to-Asp phosphorelay signal transduction of Arabidopsis thaliana. Plant Cell Physiol. 2004, 45 (1): 28-39. 10.1093/pcp/pcg154.PubMedView Article
- Schiefelbein J: Cell-fate specification in the epidermis: a common patterning mechanism in the root and shoot. Curr Opin Plant Biol. 2003, 6 (1): 74-78. 10.1016/S136952660200002X.PubMedView Article
- Weigel D, Alvarez J, Smyth DR, Yanofsky MF, Meyerowitz EM: LEAFY controls floral meristem identity in Arabidopsis. Cell. 1992, 69 (5): 843-859. 10.1016/0092-8674(92)90295-N.PubMedView Article
- Ren X-Y, Vorst O, Fiers MWEJ, Stiekema WJ, Nap J-P: In plants, highly expressed genes are the least compact. Trends in Genetics. 2006, 22 (10): 528-532. 10.1016/j.tig.2006.08.008.PubMedView Article
- Gutierrez RA, Green PJ, Keegstra K, Ohlrogge JB: Phylogenetic profiling of the Arabidopsis thaliana proteome: what proteins distinguish plants from other organisms?. Genome Biol. 2004, 5 (8): R53-10.1186/gb-2004-5-8-r53.PubMedPubMed CentralView Article
- Gutierrez RA, Ewing RM, Cherry JM, Green PJ: Identification of unstable transcripts in Arabidopsis by cDNA microarray analysis: rapid decay is associated with a group of touch- and specific clock-controlled genes. Proc Natl Acad Sci USA. 2002, 99 (17): 11513-11518. 10.1073/pnas.152204099.PubMedPubMed CentralView Article
- Orphanides G, Reinberg D: A Unified Theory of Gene Expression. Cell. 2002, 108 (4): 439-451. 10.1016/S0092-8674(02)00655-4.PubMedView Article
- Walther D, Brunnemann R, Selbig J: The regulatory code for transcriptional response diversity and its relation to genome structural properties in Arabidopsis thaliana. PLoS Genetics. 2006, preprint(2006):e11.eor.
- Chan SW, Henderson IR, Jacobsen SE: Gardening the genome: DNA methylation in Arabidopsis thaliana. Nat Rev Genet. 2005, 6 (5): 351-360. 10.1038/nrg1601.PubMedView Article
- Gehring M, Henikoff S: DNA methylation dynamics in plant genomes. Biochim Biophys Acta. 2007
- Zhang X, Yazaki J, Sundaresan A, Cokus S, Chan SW, Chen H, Henderson IR, Shinn P, Pellegrini M, Jacobsen SE, Ecker JR: Genome-wide high-resolution mapping and functional analysis of DNA methylation in arabidopsis. Cell. 2006, 126 (6): 1189-1201. 10.1016/j.cell.2006.08.003.PubMedView Article
- Zilberman D, Gehring M, Tran RK, Ballinger T, Henikoff S: Genome-wide analysis of Arabidopsis thaliana DNA methylation uncovers an interdependence between methylation and transcription. Nat Genet. 2007, 39 (1): 61-69. 10.1038/ng1929.PubMedView Article
- Molina C, Grotewold E: Genome wide analysis of Arabidopsis core promoters. BMC Genomics. 2005, 6 (1): 25-10.1186/1471-2164-6-25.PubMedPubMed CentralView Article
- Shahmuradov IA, Gammerman AJ, Hancock JM, Bramley PM, Solovyev VV: PlantProm: a database of plant promoter sequences. Nucleic Acids Res. 2003, 31 (1): 114-117. 10.1093/nar/gkg041.PubMedPubMed CentralView Article
- Tukey J: Multiple comparisons. J Am Stat Assoc. 1953, 48: 624-625.
- Schwarz G: Estimating the dimension of a model. Annls Statistics. 1978, 6: 461-464. 10.1214/aos/1176344136.View Article
- Zhang X, Clarenz O, Cokus S, Bernatavichute YV, Pellegrini M, Goodrich J, Jacobsen SE: Whole-genome analysis of histone H3 lysine 27 trimethylation in Arabidopsis. PLoS Biol. 2007, 5 (5): e129-10.1371/journal.pbio.0050129.PubMedPubMed CentralView Article
- Shi J, Dawe RK: Partitioning of the Maize Epigenome by the Number of Methyl Groups on Histone H3 Lysines 9 and 27. Genetics. 2006, 173 (3): 1571-1583. 10.1534/genetics.106.056853.PubMedPubMed CentralView Article
- Bernstein BE, Mikkelsen TS, Xie X, Kamal M, Huebert DJ, Cuff J, Fry B, Meissner A, Wernig M, Plath K, Jaenisch R, Wagschal A, Feil R, Schreiber SL, Lander ES: A Bivalent Chromatin Structure Marks Key Developmental Genes in Embryonic Stem Cells. Cell. 2006, 125 (2): 315-326. 10.1016/j.cell.2006.02.041.PubMedView Article
- Garcia-Bassets I, Kwon Y-S, Telese F, Prefontaine GG, Hutt KR, Cheng CS, Ju B-G, Ohgi KA, Wang J, Escoubet-Lozach L, Rose DW, Glass CK, Fu X-D, Rosenfeld MG: Histone Methylation-Dependent Mechanisms Impose Ligand Dependency for Gene Activation by Nuclear Receptors. Cell. 2007, 128 (3): 505-518. 10.1016/j.cell.2006.12.038.PubMedPubMed CentralView Article
- Wang R, Tischner R, Gutierrez RA, Hoffman M, Xing X, Chen M, Coruzzi G, Crawford NM: Genomic analysis of the nitrate response using a nitrate reductase-null mutant of Arabidopsis. Plant Physiol. 2004, 136 (1): 2512-2522. 10.1104/pp.104.044610.PubMedPubMed CentralView Article
- Scheible WR, Morcuende R, Czechowski T, Fritz C, Osuna D, Palacios-Rojas N, Schindelasch D, Thimm O, Udvardi MK, Stitt M: Genome-wide reprogramming of primary and secondary metabolism, protein synthesis, cellular growth processes, and the regulatory infrastructure of Arabidopsis in response to nitrogen. Plant Physiol. 2004, 136 (1): 2483-2499. 10.1104/pp.104.047019.PubMedPubMed CentralView Article
- Palenchar P, Kouranov A, Lejay L, Coruzzi G: Genome-wide patterns of carbon and nitrogen regulation of gene expression validate the combined carbon and nitrogen (CN)-signaling hypothesis in plants. Genome Biology. 2004, 5 (11): R91-PubMedPubMed CentralView Article
- Wang R, Okamoto M, Xing X, Crawford NM: Microarray analysis of the nitrate response in Arabidopsis roots and shoots reveals over 1,000 rapidly responding genes and new linkages to glucose, trehalose-6-phosphate, iron, and sulfate metabolism. Plant Physiology. 2003, 132: 556-567. 10.1104/pp.103.021253.PubMedPubMed CentralView Article
- Stitt M, Muller C, Matt P, Gibon Y, Carillo P, Morcuende R, Scheible WR, Krapp A: Steps towards an integrated view of nitrogen metabolism. J Exp Bot. 2002, 53 (370): 959-970. 10.1093/jexbot/53.370.959.PubMedView Article
- Suzuki MM, Kerr ARW, De Sousa D, Bird A: CpG methylation is targeted to transcription units in an invertebrate genome. Genome Res. 2007, 17 (5): 625-631. 10.1101/gr.6163007.PubMedPubMed CentralView Article
- Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP: Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostat. 2003, 4 (2): 249-264. 10.1093/biostatistics/4.2.249.View Article
- Persson S, Wei H, Milne J, Page GP, Somerville CR: Identification of genes required for cellulose synthesis by regression analysis of public microarray data sets. PNAS. 2005, 102 (24): 8633-8638. 10.1073/pnas.0503392102.PubMedPubMed CentralView Article
- Rhee SY, Beavis W, Berardini TZ, Chen GH, Dixon D, Doyle A, Garcia-Hernandez M, Huala E, Lander G, Montoya M, Miller N, Mueller LA, Mundodi S, Reiser L, Tacklind J, Weems DC, Wu YH, Xu I, Yoo D, Yoon J, Zhang PF: The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community. Nucleic Acids Research. 2003, 31 (1): 224-228. 10.1093/nar/gkg076.PubMedView Article
- Hong F, Breitling R, McEntee CW, Wittner BS, Nemhauser JL, Chory J: RankProd: a bioconductor package for detecting differentially expressed genes in meta-analysis. Bioinformatics. 2006, 22 (22): 2825-2827. 10.1093/bioinformatics/btl476.PubMedView Article
- Benjamini Y, Hochberg Y: Controlling the False Discovery Rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B. 1995, 57: 289-300.
- Storey JD, Tibshirani R: Statistical significance for genomewide studies. Proc Natl Acad Sci USA. 2003, 100 (16): 9440-9445. 10.1073/pnas.1530509100.PubMedPubMed CentralView Article
- Deeken R, Engelmann JC, Efetova M, Czirjak T, Muller T, Kaiser WM, Tietz O, Krischke M, Mueller MJ, Palme K, Dandekar T, Hedrich R: An integrated view of gene expression and solute profiles of Arabidopsis tumors: a genome-wide approach. Plant Cell. 2006, 18 (12): 3617-3634. 10.1105/tpc.106.044743.PubMedPubMed CentralView Article
- Davuluri R, Sun H, Palaniswamy S, Matthews N, Molina C, Kurtz M, Grotewold E: AGRIS: Arabidopsis Gene Regulatory Information Server, an information resource of Arabidopsis cis-regulatory elements and transcription factors. BMC Bioinformatics. 2003, 4 (1): 25-10.1186/1471-2105-4-25.PubMedPubMed CentralView Article
- Mueller LA, Zhang P, Rhee SY: AraCyc: A Biochemical Pathway Database for Arabidopsis. Plant Physiol. 2003, 132 (2): 453-10.1104/pp.102.017236.PubMedPubMed CentralView Article
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.