- Research article
- Open Access
Differential SAGE analysis in Arabidopsis uncovers increased transcriptome complexity in response to low temperature
BMC Genomicsvolume 9, Article number: 434 (2008)
Abiotic stress, including low temperature, limits the productivity and geographical distribution of plants, which has led to significant interest in understanding the complex processes that allow plants to adapt to such stresses. The wide range of physiological, biochemical and molecular changes that occur in plants exposed to low temperature require a robust global approach to studying the response. We have employed Serial Analysis of Gene Expression (SAGE) to uncover changes in the transcriptome of Arabidopsis thaliana over a time course of low temperature stress.
Five SAGE libraries were generated from A. thaliana leaf tissue collected at time points ranging from 30 minutes to one week of low temperature treatment (4°C). Over 240,000 high quality SAGE tags, corresponding to 16,629 annotated genes, provided a comprehensive survey of changes in the transcriptome in response to low temperature, from perception of the stress to acquisition of freezing tolerance. Interpretation of these data was facilitated by representing the SAGE data by gene identifier, allowing more robust statistical analysis, cross-platform comparisons and the identification of genes sharing common expression profiles. Simultaneous statistical calculations across all five libraries identified 920 low temperature responsive genes, only 24% of which overlapped with previous global expression analysis performed using microarrays, although similar functional categories were affected. Clustering of the differentially regulated genes facilitated the identification of novel loci correlated with the development of freezing tolerance. Analysis of their promoter sequences revealed subsets of genes that were independent of CBF and ABA regulation and could provide a mechanism for elucidating complementary signalling pathways. The SAGE data emphasised the complexity of the plant response, with alternate pre-mRNA processing events increasing at low temperatures and antisense transcription being repressed.
Alternate transcript processing appears to play an important role in enhancing the plasticity of the stress induced transcriptome. Novel genes and cis-acting sequences have been identified as compelling targets to allow manipulation of the plant's ability to protect against low temperature stress. The analyses performed provide a contextual framework for the interpretation of quantitative sequence tag based transcriptome analysis which will prevail with the application of next generation sequencing technology.
Abiotic stresses, including temperature, are a major constraint on the distribution of plant species throughout the world and the impact of these stresses has been estimated to reduce yield potential by 69% . Temperate plants have the ability to increase their freezing tolerance in response to a period of low, non-freezing temperatures through a physiological adaptation known as cold acclimation . In the non-acclimated state, exposure to freezing temperatures causes significant damage to most plant species. However, once acclimated, freezing tolerance is significantly increased and the ability to resist this stress varies both within and between species, for example Brassica napus, wheat and rye are able to withstand temperatures of -16°C, -19°C and -29°C respectively [3–5]. Arabidopsis thaliana (hereafter referred to as Arabidopsis) has been used as a model to study freezing tolerance and is able exhibit a modest increase of 5°C in freezing tolerance from -3°C to -8°C after exposure to acclimating conditions for one week [6, 7]. Cold acclimation is a quantitative trait with a multitude of metabolic, molecular and physiological changes occurring in response to low temperature exposure. An understanding of the molecular components controlling this response has been advanced by the isolation of mutants with a differential response to temperature stress [8–10] and through the identification of genes whose expression is correlated with the development of freezing tolerance [11–14]. However, the individual effects of the majority of these genes are marginal perhaps reflecting redundancy among stress response networks. Therefore, to identify key regulatory proteins within these networks, a comprehensive strategy which may be achieved through the use of genomics technologies is required.
Serial analysis of gene expression (SAGE) has been used to profile the transcriptomes of at least twenty species  since the technique was developed by Velculescu et al. (. SAGE is a sequence based technology developed to generate a transcript expression profile in a high throughput, accurate and non-biased manner. Briefly, the method allows the capture of a 14–21 bp cDNA fragment from a defined position within each mRNA molecule. The captured SAGE tags are amplified and concatenated together before being cloned and sequenced, with each sequence read allowing information for approximately 40–60 transcripts to be obtained. The frequency of each tag occurring within a library directly represents the abundance of that mRNA species within the tissue sampled (. SAGE is a powerful technology that has the capacity to detect small differences in gene expression and the reproducibility of SAGE has been proven by comparisons between tag abundance profiles generated from the same mRNA pool . One of the advantages of using SAGE is the ability to reveal the expression of novel genes as data capture is independent from prior knowledge of DNA sequence. The specificity of tag based technologies allows the detection of post transcriptional regulation including the identification of alternate transcript and antisense products .
In Arabidopsis, effective tag to gene mapping strategies have been assisted by the completion of the genome sequence where approximately 30,000 gene models have been identified . In general, SAGE data are referenced to the identified tags rather than to the gene from which they are derived. It is commonly observed that multiple SAGE tags uniquely match to a single gene, which together provide the overall expression level of the locus . Therefore, representation of SAGE data based on the annotated gene identifiers would facilitate evaluating gene expression levels, reduce the number of statistical comparisons and simplify comparisons of expression data across expression profiling platforms. In addition, this representation method enables the visualisation of putative alternate transcript processing within the SAGE data. The available analysis tools are adequate for the majority of SAGE experiments conducted to date that have assayed differences in tag frequency between two experimental conditions . However, to determine differences among multiple libraries, which capture the progression of a physiological phenomenon, requires a statistical test with greater power and precision.
This study applied SAGE technology to assess gene expression changes that occur in Arabidopsis leaf tissue exposed to low temperature over a period of one week. Custom Perl scripts were developed to simultaneously analyse multiple libraries and to optimise the presentation of the generated SAGE data. These data demonstrate the changes in the Arabidopsis transcriptome that occur in response to low temperature exposure. Time points were selected to discover gene expression changes correlated with the acclimation process from early signalling events through to the acquisition of freezing tolerance. In addition to identifying a set of novel low temperature regulated loci, the analysis uncovered a disproportionate amount of post transcriptional regulation in response to low temperature, where an increase in alternate transcript processing and a decrease in antisense transcription was observed.
Determination of freezing tolerance in Arabidopsis
The degree of freezing tolerance in Arabidopsis was assayed at five time points, 0 minutes, 30 minutes, 2 hours, 2 days and 1 week of exposure to 4°C. As anticipated, the level of freezing tolerance was positively correlated with the period of time under acclimating conditions. Freezing tolerance was measured to be -4.2°C with non-acclimated material under these growth conditions. There was no detectable increase in freezing tolerance prior to 48 hours exposure to low temperature and after 1 week the freezing tolerance increased to -8.3°C (Figure 1). These results are in agreement with previous assessments of freezing tolerance for Arabidopsis [6, 10].
Experimental SAGE library analysis
SAGE libraries were developed from tissue collected at each of the five selected time points, sequencing of these libraries resulted in a total of 242,066 high quality SAGE tags (Table 1). This total comprised 38,179 distinct tags where 20,741 (54%) were observed only once (singleton tags). In common with previous analyses, singleton tags were the most abundant class in each library contributing between 59%–63% of the unique tags [21, 22]. Ninety percent (34,146) of the distinct experimental tags could be matched to the Arabidopsis genome sequence. However, due to the short tag length (14 bp) and the non-random nature of genome sequence a number of tags were assigned to multiple locations, complicating the interpretation of these expression data. Thus, all further analyses were restricted to the most informative tags that had been unambiguously assigned to a single location. This resulted in 26,456 (69%) tags matched to 16,629 annotated genes, including 74 genes on the chloroplast and 5 genes on the mitochondrial genome. An additional 1,942 (5%) tags were matched to the pseudochromosome sequences and a further 67 tags and 37 tags were assigned to intergenic regions of the plastid and mitochondria genomes respectively. In comparison, 15,184 (40%) of all unique tags were matched to 13,186 genes using the available Arabidopsis Unigene and full length cDNA sequences, underlining the advantages of a fully sequenced and annotated genome (Additional file 1).
Interpretation of SAGE data
The SAGE output was restructured to exploit the advantages of utilising a model organism by referencing the results to annotated gene identifiers rather than individual tag sequences (Figure 2). The abundance and relative position of the assigned tags at each locus is indicated along with the overall expression level as determined by summing the individual tag counts. Statistical analysis was performed on both tag and locus counts. SAGE is able to provide evidence for transcription from regions beyond the annotated Arabidopsis gene identifiers (AGI). These include tags matching intergenic regions, which could represent as yet unannotated genes, transposable elements or small RNAs and tags that remain unassigned to the available genome sequence. This study detected 6,079 such orphan tags which were included in the statistical analysis to reveal novel low temperature responsive transcripts. A summary of all these SAGE data are provided as additional data (Additional file 2).
Low temperature regulation of gene expression detected through SAGE
This is the first report where differences among multiple SAGE libraries are determined simultaneously as previous SAGE analyses have been restricted to pair-wise library comparisons. Statistical analysis was performed using the 2xt chi-squared test for homogeneity among the five SAGE libraries on both discrete SAGE tags and the identified AGI loci.
The SAGE tag analysis revealed 956 tags exhibiting differential expression (p < 0.01), including 874 tags assigned to AGI loci, 54 tags assigned to pseudochromosome sequence, 8 tags assigned to plastid sequence, 5 tags matched to mitochondrial sequence and 15 tags that were not matched to the available genome sequence. Analysis of the total detected expression for each AGI locus as represented by the sum of the individual tag counts revealed 920 genes exhibiting a significant difference in gene expression (p < 0.01). A summary of the expression of 25 genes exhibiting the greatest change in transcript levels in response to low temperature are presented in Table 2 and all differentially expressed genes are available in Additional file 3. It is possible to account for inflation in the frequency of type I errors due to multiple comparisons through the application of the Benjamini-Hochberg  or Bonferroni correction . This provides greater confidence for classifying 440 or 300 genes as low temperature responsive, respectively (Additional file 3). However, the application of such corrections assumes independence of loci which may not be appropriate for gene expression data and might increase the occurrence of type 2 errors to unacceptable levels.
Assessment of known low temperature responsive genes
The fidelity of the SAGE data was confirmed by studying the expression of 30 previously identified low temperature induced genes including the LTI/COR/ERD genes [11, 14] and the CBF/DREB1 transcription factor gene family (hereafter referred to as CBF)  (Table 3). The anticipated increase in expression level was observed for 28 of the selected genes (p < 0.01). Interestingly, SAGE was able to discriminate between transcripts derived from duplicate members of the CBF gene family due to unique tags present in the 3' UTR sequence. Each member of this gene family exhibited a similar expression profile upon exposure to low temperature with maximum expression observed after 2 hours. However, subtle differences were observed, with CBF2 expression being detected prior to CBF1 or CBF3 expression and the majority of the transcripts were derived from CBF3. CBF3 was the only homologue transcribed after prolonged exposure to low temperature. Additionally, no significant difference was detected in the expression of ten abiotic stress related genes previously demonstrated to be unaffected by low temperature, which included the DREB2 family, the SOS genes and drought inducible genes (data not shown).
SAGE profiles during the development of freezing tolerance
The generation of SAGE data from multiple libraries under cold acclimating conditions allowed gene expression profiles to be explored and genes with similar profiles to be clustered. The profiles of the 920 low temperature responsive genes were clustered using a post-hoc pair-wise analysis for the chi-square test (Figure 3) .
Genes involved in the perception of low temperature and those activating cold specific regulons would be expected to exhibit changes in expression at early time points. A comparison of the control library to each low temperature treated library identified those genes only exhibiting a significant difference (p < 0.01) at a single time point. This analysis identified 126 and 47 genes that were up-regulated and 28 and 5 genes which were repressed at the 30 and 120 minute time points, respectively. As anticipated, the major functional class represented among the up-regulated genes were transcriptional activators (25%) which included two members of the well characterised CBF gene family. The third copy of CBF, CBF2 was significantly up-regulated at both of these time points and nineteen additional genes were found to share a similar profile.
CBF has been demonstrated to control the expression of a suite of low temperature responsive genes exemplified by COR genes whose expression is positively correlated with the development of freezing tolerance . A cluster of 63 genes were identified that were significantly induced to high levels in response to low temperature after 2 days and remained up-regulated after one week, this profile was in common with that exhibited by the majority of COR genes. However, it was noted that not all COR genes would be observed in this cluster since there are subtle differences among the expression profiles of the previously classified COR genes (Table 3). This cluster was enriched for ribosomal proteins (11%, χ21 = 72; p < 0.001) and genes involved in oxidative stress protection (8%, χ21 = 32; p < 0.001) in addition to previously annotated cold regulated genes (11%, χ21 = 17.72; p < 0.001).
An analysis of the promoter sequences (within 1500 bp 5' of the ATG) for the 63 genes in the COR gene-like profile (Figure 3C) revealed that 24 possessed the CRT/DRE cis-acting element (CCGAC) recognised by CBF (Figure 4). No correlation was found between the number of CRT/DRE elements in each promoter and the magnitude of gene expression. The promoter sequences of the remaining 39 genes did not possess this element. A comprehensive analysis of 6 mer motifs found to be significantly over-represented within the promoter elements of this cluster identified two additional known cis-acting regulatory elements, 'AUX' and 'ABRE' that are found in the promoters of auxin and absisic acid (ABA) regulated genes respectively . It has been shown that COR gene expression responds to the application of ABA, which is reflected by the presence of the ABRE sequence in a high proportion (83%) of those genes whose promoters also contained the CRT/DRE element. However, there were a number of predicted ABA responsive genes that were independent of the CBF regulon (Figure 4). In addition, two further groups of genes appeared to be independent of either CBF or ABA regulation and were characterised by the preponderance of unknown motifs 'GGCCCA' and 'ATAACC'. A similar promoter analysis of the 126 genes found to be up-regulated at 30 minutes only (Figure 3A) identified 193 6 mers which were significantly more abundant in the selected genes compared to the random set (Additional file 4). The majority of the elements were uninformative due to sequence ambiguity. However, for those which could be assigned to a known motif, they were predominantly annotated as light responsive or under circadian control, which may be expected for this timepoint.
Identification of novel transcripts through SAGE
Alternate transcript processing can be detected due to the precision of SAGE tag assignment . The SAGE method captures a tag from the 3' most anchoring enzyme (Nla III) recognition site within each transcript, which will bias the detection of processing events towards those occurring at the 3' end. Although potential differential pre-mRNA processing products have been well described, the information provided by the tag to gene matching is insufficient to distinguish among certain classes [28–30]. SAGE cannot identify mutually exclusive exons and absent exons (exon skip) can only be inferred if the canonical SAGE tag site resides in an internal exon. Thus, using SAGE it is possible to define three general classes of alternate transcript processing; retained introns, modified exon structure (alternative acceptor/donor splice sites) and polymorphic UTR sequences.
Analysis of the SAGE data has revealed that 4,982 (32%) Arabidopsis loci may be subject to differential processing events, this frequency is in accordance with previous estimates based on EST analysis . However, incomplete digestion by the anchoring enzyme can generate SAGE tag artefacts. These potentially misleading tags were eliminated from subsequent analysis as described in Robinson et al. (2004), resulting in a more conservative estimate, with alternative transcript processing events affecting 1,275 loci (8%) (Additional file 5). To further corroborate this phenomenon, these data were compared with loci encoding multiple Unigene and/or full length cDNA sequences possessing unique canonical tags (Additional file 1). For the 993 loci in common between these data, tags matched specific alternative canonical sites in 402 instances among the full length cDNA/EST data. A breakdown of the different classes of pre-mRNA processing events are summarised in Table 4. Similar to previous analyses based on EST sequences the most prevalent type of processing event was intron retention (55%) [29, 31]. The vast majority (41%) of the retained intron events were identified in the 3'-UTR sequences.
Alternate transcript processing was observed for 138 (15%) of the low temperature responsive genes (p < 0.01), which indicates a two-fold increase of this phenomenon at low temperature. Forty-four of these loci (32%) were functionally assigned to the photosynthetic light harvesting process, electron transport or RuBisCO activase. As anticipated, the majority of loci involved in photosynthesis displayed a decrease in expression after one week at low temperature . However, for 13 of these loci an alternate truncated transcript was observed, which was transiently induced (for example, At2g34420 in Figure 2). Potential alternate transcripts were detected for four low temperature induced chloroplast encoded genes and in each instance tags derived from all possible Nla III sites were observed. Chloroplast transcripts are subject to poly-adenylation dependant degradation suggesting that these tags represent different cleavage events rather than the products of alternate transcript processing . In light of this, the relative abundance of organelle encoded transcripts cannot be assessed through expression analysis techniques that capture RNA molecules by poly-T priming.
The 1,952 (5%) tags that were unambiguously matched to intergenic sequence could indicate the presence of non-coding RNAs. Since SAGE tags are captured from polyadenylated transcripts, only a subset of these molecules can be assayed including microRNAs. Presently, 86 microRNA families have been identified in Arabidopsis and 5 of the intergenic SAGE tags were mapped to these microRNA loci . In addition, we tentatively matched 14 SAGE tags to available small RNA databases (Additional file 2). However, these tags were found at levels insufficient to detect any significant expression response.
The orientation of each SAGE tag is known, which facilitates the identification of antisense transcripts [18, 35]. The level of antisense expression detected among the SAGE data was estimated to be 21% where tags were matched unambiguously in the antisense orientation to 3,556 genes (Additional file 6). This level is comparable to that previously detected in Arabidopsis . Differential antisense expression (p < 0.01) was detected for 50 genes subsequent to low temperature exposure, 13 of these were in common with the 920 low temperature responsive genes described above. These data indicate that the degree of antisense transcription has been reduced four-fold in response to low temperature.
Evolutionary functional conservation
The evolutionary origin of the low temperature responsive genes was inferred based on the classifications described by Gutierrez et al.  who determined that at least 14% (3,848) of Arabidopsis proteins are plant specific and 9% (2,436) are evolutionarily conserved with the Eukaryota, Bacteria and Archea domains. The SAGE analysis identified 162 plant specific genes and 127 conserved genes that were differentially regulated by low temperature (p < 0.01). Functional characterisation of these subsets revealed that low temperature treatment induced changes in the frequency distribution among several Gene Ontology (GO) slim categories (Figure 5).
Among the plant specific genes the greatest disparity was observed in the 'Other binding' category (χ21 = 26; p < 0.001) where the majority of observed genes encoded for lipid binding proteins (Figure 5). Interestingly, although the proportion of genes annotated with 'Transcription factor activity' was similar, there was bias toward the members of two specific gene families, namely ERF/AP2 and AUX/IAA.
The percentage of genes identified as possessing 'Transporter activity' was significantly increased (χ21 = 22; p < 0.001) within the conserved genes subset, whereas genes annotated with 'Kinase activity' were under represented (χ21 = 15; p < 0.001) among these data (Figure 5). The majority of the low temperature responsive genes contributing to the 'Transporter activity' category were functionally characterized as either water channel proteins or hexose transporters. In addition, the heat shock protein gene family was over represented within the 'Nucleotide binding' category, although no overall proportional change was observed.
Five SAGE libraries were constructed from tissue harvested throughout the adaptive cold acclimation process exhibiting incremental levels of freezing tolerance, which was positively correlated with exposure to low temperature (Figure 1). Analysing these libraries allows changes in gene expression induced by the initial perception of low temperature, the activation of signalling networks and the perturbation of metabolic pathways to be observed. In total, 242,066 high sequence quality tags were generated to provide a comprehensive assessment of the Arabidopsis leaf transcriptome using SAGE.
The biological interpretation and simultaneous statistical analysis of these complex data necessitated the development of new representation and analysis tools, which was achieved using custom Perl scripts [18, 20]. The interpretation of the SAGE data was enhanced by summarising the output according to the accepted gene models. This reduced the complexity of the transcript population, facilitated the identification of differential transcript processing, allowed clustering of similar expression profiles and simplified comparisons with expression data generated using alternate technology platforms. A total of 16,629 annotated Arabidopsis genes were detected with further evidence for 2,150 putative transcriptional units.
Differential gene expression in response to low temperature
In total, SAGE was able to detect 920 genes responding to low temperature (p < 0.01). Additional low temperature responsive tags were matched to 68 pseudochromosome locations and 15 tags remained unassigned. This suggests that the expression of 6% of the annotated genes detected were affected by low temperature regulation. A set of 30 diagnostic marker genes, known to be regulated by low temperature, were assessed to validate the experimental design. SAGE was able to detect only 28 of these genes due to the lack of an anchoring enzyme site in the KIN1 and KIN2 genes. In each case, the anticipated expression profile was observed (Table 3). It was determined that SAGE would be unable to detect approximately 2% of the Arabidopsis transcripts due to the absence of the Nla III anchoring enzyme site .
The effects of low temperature on the Arabidopsis transcriptome have been studied previously using microarray platforms [38–43]. The two most recent analyses which used a comparable duration of low temperature exposure designated 514  and 5,924 genes (p < 0.01)  as cold responsive. By comparison to these studies 40% of the differentially regulated loci as determined by SAGE were also classified as low temperature regulated and only 71 of these loci were identified by all three analyses. It is perhaps not surprising that this discrepancy exists, since many factors influence the classification of genes as being responsive to low temperature . Variations in environmental conditions will affect differential gene expression increasing the number of experimental parameters and making direct comparisons difficult to interpret. Technically, SAGE is able to characterise the expression of genes absent from the microarray, as such the profiles of an additional 2,040 genes were determined. Furthermore, SAGE has the ability to detect differences among tag frequencies over a large dynamic range, dependent on the number of tags obtained in each library. This is highlighted by the lack of correspondence between cold responsive genes that are highly expressed based on SAGE analysis, where the signal intensity level detected by microarrays would be saturating and thus prevent identification of differential gene expression . Conversely, a bias exists in SAGE for genes of low abundance where the small number of tags is an impediment to the statistical differentiation of low temperature response. Due to these factors, functional annotation of the identified cold responsive genes was used to compare the data generated from different platforms. This analysis revealed that near identical processes were affected by low temperature, although the response of 'Structural proteins' appeared to be under estimated in the microarray data which could be a function of their relatively high expression levels. The use of complementary technologies allows the complex transcriptional changes taking place during cold acclimation to be fully realised.
Defining expression profiles
Patterns of co-regulation were assayed among the genes identified as low temperature responsive. This analysis refines the data to assist with candidate gene selection, defines gene regulons to facilitate the capture of regulatory proteins and associates genes of unknown function with well characterised genes. Since the discovery of the COR genes the majority of low temperature research has focused on factors controlling the regulation of these genes, exemplified by the identification of the transcriptional activators CBF and ICE [46, 47]. The SAGE data can be mined to find genes with analogous profiles by defining expression patterns representative of the COR and CBF genes.
COR gene-like expression was observed for 63 genes whose transcripts accumulate after 2 days and remain induced after 7 days of low temperature treatment (Figure 3C). Among these genes were the complement of COR genes and almost 30% of the genes identified were annotated as stress responsive. The CBF binding element was found in the promoter region of 24 of the 63 COR-like genes of which four were common to the previously designated CBF regulon . The remaining 39 genes are likely to be under alternate transcriptional control acting in parallel to and independent of CBF, which appeared to be reflected by the presence of known and new motifs within the regulatory regions of these genes. Motifs which characterise ABA responsive genes were prevalent among genes putatively activated by CBF. A distinct group of genes appeared to respond only to ABA and a number of genes appeared to be independent of either CBF or ABA. The uncharacterised motifs that were significantly over-represented within this category of genes could be used to further dissect low temperature signalling pathways.
A similar approach was applied to identify genes sharing an expression profile comparable to the CBF transcriptional activators. The individual members of the CBF gene family exhibited overlapping expression profiles, which were compared to the 920 low temperature responsive genes (Table 3). This resulted in the identification of 47, 19 and 13 genes that mimicked the pattern of CBF1, CBF2 and CBF3, respectively. The predominant functional categories observed for these genes were 'Response to stress' (23%) and 'Transcription factor activity' (22%). The novel low temperature inducible transcription factors may allow the detection of further pathways controlling the acquisition of freezing tolerance.
The largest observed cluster of genes represented a rapid transient response observed after 30 minutes of low temperature exposure, where 126 genes were up-regulated and 28 were repressed. The selection of the early time point was designed to capture the immediate transcriptional changes resulting from the plant's perception of the temperature reduction but not to be influenced by the consequent physiological adjustment to the stress. This was indicated by the equivalent distribution of the 126 up-regulated genes across the functional categories compared to those observed from non-stressed tissue, suggesting no significant disruption of cellular homeostasis. Among these data, 16 genes were induced only at this time point and annotated with transcription factor activity and warrant further investigation. A significant proportion of the down-regulated genes were functionally annotated as chloroplast proteins (21%, two-fold increase over non-stressed tissue). The reduction in expression level could be the result of cold induced photoinhibition where the reduction in chlorophyll content, photosystem antenna size and non-photochemical quenching reduce the deleterious effects of reactive oxygen species .
Evolutionary conservation of low temperature induced genes
Sensing and responding to a changing environment is fundamental to all species. Cold adaptation is found in a diverse range of poikilotherm species and different strategies are utilized to survive the environmental stress. The similarities among evolutionarily distant species could illuminate the cellular processes of low temperature perception and signal transduction. Although many adaptations influencing freezing tolerance may be shared among these species, the perception of the temperature change and adaptation to the stress is likely unique in photoautotrophic species as it probably involves perturbations in primary metabolism , adjustments to redox homeostasis in the cytosol, chloroplast and mitochondria in conjunction with changes in cell wall-membrane-cytoskeleton conformations [50–52].
Among the low temperature regulated proteins that were found to be conserved across all domains (Eukaryota, Bacteria and Archea) there was a preponderance of membrane transporters, consisting of aquaporins, hexose transporters and a subfamily of the ABC transporters, the soluble GCN type proteins. Although the frequency of genes annotated with 'Kinase activity' were under represented in the low temperature data, 80% of those found were conserved across all three domains. Similarly, the genes which fall under the GO slim classifications 'Other enzyme activity' and 'Nucleotide binding' were over represented among the low temperature responsive genes. On closer examination, these categories largely contained proteins involved in photorespiration, reactive oxygen species metabolism and molecular chaperones. Together, the expression of these proteins, which are essential to protect and restore cellular homeostasis, suggest a conservation of the metabolic response to low temperature stress across a diverse range of organisms.
In contrast, elucidating those responses specific to plants could identify key requirements of the low temperature stress tolerance strategy. This was exemplified by the identification of the COR genes and CBF, among the identified plant-specific proteins. Other plant specific genes regulated by low temperature included those encoding lipid transfer proteins (LTP), additional AP2 binding proteins and auxin regulated proteins. The LTP genes exhibited two distinct expression profiles, where transcript accumulation occurs after 30 minutes or 7 days of treatment. The latter expression pattern is reminiscent of some COR genes, indeed the family shares physical properties with COR proteins and one such protein was found to be involved in thylakoid membrane stabilisation [53, 54]. It could be inferred from their plant specific origin that their role is to specifically protect plastid or thylakoid membranes. The uncharacterised AP2 binding proteins may act to control CBF independent signalling pathways and represent new targets for manipulating the plant's response to low temperature stress.
Detection of low temperature induced novel transcripts
The use of SAGE analysis allowed the level of antisense transcription and alternative transcript processing to be estimated throughout the low temperature treatment. Opposing responses to low temperature were observed for antisense and alternative transcript processing. The frequency of antisense transcripts detected was reduced four-fold. It has been proposed that antisense molecules are processed in a similar manner to siRNAs in order to become active , thus the reduction in antisense SAGE tags could result from an increase in siRNA production. However, it has been demonstrated that penetrance of RNA mediated gene silencing in tobacco was impaired upon lowering growth temperatures due to a reduction in DICER activity . Therefore, it appears that both siRNA production and antisense transcription decrease in response to low temperatures. This phenomenon may impact the use of RNA interference as a mechanism for controlling gene expression under adverse environmental conditions.
In contrast, the frequency of alternative transcript processing events doubled in response to low temperature, but the relative levels of the different classes of alternative transcripts were unaffected. These data add to the emerging body of evidence indicating that alternate transcript processing is required for plants to adapt to unfavourable conditions. The level of alternate transcript processing was assessed using available EST sequence data for Arabidopsis and it was found that genes with retained introns were biased towards those annotated as being involved in photosynthesis and stress responses . Precise transcript processing is essential for normal plant development, and its relevance to abiotic stress tolerance was exemplified by the identification of the STA1 Arabidopsis mutant, since plants possessing defective sta1 alleles exhibited mis-splicing of COR15 transcripts under cold stress . It has also been observed that critical components of the splicesome, the serine/arginine rich proteins that are involved in the regulation of pre-mRNA splicing, are themselves subject to alternate processing in response to different stress treatments . This might point to a mechanism for generating the increased transcriptome complexity that was observed among the SAGE data in response to low temperature.
The majority of the identified alternate transcript processing events were found to affect genes involved in photosynthesis, light harvesting and electron transport. Notably, the data revealed that 30% of these genes produced a truncated transcript which was transiently induced after two hours of low temperature treatment. The impact of low temperature on photosynthesis leads to an energy imbalance caused by a reduction in the rates of metabolic and photosynthetic enzymes relative to the temperature-independent reaction rates of electron transfer to light harvesting complexes . Adjustments are necessary to prevent oxidative damage, these are known to include alterations in antennal size, heat dissipation through non-photochemical quenching and the diversion of energy from Photosystem II (PSII) to Photosystem I (PSI) . Alterations in antennal size and number could result from truncated transcripts. In mammalian systems, short isoforms of intracellular receptors have been shown to interfere with the formation of functional protein complexes . The alternatively processed photosynthetic transcripts which have lost the conserved chlorophyll binding domain may still function as a structural component of the antennae, but would effectively reduce the efficiency of energy capture, limiting production of reactive oxygen species. This appears to be a short term response to low temperature that acts in concert with an overall reduction in the level of full length transcript observed after one week. This would correlate with the observation that there is no sustained repression of photosynthetic capacity in herbaceous plants after long periods of low temperature treatment .
This study has utilised SAGE to identify differences in gene expression due to low temperature exposure among multiple libraries. It provides a methodology to visualise, interpret and maximise the amount of information which can be obtained from sequence tag data in high-throughput gene expression analysis. Such analyses will become more prevalent with the use of next generation sequencing technologies, which will facilitate the adoption of digital expression analysis. This study, while providing a global view of the low temperature response in Arabidopsis, has identified novel stress regulated genes and potential cis-acting regulatory elements, which will provide avenues for functional characterisation of the plant's response.
Plant Materials and SAGE library construction
The A. thaliana ecotype Columbia (Col-4) was used throughout this study. The plant material for SAGE analysis was grown and harvested as described in Robinson et al. . Control seedlings were grown for 14 days at 22°C and 125 μE light with a 16 hr photoperiod. Low temperature treated plants were grown as described for control plants but were exposed to 4°C for 30 minutes, 2 hours, 2 days or 7 days prior to tissue harvest and RNA extraction. Tissue harvest was synchronised to the virtual dawn except where the duration of low temperature exposure was a fraction of 24 hours. The SAGE libraries were generated and the SAGE tags were sequenced and extracted as described in Robinson et al. . For each library, all extracted SAGE tags were submitted to the NCBI Gene Expression Omnibus (GEO) database (Accession No. GSE11461; http://www.ncbi.nlm.nih.gov/geo/).
Determination of freezing tolerance
The plant material for the freezing tolerance assays was sown in soil and grown at 22°C under 125 μE light with a 16 hr photoperiod until the 6 leaf stage. Low temperature treatment of these plants was performed at 4°C with the light intensity remaining at 125 μE. The plants were sampled at 0 time, 30 minutes, 2 hours, 2 days and 1 week after transfer to the cold acclimating conditions. The degree of freezing tolerance exhibited in these plants was determined by assaying ion leakage by measuring changes in the level of electrical conductivity as described by Sharma et al. .
SAGE tag analysis
The SAGE tag to gene assignment was performed using the cSAGE algorithm  with the modification that tags found within 250 bp of an annotated gene's co-ordinates were matched to that gene . This was performed as these tags are likely derived from improperly annotated UTR sequences rather than evidence of novel transcriptional units. Custom Perl scripts were used to organise and count the number of tags matched to each Arabidopsis gene identifier (AGI), in both sense and antisense orientation . Where indicated to facilitate comparison among libraries the tag values have been normalised to 50,000 tags per library based on the number of tags matching to AGIs. All statistical analyses were performed using 2xt contingency tables and employed the chi-square homogeneity test to detect differences in tag frequencies among the libraries where the null hypothesis states that each sample is from the same distribution i.e. H0: π1 = π2 π3 = π4 = π5 . Genes were clustered based on expression profiles determined through the use of a post-hoc chi-square test statistic in pair-wise analyses and was achieved using custom Perl scripts, which are available upon request .
Promoter sequence analysis
Custom Perl scripts were developed to analyse the putative promoter sequences from a subset of selected genes. The frequency of individual sequence motifs present within a query set of promoter gene sequences was compared to the frequency observed within the promoters of 6,000 randomly selected genes. The confidence of these frequency distributions was assigned using the binomial distribution. The promoter sequences were defined as the 1500 bp sequence immediately 5' of the start codon of the specified AGI except where there was insufficient sequence between AGI codes, when the promoter sequence was truncated at the boundary of the upstream annotated gene. Promoter sequences were analysed for the presence of sequence motifs (n-mers) of a user-defined length. The frequency of each motif was calculated from both strands and palindromic sequences were only counted once.
Additional Database resources
Arabidopsis Unigene (build #67) and full length cDNA sequences were obtained from the National Center for Biotechnology Information http://www.ncbi.nlm.nih.gov/. The microRNA sequences and small RNA data was obtained from http://asrp.cgrb.oregonstate.edu/ and http://mpss.udel.edu/at/, respectively. The gene ontology (GO) slim plant database from The Arabidopsis Information Resource (TAIR; http://www.arabidopsis.org) was used for functional classification ftp://ftp.arabidopsis.org/home/tair/Ontologies/Gene_Ontology/. The plant specific database used for evolutionary comparisons was obtained from http://genomics.msu.edu/plant_specific/introduction.html. The PLACE database was used to identify known cis-acting regulatory elements (Place: http://www.dna.affrc.go.jp/htdocs/PLACE/).
Boyer JS: Plant Productivity and Environment. Science. 1982, 218: 443-448. 10.1126/science.218.4571.443.
Levitt J: Chilling, Freezing and High Temperature Stress. Responses of Plants to Environmental Stress. 1980, New York, NY: Academic Press, 1: 137-141.
Huner NP, Williams JP, Maissan EE, Myscich EG, Krol M, Laroche A, Singh J: Low Temperature-Induced Decrease in trans-Delta-Hexadecenoic Acid Content Is Correlated with Freezing Tolerance in Cereals. Plant Physiol. 1989, 89: 144-150.
Perras M, Sarhan F: Synthesis of Freezing Tolerance Proteins in Leaves, Crown, and Roots during Cold Acclimation of Wheat. Plant Physiol. 1989, 89: 577-585.
Sharma N, Cram D, Huebert T, Zhou N, Parkin IA: Exploiting the wild crucifer Thlaspi arvense to identify conserved and novel genes expressed during a plant's response to cold stress. Plant Mol Biol. 2007, 63: 171-184. 10.1007/s11103-006-9080-4.
Gilmour SJ, Hajela RK, Thomashow MF: Cold Acclimation in Arabidopsis thaliana. Plant Physiol. 1988, 87: 745-750.
Hannah MA, Wiese D, Freund S, Fiehn O, Heyer AG, Hincha DK: Natural genetic variation of freezing tolerance in Arabidopsis. Plant Physiol. 2006, 142: 98-112. 10.1104/pp.106.081141.
Warren G, McKown R, Marin AL, Teutonico R: Isolation of mutations affecting the development of freezing tolerance in Arabidopsis thaliana (L.) Heynh. Plant Physiol. 1996, 111: 1011-1019. 10.1104/pp.111.4.1011.
Ishitani M, Xiong L, Stevenson B, Zhu JK: Genetic analysis of osmotic and cold stress signal transduction in Arabidopsis: interactions and convergence of abscisic acid-dependent and abscisic acid-independent pathways. Plant Cell. 1997, 9: 1935-1949. 10.1105/tpc.9.11.1935.
Xin Z, Browse J: Eskimo1 mutants of Arabidopsis are constitutively freezing-tolerant. Proc Natl Acad Sci USA. 1998, 95: 7799-7804. 10.1073/pnas.95.13.7799.
Thomashow MF: PLANT COLD ACCLIMATION: Freezing Tolerance Genes and Regulatory Mechanisms. Annu Rev Plant Physiol Plant Mol Biol. 1999, 50: 571-599. 10.1146/annurev.arplant.50.1.571.
Thomashow MF: So what's new in the field of plant cold acclimation? Lots!. Plant Physiol. 2001, 125: 89-93. 10.1104/pp.125.1.89.
Thomashow MF: Role of cold-responsive genes in plant freezing tolerance. Plant Physiol. 1998, 118: 1-8. 10.1104/pp.118.1.1.
Hughes M, Dunn M: The molecular biology of plant acclimation to low temperature. J Exp Bot. 1996, 47: 291-305. 10.1093/jxb/47.3.291.
Lash AE, Tolstoshev CM, Wagner L, Schuler GD, Strausberg RL, Riggins GJ, Altschul SF: SAGEmap: a public gene expression resource. Genome Res. 2000, 10: 1051-1060. 10.1101/gr.10.7.1051.
Velculescu VE, Zhang L, Vogelstein B, Kinzler KW: Serial analysis of gene expression. Science. 1995, 270: 484-487. 10.1126/science.270.5235.484.
Trendelenburg G, Prass K, Priller J, Kapinya K, Polley A, Muselmann C, Ruscher K, Kannbley U, Schmitt AO, Castell S: Serial analysis of gene expression identifies metallothionein-II as major neuroprotective gene in mouse focal cerebral ischemia. J Neurosci. 2002, 22: 5879-5888.
Robinson SJ, Cram DJ, Lewis CT, Parkin IA: Maximizing the efficacy of SAGE analysis identifies novel transcripts in Arabidopsis. Plant Physiol. 2004, 136: 3223-3233. 10.1104/pp.104.043406.
AGI: Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000, 408: 796-815. 10.1038/35048692.
Robinson SJ, Guenther JD, Lewis CT, Links MG, Parkin IA: Reaping the Benefits of SAGE. Methods Mol Biol. 2007, 406: 365-386.
Matsumura H, Nirasawa S, Terauchi R: Technical advance: transcript profiling in rice (Oryza sativa L.) seedlings using serial analysis of gene expression (SAGE). Plant J. 1999, 20: 719-726. 10.1046/j.1365-313X.1999.00640.x.
Song S, Qu H, Chen C, Hu S, Yu J: Differential gene expression in an elite hybrid rice cultivar (Oryza sativa, L) and its parental lines based on SAGE data. BMC Plant Biol. 2007, 7: 49-10.1186/1471-2229-7-49.
Benjamini Y, Hochberg Y: Controlling the false discovery rate: A pratical and powerful approach to multiple testing. Journal Royal Statistical Society. 1995, 57: 289-300.
Miller RG: Simultaneous statistical inference. 1981, Springer Verlag, 2
Gilmour SJ, Fowler SG, Thomashow MF: Arabidopsis transcriptional activators CBF1, CBF2, and CBF3 have matching functional activities. Plant Mol Biol. 2004, 54: 767-781. 10.1023/B:PLAN.0000040902.06881.d4.
Seaman MA, Hill CC: Pairwise comparisons for proportions: A note on Cox and Key. Educational and Psychological Measurement. 1996, 56: 452-459. 10.1177/0013164496056003007.
Higo K, Ugawa Y, Iwamoto M, Korenaga T: Plant cis-acting regulatory DNA elements (PLACE) database: 1999. Nucl Acids Res. 1999, 27: 297-300. 10.1093/nar/27.1.297.
Iida K, Seki M, Sakurai T, Satou M, Akiyama K, Toyoda T, Konagaya A, Shinozaki K: Genome-wide analysis of alternative pre-mRNA splicing in Arabidopsis thaliana based on full-length cDNA sequences. Nucleic Acids Res. 2004, 32: 5096-5103. 10.1093/nar/gkh845.
Ner-Gaon H, Halachmi R, Savaldi-Goldstein S, Rubin E, Ophir R, Fluhr R: Intron retention is a major phenomenon in alternative splicing in Arabidopsis. Plant J. 2004, 39: 877-885. 10.1111/j.1365-313X.2004.02172.x.
Reddy AS: Alternative splicing of pre-messenger RNAs in plants in the genomic era. Annu Rev Plant Biol. 2007, 58: 267-294. 10.1146/annurev.arplant.58.032806.103754.
Wang BB, Brendel V: Genomewide comparative analysis of alternative splicing in plants. Proc Natl Acad Sci USA. 2006, 103: 7175-7180. 10.1073/pnas.0602039103.
Gulick PJ, Drouin S, Yu Z, Danyluk J, Poisson G, Monroy AF, Sarhan F: Transcriptome comparison of winter and spring wheat responding to low temperature. Genome. 2005, 48: 913-923.
Slomovic S, Portnoy V, Liveanu V, Schuster G: RNA Polyadenylation in Prokaryotes and Organelles; Different Tails Tell Different Tales. Critical Reviews in Plant Sciences. 2006, 25: 65-77. 10.1080/07352680500391337.
Backman TW, Sullivan CM, Cumbie JS, Miller ZA, Chapman EJ, Fahlgren N, Givan SA, Carrington JC, Kasschau KD: Update of ASRP: the Arabidopsis Small RNA Project database. Nucleic Acids Res. 2007
Pleasance ED, Marra MA, Jones SJ: Assessment of SAGE in transcript identification. Genome Res. 2003, 13: 1203-1215. 10.1101/gr.873003.
Yamada K, Lim J, Dale JM, Chen H, Shinn P, Palm CJ, Southwick AM, Wu HC, Kim C, Nguyen M: Empirical analysis of transcriptional activity in the Arabidopsis genome. Science. 2003, 302: 842-846. 10.1126/science.1088305.
Gutierrez RA, Larson MD, Wilkerson C: The plant-specific database. Classification of Arabidopsis proteins based on their phylogenetic profile. Plant Physiol. 2004, 135: 1888-1892. 10.1104/pp.104.043687.
Vogel JT, Zarka DG, Van Buskirk HA, Fowler SG, Thomashow MF: Roles of the CBF2 and ZAT12 transcription factors in configuring the low temperature transcriptome of Arabidopsis. The Plant Journal. 2005, 41: 195-211. 10.1111/j.1365-313X.2004.02288.x.
Kreps JA, Wu Y, Chang H-S, Zhu T, Wang X, Harper JF: Transcriptome Changes for Arabidopsis in Response to Salt, Osmotic, and Cold Stress. Plant Physiol. 2002, 130: 2129-2141. 10.1104/pp.008532.
Lee B-h, Henderson DA, Zhu J-K: The Arabidopsis Cold-Responsive Transcriptome and Its Regulation by ICE1. Plant Cell. 2005, 17: 3155-3175. 10.1105/tpc.105.035568.
Fowler S, Thomashow MF: Arabidopsis Transcriptome Profiling Indicates That Multiple Regulatory Pathways Are Activated during Cold Acclimation in Addition to the CBF Cold Response Pathway. Plant Cell. 2002, 14: 1675-1690. 10.1105/tpc.003483.
Kaplan F, Kopka J, Sung DY, Zhao W, Popp M, Porat R, Guy CL: Transcript and metabolite profiling during cold acclimation of Arabidopsis reveals an intricate relationship of cold-regulated gene expression with modifications in metabolite content. Plant J. 2007, 50: 967-981. 10.1111/j.1365-313X.2007.03100.x.
Hannah MA, Heyer AG, Hincha DK: A global survey of gene regulation during cold acclimation in Arabidopsis thaliana. PLoS Genet. 2005, 1: e26-10.1371/journal.pgen.0010026.
Bieniawska Z, Espinoza C, Schlereth A, Sulpice R, Hincha DK, Hannah MA: Disruption of the Arabidopsis circadian clock is responsible for extensive variation in the cold-responsive transcriptome. Plant Physiol. 2008, 147: 263-279. 10.1104/pp.108.118059.
Chen J, Agrawal V, Rattray M, West MA, St Clair DA, Michelmore RW, Coughlan SJ, Meyers BC: A comparison of microarray and MPSS technology platforms for expression analysis of Arabidopsis. BMC Genomics. 2007, 8: 414-10.1186/1471-2164-8-414.
Chinnusamy V, Ohta M, Kanrar S, Lee BH, Hong X, Agarwal M, Zhu JK: ICE1: a regulator of cold-induced transcriptome and freezing tolerance in Arabidopsis. Genes Dev. 2003, 17: 1043-1054. 10.1101/gad.1077503.
Jaglo-Ottosen KR, Gilmour SJ, Zarka DG, Schabenberger O, Thomashow MF: Arabidopsis CBF1 overexpression induces COR genes and enhances freezing tolerance. Science. 1998, 280: 104-106. 10.1126/science.280.5360.104.
Oquist G, Huner NP: Photosynthesis of overwintering evergreen plants. Annu Rev Plant Biol. 2003, 54: 329-355. 10.1146/annurev.arplant.54.072402.115741.
Stitt M, Hurry V: A plant for all seasons: alterations in photosynthetic carbon metabolism during cold acclimation in Arabidopsis. Curr Opin Plant Biol. 2002, 5: 199-206. 10.1016/S1369-5266(02)00258-3.
Sane PV, Ivanov AG, Hurry V, Huner NP, Oquist G: Changes in the redox potential of primary and secondary electron-accepting quinones in photosystem II confer increased resistance to photoinhibition in low-temperature-acclimated Arabidopsis. Plant Physiol. 2003, 132: 2144-2151. 10.1104/pp.103.022939.
Orvar BL, Sangwan V, Omann F, Dhindsa RS: Early steps in cold sensing by plant cells: the role of actin cytoskeleton and membrane fluidity. Plant J. 2000, 23: 785-794. 10.1046/j.1365-313x.2000.00845.x.
Abdrakhamanova A, Wang QY, Khokhlova L, Nick P: Is microtubule disassembly a trigger for cold acclimation?. Plant Cell Physiol. 2003, 44: 676-686. 10.1093/pcp/pcg097.
Hincha DK: Cryoprotectin: a plant lipid-transfer protein homologue that stabilizes membranes during freezing. Philos Trans R Soc Lond B Biol Sci. 2002, 357: 909-916. 10.1098/rstb.2002.1079.
Sror HA, Tischendorf G, Sieg F, Schmitt JM, Hincha DK: Cryoprotectin protects thylakoids during a freeze-thaw cycle by a mechanism involving stable membrane binding. Cryobiology. 2003, 47: 191-203. 10.1016/j.cryobiol.2003.09.005.
Vaucheret H: Post-transcriptional small RNA pathways in plants: mechanisms and regulations. Genes Dev. 2006, 20: 759-771. 10.1101/gad.1410506.
Szittya G, Silhavy D, Molnar A, Havelda Z, Lovas A, Lakatos L, Banfalvi Z, Burgyan J: Low temperature inhibits RNA silencing-mediated defence by the control of siRNA generation. Embo J. 2003, 22: 633-640. 10.1093/emboj/cdg74.
Lee BH, Kapoor A, Zhu J, Zhu JK: STABILIZED1, a stress-upregulated nuclear protein, is required for pre-mRNA splicing, mRNA turnover, and stress tolerance in Arabidopsis. Plant Cell. 2006, 18: 1736-1749. 10.1105/tpc.106.042184.
Palusa SG, Ali GS, Reddy AS: Alternative splicing of pre-mRNAs of Arabidopsis serine/arginine-rich proteins: regulation by hormones and stresses. Plant J. 2007, 49: 1091-1107.
Horton P, Ruban AV, Walters RG: Regulation of Light Harvesting in Green Plants. Annu Rev Plant Physiol Plant Mol Biol. 1996, 47: 655-684. 10.1146/annurev.arplant.47.1.655.
Rosenstiel P, Huse K, Till A, Hampe J, Hellmig S, Sina C, Billmann S, von Kampen O, Waetzig GH, Platzer M: A short isoform of NOD2/CARD15, NOD2-S, is an endogenous inhibitor of NOD2/receptor-interacting protein kinase 2-induced signaling pathways. Proceedings of the National Academy of Sciences. 2006, 103: 3280-3285. 10.1073/pnas.0505423103.
Santner TJ, Duffy DE: Univariate Discrete Data with Covariates. The Statistical Analysis of Discrete Data. 1989, New York: Springer-Verlag, 204-286.
This work was supported by funding from the AAFC Canadian Crop Genomics Initiative and the Genome Prairie project 'Functional Genomics of Abiotic Stress'. The authors would like to thank Dr. John Nixon and Wayne Clarke for their assistance with statistical analysis and the development of Perl scripts and Dr. Dwayne Hegedus and Matthew Links for their critical reading of this manuscript. All of the above are at the Saskatoon Research Centre.
SJR carried out the molecular work, developed targeted Perl scripts, analysed the data, and wrote the manuscript. IAPP conceived of the study, participated in its design and coordination, the analysis of results and helped to write the manuscript. Both authors have read and approved the final manuscript.