Transcriptional activity of transposable elements in maize
© Vicient. 2010
Received: 26 March 2010
Accepted: 25 October 2010
Published: 25 October 2010
Skip to main content
© Vicient. 2010
Received: 26 March 2010
Accepted: 25 October 2010
Published: 25 October 2010
Mobile genetic elements represent a high proportion of the Eukaryote genomes. In maize, 85% of genome is composed by transposable elements of several families. First step in transposable element life cycle is the synthesis of an RNA, but few is known about the regulation of transcription for most of the maize transposable element families. Maize is the plant from which more ESTs have been sequenced (more than two million) and the third species in total only after human and mice. This allowed us to analyze the transcriptional activity of the maize transposable elements based on EST databases.
We have investigated the transcriptional activity of 56 families of transposable elements in different maize organs based on the systematic search of more than two million expressed sequence tags. At least 1.5% maize ESTs show sequence similarity with transposable elements. According to these data, the patterns of expression of each transposable element family is variable, even within the same class of elements. In general, transcriptional activity of the gypsy -like retrotransposons is higher compared to other classes. Transcriptional activity of several transposable elements is specially high in shoot apical meristem and sperm cells. Sequence comparisons between genomic and transcribed sequences suggest that only a few copies are transcriptionally active.
The use of powerful high-throughput sequencing methodologies allowed us to elucidate the extent and character of repetitive element transcription in maize cells. The finding that some families of transposable elements have a considerable transcriptional activity in some tissues suggests that, either transposition is more frequent than previously expected, or cells can control transposition at a post-transcriptional level.
Transposable elements (TEs) are DNA sequences that move from one location to another within the genome or can produce copies of themselves. Eukaryotic TEs are divided into two classes, according to whether their transposition intermediate is RNA (class I) or DNA (class II). Each class contain elements that encode functional products required for transposition (autonomous) and elements that only retain the cis sequences necessary for recognition by the transposition machinery (non-autonomous). Class I elements can be divided into several subclasses: SINEs, LINEs, long terminal repeat (LTR) retrotransposons and TRIMs (Terminal-repeat Retrotransposons In Miniature), which are LTR non-autonomous elements . Class II elements comprise autonomous and non-autonomous transposons, including the MITEs (Miniature Inverted-repeat Transposable Elements) .
TEs are major components of most eukaryotic genomes and are particularly abundant in plants. TEs represent 80% of the maize and 90% of wheat genomes . All the classes of TEs found in Eukaryotes are also present in plant genomes, but LTR retrotransposons are the most abundant in terms of copy number and percentage of genome . 95% of maize TEs are LTR retrotransposons .
TEs play an important role in genome and gene evolution. TE insertion can disrupt genes and mediate chromosome rearrangements and can provide alternative promoters, exons, terminators and splice junctions . Several rice genes contain TE derived sequences . However, TE influence in gene expression is not restricted to physical modification of chromosomes. TEs were first characterized in maize as gene ''controlling elements'' . Maize "controlling elements" change the expression of some genes due to the transcription of non-coding RNAs (ncRNA) from the transposon promoters which contribute to the epigenetic regulation of neighbouring genes through mechanisms such as RNAi, transcriptional interference and anti-silencing . The methylation of a SINE element close to the FWA gene, a gene submitted to imprinting, allows the proper epigenetic control in Arabidopsis thaliana . TEs also produce short double-stranded RNAs (dsRNAs) which contribute to the epigenetic gene regulation. Analyses in maize, tobacco, wheat or rice have shown that transcriptional readout from retrotransposon LTRs may generate sense and antisense transcripts of adjacent genes, altering their expression . Given the large number of retrotransposon copies in plant genomes and their frequent location near genes, it becomes clear the high potential impact of the TE transcription on the expression of the nearby genes [12, 13]. For this reason, TE transcription was believed to be severely repressed in plants. This point of view was supported by the fact that during long time transcription activity was only demonstrated for a few plant TEs, and only activated under certain precise circumstances as, for example, pathogen infection, physical injuries or different abiotic stresses [14, 15]. Inactivity of TEs may be due to the accumulation of mutations that have altered their structure. However, although transpositionally inactive due to insertions, deletions, rearrangements or mutations, some copies of the TEs may retain the capacity to direct transcription from their own promoters. In addition to the direct inactivation, cells have also developed mechanism for TE control including silencing by DNA methylation or the small RNA pathways . TEs producing double-stranded or aberrant RNAs are silenced by a post-transcriptional gene silencing mechanism (PTGS) and active TEs are inactivated by transcriptional gene silencing (TGS) .
Despite mutations and cell control, TEs manage to be transcriptionally and transpositionally active. Phylogenetic analysis of TE families in maize revealed recent events of extreme TE proliferation  and recent transposition activity has also been demonstrated in rice . The use of sensitive techniques for gene expression analysis like deep transcriptome sequencing provide increasing data on the presence of TE-transcripts in several plants and cell types [20–23].
Maize is the plant species from which more ESTs have been sequenced and the third species only after human and mice . More than two million maize ESTs have been sequenced from many libraries corresponding to several maize organs, developmental stages and conditions. It provides a strong basis for the development of computer-based procedures for the in silico analysis of expression profiles. The present work aimed to produce a body map of TE transcription in maize plants. We show that the fraction of TE-related transcripts varies greatly among TE classes and among organs.
The number of ESTs in a large transcript database can be used to estimate relative transcriptional rates. More than two million maize EST sequences are deposited in the NCBI EST database (Zm-dbEST). Such a large amount of sequences provides an opportunity to perform virtual analysis of gene expression in this species. We used a representative sequence of 56 well-characterized TEs (available in the repeats and retrotransposon databases) to query BLASTN against the maize EST database (Zm-dbESTs). 1,5% of the total maize ESTs (25.282 sequences) showed significant sequence homology (e-value < 1E-20) with one of the 56 analysed TE families (Additional file 1).
TE families analysed in this study
LTR retrotransposon (Copia)
LTR retrotransposon (gypsy)
The distribution of the EST matches along the TE sequence was examined (Additional file 2). The EST distribution was variable depending on the TE element family, but a general similar behaviour was observed within classes. For example, in LINEs, most of the ESTs showed similarity with the 3'end. On the other hand, in LTR retrotransposons and TRIMs, most of the ESTs are similar to the LTR regions. These non-random distribution is probably a consequence of the different transcription mechanism characteristic of each TE family.
"Virtual northern" analysis provides an easy and cheap alternative to the study of transcriptional profiling. An advantage of EST profiling compared with other methods is that it does not require prior knowledge of the gene sequences. The accuracy of "virtual northern" analyses will depend on the diversity of biological samples and in the number of sequence tags to provide sufficient depth to identify low-abundant transcripts. An additional problem will be the possibility to distinguish between closely related genes in the basis of partial sequences. EST profiling have been used for the identification of reference genes for quantitative RT-PCR normalization in wheat  and barley , expression profiling of storage-protein gene families in wheat , identification of differentially expressed transcripts from sugarcane maturing stem , or the identification of cancer gene-markers in humans . The application of EST profiling to maize TEs is particularly appropriate. First, the analysis of TE families, and not single genes, virtually eliminates the problem of distinguishing between closely related sequences. Second, maize is the third organism in number of ESTs, and, finally, several of the cDNA libraries were constructed from precise, well-defined, dissected organs. The applicability of EST profiling in maize is demonstrated by the expected results for some marker genes (figure 4). One possible problem is the presence of sequences originated from contaminant genomic DNA in the EST collections. This problem is especially serious in the case of TEs because some of them are present in high copy numbers in the genome. Although we cannot totally exclude the presence of some genomic contamination, our results indicate that, if any, it may be considered anecdotic (Figure 2).
Once integrated in the genome, TEs accumulate mutations and become transpositionally inactive. However, even partial or rearranged TE copies may retain their capacity to initiate transcription. Cells have active mechanisms to protect their genome integrity against TE activity including transcriptional silencing  and short-interfering RNAs (siRNAs) . Under certain circumstances some TEs can escape this cell control and transcribe and, sometimes, transpose . For example, different TE families are transcribed in response to biotic or abiotic stresses or in cell culture [33–37]. In addition to these "stress response" transcription, increasing data demonstrate that some TEs may have at least low transcriptional activities under normal circumstances in plant life. For example, transcription in leaves has been demonstrated for barley BARE, maize Grande and tomato Rider retrotransposons, and in different sorghum TEs [38–41]. Different EST based analysis, including the data presented here, demonstrate the presence of TE-transcripts in several organs and cell types [20–22, 42, 43]. According to our data, at least 1.5% of the ESTs correspond to TEs. This is an underestimation because only well characterized maize TEs were considered in our analysis and because ESTs libraries only contain data on polyadenylated mRNAs and it is not clear which percentage of TE transcripts contain a polyA track. For example, it has been estimated that only 15% the transcripts of the barley retrotransposon BARE1 are polyadenylated . In any case, the percentage is different according to the organs analysed ranging from 7.7% in SAM and 6.2% in cultured cells, to only 0.2% in female flowers and 0.1% in embryo.
TE transcripts are specially abundant in SAM, cultured cells (Figure 4; ) and sperm cells (Figure 5; [31, 45–47]). A common feature of SAM, pollen and cell cultures is that they contain pluripotent cells. Animal totipotent cells like oocytes and two-cell mouse embryos also exhibit high levels of TE transcription . The acquisition of totipotency depends, among other things, on epigenetic reprogramming  and activation of TEs has also been associated with reductions on DNA methylation . For example, DNA in plant cultured cells undergoes hypomethylated and these cells show a transcriptional activation of specific TEs . Tobacco Tnt1 retrotransposon is silenced when introduced in Arabidopsis, but reversion of Tnt1 silencing is obtained when the number of Tnt1 elements is reduced to two by genetic segregation . Microarray expression profiling of Arabidopsis mature pollen revealed that many of the genes involved in siRNA biogenesis and silencing are not expressed in pollen or expressed at low levels . Although epigenetic changes may explain activation of certain TEs in some tissues, not all TE families accumulate equally in SAM or sperm cells, suggesting that the phenomenon requires some family specific mechanisms rather than simply being the result of a genome-wide activation of retrotransposons. One possible explanation may be the presence of cis specific signals in the TE promoter that may enhance their expression in certain cells. For example, pollen promoter specific signals have been detected in the LTR of Grande (personal unpublished data).
The use of powerful high-throughput sequencing methodologies allowed us to elucidate the extent and character of repetitive element transcription in maize cells. Next-generation sequencing of transcriptomes and genomes will enable further studies on TE transcription and their consequences.
Organ/condition maize EST databases used in this analysis
Number of libraries
Number of ESTs
Shoot apical meristem
Sequence alignments were performed using CLUSTALW and phylogenetic trees using neighbour joining method. Graphic representation of phylogenetic trees were prepared using Dendroscope v.2.7.4 .
EST: with sequence similarity to a transposable element
long terminal repeat
long interspersed transposable elements
Miniature Inverted Transposable Element
Terminal-repeat Retrotransposons In Miniature.
I am in agreement with Josep M Casacuberta for critical reading the manuscript.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.