Template switching can create complex LTR retrotransposon insertions in Triticeae genomes
© Sabot and Schulman; licensee BioMed Central Ltd. 2007
Received: 03 April 2007
Accepted: 24 July 2007
Published: 24 July 2007
The LTR (long terminal repeat) retrotransposons of higher plants are replicated by a mutagenic life cycle containing transcription and reverse transcription steps. The DNA copies are often subject to recombination once integrated into the genome. Complex elements, where two elements share an LTR, are not uncommon. They are thought to result from heterologous recombination between two adjacent elements that occurs following their integration.
Here, we present evidence for another potential mechanism for the creation of complex elements, involving abnormal template switching during reverse transcription. The template switching creates a large, complex daughter element, formed by the fusion of two parent sequences, which is then inserted into the genome.
Those complex elements are part of the genome structure of plants in the Poaceae, especially in the Triticeae, but not of Arabidopsis. Hence, retrotransposon dynamics shaping the genome are lineage-specific.
Long Terminal Repeat (LTR) retrotransposons are Class I transposable elements that replicate by a "Copy-and-Paste" mechanism, called retrotransposition, which is quite similar to lentivirus (such as the HIV) replication. Higher plant genomes, especially of the grasses (such as maize, wheat and barley), harbor a large number of these elements, which form the vast majority of the nuclear DNA. Retrotransposition involves a reverse transcription step, where cDNA is synthesized from an RNA template. Reverse transcription is catalyzed by reverse transcriptase, which is generally encoded by the retrotransposon being copied, and the cDNA is inserted into a new genomic location by the integrase, which is also self-encoded . A canonical retrotransposon insertion comprises two LTRs and an internal domain containing the coding domain for integrase, reverse transcriptase, a proteinase, the structural protein GAG, and the signals for reverse transcription.
Results and discussion
In the genome of Arabidopsis thaliana, after thorough analyses, Devos et al. identified no complex elements other than those originating from recombination between two retroelements . For these, the two outermost LTRs differ from each other by not being derived from the same reverse transcription and integration. This has two structural consequences. First, a recombination between the 3' LTR from one element and the 5' from another, closely related one on the same strand (Figure 2) gives rise to a third, internal LTR. This LTR is a chimera of the two LTRs involved in the recombination. A second consequence, because the two elements involved come from two independent insertion events that generated two different target-site duplications (TSDs), is that the resulting complex does not harbor flanking TSDs. By these measures, the vast majority of the complex elements already identified arose from unequal and heterologous recombination between adjacent and independent insertions [2, 8].
Based on these observations, we checked within other Poaceae sequences for the occurrence of such complex structures. We carried out an ab initio scan of the rice pseudomolecules and all the available genomic sequences from maize, using the LTR_STRUC software  for detection of complete LTR retrotransposons. This software detects only complete elements, based on the presence of both two LTRs and the TSD motifs flanking them. Out of 4704 identified potential LTR retrotransposons, we were able to clearly identify 2 new complex structures harboring the diagnostic features: an internal LTR, 2 complete core sequences, flanking TSDs and similarity between the outermost LTRs. The first element is located on chromosome 5 of rice, in position 14011139–14022766 (TIGR pseudomolecule), in the forward orientation. This element is a member of the Squiq subfamily, with CAAAC as the TSD sequence. The second detected complex is a member of the Opie family in the maize BAC AY078063 , in position 57992–74088, reverse orientation, with GCATG as the TSDs (the detailed alignments of LTRs as the dotter images for those complexes are provided in additional file 2).
Errors in template choice during the reverse transcription can occur anywhere along the sequence. The growing cDNA can jump to the other packaged template instead of to the other end of the template it is already on. Generally, because the two packaged templates are almost identical (derived from the same retrotransposon or retrovirus RNA), the phenomenon is undetectable because there are no major modifications to the resulting cDNA. However, if two different RNAs are packaged in the same virus-like particle, a jump to the other template during reverse transcription leads to abnormal or new elements, opening a new mode for LTR retrotransposon evolution. The Veju L  and BARE 2  elements appear to have been formed in this way.
If RNAs from two slightly different individual LTR retrotransposons are co-packaged, the strand switch could occur also between the two R regions. This would lead to formation of a heterodimer (Figure 5B) rather than a normal monomer (Figure 5A). The resulting cDNA would constitute a chimeric complex between the two elements, and possess chimeric LTRs. The process of reverse transcription described above renders the external LTRs identical. Their 3' ends would be therefore also identical and could serve as substrates for the same type of integrase. Thus, a chimeric complex element nevertheless would be integrated via standard integrase catalysis, leading to a new genomic insertion harboring TSDs on either side (Figure 5B). The dimerization could occur between the two packaged RNAs from highly similar elements, such as closely related members of the same retrotransposon family, leading to a complex harboring three identical LTRs interspersed between two similar internal regions. Moreover, because the LTRs would be complete and not compromised by heteroduplex formation, each of them would be able to promote the expression of its corresponding downstream element. Thus, the two original elements could be expressed as normal and individual copies and even propagate through the genome as separate elements.
Number of insertions in ~7 Mb of Triticeae large-insert sequences
Type of event
Number of events
LTR retrotransposon insertions
Other recombination events
Template Switching Complexes
DNA Transposon insertions
The model we propose is consistent both with the available data and with the established details of the retrotransposon life cycle. A direct demonstration of the mechanism would entail isolation of virus-like particles containing two paired RNAs (Figure 5B) and demonstrating the RNA structure. This, however, awaits both an efficient system for production of packaged complexes (perhaps by over-expression of a retrotransposon with a tendency to form complexes) and a means of distinguishing the number of mRNAs present within the buckle.
All currently public available Triticeae (wheat and barley) BACs were re-analyzed as in . The updated annotations were used to analyze the insertion complexes. The original analyses of AF497474 from Aegilops tauschii, AF368673 from Triticum turgidum and AY078063 from Zea mays were performed respectively by [3, 4], and . The sequences of the rice pseudomolecules (~367 Mb) were downloaded from the TIGR website . The scanned maize sequences represent the whole large sequences available for maize in the public database, i.e., excluding the trace files and the gene-only sequences. They were downloaded from the NCBI website  and represent ~1 650 Mb.
The ab initio identification of LTR retrotransposons within the rice and maize sequences was performed by the LTR_STRUC software  using standard specifications. All of the 4072 potential complex elements output by this program were first screened by a home-made Python script according to their size, and the 1416 candidates meeting the criterion of >10 kb length were then manually checked using Dotter  for the presence of the internal LTR. The LTR vs. LTR analyses were performed using Dotter , and the target-site duplication were manually verified. The LTR alignments were verified using ClustalX , after manual editing as necessary (see Supplemental data).
long terminal repeat
The authors thank Jaakko Tanskanen for his help with the Python scripts. FS was supported by a fellowship from CIMO and by a University of Helsinki Postdoctoral Fellowship. Experiments described here were carried out under a grant from Academy of Finland, Project 106949.
- Sabot F, Schulman AH: Parasitism and the retrotransposon life cycle in plants: A hitchhiker's guide to the genome. Heredity. 2006, 97: 381-388. 10.1038/sj.hdy.6800903.PubMedView ArticleGoogle Scholar
- Devos KM, Brown JKM, Bennetzen JL: Genome size reduction through illegitimate recombination counteracts genome expansion in Arabidopsis. Genome Res. 2002, 12: 1075-1079. 10.1101/gr.132102.PubMed CentralPubMedView ArticleGoogle Scholar
- Anderson OD, Rausch C, Moullet O, Lagudah ES: The wheat D-genome HMW-glutenin locus: BAC sequencing, gene distribution, and retrotransposon clusters. Funct Int Genomics. 2003, 3: 56-68.Google Scholar
- Kong X-Y, Gu YQ, You FM, Dubcovsky J, Anderson OD: Dynamics of the evolution of orthologous and paralogous portions of a complex locus region in two genomes of allopolyploid wheat. Plant Mol Biol. 2004, 54: 55-69. 10.1023/B:PLAN.0000028768.21587.dc.PubMedView ArticleGoogle Scholar
- Chantret N, Cenci A, Sabot F, Anderson OD, Dubcovsky J: Sequencing of the Triticum monococcum Hardness locus reveals good microcolinearity with rice. Mol Genet Genomics. 2004, 271: 377-386. 10.1007/s00438-004-0991-y.PubMedView ArticleGoogle Scholar
- Chantret N, Salse J, Sabot F, Rahman S, Bellec A, Laubin B, Dubois I, Dossat C, Sourdille P, Joudrier P, Gautier MF, Cattolico L, Beckert M, Aubourg S, Weissenbach J, Caboche M, Bernard M, Leroy P, Chalhoub B: Molecular basis of evolutionary events that shaped the Hardness, Ha locus in diploid and polyploidy wheat species, Triticum and Aegilops. The Plant Cell. 2005, 17: 1033-1045. 10.1105/tpc.104.029181.PubMed CentralPubMedView ArticleGoogle Scholar
- Sabot F, Guyot R, Wicker T, Chantret N, Salse J, Laubin B, Leroy P, Sourdille P, Chalhoub B, Bernard M: Updating transposable element annotations from large wheat genomic sequences reveals diverse activities and gene association of elements. Mol Genet Genomics. 2005, 274: 119-132. 10.1007/s00438-005-0012-9.PubMedView ArticleGoogle Scholar
- Vicient CM, Kalendar R, Schulman AH: Variability, recombination and mosaic evolution of the barley BARE-1 retrotransposon. J Mol Evol. 2005, 61: 275-291. 10.1007/s00239-004-0168-7.PubMedView ArticleGoogle Scholar
- McCarthy EM, McDonald JF: LTR_STRUC: a novel search and identification program for LTR retrotransposons. Bioinformatics. 2003, 19: 362-367. 10.1093/bioinformatics/btf878.PubMedView ArticleGoogle Scholar
- Stam M, Belele C, Ramakrishna W, Dorweiler JE, Bennetzen JL, Chandler VL: The regulatory regions required for B' paramutation and expression are located far upstream of the maize b1 transcribed sequences. Genetics. 2002, 162: 917-930.PubMed CentralPubMedGoogle Scholar
- Sabot F, Sourdille P, Bernard M: Advent of a new retrotransposon structure: the long form of the Veju elements. Genetica. 2005, 125: 325-335. 10.1007/s10709-005-7926-3.PubMedView ArticleGoogle Scholar
- TIGR homepage. [http://www.tigr.org]
- NCBI homepage. [http://www.ncbi.nlm.nih.gov]
- Sonnhammer EL, Durbin R: A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis. Gene. 1995, 167: GC1-10. 10.1016/0378-1119(95)00714-8.PubMedView ArticleGoogle Scholar
- Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1997, 22: 4673-4680. 10.1093/nar/22.22.4673.View ArticleGoogle Scholar
- Artemis at Sanger Institute. [http://www.sanger.ac.uk/Software/Artemis/]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.