Skip to main content

LTR retroelements are intrinsic components of transcriptional networks in frogs



LTR retroelements (LTR REs) constitute a major group of transposable elements widely distributed in eukaryotic genomes. Through their own mechanism of retrotranscription LTR REs enrich the genomic landscape by providing genetic variability, thus contributing to genome structure and organization. Nonetheless, transcriptomic activity of LTR REs still remains an obscure domain within cell, developmental, and organism biology.


Here we present a first comparative analysis of LTR REs for anuran amphibians based on a full depth coverage transcriptome of the European pool frog, Pelophylax lessonae, the genome of the African clawed frog, Silurana tropicalis (release v7.1), and additional transcriptomes of S. tropicalis and Cyclorana alboguttata. We identified over 1000 copies of LTR REs from all four families (Bel/Pao, Ty1/Copia, Ty3/Gypsy, Retroviridae) in the genome of S. tropicalis and discovered transcripts of several of these elements in all RNA-seq datasets analyzed. Elements of the Ty3/Gypsy family were most active, especially Amn-san elements, which accounted for approximately 0.27% of the genome in Silurana. Some elements exhibited tissue specific expression patterns, for example Hydra1.1 and MuERV-like elements in Pelophylax. In S. tropicalis considerable transcription of LTR REs was observed during embryogenesis as soon as the embryonic genome became activated, i.e. at midblastula transition. In the course of embryonic development the spectrum of transcribed LTR REs changed; during gastrulation and neurulation MuERV-like and SnRV like retroviruses were abundantly transcribed while during organogenesis transcripts of the XEN1 retroviruses became much more active.


The differential expression of LTR REs during embryogenesis in concert with their tissue-specificity and the protein domains they encode are evidence for the functional roles these elements play as integrative parts of complex regulatory networks. Our results support the meanwhile widely accepted concept that retroelements are not simple “junk DNA” or “harmful genomic parasites” but essential components of the transcriptomic machinery in vertebrates.


Transposable elements (TEs) are mobile genetic elements that constitute large portions of the genome in eukaryotes [1, 2]. In primates including humans, for example, about 50% of the genome consists of TEs [3]. Vast genome size differences among species are directly related to the TE content [1, 2, 4, 5]; thus TE abundance and diversity are characteristic features of plant and animal genomes [6].

Transposable elements play an important role for genome organization and evolution as substantial providers of large scale mutation events, creating genetic variability that natural selection can act upon [1]. They can affect both single genes and entire genomes [7, 8] by chromosomal rearrangements including insertions, duplications, deletions, and recombination events [9, 10]. Although most TE-caused mutations are expected to be deleterious, some are neutral or even adaptive. TE-derived sequences such as promoters [1115], polyadenylation signals and termination sites [1618], and smRNAs [19] are involved in regulation of gene expression at both the transcriptional and post-transcriptional level [2, 9, 20]. In addition, TE proliferation is thought to create new regulatory networks and to participate in the rewiring of pre-established regulatory networks [2].

Little is known about the regulation of TE activity. Large scale elimination and suppression of retroelements have both been documented for the genome of the pufferfish [21]. Several factors have been shown to be responsible for TE silencing, such as RNAi [22, 23], especially by piRNAs [24, 25], and DNA methylation [26].

In some cases activation of TEs seems to be environmentally mediated. There is evidence, for example, that retrotransposition activates the expression of stress response genes thus providing a positive feedback under stressful conditions to promote survival related genes [27].

Transposable elements are generally classified into Class I elements (called retrotransposons or retroelements), which use an RNA intermediate for transposition; and Class II elements, which replicate without an RNA intermediate, either by a cut-and-paste mechanism (DNA transposons), by rolling circle DNA replication (helitrons), or by so far unknown mechanisms (politrons/mavericks). Among the Class I elements two major subclasses are recognized: (1) retroelements (REs) with long terminal repeats (LTRs) and (2) elements without LTRs (non-LTR REs) [20, 28]. In this study we focus on LTR REs, which can be classified into four major families, namely Bel/Pao, Ty1/Copia, Ty3/Gypsy, and retroviruses [29, 30]. A common LTR retrotransposon typically encodes two polyproteins, termed GAG and POL. The group-specific antigen (GAG) usually contains matrix, capsid, and nucleocapsid domains; POL consists of aspartic proteinase (AP), reverse transcriptase (RT), ribonuclease (RN), and integrase (INT) domains, the latter three (RT, RN, INT) are responsible for retrotranscribing cDNA from RNA intermediates and inserting it into the host genome.

Endogenous retroviruses (ERVs) constitute a specific class of LTR REs that additionally contain an open reading frame (ORF) for an envelope protein (ENV), which enables ERVs to move from one cell to another. In contrast, all other LTR REs either lack or contain a remnant of an ENV gene and can only reinsert into their own host genome [1, 31, 32]. There are, however, ERVs that secondary lost their ENV gene and thus their infectious ability. Such ERVs are retrotransposing instead of infecting other cells as do typical retroviruses [33].

As a precondition for understanding the role of LTR REs in shaping genomes the diversity of these elements has to be systematized [3436] . For this purpose several computer programs have been developed to automatically detect LTR REs [37]. Some of these computing methods have made it possible to detect and identify previously unknown elements [38]; however, only a few comprehensive studies on LTR RE diversity have been carried out on non-model organisms. Furthermore, many genomes still host remnants of inactive retrotransposons corresponding to ancient retrotransposition events. These “genomic fossils” have accumulated mutations through time; many of them are difficult to identify because they have lost some of their characteristic features, thus making them imperceptible to automatic searches.

In this study we analyze the abundance and diversity of LTR retrotransposons found in the genome of the western clawed frog Silurana (Xenopus) tropicalis and compare it to a full depth coverage transcriptome of an advanced frog species, the European pool frog Pelophylax (Rana) lessonae. Amphibians are a very important evolutionary link between lunged and gilled vertebrates; they are also amongst the animals with the largest genomes [39]. The sequencing of the Silurana genome revealed a high diversity of TEs, even higher than in many other eukaryotes and vertebrates studied, including all four major families of LTR REs [40], thus making the frog genomic and transcriptional landscapes excellent environments to study the variability and dynamics of LTR REs. We were able to effectively estimate the abundance of the LTR RE families and clades within the Silurana genome, systematized them into clades on the basis of phylogenetic analyses, which we then used to analyze the diversity and expression patterns of LTR REs in the transcriptional landscapes of different tissues obtained from P. lessonae, S. tropicalis, and of eight individuals of Cyclorana alboguttata.

Based on RNAseq data we show that certain elements are tissue-specific expressed and for the first time that the expression patterns of ERVs change during embryonic development of Silurana. Finally, we discuss factors that may affect the transcription of LTR REs in the context of tissue- and genome-specificity.


Transcriptome assembly

Four transcriptomes were assembled. The largest transcriptome comprised the libraries of Silurana developmental stages [41], which spanned 148 million bp and 247 thousand sequences with an N50 of 791. The largest assembled sequence originated from the P. lessonae transcriptome and consisted of 94519 bp, it included an ORF of 93336 bp coding for 31122 amino acids (aa), a full length frog ortholog of titin (Gr. titan = giant), the largest known vertebrate gene/protein. The presence of this unusually long transcript indicates the good assembly quality of the P. lessonae transcriptome.

LTR RE diversity and abundance in the Siluranagenome

Phylogenetic reconstructions (Figure 1, Additional file 1: Figures S1-S5) based on RT domains, revealed the presence of LTR REs of all four classes (Bel/Pao, Ty1/Copia, Ty3/Gypsy, Retroviridae) in the genome and transcriptomes of S. tropicalis and the transcriptomes of P. lessonae and C. alboguttata (Table 1). We were able to identify at least eleven types of LTR REs (Figure 1, Table 1), some of them either unknown or else previously neglected in the Silurana genome.

Figure 1

Classification and structure of LTR retroelements in the frog genome and transcriptomes. Maximum-likelihood (ML) trees calculated on the basis of 256 known RT domains of eukaryotic LTR REs including amino acid sequences obtained from the Silurana tropicalis genome (a) and the transcriptomes of Pelophylax lessonae (b). Diagrammatic presentation of LTR REs (c) found in the Silurana genome (blue) and in the transcriptome of P. lessonae (red). The thin lines represent the overall length of the retroelement including the LTRs, while thick bars depict open reading frames for aspartic proteinase (AP), chromo domain (CHR), envelope protein (ENV), group-specific antigen (GAG), integrase (INT), RNase (RN), and reverse transcriptase (RT). Frameshifts are indicated by asterisks (*).

Table 1 LTR retroelements detected in the genome of Silurana tropicalis

Two types of Bel/Pao elements (Kobel and Hydra3.1) were found in the Silurana genome (Table 1). A Kobel-like element was present in multiple copies (135) in the Silurana genome; it was transcriptionally active in Silurana, Pelophylax, and Cyclorana (Figure 1, Table 2). Hydra 3.1-like elements were present with 2 copies in the Silurana genome but absent in the frog transcriptomes analyzed.

Table 2 LTR retroelements discovered in the genome of Silurana tropicalis (SIL-G) and different transcriptomes of S. tropicalis (SIL-T: adult tissues; SIL-D: developmental stages), Cyclorana alboguttata (CYC-T: adult individuals), and Pelophylax lessonae (PEL-T: adult tissues) with remarks on the occurrence and distribution of these elements among animals, plants, and fungi

Three types of Ty1/Copia elements (Hydra1.1, Mtanga, Zeco) were found in the frog genome and transcriptomes (Figure 1, Tables 1 and 2). Hydra1.1 and Mtanga-like elements were detected in the Silurana genome with 6 and 8 copies, respectively. Zeco-like elements, however, were found only in the transcriptome of P. lessonae together with transcripts of Hydra1.1- and Mtanga-like elements.

We found four types of Ty3/Gypsy elements (Amn-san, Cer, Gmr1, Mag) in the Silurana genome (Table 1). In total we identified over 700 copies of Amn-san elements, about 30 copies of Cer-like elements, ca. 200 copies of Gmr1-like elements, and approximately 80 copies of Mag-like elements. Multiple transcripts of these elements were also found in Pelophylax, Silurana, and Cyclorana tissues (Table 2).

Among the Retroviridae elements, three types (Murine Endogenous Retrovirus-like element, MuERV; Snakehead fish retrovirus, SnRV; and Xenopus laevis endogenous retrovirus, XEN1) were found in the Silurana genome and the frog transcriptomes analyzed (Figure 1, Tables 1 and 2; Additional file 1: Tables S1 and S2). A MuERV-L was present in 1-2 copies in the Silurana genome and in the P. lessonae transcriptome. Moreover, we were able to locate about 9 copies of SnRV-like elements within the Silurana genome and recovered a complete ENV-less element of this virus in the P. lessonae transcriptome. A XEN1 was present in the Silurana genome with ca. 10 copies and several transcripts were present in the transcriptomes of Pelophylax, Silurana, and Cyclorana (Table 2).

Genome colonization and proliferation of LTR elements

The diversity of LTR REs is largely the same in Silurana and Pelophylax (Figure 2a). There is evidence, however, that at least two elements (Zeco and Hydra3.1) have been acquired or lost since their last common ancestor. Our results clearly demonstrate that Ty3/Gypsy and Bel/Pao are the most prolific LTR RE families within the Silurana genome (Figure 2b), while elements of Ty1/Copia and Retroviridae show less success in fixation. Among all frog LTR REs, Amn-san elements are the most abundant, with multiple genomic copies (>700) followed by Gmr1 and Kobel (Table 2); some of the copies show very low sequence divergence as indicated by the average relatedness values calculated on the basis of the nucleotide and aa sequences of the RT domain (Figure 2c).

Figure 2

Diversity and expression patterns of LTR retroelements in the frog genome and transcriptomes. (a) Diversity of LTR REs in the genome of Silurana and in the frogs transcriptomes analyzed. (b) Number of LTR RE copies in the Silurana genome; (c) Proliferation patterns based on average relatedness of LTR REs in the Silurana genome. The average relatedness was calculated on the basis of amino acids as LOG (∑ (Alignment coverage *Alignment score)), in which a higher relatedness score indicates that the elements within that group are closer related to one another. (d) Arithmetic means of relative NRC values calculated for brain (B), heart (H), liver (L), and muscle (M) of S. tropicalis (left points) and P. lessonae (right points). (e) Relative amount of LTR REs in different frog transcriptomes.

Transcript abundance and differential expression

Our results clearly show that LTR REs from all four families (Bel/Pao, Ty1/Copia, Ty3/Gypsy, Retroviridae) are differentially transcribed. Ty3/Gypsy appears to be the most active LTR RE family as indicated by both the number of copies and NRC (Normalized Read Count) values (Table 2, Additional file 1: Tables S1 and S2).

In adult individuals of Silurana and Pelophylax, the expression of some elements exhibit tissue specific patterns (Figure 2d); significant differences in expression were observed for three elements (Amn-san, Gmr1, Mag) in Silurana and for two elements (Hydra1.1, MuERV) in Pelophylax (Additional file 1: Figures S6 and S7; Tables S1, S2, and S4). Hydra1.1, for example, exhibited the highest relative NRC values in brain and lowest in muscle transcriptomes in both Pelophylax and Silurana (Figure 2d, Additional file 1: Figure S7). It is also noticeable that SnRV is over-expressed in the tongue tissue of P. lessonae showing a circa 5 time higher relative NRC value than in the other tissues investigated (Additional file 1: Figure S6). In muscle of both Silurana and Pelophylax most elements were on average less expressed than in other tissues (Additional file 1: Figure S7). Muscle tissues of eight C. alboguttata individuals, however, showed only little similarity in both the relative amount and diversity of transcribed LTR REs (Figure 2e).

In the embryonic development of S. tropicalis transcription of LTR REs begins as soon as the embryonic genome is activated, ca. 6-8 hours after insemination of eggs at developmental stage 8.5, i. e. at the midblastula transition (MBT) [55], Figure 3; here stage 8.5 is included in stage 9). While Ty3/Gypsy, Bel/Pao, and Ty1/Copia elements did not show clear differential expression patterns during embryonic development, retroviral elements, particularly MuERV and SnRV, were most actively transcribed during gastrulation and neurulation, and XEN1 during organogenesis.

Figure 3

Normalized read counts and relative amount of expression of LTR retroelement (LTR RE) transcripts throughout the developmental progression of S. tropicalis progession. The presence of each type of LTR RE found within the transcriptome of S. tropicalis throughout 23 distinct developmental stages is summarized.

LTR RE annotations

Predicted LTR REs from the Silurana genome and LTR RE transcripts from all frog transcriptomes exhibited many ORFs which contained protein domains normally associated with retrotranscription of LTR REs and their reinsertion into the genome. The preliminary annotation of these genomic elements further revealed specific domains for each type of LTR RE that are linked with cell regulation in animals (Table 3).

Table 3 Examples of protein domains found in LTR REs predicted from the genome of Silurana tropicalis which might play a role in gene regulation and transcriptional networking


LTR retroelement diversity in the genomic and transcriptomic landscapes of frogs

Based on two different genomic search methods we have found between 1200 and 1300 LTR REs of four distinct families within the Silurana genome, containing at least LTRs and a retrotranscriptase ORF. LTR elements, however, constitute only a small fraction of total nuclear Silurana DNA compared to non-LTR REs and DNA transposons, which comprise up to one third of the Silurana genome [40]. Calculating the average length of each element and multiplying the average number of each element (Table 2), it can be suggested that around 0.49% (ca. 7.18 Mbp) of the Silurana genome assembly 7.1 (in total 1.45 billion bp) is composed of LTR REs. This estimation is concordant to the 7.43 Mbp calculated by Smit et al. [67] using the Repeat Masker Silurana genomic dataset (available at but differs from the value (9%) published by Hellsten et al. [40]; this discrepancy probably reflects a lower threshold used by Hellsten et al. to identify LTR REs.

Besides elements typical for vertebrate genomes such as Amn-san, Gmr1, and retroviruses, we have identified LTR REs of the Ty1/Copia and Bel/Pao clades, which have so far only been found in the genomes of phylogenetically distant aquatic animals. The Hydra3.1 element, for example, was first described from the genome of a freshwater animal Hydra magnipapillata; Kobel-like elements are known from the genomes of basal protostomes and deuterostomes [32].

Amn-san elements were most abundant in all of our data sets. They account for about 0.27% of the genome size in Silurana and can be considered as the most successful LTR REs in the Silurana genome. This assumption is evidenced by the coexistence of multiple copies with high sequence similarity, speaking for relatively recent bursts in activity of one or even several active master elements or recurrent genomic invasions. Besides closely related Amn-san elements, we found copies with higher sequence divergences that may trace back to older and now inactive elements. Large numbers of LTR REs have also been found in the giant genomes of salamanders, primarily Ty3/Gypsy elements [68], which supports our results that these LTR REs, particularly Amn-san, are the most numerous elements and account for nearly half of the LTR RE content in the Silurana genome. Moreover, our Silurana genomic dataset contained twice as much Bel/Pao elements as had been previously reported by de la Chaux and Wagner [35], who used a more selective pipeline and different reference sequences to identify LTR REs.

Colonization of the amphibian genome by LTR REs

Very little is known about genome colonization by LTR REs or about their evolutionary dynamics which is thought to encompass both gradual and vertical processes, as well as distinct modular, salutatory, and reticular events [32]. As indicated by the similar LTR RE spectrum in the genomes of Silurana and Pelophylax, most of the REs were already present in the genome of their last common ancestor, which presumably lived ca. 230 million years before present [69]. It can be assumed, however, that genome colonization by LTR REs predates the split between Rhinophrynidae + Pipidae and Neobatrachians because members of all RE families except Retroviridae are widely distributed among the genomes of plants, fungi, and animals [32].

LTR REs are usually inherited vertically from generation to generation; there is also evidence for a horizontal transfer of such elements between species [7074]. A successful spread of LTR REs assumes a stable integration into the germline of the host, which can be achieved when eggs or early embryonic stages are infected. The underlying transfer requires a vector; it was speculated that parasites may transmit nuclear DNA including TEs [74, 75]. The mechanisms of the transmission process, however, remain obscure. In this context it should be noted that Cer elements found in the genome of Silurana and the transcriptome of Pelophylax showed closest relationships to elements described from the genome of the nematode Caenorhabditis elegans [45]. We do not know whether these Cer elements originated directly from frog genomes or from the genomes of putative parasites. The latter possibility is more parsimonious because highest expression of Cer elements was observed in muscle and testis; both tissues are known to be colonized by parasitic flatworms [76, 77].

Differential expression of LTR REs

The expression of LTR REs in vertebrates is thought to depend on a variety of genetic and epigenetic factors as indicated by specific spatiotemporal expression patterns, i.e. differences in the expression profiles of distinct elements (families) between tissues, sexes, ontogenetic and age stages, individuals, and species [7882]. Tissue-specific expression patterns of single LTR REs, especially Hydra1.1 and MuERV, have been observed in the frog transcriptomes analyzed. The most enigmatic example for tissue-specific expression is the Snakehead retrovirus (SnRV), which was highly expressed in the tongue of P. lessonae but at very low levels in the other tissues investigated. The significance of this pattern is not yet understood just as this ERV is not well studied either.

Similar patterns of cell type specific expression have been reported for the ZFERV virus of the zebrafish; for this ERV the thymus appears to be a major tissue for retroviral activity [78]. Pervasive, tissue-specific RE transcription is likely to have functional consequences on the protein-coding transcriptome [80] and is thought to be directly linked to the role these elements may play in physiology of organs [78, 79].

Evidence for individual differences of LTR RE expression comes from the Cyclorana dataset; here a small number of Kobel-like elements were transcribed in muscle tissue of only some individuals. This suggests that expression of LTR REs may play a role in the process of individual adaptation and may affect phenotypic variability. Because the Silurana transcriptomic datasets are pooled from several specimens [41], individual effects should be minimized as indicated by similar expression profiles of LTR RE transcripts in S. tropicalis eggs and embryos obtained from two different clutches (Figure 3). Moreover, there is evidence for species-specific expression of LTR REs. For example, XEN1-like elements exhibited only minor transcription in Pelophylax and Silurana, but were relatively highly expressed in the muscle tissue of Cyclorana compared to the other elements.

Our analyses clearly demonstrate that LTR REs are differentially expressed during ontogenetic development of S. tropicalis; there are clear transitions between three LTR RE communities at particular stages of development. Transcription starts abruptly at the MBT (stage 8.5, Figure 3). Before the MBT Silurana embryos undergo 12 rapid synchronous cleavages; this phase is also characterized by the absence of cell motility. At the MBT the blastomers become motile and the cell cycle becomes more complex. While low levels of transcription are known to occur before the MBT, especially of genes associated with phosphorylation, the cell cycle, signal transduction, and apoptosis [41, 8385] we did not find significant expression of viral-related transcripts before stage 8.5. The significant change of LTR RE transcription profiles during embryogenesis indicates that LTR REs are probably involved in cell differentiation and organogenesis in S. tropicalis as has already been demonstrated by Sinzelle et al. [81] for the ERV XTERV1.

For mammals there is increasing evidence that LTR REs are involved in gene regulation and developmental processes. In mouse oocytes and preimplantation embryos, for example, retroviruses exhibited a high contribution to the maternal mRNA pool and different LTR REs had specific, developmentally regulated expression patterns [86]. In a 2-cell (2C) stage embryo cDNA library prepared by Peatson et al. [87], the bulk of interspersed repeat ESTs were MuERV, similar to the situation observed in gastrulation and neurulation stages of Silurana. In mice the 2C stage is the critical phase when the embryo switches from a maternal to a zygotic transcriptome [88] comparable to the MBT in Silurana [89]. In mouse 2C-like embryonic stem cells (ESCs) the expression pattern of murine ERV elements with leucine t-RNA primer (MuERVL) overlapped with more than 100 2C-specific genes that have co-opted regulatory elements from these retroviruses to initiate their transcription [90]. More than 25% of the nearly 700 MuERVL copies were activated, and 307 genes generated chimeric transcripts with junctions to MuERVL elements. Similar observations were obtained from human ESCs in which HERV-H was highly expressed but became silenced on differentiation into embryoid bodies [91]. Based on these results it can be suggested that ERVs may have an important gene regulatory role already in early mammalian development by contributing to the specification of cell types.

In contrast to the mouse genome, only 1-2 MuERV copies were found in the genome of Silurana where they were highest expressed from stage 13-14 (mid gastrulation) to stage 22-23 (end of neurulation). One of these copies carried an ORF of unknown function and an ENV protein.

During embryonic development LTR REs operate as alternative promotors, enhancers [1315, 92], first exons for a subset of host genes [87], and as targets of transcription factors [93]. Retroelements are even able to serve host functions for genes over longer distances as the example of the human ERV-9 demonstrates [94]. The LTR/POL II complex of this ERV appears to mediate the long range transfer of proteins from the LTR to the ß-globolin gene. Moreover, RE derived mRNAs are important sources for small RNAs, which are known to be necessary for regulation of gene expression [95].

Based on the fact that LTR REs are apparently involved in key and early stages of embryonic development in Silurana, we hypothesize that LTR REs including ERVs, were already exapted as regulators of embryonic development in lower vertebrates, i.e. long before the earliest mammalian genomes evolved.

LTR REs as evolvability toolboxes

There is increasing evidence that LTR REs have greatly contributed to generate the adaptive genetic diversity observed in living organisms [96, 97]. Beside the fact that LTR REs are common components of transcriptional networks, the protein domains they carry are known to be essential for genome maintenance and dynamics such as transcription regulation, mRNA trafficking, intracellular signaling, cell survival, and differentiation [15]. LTR REs typically include highly specific RNA binding domains (Zinc fingers, Zinc knuckles, SCAN domains) [61, 64, 65]; domains for catalysis of DNA integration into the genome (integrase domain); peptide cleavage (pepsin-like aspartases and protease domains), RNA and DNA cleavage (RNAse domain, endonuclease domain) [62, 63]; and reverse transcription (retrotranscriptase domain); in addition some carry group specific antigens (GAG domains) [98]; chromatin organization modifiers (chromo domains) [56, 57]; and trans-membrane glycoproteins (ENV domain). The domain composition is element-specific (Table 3), for example chromo domains were only found within Am-san elements; more than half of the Gmr elements exclusively contained a SCAN domain, while Zinc finger and Zinc knuckle domains were only identified in retroviruses. Moreover, LTR RE derived glycoproteins, in particular from ERVs, are thought to act as blocking receptors against exogenous infective viruses (a phenomenon called retroviral interference or super-infection resistance) [99, 100].

A not yet discussed putative function concerns the ENV domain of ERVs which is responsible for cell entry [101] and also has an immunosuppressive function [102]. We found that the ENV domain of MuERV was only expressed during embryogenesis but not in adult tissues of Silurana. This fact indicates that MuERV still possesses the capability to overcome cell membranes during embryogenesis and predisposes one to believe that ERVs might play a general role in signal transduction pathways and thus for coordination and regulation of ontogenetic processes in frogs and probably also in other vertebrates. Because of the relative low copy number of ERVs in the Silurana tropicalis genome (<25), this species could serve as a suitable model to study the effects ERVs have on ontogenesis and cell differentiation.

Taking the known and putative functions of ERVs and remnant LTR elements into consideration, the common view that they have to be considered as fossil representatives of retroviruses extant at the time of their insertion into the germline [15, 103] has to be questioned. Because complex phenomena such as molecular orchestration of embryonic development, placentation, and immunity are closely accompanied by ERVs and their derivatives we are more inclined to believe that LTR REs and in general TEs significantly contributed to the rise and diversification of vertebrate animals.


We here present the first comprehensive study on the diversity of LTR REs in frog genomes. We found LTR REs of all four families (Bel/Pao, Ty1/Copia, Ty3/Gypsy, Retroviridae) in the genome of Silurana and in the transcriptional landscapes of Silurana and Pelophylax. Ty3/Gypsy and Bel/Pao are the most abundant LTR RE classes within the frog genome and transcriptome. Amn-san elements from the Ty3/Gypsy class are the most prolific with over 700 full-length genomic copies. It has been shown that LTR REs are differentially transcribed not only across different tissues of the same frog, but also across different species of frogs and across different individuals of the same species. Differential expression of LTR REs occurred also during the embryonic development of Silurana, where transcription of LTR REs begins as soon as the embryonic genome is activated, followed by clear transitions between three LTR RE communities at particular stages of development. Their involvement in key and early stages of development suggests that LTR REs, especially ERVs, were already exapted as regulators of embryonic development in lower vertebrates, i.e. before the earliest mammalian genomes evolved.

Measured in terms of the huge amount and variability of LTR REs, only little is known on their specific genomic functions. Therefore, experimental approaches are urgently needed to better understand the roles LTR REs play for cell function, gene regulation, and organismic development, separately and in concert with other genes and genetic factors. Future efforts should also include studies focused on the functions of the protein domains encoded within each LTR RE type, and particularly the ENV domain of ERVs.

Beside the fact that LTR REs are transcriptionally active, their cell type-specificity and differential expression during ontogenetic development emphasize once again their importance for organismic development in vertebrates as intrinsic components of regulatory networks.


Tissue preparation, RNA isolation, and de novosequencing

Organs (brain, heart, eye, intestine, liver, lung, muscle, skin, stomach, testes, tongue) for tissue samples were taken from two Pelophylax lessonae males (PL68-2012, PL74-2012) collected near Melzow, Germany (53°11'00"N, 13°54'00"E), snap frozen in liquid nitrogen and stored at -80°C. RNA and DNA was isolated simultaneously from each tissue mentioned above using the AllPrep DNA/RNA Mini Kit (Qiagen, Cat.No. 80204). Frozen tissue pieces were disrupted using mortar and pestle, and homogenized in RLT buffer in TissueLyser for 2 min at 20 Hz. RNA quantification and integrity were determined using a Qubit® 2.0 Fluorometer (Life Technologies, Cat.No. Q32866) and a 2100 BioAnalyser (Agilent Technologies, Cat.No. G2940CA), respectively, according to the manufacturer’s instructions.

MRNA-seq libraries were prepared from 2000 ng of total RNA using TruSeq RNA Sample Prep Kit v2 (Illumina, Cat.No. RS-122-2001) with a modification of the protocol allowing to preserve directional information about the transcripts [104]. First, mRNA was isolated within a pool of total RNA and chemically fragmented. Then double-stranded (ds) cDNA synthesis was performed with the incorporation of dUTP in the second strand. The ds cDNA fragments were further processed following a standard Illumina sequencing library preparation scheme: end polishing, A-tailing, adapter ligation, and size selection. Prior to final library amplification, the dUTP-marked strand was selectively degraded by Uracil-DNA-Glycosylase (UDG). The remaining strand was amplified to generate a cDNA library suitable for sequencing. Paired-end 2x50 bp sequencing was carried out on the Illumina HiSeq2000 platform, generating on average 50 million paired-end reads or 2.5 GB per sample.

Genome data sources and de novoassembly of transcriptome data

The genome assembly (release v7.1) of Silurana tropicalis was downloaded from [105] [date last accessed 29 July 2014]:

To study the transcriptional diversity and dynamics of LTR REs in frogs we assembled transcriptomes of P. lessonae and S. tropicalis from several tissues. P. lessonae transcriptomes of brain, eye, intestine, liver, lung, skin, stomach, testis, and tongue originated from individual PL74-2012, transcriptomes of heart and muscle from individual PL68-2012. Transcriptomes for brain, liver, kidney, heart, and skeletal muscle of S. tropicalis are based on publicly available RNA-seq datasets (Accession No. SRX191164-68, 5 runs, 39 Gbases). Additionally, we assembled a transcriptome by using a dataset of 23 distinct developmental stages of S. tropicalis from two egg clutches ([41]; Accession No. SRA051954 - 40 runs compromising 92 Gbases) to study the dynamics of LTR REs through embryonic development. Finally, eight RNA-seq libraries from muscle tissue samples of eight individuals of the Australian green-striped burrowing frog Cyclorana alboguttata (Accession No. SRA059487 - 8 runs, 42 Gbases) were analyzed to answer the question whether the expression of LTR elements is individual-specific.

All SRA files were converted to fastq format using the fastq-dump utility of the SRA tool kit (available from NCBI) and transcriptome data were assembled with SOAPdenovo-trans [106]. We assembled the transcriptomes of Cyclorana and of the developmental stages of Silurana using different k-mer lengths (k = 23, 31, 51), merged the contig files and constructed a non-redundant file using the program CD-HIT [107, 108].

Pelophylaxdeep transcriptome assembly

Prior to de novo sequence assembly, an inhouse python script was used to clean raw Illumina reads from adapter sequences (on average 1-3%) and low quality reads (Phred score below 11). Reads containing Ns were excluded. On average about 10% of the sequences were excluded by this procedure. A total of 1,119,579,890 reads was assembled simultaneously using SOAPdenovo-trans; settings (other than default) used were –K 31 –M3 –F –G 200 (per default up to five transcripts per locus were allowed).

LTR retroelement identification

We created several datasets to gain independent overviews on LTR REs in each frog transcriptome and in the Silurana genome (Figure 4). In all searches we relied heavily on a reference collection of retroelement domains and alignments obtained from the publicly available Gypsy Database 2.0 (GyDB) [36]. For the detection of LTR REs we used the retro-transcriptase (RT) domain because it is the best conserved through evolutionary time [109]. In order to obtain a custom representation of the LTR RE diversity, including all four LTR RE families occurring within the frog genome and transcriptomes, the following methods were applied:

Figure 4

Work flow diagram summarizing data flow from sequencing to statistical analysis. Abbreviations used in the Figure: DB, database; LTR REs, long terminal repeat retroelements.

Genome LTR RE search method 1: We used tblastn to query the complete RT domains of GyDB against the entire Silurana genome reporting matches with an e-value of 1e-40 and alignments for the 10,000 best matches.

Genome LTR RE search method 2: We applied the program suffixerator, which is part of GenomeTools ( with default parameters and created an enhanced suffix file which was later scanned with LTR harvest [110], a de novo detector of LTR REs, with relaxed parameters (-seed 20, minlenLTR 30, maxlenLTR 2000, similar 70) to predict more LTR REs. To leave out false LTR RE predictions, we then searched each LTR harvest predicted sequence against a database of RT domains of GyDB using blastx. Matches with an e-value of 1e-40 and alignments for only the best match were reported.

Transcriptome LTR RE search method: For the identification of LTR REs in the transcriptomes we used blastx to query each transcriptome sequence against the RT domains of GyDB. All sequences with e-values of 1e-30 were considered to belong to LTR REs.

Systematic classification

Because the results from both genome search methods yielded thousands of RT alignments, we separately clustered each genome LTR RE dataset using the program CD-HIT with an identity threshold of 80%, and discarded sequences shorter than 120 aa to reduce the high number of similar and identical copies of each retroelement.

Databases resulting from single frog transcriptomes and from the S. tropicalis genome were analyzed separately. We fused each dataset with the complete RT domains of GyDB, aligned the sequences, and inferred a Maximum-Likelihood (ML) tree in order to accurately place the retroelements in a phylogenetic context. All alignments were conducted with the program Mafft [111] using local alignment and a Blosum 30 aa substitution matrix as parameters. Final alignment files were prepared by removing columns with more than 70% of gaps (Additional file 2). ML trees were calculated with the program PhyML 3.0 [112] using 4 rate categories and a nearest neighbor interchange (NNI) tree search. Branch support was estimated with an approximate likelihood ratio test (aLRT) as implemented in PhyML.

The ML trees based on the different genome search methods (1 and 2) were largely the same. We selected the tree resulting from the LTR harvest predictions (method 2). To check the integrity, i.e. the completeness, of the LTR REs, we used NCBI’s Conserved Domain Database [113] and a custom query databases derived from the GyDB. Candidate sequences and regions were extracted and queried against a references database containing the GAG, POL, dUTPASE, and CHR domains of each class of LTR REs.

LTR retroelement quantification

In order to estimate the quantity of LTR RE copies for each type that coexist within the Silurana genome we applied two different counting procedures: (1) all ORFs with a minimal length of 450 bp were translated into aa between the START and STOP codons using EMBOSS getorf [114]; the resulting protein predictions were blasted against a custom database containing only RT domains of LTR REs previously distinguished in the phylogenetic analysis. (2) The second method based on the results of LTR harvest (genome search method 2). We also searched against a selected database of RT domains and counted the amount of hits accumulated for each element.

Proliferation analysis

In order to determine which of the elements have been more efficient in copying and inserting themselves within the Silurana genome, we used the inner regions (regions without LTRs) that resulted from LTR harvest (genome search method 2), separated each LTR RE prediction by element type based on the previous analyses and queried each group of elements against itself by using Blast. We blasted the aa region (Blastp) of the RT domain as well as the whole inner regions (Blastn) of LTR RE predictions using default parameters.

After processing the Blast reports we were able to estimate the relatedness of each element within its group by extracting the alignment score and coverage. For each element we normalized the relatedness value using the formula: element relative relatedness = Ln ∑ (alignment coverage × alignment score).

LTR annotation

To predict putative functions of LTR REs we annotated the genomic copies as well as the transcripts from all frog transcriptomes analyzed. We translated all ORFs using EMBOSS getorf [114] with default parameters and the option '-find 1' which translates only regions between the start and stop codon. The resulted protein predictions were then classified by their domains using Hmmer [115] and Pfam-A reference databases [116]. Domain hits with e-values of 1e-10 were parsed out (Additional files 3 and 4).

Transcript abundance and tests for differential expression

As a first step we treated the assembled transcriptomes as a reference genome and mapped the read library of each tissue against the transcriptome using Bowtie 2.1.0 [117] with default options and settings to report the 20 best alignments of every read with the -K 20 parameter. Raw count data were obtained through a custom python script and analyzed with DEseq [118] to normalize count data across tissues (Additional file 1: Figures S8-S11). Based on these normalized read counts (NRC) expression patterns of different LTR REs were analyzed for each transcriptome (Additional file 1: Tables S1 and S2). For tissue-specific transcriptomes we also calculated relative values of normalized read counts (NRCrel) dividing the single NRC values by their arithmetic mean (Additional file 1: Table S3). Based on these NRCrel values we compared tissue-specific expression of all LTR REs detected (Additional file 1: Figures S6 and S7, Table S4). Because NRC and NRCrel values were not normally distributed, a LOG transformation or POWER transformation based on the method of Box and Cox [119] was applied (Additional file 1). Transformed data were tested for normality and variance homogeneity using the test statistics of Shapiro-Wilk [120] and Levene [121], respectively. NRC and NRCrel values were analyzed with the One-Way ANOVA procedure and/or the Kruskal-Wallis test [122] to determine significant differences in the expression patterns (Additional file 1: Tables S1, S2, and S4). Statistical calculations were done with the program Statgraphics Centurion Version 15.2.14 (Statpoint Technologies, Inc., Warrenton, Virginia, USA).

Data access

RNA-seq libraries for the eleven tissues of the Pelophylax lessonae deep transcriptome study are available from SRA sequence database under accession number SRP036849.


Our animal use protocols follow the Animal Welfare Act of the Federal Republic of Germany and the recommendations contained in "Guidelines for Use of Live Amphibians and Reptiles in Field Research" compiled by the American Society of Ichthyologists and Herpetologists (ASIH), the Herpetologists League (HL), and the Society for the Study of Amphibians and Reptiles (SSAR). All experiments in this study were performed under the ethical permits RS7/SPN176 and LUGV_RO7-4610/73#81213/2012 which were approved by Landesumweltamt Brandenburg, Regionalabteilung Süd, Referat RS7 Naturschutz and Landesamt für Umwelt, Gesundheit und Verbraucherschutz, Brandenburg, Regionalabteilung Ost, respectively.


  1. 1.

    Kazazian HH: Mobile elements: drivers of genome evolution. Science. 2004, 303: 1626-1632.

    CAS  PubMed  Google Scholar 

  2. 2.

    Feschotte C: Transposable elements and the evolution of regulatory networks. Nat Rev Genet. 2008, 9: 397-405.

    CAS  PubMed Central  PubMed  Google Scholar 

  3. 3.

    Gibbs RA, Rogers J, Katze MG, Bumgarner R, Weinstock GM, Mardis ER, Remington KA, Strausberg RL, Venter JC, Wilson RK, Batzer MA, Bustamante CD, Eichler EE, Hahn MW, Hardison RC, Makova KD, Miller W, Milosavljevic A, Palermo RE, Siepel A, Sikela JM, Attaway T, Bell S, Bernard KE, Buhay CJ, Chandrabose MN, Dao M, Davis C, Delehaunty KD, Ding Y, et al: Evolutionary and biomedical insights from the rhesus macaque genome. Science. 2007, 316: 222-234.

    CAS  PubMed  Google Scholar 

  4. 4.

    Petrov DA: Evolution of genome size: new approaches to an old problem. Trends Genet TIG. 2001, 17: 23-28.

    CAS  Google Scholar 

  5. 5.

    Kidwell MG: Transposable elements and the evolution of genome size in eukaryotes. Genetica. 2002, 115: 49-63.

    CAS  PubMed  Google Scholar 

  6. 6.

    Böhne A, Brunet F, Galiana-Arnoux D, Schultheis C, Volff J-N: Transposable elements as drivers of genomic and biological diversity in vertebrates. Chromosome Res Int J Mol Supramol Evol Asp Chromosome Biol. 2008, 16: 203-215.

    Google Scholar 

  7. 7.

    Biémont C: A brief history of the status of transposable elements: from junk DNA to major players in evolution. Genetics. 2010, 186: 1085-1093.

    PubMed Central  PubMed  Google Scholar 

  8. 8.

    Bire S, Rouleux-Bonnin F: Transposable elements as tools for reshaping the genome: it is a huge world after all!. Methods Mol Biol Clifton NJ. 2012, 859: 1-28.

    CAS  Google Scholar 

  9. 9.

    Medstrand P, van de Lagemaat LN, Dunn CA, Landry J-R, Svenback D, Mager DL: Impact of transposable elements on the evolution of mammalian gene regulation. Cytogenet Genome Res. 2005, 110: 342-352.

    CAS  PubMed  Google Scholar 

  10. 10.

    Sela N, Kim E, Ast G: The role of transposable elements in the evolution of non-mammalian vertebrates and invertebrates. Genome Biol. 2010, 11: R59-

    PubMed Central  PubMed  Google Scholar 

  11. 11.

    Van de Lagemaat LN, Landry J-R, Mager DL, Medstrand P: Transposable elements in mammals promote regulatory variation and diversification of genes with specialized functions. Trends Genet TIG. 2003, 19: 530-536.

    CAS  Google Scholar 

  12. 12.

    Mariño-Ramírez L, Lewis KC, Landsman D, Jordan IK: Transposable elements donate lineage-specific regulatory sequences to host genomes. Cytogenet Genome Res. 2005, 110: 333-341.

    PubMed Central  PubMed  Google Scholar 

  13. 13.

    Cohen CJ, Lock WM, Mager DL: Endogenous retroviral LTRs as promoters for human genes: a critical assessment. Gene. 2009, 448: 105-114.

    CAS  PubMed  Google Scholar 

  14. 14.

    Conley AB, Piriyapongsa J, Jordan IK: Retroviral promoters in the human genome. Bioinforma Oxf Engl. 2008, 24: 1563-1567.

    CAS  Google Scholar 

  15. 15.

    Jern P, Coffin JM: Effects of retroviruses on host genome function. Annu Rev Genet. 2008, 42: 709-732.

    CAS  PubMed  Google Scholar 

  16. 16.

    Roy-Engel AM, El-Sawy M, Farooq L, Odom GL, Perepelitsa-Belancio V, Bruch H, Oyeniran OO, Deininger PL: Human retroelements may introduce intragenic polyadenylation signals. Cytogenet Genome Res. 2005, 110: 365-371.

    CAS  PubMed  Google Scholar 

  17. 17.

    Lee JY, Ji Z, Tian B: Phylogenetic analysis of mRNA polyadenylation sites reveals a role of transposable elements in evolution of the 3’-end of genes. Nucleic Acids Res. 2008, 36: 5581-5590.

    CAS  PubMed Central  PubMed  Google Scholar 

  18. 18.

    Conley AB, Jordan IK: Cell type-specific termination of transcription by transposable element sequences. Mob DNA. 2012, 3: 15-

    CAS  PubMed Central  PubMed  Google Scholar 

  19. 19.

    Smalheiser NR, Torvik VI: Mammalian microRNAs derived from genomic repeats. Trends Genet TIG. 2005, 21: 322-326.

    CAS  Google Scholar 

  20. 20.

    Rebollo R, Romanish MT, Mager DL: Transposable elements: an abundant and natural source of regulatory sequences for host genes. Annu Rev Genet. 2012, 46: 21-42.

    CAS  PubMed  Google Scholar 

  21. 21.

    Volff J-N, Bouneau L, Ozouf-Costaz C, Fischer C: Diversity of retrotransposable elements in compact pufferfish genomes. Trends Genet. 2003, 19: 674-678.

    CAS  PubMed  Google Scholar 

  22. 22.

    Vastenhouw NL, Plasterk RHA: RNAi protects the Caenorhabditis elegans germline against transposition. Trends Genet TIG. 2004, 20: 314-319.

    CAS  Google Scholar 

  23. 23.

    Obbard DJ, Gordon KHJ, Buck AH, Jiggins FM: The evolution of RNAi as a defence against viruses and transposable elements. Philos Trans R Soc B Biol Sci. 2009, 364: 99-115.

    CAS  Google Scholar 

  24. 24.

    Malone CD, Hannon GJ: Small RNAs as guardians of the genome. Cell. 2009, 136: 656-668.

    CAS  PubMed Central  PubMed  Google Scholar 

  25. 25.

    Ishizu H, Siomi H, Siomi MC: Biology of PIWI-interacting RNAs: new insights into biogenesis and function inside and outside of germlines. Genes Dev. 2012, 26: 2361-2373.

    CAS  PubMed Central  PubMed  Google Scholar 

  26. 26.

    Yoder JA, Walsh CP, Bestor TH: Cytosine methylation and the ecology of intragenomic parasites. Trends Genet TIG. 1997, 13: 335-340.

    CAS  Google Scholar 

  27. 27.

    Feng G, Leem Y-E, Levin HL: Transposon integration enhances expression of stress response genes. Nucleic Acids Res. 2013, 41: 775-789.

    CAS  PubMed Central  PubMed  Google Scholar 

  28. 28.

    Deininger PL, Batzer MA: Mammalian retroelements. Genome Res. 2002, 12: 1455-1465.

    CAS  PubMed  Google Scholar 

  29. 29.

    Wicker T, Sabot F, Hua-Van A, Bennetzen JL, Capy P, Chalhoub B, Flavell A, Leroy P, Morgante M, Panaud O, Paux E, SanMiguel P, Schulman AH: A unified classification system for eukaryotic transposable elements. Nat Rev Genet. 2007, 8: 973-982.

    CAS  PubMed  Google Scholar 

  30. 30.

    Kapitonov VV, Jurka J: A universal classification of eukaryotic transposable elements implemented in Repbase. Nat Rev Genet. 2008, 9: 411-412. author reply 414

    PubMed  Google Scholar 

  31. 31.

    Eickbush TH, Jamburuthugoda VK: The diversity of retrotransposons and the properties of their reverse transcriptases. Virus Res. 2008, 134: 221-234.

    CAS  PubMed Central  PubMed  Google Scholar 

  32. 32.

    Llorens C, Muñoz-Pomer A, Bernad L, Botella H, Moya A: Network dynamics of eukaryotic LTR retroelements beyond phylogenetic trees. Biol Direct. 2009, 4: 41-

    PubMed Central  PubMed  Google Scholar 

  33. 33.

    Magiorkinis G, Gifford RJ, Katzourakis A, De Ranter J, Belshaw R: Env-less endogenous retroviruses are genomic superspreaders. Proc Natl Acad Sci U S A. 2012, 109: 7385-7390.

    CAS  PubMed Central  PubMed  Google Scholar 

  34. 34.

    Havecker ER, Gao X, Voytas DF: The diversity of LTR retrotransposons. Genome Biol. 2004, 5: 225-

    PubMed Central  PubMed  Google Scholar 

  35. 35.

    De la Chaux N, Wagner A: BEL/Pao retrotransposons in metazoan genomes. BMC Evol Biol. 2011, 11: 154-

    PubMed Central  PubMed  Google Scholar 

  36. 36.

    Llorens C, Futami R, Covelli L, Domínguez-Escribá L, Viu JM, Tamarit D, Aguilar-Rodríguez J, Vicente-Ripolles M, Fuster G, Bernet GP, Maumus F, Munoz-Pomer A, Sempere JM, Latorre A, Moya A: The Gypsy Database (GyDB) of mobile genetic elements: release 2.0. Nucleic Acids Res. 2011, 39 (Database issue): D70-74.

    CAS  PubMed Central  PubMed  Google Scholar 

  37. 37.

    Lerat E: Identifying repeats and transposable elements in sequenced genomes: how to find your way through the dense forest of programs. Heredity. 2010, 104: 520-533.

    CAS  PubMed  Google Scholar 

  38. 38.

    Rho M, Schaack S, Gao X, Kim S, Lynch M, Tang H: LTR retroelements in the genome of Daphnia pulex. BMC Genomics. 2010, 11: 425-

    PubMed Central  PubMed  Google Scholar 

  39. 39.

    Animal Genome Size Database. Home []

  40. 40.

    Hellsten U, Harland RM, Gilchrist MJ, Hendrix D, Jurka J, Kapitonov V, Ovcharenko I, Putnam NH, Shu S, Taher L, Blitz IL, Blumberg B, Dichmann DS, Dubchak I, Amaya E, Detter JC, Fletcher R, Gerhard DS, Goodstein D, Graves T, Grigoriev IV, Grimwood J, Kawashima T, Lindquist E, Lucas SM, Mead PE, Mitros T, Ogino H, Ohta Y, Poliakov AV, et al: The genome of the Western clawed frog Xenopus tropicalis. Science. 2010, 328: 633-636.

    CAS  PubMed Central  PubMed  Google Scholar 

  41. 41.

    Tan MH, Au KF, Yablonovitch AL, Wills AE, Chuang J, Baker JC, Wong WH, Li JB: RNA sequencing reveals a diverse and dynamic repertoire of the Xenopus tropicalis transcriptome over development. Genome Res. 2013, 23: 201-216.

    CAS  PubMed Central  PubMed  Google Scholar 

  42. 42.

    Copeland CS, Mann VH, Morales ME, Kalinna BH, Brindley PJ: The Sinbad retrotransposon from the genome of the human blood fluke, Schistosoma mansoni, and the distribution of related Pao-like elements. BMC Evol Biol. 2005, 5: 20-

    PubMed Central  PubMed  Google Scholar 

  43. 43.

    Rohr CJB, Ranson H, Wang X, Besansky NJ: Structure and evolution of mtanga, a retrotransposon actively expressed on the Y chromosome of the African malaria vector Anopheles gambiae. Mol Biol Evol. 2002, 19: 149-162.

    CAS  PubMed  Google Scholar 

  44. 44.

    Terrat Y, Bonnivard E, Higuet D: GalEa retrotransposons from galatheid squat lobsters (Decapoda, Anomura) define a new clade of Ty1/copia-like elements restricted to aquatic species. Mol Genet Genomics MGG. 2008, 279: 63-73.

    CAS  Google Scholar 

  45. 45.

    Bowen NJ, McDonald JF: Genomic analysis of Caenorhabditis elegans reveals ancient families of retroviral-like elements. Genome Res. 1999, 9: 924-935.

    CAS  PubMed  Google Scholar 

  46. 46.

    Bae YA, Moon SY, Kong Y, Cho SY, Rhyu MG: CsRn1, a novel active retrotransposon in a parasitic trematode, Clonorchis sinensis, discloses a new phylogenetic clade of Ty3/gypsy-like LTR retrotransposons. Mol Biol Evol. 2001, 18: 1474-1483.

    CAS  PubMed  Google Scholar 

  47. 47.

    Butler M, Goodwin T, Poulter R: An unusual vertebrate LTR retrotransposon from the cod Gadus morhua. Mol Biol Evol. 2001, 18: 443-447.

    CAS  PubMed  Google Scholar 

  48. 48.

    Goodwin TJD, Poulter RTM: A group of deuterostome Ty3/gypsy-like retrotransposons with Ty1/copia-like pol-domain orders. Mol Genet Genomics MGG. 2002, 267: 481-491.

    CAS  Google Scholar 

  49. 49.

    Michaille JJ, Mathavan S, Gaillard J, Garel A: The complete sequence of mag, a new retrotransposon in Bombyx mori. Nucleic Acids Res. 1990, 18: 674-

    CAS  PubMed Central  PubMed  Google Scholar 

  50. 50.

    Tubío JMC, Naveira H, Costas J: Structural and evolutionary analyses of the Ty3/gypsy group of LTR retrotransposons in the genome of Anopheles gambiae. Mol Biol Evol. 2005, 22: 29-39.

    PubMed  Google Scholar 

  51. 51.

    Bénit L, De Parseval N, Casella JF, Callebaut I, Cordonnier A, Heidmann T: Cloning of a new murine endogenous retrovirus, MuERV-L, with strong similarity to the human HERV-L element and with a gag coding sequence closely related to the Fv1 restriction gene. J Virol. 1997, 71: 5652-5657.

    PubMed Central  PubMed  Google Scholar 

  52. 52.

    Bénit L, Lallemand JB, Casella JF, Philippe H, Heidmann T: ERV-L elements: a family of endogenous retrovirus-like elements active throughout the evolution of mammals. J Virol. 1999, 73: 3301-3308.

    PubMed Central  PubMed  Google Scholar 

  53. 53.

    Hart D, Frerichs GN, Rambaut A, Onions DE: Complete nucleotide sequence and transcriptional analysis of snakehead fish retrovirus. J Virol. 1996, 70: 3606-3616.

    CAS  PubMed Central  PubMed  Google Scholar 

  54. 54.

    Kambol R, Kabat P, Tristem M: Complete nucleotide sequence of an endogenous retrovirus from the amphibian, Xenopus laevis. Virology. 2003, 311: 1-6.

    CAS  PubMed  Google Scholar 

  55. 55.

    Newport J, Kirschner M: A major developmental transition in early Xenopus embryos: I. characterization and timing of cellular changes at the midblastula stage. Cell. 1982, 30: 675-686.

    CAS  PubMed  Google Scholar 

  56. 56.

    Brehm A, Tufteland KR, Aasland R, Becker PB: The many colours of chromodomains. BioEssays News Rev Mol Cell Dev Biol. 2004, 26: 133-140.

    CAS  Google Scholar 

  57. 57.

    Kordis D: A genomic perspective on the chromodomain-containing retrotransposons: Chromoviruses. Gene. 2005, 347: 161-173.

    CAS  PubMed  Google Scholar 

  58. 58.

    Cavalli G, Paro R: Chromo-domain proteins: linking chromatin structure to epigenetic regulation. Curr Opin Cell Biol. 1998, 10: 354-360.

    CAS  PubMed  Google Scholar 

  59. 59.

    Schüller M, Jenne D, Voltz R: The human PNMA family: novel neuronal proteins implicated in paraneoplastic neurological disease. J Neuroimmunol. 2005, 169: 172-176.

    PubMed  Google Scholar 

  60. 60.

    Iwasaki S, Suzuki S, Pelekanos M, Clark H, Ono R, Shaw G, Renfree MB, Kaneko-Ishino T, Ishino F: Identification of a novel PNMA-MS1 gene in marsupials suggests the LTR retrotransposon-derived PNMA genes evolved differently in marsupials and eutherians. DNA Res Int J Rapid Publ Rep Genes Genomes. 2013, 20: 425-436.

    CAS  Google Scholar 

  61. 61.

    Sander TL, Stringer KF, Maki JL, Szauter P, Stone JR, Collins T: The SCAN domain defines a large family of zinc finger transcription factors. Gene. 2003, 310: 29-38.

    CAS  PubMed  Google Scholar 

  62. 62.

    Dlakić M: Functionally unrelated signalling proteins contain a fold similar to Mg2 + -dependent endonucleases. Trends Biochem Sci. 2000, 25: 272-273.

    PubMed  Google Scholar 

  63. 63.

    Sitbon E, Pietrokovski S: New types of conserved sequence domains in DNA-binding regions of homing endonucleases. Trends Biochem Sci. 2003, 28: 473-477.

    CAS  PubMed  Google Scholar 

  64. 64.

    Brown RS: Zinc finger proteins: getting a grip on RNA. Curr Opin Struct Biol. 2005, 15: 94-98.

    CAS  PubMed  Google Scholar 

  65. 65.

    Laity JH, Lee BM, Wright PE: Zinc finger proteins: new insights into structural and functional diversity. Curr Opin Struct Biol. 2001, 11: 39-46.

    CAS  PubMed  Google Scholar 

  66. 66.

    Banumathy G, Somaiah N, Zhang R, Tang Y, Hoffmann J, Andrake M, Ceulemans H, Schultz D, Marmorstein R, Adams PD: Human UBN1 is an ortholog of yeast Hpc2p and has an essential role in the HIRA/ASF1a chromatin-remodeling pathway in senescent cells. Mol Cell Biol. 2009, 29: 758-770.

    CAS  PubMed Central  PubMed  Google Scholar 

  67. 67.

    RepeatMasker Open-3.0 - frog [ xenTro ] Genomic Dataset. []

  68. 68.

    Sun C, Shepard DB, Chong RA, López Arriaza J, Hall K, Castoe TA, Feschotte C, Pollock DD, Mueller RL: LTR retrotransposons contribute to genomic gigantism in plethodontid salamanders. Genome Biol Evol. 2012, 4: 168-183.

    PubMed Central  PubMed  Google Scholar 

  69. 69.

    Roelants K, Gower DJ, Wilkinson M, Loader SP, Biju SD, Guillaume K, Moriau L, Bossuyt F: Global patterns of diversification in the history of modern amphibians. Proc Natl Acad Sci U S A. 2007, 104: 887-892.

    CAS  PubMed Central  PubMed  Google Scholar 

  70. 70.

    Jordan IK, Matyunina LV, McDonald JF: Evidence for the recent horizontal transfer of long terminal repeat retrotransposon. Proc Natl Acad Sci U S A. 1999, 96: 12621-12625.

    CAS  PubMed Central  PubMed  Google Scholar 

  71. 71.

    Gonzalez P, Lessios HA: Evolution of sea urchin retroviral-like (SURL) elements: evidence from 40 echinoid species. Mol Biol Evol. 1999, 16: 938-952.

    CAS  PubMed  Google Scholar 

  72. 72.

    Terzian C, Ferraz C, Demaille J, Bucheton A: Evolution of the Gypsy endogenous retrovirus in the Drosophila melanogaster subgroup. Mol Biol Evol. 2000, 17: 908-914.

    CAS  PubMed  Google Scholar 

  73. 73.

    Vázquez-Manrique RP, Hernández M, Martínez-Sebastián MJ, de Frutos R: Evolution of gypsy endogenous retrovirus in the Drosophila obscura species group. Mol Biol Evol. 2000, 17: 1185-1193.

    PubMed  Google Scholar 

  74. 74.

    Schaack S, Gilbert C, Feschotte C: Promiscuous DNA: horizontal transfer of transposable elements and why it matters for eukaryotic evolution. Trends Ecol Evol. 2010, 25: 537-546.

    PubMed Central  PubMed  Google Scholar 

  75. 75.

    Silva JC, Loreto EL, Clark JB: Factors that affect the horizontal transfer of transposable elements. Curr Issues Mol Biol. 2004, 6: 57-71.

    CAS  PubMed  Google Scholar 

  76. 76.

    Düşen S, Öz M: Helminths of the marsh frog, Rana ridibunda Pallas, 1771 (Anura: Ranidae), from Antalya Province, Southwestern Turkey. Comp Parasitol. 2006, 73: 121-129.

    Google Scholar 

  77. 77.

    Düşen S, Öz M: Helminth fauna of the Eurasian marsh frog, Pelophylax ridibundus (Pallas, 1771) (Anura: Ranidae), collected from Denizli Province, Inner-West Anatolia Region, Turkey. Helminthologia. 2013, 50: 57-66.

    Google Scholar 

  78. 78.

    Shen C-H, Steiner LA: Genome structure and thymic expression of an endogenous retrovirus in zebrafish. J Virol. 2004, 78: 899-911.

    CAS  PubMed Central  PubMed  Google Scholar 

  79. 79.

    Carré-Eusèbe D, Coudouel N, Magre S: OVEX1, a novel chicken endogenous retrovirus with sex-specific and left-right asymmetrical expression in gonads. Retrovirology. 2009, 6: 59-

    PubMed Central  PubMed  Google Scholar 

  80. 80.

    Faulkner GJ, Kimura Y, Daub CO, Wani S, Plessy C, Irvine KM, Schroder K, Cloonan N, Steptoe AL, Lassmann T, Waki K, Hornig N, Arakawa T, Takahashi H, Kawai J, Forrest ARR, Suzuki H, Hayashizaki Y, Hume DA, Orlando V, Grimmond SM, Carninci P: The regulated retrotransposon transcriptome of mammalian cells. Nat Genet. 2009, 41: 563-571.

    CAS  PubMed  Google Scholar 

  81. 81.

    Sinzelle L, Carradec Q, Paillard E, Bronchain OJ, Pollet N: Characterization of a Xenopus tropicalis endogenous retrovirus with developmental and stress-dependent expression. J Virol. 2011, 85: 2167-2179.

    CAS  PubMed Central  PubMed  Google Scholar 

  82. 82.

    Dennis S, Sheth U, Feldman JL, English KA, Priess JR: C. elegans germ cells show temperature and age-dependent expression of Cer1, a Gypsy/Ty3-related retrotransposon. PLoS Pathog. 2012, 8: e1002591-

    PubMed Central  PubMed  Google Scholar 

  83. 83.

    Kimelman D, Kirschner M, Scherson T: The events of the midblastula transition in Xenopus are regulated by changes in the cell cycle. Cell. 1987, 48: 399-407.

    CAS  PubMed  Google Scholar 

  84. 84.

    Yang J, Tan C, Darken RS, Wilson PA, Klein PS: Beta-catenin/Tcf-regulated transcription prior to the midblastula transition. Dev Camb Engl. 2002, 129: 5743-5752.

    CAS  Google Scholar 

  85. 85.

    Skirkanich J, Luxardi G, Yang J, Kodjabachian L, Klein PS: An essential role for transcription before the MBT in Xenopus laevis. Dev Biol. 2011, 357: 478-491.

    CAS  PubMed Central  PubMed  Google Scholar 

  86. 86.

    Kigami D, Minami N, Takayama H, Imai H: MuERV-L is one of the earliest transcribed genes in mouse one-cell embryos. Biol Reprod. 2003, 68: 651-654.

    CAS  PubMed  Google Scholar 

  87. 87.

    Peaston AE, Evsikov AV, Graber JH, de Vries WN, Holbrook AE, Solter D, Knowles BB: Retrotransposons regulate host genes in mouse oocytes and preimplantation embryos. Dev Cell. 2004, 7: 597-606.

    CAS  PubMed  Google Scholar 

  88. 88.

    Schultz RM: Regulation of zygotic gene activation in the mouse. BioEssays News Rev Mol Cell Dev Biol. 1993, 15: 531-538.

    CAS  Google Scholar 

  89. 89.

    Tadros W, Lipshitz HD: The maternal-to-zygotic transition: a play in two acts. Dev Camb Engl. 2009, 136: 3033-3042.

    CAS  Google Scholar 

  90. 90.

    Macfarlan TS, Gifford WD, Driscoll S, Lettieri K, Rowe HM, Bonanomi D, Firth A, Singer O, Trono D, Pfaff SL: Embryonic stem cell potency fluctuates with endogenous retrovirus activity. Nature. 2012, 487: 57-63.

    CAS  PubMed Central  PubMed  Google Scholar 

  91. 91.

    Wolf D, Goff SP: Embryonic stem cells use ZFP809 to silence retroviral DNAs. Nature. 2009, 458: 1201-1204.

    CAS  PubMed Central  PubMed  Google Scholar 

  92. 92.

    Dunn CA, Romanish MT, Gutierrez LE, van de Lagemaat LN, Mager DL: Transcription of two human genes from a bidirectional endogenous retrovirus promoter. Gene. 2006, 366: 335-342.

    CAS  PubMed  Google Scholar 

  93. 93.

    Wang T, Zeng J, Lowe CB, Sellers RG, Salama SR, Yang M, Burgess SM, Brachmann RK, Haussler D: Species-specific endogenous retroviruses shape the transcriptional network of the human tumor suppressor protein p53. Proc Natl Acad Sci U S A. 2007, 104: 18613-18618.

    CAS  PubMed Central  PubMed  Google Scholar 

  94. 94.

    Pi W, Zhu X, Wu M, Wang Y, Fulzele S, Eroglu A, Ling J, Tuan D: Long-range function of an intergenic retrotransposon. Proc Natl Acad Sci U S A. 2010, 107: 12992-12997.

    CAS  PubMed Central  PubMed  Google Scholar 

  95. 95.

    McCue AD, Slotkin RK: Transposable element small RNAs as regulators of gene expression. Trends Genet TIG. 2012, 28: 616-623.

    CAS  Google Scholar 

  96. 96.

    Oliver KR, Greene WK: Mobile DNA and the TE-Thrust hypothesis: supporting evidence from the primates. Mob DNA. 2011, 2: 8-

    CAS  PubMed Central  PubMed  Google Scholar 

  97. 97.

    Oliver KR, Greene WK: Transposable elements and viruses as factors in adaptation and evolution: an expansion and strengthening of the TE-Thrust hypothesis. Ecol Evol. 2012, 2: 2912-2933.

    PubMed Central  PubMed  Google Scholar 

  98. 98.

    Sandmeyer SB, Clemens KA: Function of a retrotransposon nucleocapsid protein. RNA Biol. 2010, 7: 642-654.

    CAS  PubMed Central  PubMed  Google Scholar 

  99. 99.

    Nethe M, Berkhout B, van der Kuyl AC: Retroviral superinfection resistance. Retrovirology. 2005, 2: 52-

    PubMed Central  PubMed  Google Scholar 

  100. 100.

    Weiss RA: On the concept and elucidation of endogenous retroviruses. Philos Trans R Soc Lond B Biol Sci. 2013, 368: 20120494-

    PubMed Central  PubMed  Google Scholar 

  101. 101.

    White JM, Delos SE, Brecher M, Schornberg K: Structures and Mechanisms of Viral Membrane Fusion Proteins. Crit Rev Biochem Mol Biol. 2008, 43: 189-219.

    CAS  PubMed Central  PubMed  Google Scholar 

  102. 102.

    Bénit L, Dessen P, Heidmann T: Identification, phylogeny, and evolution of retroviral elements based on their envelope genes. J Virol. 2001, 75: 11709-11719.

    PubMed Central  PubMed  Google Scholar 

  103. 103.

    Varela M, Spencer TE, Palmarini M, Arnaud F: Friendly viruses: the special relationship between endogenous retroviruses and their host. Ann N Y Acad Sci. 2009, 1178: 157-172.

    CAS  PubMed Central  PubMed  Google Scholar 

  104. 104.

    Parkhomchuk D, Borodina T, Amstislavskiy V, Banaru M, Hallen L, Krobitsch S, Lehrach H, Soldatov A: Transcriptome analysis by strand-specific sequencing of complementary DNA. Nucleic Acids Res. 2009, 37: e123-

    PubMed Central  PubMed  Google Scholar 

  105. 105.

    Bowes JB, Snyder KA, Segerdell E, Gibb R, Jarabek C, Noumen E, Pollet N, Vize PD: Xenbase: a Xenopus biology and genomics resource. Nucleic Acids Res. 2008, 36 (Database issue): D761-767.

    CAS  PubMed Central  PubMed  Google Scholar 

  106. 106.

    Xie Y, Wu G, Tang J, Luo R, Patterson J, Liu S, Huang W, He G, Gu S, Li S, Zhou X, Lam T-W, Li Y, Xu X, Wong GK-S, Wang J: SOAPdenovo-Trans: De novo transcriptome assembly with short RNA-Seq reads. Bioinformatics. 2014, 30: 1660-1666.

    CAS  PubMed  Google Scholar 

  107. 107.

    Li W, Godzik A: CD-HIT: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinforma Oxf Engl. 2006, 22: 1658-1659.

    CAS  Google Scholar 

  108. 108.

    Fu L, Niu B, Zhu Z, Wu S, Li W: CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinforma Oxf Engl. 2012, 28: 3150-3152.

    CAS  Google Scholar 

  109. 109.

    McClure MA, Johnson MS, Feng DF, Doolittle RF: Sequence comparisons of retroviral proteins: relative rates of change and general phylogeny. Proc Natl Acad Sci U S A. 1988, 85: 2469-2473.

    CAS  PubMed Central  PubMed  Google Scholar 

  110. 110.

    Ellinghaus D, Kurtz S, Willhoeft U: LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics. 2008, 9: 18-

    PubMed Central  PubMed  Google Scholar 

  111. 111.

    Katoh K, Standley DM: MAFFT Multiple Sequence Alignment Software Version 7: Improvements in performance and usability. Mol Biol Evol. 2013, 30: 772-780.

    CAS  PubMed Central  PubMed  Google Scholar 

  112. 112.

    Guindon S, Dufayard J-F, Lefort V, Anisimova M, Hordijk W, Gascuel O: New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol. 2010, 59: 307-321.

    CAS  PubMed  Google Scholar 

  113. 113.

    Marchler-Bauer A, Zheng C, Chitsaz F, Derbyshire MK, Geer LY, Geer RC, Gonzales NR, Gwadz M, Hurwitz DI, Lanczycki CJ, Lu F, Lu S, Marchler GH, Song JS, Thanki N, Yamashita RA, Zhang D, Bryant SH: CDD: conserved domains and protein three-dimensional structure. Nucleic Acids Res. 2013, 41 (Database issue): D348-352.

    CAS  PubMed Central  PubMed  Google Scholar 

  114. 114.

    Rice P, Longden I, Bleasby A: EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet TIG. 2000, 16: 276-277.

    CAS  Google Scholar 

  115. 115.

    Eddy SR: Accelerated Profile HMM Searches. PLoS Comput Biol. 2011, 7: e1002195-

    CAS  PubMed Central  PubMed  Google Scholar 

  116. 116.

    Finn RD, Tate J, Mistry J, Coggill PC, Sammut SJ, Hotz H-R, Ceric G, Forslund K, Eddy SR, Sonnhammer ELL, Bateman A: The Pfam protein families database. Nucleic Acids Res. 2008, 36 (Database issue): D281-D288.

    CAS  PubMed Central  PubMed  Google Scholar 

  117. 117.

    Langmead B, Salzberg SL: Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012, 9: 357-359.

    CAS  PubMed Central  PubMed  Google Scholar 

  118. 118.

    Anders S, Huber W: Differential expression analysis for sequence count data. Genome Biol. 2010, 11: R106-

    CAS  PubMed Central  PubMed  Google Scholar 

  119. 119.

    Box GE, Cox DR: An analysis of transformations. J R Stat Soc Ser B Methodol. 1964, 26: 211-252.

    Google Scholar 

  120. 120.

    Shapiro SS, Wilk MB: An analysis of variance test for normality (complete samples). Biometrica. 1965, 52: 591-611.

    Google Scholar 

  121. 121.

    Levene H: Robust tests for equality of variances. Contrib Probab Stat Essays Honor Harold Hotell. 1960, Stanford, CA: Stanford University Press, 278-292.

    Google Scholar 

  122. 122.

    Kruskal WH, Wallis WA: Use of ranks in one-criterion variance analysis. J Am Stat Assoc. 1952, 47: 583-

    Google Scholar 

Download references


We thank Thomas Uzzell (Philadelphia), Gaston-Denis Guex (Dätwil), and two anonymous reviewers for their constructive criticism on an earlier draft of this paper. This research was supported by the Deutsche Forschungsgemeinschaft (grants PL 213/9-1 and PO 1431/1-1).

Author information



Corresponding author

Correspondence to José Horacio Grau.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

JHG conceived and designed the study. MM, AJP and JP carried out the laboratory procedures. mRNA preparation and sequencing was done by AJP. JHG and AJP assembled and analyzed the transcriptome data. JP performed statistical tests for expression patterns, supervised and contributed greatly to draft the manuscript. All authors participated in the elaboration of the manuscript. All authors read and approved the final manuscript.

Electronic supplementary material

Additional file 1: Supplemental material consists of supplementary methods, figures, and tables.(PDF 3 MB)

Aligned amino acid datasets and phylogenetic trees.

Additional file 2: Amino acid sequence alignments and tree files for the different frog transcriptomes and the Silurana genome. (ZIP 193 KB)

Annotation of LTR RE predicted from the genome of

Additional file 3: Silurana tropicalis. Table listing Pfam-A hits of ORFs contained in LTR RE predictions of the Silurana tropicalis genome.(CSV 365 KB)

Annotation of LTR RE transcripts identified in the frog transcriptomes used in this study.

Additional file 4: Table listing Pfam-A hits of ORFs contained in LTR RE transcripts of several frog transcriptomes.(CSV 22 KB)

Authors’ original submitted files for images

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Grau, J.H., Poustka, A.J., Meixner, M. et al. LTR retroelements are intrinsic components of transcriptional networks in frogs. BMC Genomics 15, 626 (2014).

Download citation


  • LTR retroelements
  • Silurana
  • Pelophylax
  • Anura
  • RNAseq
  • Transcriptome
  • Embryogenesis