Genome-wide determination of poly(A) sites in Medicago truncatula: evolutionary conservation of alternative poly(A) site choice
© Wu et al.; licensee BioMed Central Ltd. 2014
Received: 16 April 2014
Accepted: 15 July 2014
Published: 21 July 2014
Alternative polyadenylation (APA) plays an important role in the post-transcriptional regulation of gene expression. Little is known about how APA sites may evolve in homologous genes in different plant species. To this end, comparative studies of APA sites in different organisms are needed. In this study, a collection of poly(A) sites in Medicago truncatula, a model system for legume plants, has been generated and compared with APA sites in Arabidopsis thaliana.
The poly(A) tags from a deep-sequencing protocol were mapped to the annotated M. truncatula genome, and the identified poly(A) sites used to update the annotations of 14,203 genes. The results show that 64% of M. truncatula genes possess more than one poly(A) site, comparable to the percentages reported for Arabidopsis and rice. In addition, the poly(A) signals associated with M. truncatula genes were similar to those seen in Arabidopsis and other plants. The 3′-UTR lengths are correlated in pairs of orthologous genes between M. truncatula and Arabidopsis. Very little conservation of intronic poly(A) sites was found between Arabidopsis and M. truncatula, which suggests that such sites are likely to be species-specific in plants. In contrast, there is a greater conservation of CDS-localized poly(A) sites in these two species. A sizeable number of M. truncatula antisense poly(A) sites were found. A high percentage of the associated target genes possess Arabidopsis orthologs that are also associated with antisense sites. This is suggestive of important roles for antisense regulation of these target genes.
Our results reveal some distinct patterns of sense and antisense poly(A) sites in Arabidopsis and M. truncatula. In so doing, this study lends insight into general evolutionary trends of alternative polyadenylation in plants.
KeywordsAlternative polyadenylation RNA processing Antisense Evolutionary conservation Legume Medicago truncatula
Polyadenylation is the cleavage in a specific location of the 3′-end of pre-mRNA and the addition of a poly(A) tail to form a mature mRNA. Polyadenylation is a key process during eukaryotic gene expression, playing an important role in mRNA stability, translation and transport . If a gene possesses more than one poly(A) site, then it undergoes alternative polyadenylation (APA). APA leads to the formation of mature mRNAs with different natures. Thus, the selection of poly(A) sites located in protein coding regions or introns may result in different protein products. Even within 3′-UTR, different APA sites may regulate mRNA stability and translatability by altering the ability of the mRNA to be regulated by RNA-binding proteins or microRNAs.
Recent studies have shown extensive networks of potential APA in different species and have linked APA to epigenetic regulation and many biological processes [1, 2]. As many as 33% of the 4057 genes in Chlamydomonas reinhardtii have at least two unique poly(A) sites . In higher plants (Arabidopsis and rice), more than 70% of expressed genes possess more than one poly(A) site [4–7]. In animals, APA affects transcripts from 55% of zebrafish genes , 43% of genes are annotated with more than one 3′-UTR isoform in Caenorhabditis elegans, and almost 70% of known human genes have multiple poly(A) sites .
Many studies have shown the evolution patterns of APA in various organisms, especially mammals. Based on 3′-ESTs with poly(A) tails, Yan et al.  found four distinct classes of patterns of APA in the human, mouse, and rat genomes: tandem poly(A) sites, composite exons, hidden exons, and truncated exons. Also using ESTs, Ara et al.  studied the poly(A) site evolution in mammalian genes in humans and mouse and identified about 4800 conserved poly(A) sites. Galante et al.  generated a catalog of conserved sense-antisense pairs occurring in the human and mouse genomes using ESTs and massively parallel signature sequencing data and suggested that these might be involved in several cellular phenomena. Lee et al.  established a conserved pattern for APA in several vertebrate species, and found that the 3′-ends of mRNAs could be dynamically modified by transposable elements through evolution. Derti et al.  applied a polyA-seq method for high-throughput sequencing of 3′-ends of polyadenylated transcripts to identify genome-wide poly(A) sites in the human, rhesus, dog, mouse, and rat genomes.
All of the above computational analyses mainly focused on the conservation study of APA in mammals. The evolution patterns of polyadenylation in plants are still largely unknown. Medicago truncatula is a model plant for the study of legume biology, the draft sequence of which has been completed . However, the annotation of the genome of M. truncatula remains relatively incomplete, especially when it comes to transcript models. Specifically, there is no collection of poly(A) sites available in M. truncatula. Consequently, the majority of the 3′-UTRs are not fully annotated . Here, using a high-throughput sequencing protocol, we generated a comprehensive and high-resolution map of poly(A) sites utilized in leaf and root tissues of M. truncatula. Our results show that poly(A) signals in M. truncatula are similar to those seen in other plants, and that the scope of possible APA (reflected in the number of M. truncatula genes possessing more than one site) is also similar to that reported for other plants. In addition, our results reveal a rather low extent of evolutionary conservation of APA involving intronic poly(A) sites, and greater conservation of sites situated within protein-coding regions, as well as sites associated with antisense transcripts.
High throughput determination of poly(A) sites in M. truncatula
To determine poly(A) sites encoded in the M. truncatula genome, the poly(A)-tag-seq (PAT-seq) approach described by Wu et al.  was used. This entails the generation of Illumina-compatible short cDNA tags that include the mRNA-poly(A) junction. For this, cDNA was synthesized with RNA isolated from pooled leaves and roots from M. truncatula plants. Reverse transcription reactions used a primer containing (from the 5′ to 3′ direction) sequences compatible with the Illumina high throughput sequencing process, a sample-specific bar code, an oligo-dT tract, and a 3′-terminal two nt anchor to promote priming at the 3′UTR-poly(A) junction. Reverse transcription reactions also included a so-called SMART adapter intended to promote reverse transcriptase-mediated template switching at the 3′ ends of completed cDNAs . The template switching serves to “add” two restriction enzyme sites and a suitable sequence for subsequent PCR amplification of the resulting cDNAs. Subsequently, the cDNA was amplified, digested with one of two restriction enzymes, and the appropriate Illumina-based sequencing adapter appended to the digested molecules by ligation. Tags so prepared were recovered, amplified, and submitted for paired-end sequencing on an Illumina GAIIx instrument.
Statistics of the sequencing data from M. truncatula
Mapped Poly(T) tags2
Paired poly(T) tags3 (internal priming excluded)
Distribution of sense and antisense poly(A) sites
The curated poly(A) tags (PATs) were mapped to the latest annotation of the M. truncatula genome (release Mt4.0v1) and the positions of PATs within annotated regions determined. To do this, the M. truncatula genome annotation was modified so as to better estimate the occurrence of poly(A) sites within putative 3′-UTRs; this was done because the current annotation is largely based on protein-coding predictions and thus likely to lack 3′-UTR information. Thus, genes that had no sequences downstream from the end of the protein-coding region were extended by 400 bp; this value reflects the typical average length of 3′-UTRs in other plant species (289 nts in rice; ) and the observation that an extension of annotated 3′-UTRs by 120 bp improves the representation of poly(A) sites in the Arabidopsis genome . M. truncatula genes that had some possible 3′-UTR sequence were extended by 200 bp, again to improve the “recovery” of PATs that fall within authentic 3′-UTRs. Subsequently, PATs that clustered together (separated by fewer than 24 nts) were grouped into so-called Poly(A) site Clusters, or PACs  to reduce the impact of microheterogeneity on subsequent analyses.
Genomic distribution of the sense PACs in M. truncatula and Arabidopsis
Number of M. truncatula genes with different number of PACs
> = 5
To assess the reliability of these PACs, publically available ESTs were used. Only 2% (5529 of 259,740) of the M. truncatula ESTs in the public collection have a poly(A) tail, thus defining 5529 poly(A) sites. A total of 4302 (78%) of these sites are within 50 nt of one or more PACs that are defined by the PATs. Since there is extensive 3′ end heterogeneity in plant transcription units, and given that the EST-derived poly(A) sites define a single site for each corresponding gene, this spatial correspondence between ESTs and PACs indicate that PACs effectively recover the 3′ ends of cloned cDNAs.
To gauge the similarity with Arabidopsis, the previously described PAT collection [7, 18] was re-analyzed using the same criteria as were used to map the M. truncatula tags. As shown in Table 2, the genomic distributions of sense-oriented PATs and PACs were similar in the two organisms. 10% more PATs mapped to 3′-UTRs in Arabidopsis than in M. truncatula; however, this difference could be traced to a larger number of PACs in M. truncatula that mapped to ambiguous regions and to intergenic regions. This difference probably reflects the state of annotations of the two genomes.
A large percentage of the PACs defined in these analyses are derived from single PATs (Additional file 1: Table S2). For all four classes of sites (3′-UTR, intronic, CDS, and intergenic), the single-nucleotide profiles associated with these sites (Additional file 2: Figure S1) are similar to the profiles obtained when studying all PACs (Figure 2) as well as the profiles obtained when studying PACs defined by more than one PAT (Additional file 2: Figure S2). These results indicate that PACs defined by single PATs are authentic poly(A) sites.
Using this information, the M. truncatula genome annotation was updated to reflect the new information regarding poly(A) sites (and corresponding 3′-UTRs); in so doing, the annotations of some 14,203 genes were updated (Additional file 3).
Distribution of antisense PACs in M. truncatula and Arabidopsis
In Arabidopsis, many more antisense PACs (49% of the total) could be attributed to overlapping genes than was seen in M. truncatula (32%; Table 4). In addition, many fewer “orphan” antisense PACs that could not be attributed to nearby genes were seen in Arabidopsis (14%) than in M. truncatula (31%; Table 4). More generally, there were only 30% as many antisense PACs in M. truncatula as in Arabidopsis. In contrast, the numbers of sense PACs identified in M. truncatula were 80% of those found in Arabidopsis (Table 2).
Evolutionary comparisons of poly(A) sites
Amongst the sets of Arabidopsis and M. truncatula genes with PACs, 10,687 gene pairs consisting of clearly-identifiable orthologs could be identified. Of the 1629 M. truncatula genes with intron-situated PACs, 768 had Arabidopsis orthologs. However, only 221 of the corresponding Arabidopsis orthologs also possessed intron-localized PACs. When 1000 trials were run, each of which consisted of a random selection of 768 genes from the complete set of 10,687 Arabidopsis orthologs, an average of 77 were found to possess intronic PACs. Therefore, the coincidence of M. truncatula and Arabidopsis genes with intronic PACs is significantly different than expected based on the assumption of a random distribution of such genes amongst all genes (χ2 test, P = 2.78e-20).
In the set of 221 intronic PAC-containing orthologous gene pairs, only 57 genes were such that both orthologs had an intronic PAC in the same (orthologous) intron. This result indicates that the location of intronic PACs within transcription units is, for the most part, not conserved in the two species.
For the 2683 M. truncatula genes that possess CDS-localized poly(A) sites, 1810 had identifiable Arabidopsis orthologs, and 661 of these also possessed CDS-localized PACs. When 1000 trials, each consisting of a random selection of 1810 genes from the complete set of 10,687 Arabidopsis orthologs, were performed, an average of 313 genes were found to possess CDS-localized PACs. Therefore, the coincidence of M. truncatula and Arabidopsis genes with CDS PACs is significantly different than expected based on the assumption of a random distribution of such genes amongst all genes (χ2 test, P = 1.15e-38).
2278 (41%) of the 5531 antisense PACs in the M. truncatula database could be attributed to convergent transcription; as suggested by Sherstnev et al. , their assignment as antisense transcripts are probably computational artifacts and not representative of transcripts that might function as antisense RNAs. However, 3253 (59%) of the antisense PACs in M. truncatula are not obviously attributable to overlapping transcription (Cases 3 and 4 in Table 4). These PACs fall within 2397 genes, or 17% of the set of M. truncatula genes defined by the complete collection of PACs. Amongst these 2397 genes, 1715 orthologs can be found in Arabidopsis. For these 1715 orthologs, 1603 (or 93%) were associated with antisense PACs. In contrast, when 1000 trials were run, each consisting of a random selection of 1715 genes from the complete set of expressed Arabidopsis genes, an average of 56% of were associated with antisense PACs. This difference between the random collection and Arabidopsis orthologs of M. truncatula genes subject to antisense transcription is significant (χ2 test, P = 1.98e-138), suggestive of potential significance of these conservation antisense PACs in plants.
An updated picture of the Medicago truncatula genome
As of version Mt4.0v1 of the M. truncatula genome, there are more than 50,000 predicted genes. However, only about 18,000 3′-UTRs are annotated in this version. The data presented in this study permits the addition of some 42,500 poly(A) sites to this annotation, as well as the precise defining of the 3′-UTRs of more than 14,000 M. truncatula genes. In addition, these results imply the existence of as many as 5500 as yet unidentified M. truncatula genes (these being associated with the PACs that fall within putative intergenic regions). Whether or not most or all of these are protein-coding, or instead if a sizeable fraction encodes non-coding RNAs, is a matter for future study.
In most respects, the characteristics of poly(A) sites and their distributions in M. truncatula are similar to those reported for other plants (primarily Arabidopsis and rice). Thus, the percentage of M. truncatula genes that possess multiple poly(A) sites is considerable (64%) and comparable to the percentages reported for Arabidopsis (between 60% and 70%; [5–7]) and rice (between 47% and 82%; ). For the most part, the genomic distributions (falling within 3′-UTRs, introns, etc.) of PACs in M. truncatula are similar to those seen in Arabidopsis (Table 2); the exception to this is the paucity of PACs that map to 5′-UTRs in M. truncatula. This probably reflects the relatively incomplete annotation of the M. truncatula genome, especially when it comes to the identification of confirmed 5′-UTRs.
As shown in Figure 2, poly(A) sites in M. truncatula are associated with the same trends in nucleotide compositions that are seen in other plants [4, 7, 19–21]. Importantly, sites that lie within introns and intergenic regions possess the same signature as sites situated in 3′-UTRs. This indicates that the former sites have a similar tri-partite structure as canonical plant poly(A) sites, a structure that consists of a linear array of (respectively) U-rich region (that may include better-defined submotifs [22, 23]), A-rich element, and U-rich region surrounding the poly(A) site itself [19, 24–26]. These results confirm a general conservation in the poly(A) signal in plants.
As was noted previously in Arabidopsis, M. truncatula poly(A) sites that fall within protein-coding regions have a different sequence profile (Figures 2D, Additional file 2: Figure S1D and S2D). While the nucleotide composition profiles of CDS-localized sites differ substantially from other sites, they are similar to the profile reported for such sites in Arabidopsis. The means by which polyadenylation at these sites is accomplished are not known, nor is the significance of this class of poly(A) site understood. However, the conservation in sequence context and in CDS polyadenylation in orthologous genes in M. truncatula and Arabidopsis, as well as the association of these sites with genes that play roles in stress responses in Arabidopsis (Additional file 1: Table S3; also, ) are collectively suggestive of distinctive modes of action and function for these sites.
There is a sizeable number of M. truncatula PACs that are in an antisense orientation with respect to an annotated gene. A smaller percentage of the M. truncatula antisense PACs occur in convergently-transcribed gene pairs than is seen in Arabidopsis (41% in M. truncatula, vs. 60% in Arabidopsis). However, in terms of total numbers of PATs, the overwhelming majority of antisense polyadenylation occurs in convergently-transcribed gene pairs in both organisms (Table 4). This indicates that, as has been noted before for Arabidopsis[6, 7], most apparent antisense-oriented poly(A) sites are actually associated with nearby genes, and probably not with dedicated, antisense-oriented non-coding RNAs.
On the evolution of poly(A) sites in plants
While these generalities demonstrate a broad conservation in the poly(A) signal and thus mechanism of 3′ end formation in plants, they also raise interesting questions. For example, little is known about how poly(A) sites may evolve in orthologous genes in different plant species. Most PACs (and PATs) in both Arabidopsis and M. truncatula map to established or probable 3′-UTRs (Table 2). The origins of these sites may be via one of two mechanisms – the poly(A) site(s) found in an ancestral gene may be conserved in various lineages, or poly(A) sites may appear and be lost in different lineages. While the latter possibility seems unlikely, the highly degenerate nature of plant poly(A) signals [19, 26] leaves the possibility open. (For example, it has been noted that little more than a general A + U richness may suffice for function as a plant polyadenylation signal [27, 28]. Thus, it is conceivable that modest sequence variation that alters A + U content may be sufficient for the origination of a new polyadenylation signal.) The results presented in this study favor the first model more than the second. Therefore, were poly(A) signals in homologous genes to evolve by random appearance and disappearance, there should be no correlation of the lengths of 3′-UTRs in homologous genes. On the other hand, if an ancestral poly(A) signal is usually retained in various lineages, there should be a correlation in 3′-UTR lengths in homologous genes. The latter is the result that is seen in this study.
Besides 3′-UTRs, poly(A) sites may also occur in other parts of a primary transcript. Intronic poly(A) sites are known to play key roles in Arabidopsis (e.g., [29, 30]) and other plants [31–35]. The results of the M. truncatula – Arabidopsis comparison are curious and contradictory at first glance. On the one hand, there is a modest conservation in terms of the impact of intronic APA on orthologous genes in the two species. However, while being subject to intronic APA is conserved, the position of the intron in affected genes is not. This suggests that the process of intronic APA itself may play roles in gene expression, but that the specific outcomes (which tend to vary for orthologous genes in the two species) are not as important. A similar proposal has been made for alternative splicing outcomes in primates . Thus, although the alternative splicing of transcripts from the DNA polymerase beta locus in primates shows little conservation in terms of specific events, there is nonetheless extensive alternative splicing leading to unproductive transcript isoforms throughout the primate lineage. Moreover, there are indications that these unproductive events have adaptive significance. A similar situation may hold for APA in plants.
While there is a significantly non-random nature to the extent of conservation of intronic APA between M. truncatula and Arabidopsis, a majority of these events in this study are specific for one or the other species. Thus, as is the case in animals [37, 38], this mode of APA may contribute to a proliferation of species-specific variability in gene expression (through the production of altered mRNA and protein isoforms).
Compared with intronic APA, there is a somewhat greater conservation of CDS-localized poly(A) sites between Arabidopsis and M. truncatula, as reflected in the greater numbers of orthologous genes that are affected by CDS APA. The association of CDS APA with Arabidopsis genes involved in stress responses and post-embryonic development (Additional file 1: Table S3; also, ), along with the conservation that is seen in this report, suggests an important role for CDS APA in plants. Because CDS APA removes the translation termination codon from the affected primary transcript, these RNAs and their presumed translation products are likely to be unstable, being rapidly degraded by enzymes associated with non-stop mRNA surveillance [39, 40]. Thus, the most likely function for these mRNAs is a negative one, perhaps serving as part of circuits that fine-tune expression at the mRNA and protein level. The conservation noted in this report suggests that this mode of regulation may be important for gene function.
The bulk of apparent antisense transcription (based on PAT numbers) in M. truncatula derives from convergently-transcribed genes, as is the case in Arabidopsis[6, 7]. However, 59% of the antisense poly(A) sites (as opposed to individual poly(A) tags) in M. truncatula can be dissociated from possible adjacent, convergently-transcribed genes. Interestingly, a high percentage of the associated target genes possess Arabidopsis orthologs that are also associated with antisense PACs. This remarkable extent of evolutionary conservation is suggestive of important roles for antisense regulation of these target genes. While such regulation could involve the induction of siRNAs that act to negatively regulate gene expression (e.g., ), the observation that there is little correlation between the possible targeting of genes by transcripts defined by antisense PACs and actual target mRNA stability  raises the possibility of other modes of regulation. Such modes might include alteration of the chromatin environment surrounding the target. The resolution of these issues awaits further experimental examination.
To summarize, the results presented here reveal the evolutionary conservation in the plant alternative polyadenylation. They indicate a general conservation in the poly(A) signal in plants, and that the conservation of CDS-localized poly(A) sites in these two species is greater than the conservation of intronic sites. Together, our results suggest some distinct sense and antisense evolution patterns of APA in plants.
Plant materials and PAT-seq library generation
Medicago truncatula (Jemalong A17) plants were grown in soil in the greenhouse near the autumn equinox under natural lighting and temperatures between 22-25°C. M. truncatula poly(A) tags were generated using total RNA isolated from the combined leaves and washed roots of 3–4 week-old nodule-free plants, following the PAT-seq protocol described in Wu et al. . There is no ethical issue to state for these plants used in the experiment. The production and dataset of the Arabidopsis poly(A) tags analyzed here has been described elsewhere [7, 18].
Sequencing data retrieval and processing
A previous iterative mapping pipeline for paired end sequences was used to determine poly(A) sites . For Arabidopsis, the latest genome annotation of TAIR10 was used (ftp://ftp.Arabidopsis.org/home/tair/Genes/TAIR10_genome_release/TAIR10_gff3/TAIR10_GFF3_genes_transposons.gff). The genome sequences and annotation for M. truncatula annotation were downloaded from ftp://ftp.jcvi.org/pub/data/m_truncatula/Mt4.0/Assembly/. To prevent double counting of poly(A) sites corresponding to multiple transcripts from the same gene, multiple transcripts from the same gene were merged into one unique gene model. Regions that have different annotations (for example, coding region in one transcript, intron in another, due to alternative splicing) in different transcripts encoded by the same gene were denoted as AMB (AMBiguous).
To facilitate the assignments of PATs and PACs to annotated genes, the annotated genes and 3′-UTRs for Arabidopsis and M. truncatula were extended using the following reasoning. As described in a previous report , for Arabidopsis, the annotated 3′-UTRs were extended for 120 nt and genes without annotated 3′-UTR were extended by 338 nt. For M. truncatula, about 75% of PACs located in intergenic regions are situated within 400 nt of an adjacent, properly-oriented protein-coding region. In addition, the median 3′-UTR length for M. truncatula was found to be 180 nts (Figure 4A). Thus, genes with annotated 3′-UTRs were extended for 200 nt and genes without annotated 3′-UTR were extended by 400 nts; these revised annotations were used to assign PATs and PACs to individual genes.
To study tags that map to more than one gene (Additional file 1: Table S1), tags starting with poly(T) were remapped to the genome using Bowtie  with option “-n 1 -l 25 -e 70 -a -m 20”, allowing for multiple occurrences of particular sequences.
Confirmation of poly(A) sites in M. truncatula
EST data was downloaded from the PlantGDB website (http://www.plantgdb.org/download/Download/xGDB/MtGDB/). There are a total of 259,740 ESTs, but only 5497 of them have a poly(A) tail. ESTs with poly(A) tails were extracted and mapped to the M. truncatula genome using GMAP . The cDNA-to-genome mapping results were analyzed to get poly(A) sites as described previously . The correspondence of EST- and PAT- derived poly(A) sites was then assessed, making the assumption that EST-derived sites that fall within 50 nts of a given PAC may serve as authentication of the PAC. This was done because of the low numbers of EST-derived poly(A) sites as well as the realization that most plant genes possess more than one site, and that the occurrence of a nearby EST poly(A) site probably reflects the existence of a cluster of nearby sites.
Analysis of orthologous genes
Orthologous gene pairs between M. truncatula and Arabidopsis were downloaded from the PLAZA website (ftp://ftp.psb.ugent.be/pub/plaza/plaza_public_02_5/GeneFamilies/genefamily_data.orth.csv.gz). A total of 21,976 orthologous gene pairs were obtained, involving 10275 Arabidopsis genes and 6447 M. truncatula genes, respectively. 3410 of the Arabidopsis genes that are in orthologous groups are such that their M. truncatula genes orthologs do not “possess” a PAC.
Availability of supporting data
The M. truncatula dataset supporting the results of this article is available in the NCBI repository with accession number SRA157756 [http://www.ncbi.nlm.nih.gov/sra]. The data compiled and used for the analyses in this article are summarized in the Additional file 4.
This work was supported by a grant from US National Science Foundation [IOS-0817818; to QQL and AGH], and a grant from Ohio Plant Biotech Consortium to QQL. Additional funding supports were also from Xiamen University and The 100-Talent Plan of Fujian Province, China. This project was also funded by the National Natural Science Foundation of China (Nos. 61201358, 61174161), the Natural Science Foundation of Fujian Province of China (No. 2012J01154), and the specialized Research Fund for the Doctoral Program of Higher Education of China (Nos. 20100121120022 and 20120121120038), and the Fundamental Research Funds for the Central Universities in China (Xiamen University: No. 2013121025).
- Xing D, Li QQ: Alternative polyadenylation and gene expression regulation in plants. Wiley Interdiscip Rev: RNA. 2010, 2 (3): 445-458.PubMedView ArticleGoogle Scholar
- Ma L, Guo C, Li QQ: Role of alternative polyadenylation in epigenetic silencing and antisilencing. Proc Natl Acad Sci U S A. 2014, 111 (1): 9-10.PubMed CentralPubMedView ArticleGoogle Scholar
- Shen Y, Liu Y, Liu L, Liang C, Li QQ: Unique features of nuclear mRNA poly(A) signals and alternative polvadenylation in Chlamydomonas reinhardtii. Genetics. 2008, 179 (1): 167-176.PubMed CentralPubMedView ArticleGoogle Scholar
- Shen Y, Ji G, Haas BJ, Wu X, Zheng J, Reese GJ, Li QQ: Genome level analysis of rice mRNA 3′-end processing signals and alternative polyadenylation. Nucleic Acids Res. 2008, 36 (9): 3150-3161.PubMed CentralPubMedView ArticleGoogle Scholar
- Shen Y, Venu RC, Nobuta K, Wu X, Notibala V, Demirci C, Meyers BC, Wang G-L, Ji G, Li QQ: Transcriptome dynamics through alternative polyadenylation in developmental and environmental responses in plants revealed by deep sequencing. Genome Res. 2011, 21 (9): 1478-1486.PubMed CentralPubMedView ArticleGoogle Scholar
- Sherstnev A, Duc C, Cole C, Zacharaki V, Hornyik C, Ozsolak F, Milos PM, Barton GJ, Simpson GG: Direct sequencing of Arabidopsis thaliana RNA reveals patterns of cleavage and polyadenylation. Nat Struct Mol Biol. 2012, 19 (8): 845-852.PubMed CentralPubMedView ArticleGoogle Scholar
- Wu X, Liu M, Downie B, Liang C, Ji G, Li QQ, Hunt AG: Genome-wide landscape of polyadenylation in Arabidopsis provides evidence for extensive alternative polyadenylation. Proc Natl Acad Sci U S A. 2011, 108 (30): 12533-12538.PubMed CentralPubMedView ArticleGoogle Scholar
- Ulitsky I, Shkumatava A, Jan CH, Subtelny AO, Koppstein D, Bell GW, Sive H, Bartel DP: Extensive alternative polyadenylation during zebrafish development. Genome Res. 2012, 22 (10): 2054-2066.PubMed CentralPubMedView ArticleGoogle Scholar
- Mangone M, Manoharan AP, Thierry-Mieg D, Thierry-Mieg J, Han T, Mackowiak SD, Mis E, Zegar C, Gutwein MR, Khivansara V, Attie O, Chen K, Salehi-Ashtiani K, Vidal M, Harkins TT, Bouffard P, Suzuki Y, Sugano S, Kohara Y, Rajewsky N, Piano F, Gunsalus KC, Kim JK: The landscape of C. elegans 3′UTRs. Science. 2010, 329 (5990): 432-435.PubMed CentralPubMedView ArticleGoogle Scholar
- Derti A, Garrett-Engele P, MacIsaac KD, Stevens RC, Sriram S, Chen R, Rohl CA, Johnson JM, Babak T: A quantitative atlas of polyadenylation in five mammals. Genome Res. 2012, 22 (6): 1173-1183.PubMed CentralPubMedView ArticleGoogle Scholar
- Yan J, Marr TG: Computational analysis of 3′-ends of ESTs shows four classes of alternative polyadenylation in human, mouse, and rat. Genome Res. 2005, 15 (3): 369-375.PubMed CentralPubMedView ArticleGoogle Scholar
- Ara T, Lopez F, Ritchie W, Benech P, Gautheret D: Conservation of alternative polyadenylation patterns in mammalian genes. BMC Genomics. 2006, 7 (1): 189-PubMed CentralPubMedView ArticleGoogle Scholar
- Galante PAF, Vidal DO, de Souza JE, Camargo AA, de Souza SJ: Sense-antisense pairs in mammals: functional and evolutionary considerations. Genome Biol. 2007, 8 (3): R40-PubMed CentralPubMedView ArticleGoogle Scholar
- Lee JY, Ji Z, Tian B: Phylogenetic analysis of mRNA polyadenylation sites reveals a role of transposable elements in evolution of the 3′-end of genes. Nucleic Acids Res. 2008, 36 (17): 5581-5590.PubMed CentralPubMedView ArticleGoogle Scholar
- Young ND, Debelle F, Oldroyd GED, Geurts R, Cannon SB, Udvardi MK, Benedito VA, Mayer KFX, Gouzy J, Schoof H, Van de Peer Y, Proost S, Cook DR, Meyers BC, Spannagl M, Cheung F, De Mita S, Krishnakumar V, Gundlach H, Zhou S, Mudge J, Bharti AK, Murray JD, Naoumkina MA, Rosen B, Silverstein KAT, Tang H, Rombauts S, Zhao PX, Zhou P, et al: The Medicago genome provides insight into the evolution of rhizobial symbioses. Nature. 2011, 480 (7378): 520-524.PubMed CentralPubMedGoogle Scholar
- Faghihi MA, Wahlestedt C: Regulatory roles of natural antisense transcripts. Nat Rev Mol Cell Biol. 2009, 10 (9): 637-643.PubMed CentralPubMedView ArticleGoogle Scholar
- Zhu YY, Machleder EM, Chenchik A, Li R, Siebert PD: Reverse transcriptase template switching: a SMART approach for full-length cDNA library construction. Biotechniques. 2001, 30 (4): 892-897.PubMedGoogle Scholar
- Thomas PE, Wu X, Liu M, Gaffney B, Ji G, Li QQ, Hunt AG: Genome-Wide control of Polyadenylation site choice by CPSF30 in Arabidopsis. Plant Cell. 2012, 24 (11): 4376-4388.PubMed CentralPubMedView ArticleGoogle Scholar
- Graber JH, Cantor CR, Mohr SC, Smith TF: In silico detection of control signals: mRNA 3′-end-processing sequences in diverse species. Proc Natl Acad Sci U S A. 1999, 96 (24): 14055-14060.PubMed CentralPubMedView ArticleGoogle Scholar
- Shepard PJ, Choi E-A, Lu J, Flanagan LA, Hertel KJ, Shi Y: Complex and dynamic landscape of RNA polyadenylation revealed by PAS-Seq. RNA. 2011, 17 (4): 761-772.PubMed CentralPubMedView ArticleGoogle Scholar
- Tian B, Graber JH: Signals for pre-mRNA cleavage and polyadenylation. Wiley Interdiscip Rev: RNA. 2012, 3 (3): 385-396.PubMed CentralPubMedView ArticleGoogle Scholar
- Rothnie HM, Reid J, Hohn T: The contribution of AAUAAA and the upstream element UUUGUA to the efficiency of mRNA 3′-end formation in plants. EMBO J. 1994, 13 (9): 2200-2210.PubMed CentralPubMedGoogle Scholar
- Sanfacon H, Brodmann P, Hohn T: A dissection of the cauliflower mosaic virus polyadenylation signal. Genes Dev. 1991, 5 (1): 141-149.PubMedView ArticleGoogle Scholar
- Hunt AG: Messenger RNA 3′ end formation in plants. Curr Top Microbiol Immunol. 2008, 326: 151-177.PubMedGoogle Scholar
- Hunt AG: RNA regulatory elements and polyadenylation in plants. Front Plant Sci. 2011, 2: 109-PubMed CentralPubMedGoogle Scholar
- Loke JC, Stahlberg EA, Strenski DG, Haas BJ, Wood PC, Li QQ: Compilation of mRNA polyadenylation signals in Arabidopsis revealed a new signal element and potential secondary structures. Plant Physiol. 2005, 138 (3): 1457-1468.PubMed CentralPubMedView ArticleGoogle Scholar
- Diehn SH, Chiu WL, De Rocher EJ, Green PJ: Premature polyadenylation at multiple sites within a Bacillus thuringiensis toxin gene-coding region. Plant Physiol. 1998, 117 (4): 1433-1443.PubMed CentralPubMedView ArticleGoogle Scholar
- Luehrsen KR, Walbot V: Intron creation and polyadenylation in maize are directed by AU-rich RNA. Genes Dev. 1994, 8 (9): 1117-1130.PubMedView ArticleGoogle Scholar
- Hornyik C, Terzi LC, Simpson GG: The spen family protein FPA controls alternative cleavage and polyadenylation of RNA. Dev Cell. 2010, 18 (2): 203-213.PubMedView ArticleGoogle Scholar
- Simpson GG, Dijkwel PP, Quesada V, Henderson I, Dean C: FY is an RNA 3′ end-processing factor that interacts with FCA to control the Arabidopsis floral transition. Cell. 2003, 113 (6): 777-787.PubMedView ArticleGoogle Scholar
- Giranton JL, Ariza MJ, Dumas C, Cock JM, Gaude T: The S locus receptor kinase gene encodes a soluble glycoprotein corresponding to the SKR extracellular domain in Brassica oleracea. Plant J. 1995, 8 (6): 827-834.PubMedView ArticleGoogle Scholar
- Tantikanjana T, Nasrallah ME, Stein JC, Chen CH, Nasrallah JB: An alternative transcript of the S locus glycoprotein gene in a class II pollen-recessive self-incompatibility haplotype of Brassica oleracea encodes a membrane-anchored protein. Plant Cell. 1993, 5 (6): 657-666.PubMed CentralPubMedView ArticleGoogle Scholar
- Bassett CL, Artlip TS, Callahan AM: Characterization of the peach homologue of the ethylene receptor, PpETR1, reveals some unusual features regarding transcript processing. Planta. 2002, 215 (4): 679-688.PubMedView ArticleGoogle Scholar
- Ishikawa T, Yoshimura K, Tamoi M, Takeda T, Shigeoka S: Alternative mRNA splicing of 3′-terminal exons generates ascorbate peroxidase isoenzymes in spinach (Spinacia oleracea) chloroplasts. Biochem J. 1997, 328 (Pt 3): 795-800.PubMed CentralPubMedView ArticleGoogle Scholar
- Tang G, Zhu X, Gakiere B, Levanony H, Kahana A, Galili G: The bifunctional LKR/SDH locus of plants also encodes a highly active monofunctional lysine-ketoglutarate reductase using a polyadenylation signal located within an intron. Plant Physiol. 2002, 130 (1): 147-154.PubMed CentralPubMedView ArticleGoogle Scholar
- Skandalis A, Frampton M, Seger J, Richards MH: The adaptive significance of unproductive alternative splicing in primates. RNA. 2010, 16 (10): 2014-2022.PubMed CentralPubMedView ArticleGoogle Scholar
- Barbosa-Morais NL, Irimia M, Pan Q, Xiong HY, Gueroussov S, Lee LJ, Slobodeniuc V, Kutter C, Watt S, Colak R, Kim T, Misquitta-Ali CM, Wilson MD, Kim PM, Odom DT, Frey BJ, Blencowe BJ: The evolutionary landscape of alternative splicing in vertebrate species. Science. 2012, 338 (6114): 1587-1593.PubMedView ArticleGoogle Scholar
- Merkin J, Russell C, Chen P, Burge CB: Evolutionary dynamics of gene and isoform regulation in Mammalian tissues. Science. 2012, 338 (6114): 1593-1599.PubMed CentralPubMedView ArticleGoogle Scholar
- Graille M, Seraphin B: Surveillance pathways rescuing eukaryotic ribosomes lost in translation. Nat Rev Mol Cell Biol. 2012, 13 (11): 727-735.PubMedView ArticleGoogle Scholar
- Vasudevan S, Peltz SW, Wilusz CJ: Non-stop decay–a new mRNA surveillance pathway. Bioessays. 2002, 24 (9): 785-788.PubMedView ArticleGoogle Scholar
- Zhang X, Lii Y, Wu Z, Polishko A, Zhang H, Chinnusamy V, Lonardi S, Zhu JK, Liu R, Jin H: Mechanisms of small RNA generation from cis-NATs in response to environmental and developmental cues. Mol Plant. 2013, 6 (3): 704-715.PubMed CentralPubMedView ArticleGoogle Scholar
- Langmead B, Trapnell C, Pop M, Salzberg SL: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009, 10 (3): R25-PubMed CentralPubMedView ArticleGoogle Scholar
- Wu TD, Watanabe CK: GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics. 2005, 21 (9): 1859-1875.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.