Genome-wide analysis of alternative splicing of pre-mRNA under salt stress in Arabidopsis
© Ding et al.; licensee BioMed Central Ltd. 2014
Received: 23 March 2014
Accepted: 29 May 2014
Published: 4 June 2014
Skip to main content
© Ding et al.; licensee BioMed Central Ltd. 2014
Received: 23 March 2014
Accepted: 29 May 2014
Published: 4 June 2014
Alternative splicing (AS) of precursor mRNA (pre-mRNA) is an important gene regulation process that potentially regulates many physiological processes in plants, including the response to abiotic stresses such as salt stress.
To analyze global changes in AS under salt stress, we obtained high-coverage (~200 times) RNA sequencing data from Arabidopsis thaliana seedlings that were treated with different concentrations of NaCl. We detected that ~49% of all intron-containing genes were alternatively spliced under salt stress, 10% of which experienced significant differential alternative splicing (DAS). Furthermore, AS increased significantly under salt stress compared with under unstressed conditions. We demonstrated that most DAS genes were not differentially regulated by salt stress, suggesting that AS may represent an independent layer of gene regulation in response to stress. Our analysis of functional categories suggested that DAS genes were associated with specific functional pathways, such as the pathways for the responses to stresses and RNA splicing. We revealed that serine/arginine-rich (SR) splicing factors were frequently and specifically regulated in AS under salt stresses, suggesting a complex loop in AS regulation for stress adaptation. We also showed that alternative splicing site selection (SS) occurred most frequently at 4 nucleotides upstream or downstream of the dominant sites and that exon skipping tended to link with alternative SS.
Our study provided a comprehensive view of AS under salt stress and revealed novel insights into the potential roles of AS in plant response to salt stress.
High salinity in soil is a major environmental condition that adversely affects crop production worldwide. Today, roughly 20% of the world’s cultivated land and nearly half of all irrigated lands are affected by salinity . High concentrations of salt in soil lead to ion imbalances and hyperosmotic stress in plants. Understanding the mechanisms of plant responses to salt stress is fundamentally important to the study of plant biology and also vital to continued development of rational breeding and genetic engineering strategies to improve salt tolerance in crop plants. Plant’s cellular and molecular responses to salt stress have been studied intensively [2, 3]. Among these responses is the dramatic change in the expression of a large number of plant genes, which are regulated at the transcriptional as well as the post-transcriptional levels.
Alternative pre-mRNA splicing (AS) is an important mechanism for regulating gene expression and for increasing transcriptome plasticity and proteome diversity in eukaryotes . AS is involved in many physiological processes in plants, including the response to biotic and abiotic stresses [5–7]. Although AS of some stress-responsive genes has been reported, large-scale or genome-wide studies of AS dynamics under salt stress conditions are still relatively scarce. Based on the data from Sanger sequencing of full-length cDNA libraries from Arabidopsis plants exposed to different stresses such as cold, heat, and salt stress, it was found that the number of AS events under stress conditions (particularly low temperature) was significantly higher than the number under normal conditions . Another study using a whole-genome tiling array in Arabidopsis with various stress treatments identified a group of AS events that were associated with stress-responsive genes and some essential regulatory genes . These and other studies revealed the involvement of AS in response to abiotic stress . The methods used in these studies, Sanger sequencing and tiling array, however suffer from relatively low resolution when compared with the recently developed high-throughput RNA sequencing (RNA-seq) methods. As a result, some AS events, particularly those with lower abundance, may escape detection. A recent study using high-throughput RNA sequencing was conducted with Arabidopsis plants that were exposed to various stresses or were at different developmental stages and time points in the diurnal cycle . That study mainly focused on the complexity of AS rather than on a detailed description of the global changes from AS under salt stress conditions .
To investigate the global dynamics of AS under salt stress, in this study, we used the Illumina HiSeq platform to perform pair-end RNA sequencing with Arabidopsis plants that were exposed to different concentrations of salt and generated ~110 million pair-end reads (101 bp in length). In what follows, we first describe the features of AS under salt stress based on comparative AS analysis. We then report on how the genes with differential AS are well associated with specific functional categories, such as the responses to stresses and RNA splicing. We also suggest that AS could represent a regulatory mechanism independent of the regulation of gene transcriptional activation. Finally, we discuss the change in pre-mRNA splicing patterns of serine/arginine-rich (SR) splicing factors under salt stress.
To identify AS events, we first predicted splice junctions using the software TopHat, which was designed to identify exon-exon splice junctions. We initially obtained 433,475 junctions from the four RNA-seq libraries (Additional file 3). After filtering the splice junctions by two criteria (for details, see Materials and Methods) - an overhang size of more than 20 bp and at least two reads to support the splice junctions - a junction data set of 397,321 confident junctions that we believe to be true splice junctions was obtained. Comparison of the junctions in this junction data set to the gene annotation (TAIR 10) revealed that about 363,383 (91.5%) junctions were previously annotated, and that the remaining 33,938 (8.5%) were novel junctions that had not been annotated in the TAIR10 Database (Additional file 3).
Intron retention was the most prevalent AS event under salt stress, although most intron retentions had relatively low read coverage compared to the read coverage of exons (Figure 2B). This is consistent with the intron-retention background in Arabidopsis that was recently reported . Following intron retention, the alternative 5' and 3' splice sites were relatively prevalent compared with the other types of AS events. Sequence analysis of alternative 5' splice sites (5'SSs) and alternative 3' splice sites (3'SSs) revealed that these activated splice sites were still associated with GU and AG dinucleotides (Figure 2C). Moreover, we found that the occurrence of these alternative 5'SSs and 3'SSs was enriched in the downstream and upstream 4 bp region of the dominant 5'SSs and 3'SSs (Figure 2D), respectively. These features of alternative 5'SSs and 3'SSs are consistent with those found in the human genome . It is noteworthy that when correlating exon skipping events to alternative 5'SSs and 3'SSs, we found that about ~17% of the skipped exons simultaneously had alternative 5'SSs or 3'SSs. This percentage of occurrence was significantly higher than that expected for random sampling of all annotated exons (the probability of random occurrence is 0.02%, Fisher Exact Test, p < 0.001). This result suggests a coordinated occurrence of exon-skipping and alternative splice site selection.
Sequence analysis of these intron-retained transcripts suggested that all of these intron retentions could generate pre-mature stop codons. Therefore, the decrease or increase in intron retention was predicted to increase or decrease the abundance of the functional transcripts, respectively. Since these genes and many other genes with significant intron retentions (Additional files 8 and 12) have been suggested to play roles in stress responses and they are induced by abiotic stresses, an increase in their functional transcript levels (e.g., ATCPK32 and ERD10) is likely to have positive effects on salt tolerance, whereas a decrease in the functional transcript levels of other genes (e.g., ERD14, RD22 and ATGSTF10) could have negative effects on salt tolerance in plants. These results suggested that alteration of AS in stress-responsive genes might impact a plant’s tolerance to salt stress.
We further compared the functional categories of DAS genes with the functional categories of genes without DAS. This comparison clearly revealed that different functional categories are over-represented in both populations (Additional files 13 and 16). Generally, among genes that produce alternative transcripts, several Gene Ontology (GO) categories related to stress, such as ‘response to metal ion’ , ‘response to abiotic stimulus’ , ‘response to cadmium ion’ , ‘RNA splicing’ and ‘RNA processing’ , are over-represented. On the other hand, among genes that are not alternatively spliced, several functions related to house-keeping, such as ‘protein transport’ , ‘DNA repair’ and ‘cell wall organization’ , are over-represented. The presence of genes that undergo AS in the stress-related and RNA processing categories is much higher than the presence of genes not alternatively spliced in these categories, further supporting the notion that stress-related genes are more predisposed to pre-mRNA processing than are genes involved in basic cellular functions (Figure 4E).
Nonetheless, these co-regulated genes account only for a relatively small portion of all DAS or DE genes in Arabidopsis. This finding suggested that AS and gene activation could be separately regulated in response to salt stress. Indeed, analysis of the DAS and DE genes confirmed that the over-represented functional categories differed largely between the two groups, revealing separate regulation of gene expression and AS in response to salt stress (Additional files 19 and 20). For example, some functional categories, e.g., ‘RNA splicing’ and ‘RNA processing’ , were over-represented only among DAS genes, while other categories, such as ‘transcription’ and ‘response to hormone stimulus’ , were found among the DE genes (Figure 5C).
Whereas genes involved in RNA splicing are mostly not regulated by salt stress at the expression level, these genes are frequently alternatively spliced under salt stress. AS of these splicing-related genes could therefore represent an independent means of regulation of genes in response to salt stress. Strikingly, we identified 15 splicing factors with changes in AS under salt stress (Additional file 21). Ten of these splicing factors encoded SR (serine/arginine-rich) proteins. SR splicing factors play key roles in the execution and regulation of pre-mRNA splicing in plants. In Arabidopsis, there are a total of 18 SR proteins [21, 22]. Previous studies suggested that pre-mRNAs of SR protein genes were frequently alternatively spliced under environmental stress, which is thought to alter the splicing of their targets and result in adaptive transcriptome changes in response to environmental conditions [23, 24].
Alternative 3'SS and 5'SS were found in the third intron of AT-SCL30A (AT3G13570) under the 150 mM NaCl treatment, while they were weakly present in the control. This observation was validated by RT-PCR using a forward primer in the third/fourth exon and a reverse one covering the splice junction (Figure 6). Further detailed analysis revealed that both alternative 3'SS and 5'SS actually introduced a novel exon (not annotated in the TAIR10 Arabidopsis genome) that was inserted into the region between the third and fourth exon and generated a novel isoform. Sequence analysis for this isoform suggested that this exon insertion would generate PTCs and thus could encode a truncated protein that was composed of 120 amino acids. Therefore, this exon insertion under salt stress could lead to a decrease in the functional transcripts. Nonetheless, it is unclear whether this novel isoform has any function.
Finally, we identified an alternative 3'SS in the second intron in AT-RSP31, with an increased level under the 300 mM NaCl treatment. This observation was validated by RT-PCR using a forward primer covering the splice junction and a reverse one in the second exon (Figure 6). This alternative 3'SS (not annotated in the TAIR10 Arabidopsis genome) extends the third exon into the next intron and thus generates a larger exon. Sequence analysis for this isoform suggested that this alternative 3'SS would introduce PTCs and thus could encode a truncated protein. It is also unclear whether this truncated protein has any function.
Through comprehensive transcriptome analysis of high-throughput RNA-seq data, in this study, we disclosed features of genome-wide AS in Arabidopsis under salt stress. Our analysis suggests that 49% of the intron-containing genes in Arabidopsis genome are alternatively spliced under salt-stress conditions. Moreover, we found that AS is increased by salt stress and that 10% of the intron-containing genes showed significantly differential AS under salt-stress conditions. The analysis of functional categories demonstrated that genes with differential AS are associated with responses to stress and RNA splicing. Finally, we observed that genes encoding the splicing factors, i.e., SR proteins, are subject to frequent and specific AS under salt stress.
Recent studies using massively parallel RNA sequencing revealed that a large percentage of genes in Arabidopsis undergo AS [10, 11], which potentially could significantly increase the plasticity of the transcriptome and proteome diversity. In this study, we conducted systematic analysis of the transcriptome of salt-treated Arabidopsis plants. Our data revealed that, under salt stress, 49% of the intron-containing genes are alternatively spliced. This number is higher than that reported by Filichkin et al. (42%) , but very close to that reported by Li et al. (48%) , and lower than a recent report that 61% of multi-exonic genes were alternatively spliced, as determined by a normalized cDNA library that facilitated the detection of AS events in low-abundance transcripts . This marked AS under salt stress could provide molecular plasticity for the plants to adapt to stress conditions.
In this study, we found that intron retention and alternative 5’SSs/3’SSs are much more prominent than exon skipping and other types of AS. These observations are consistent with the general view of AS in Arabidopsis reported previously [10, 11]. Importantly, we uncovered two novel features of AS in Arabidopsis. First, alternative 5’SSs/3’SSs tend to occur around the downstream or upstream 4 bp region of the dominant (conical) 5'SS and 3'SS (Figure 2D). A similar AS pattern was also reported in the human genome , suggesting the conservation of this AS pattern in eukaryotes. Second, we found a coordinated occurrence of exon skipping and alternative splice site selection. We thus proposed a model where exon skipping and alternative splice site selection are coupled. We suggest that all the splice sites surrounding the dominant ones have the potential to be used as alternative splice sites. These include the splice sites located at the next or last exon, which would thus cause exon skipping. Previously, exon skipping and alternative splice site selection were usually considered as two independent AS events. Few links between them were previously reported. The discovery of the linkage between these two AS events provides a novel perspective of AS and its regulation.
We found that AS events were obviously increased in Arabidopsis under salt stress. This finding is consistent with some previous studies on AS under environmental stresses . For example, cDNA sequencing results indicated that the number of AS events was significantly higher in Arabidopsis plants exposed to different stresses, particularly low temperature, than in control plants . This increased AS under stress conditions raises an important question on whether the increase is an acclimation response or merely a consequence of splicing errors caused by stress damage. We tend to believe that the increase comes from splicing errors based on the following reasons.
First, in another study on the effects of the depletion and overexpression of one core component of the splicing machinery (SAD1, a Sm-like protein 5) on pre-mRNA splicing and stress tolerance , we found that the increase or decrease of AS in many stress-related genes can be dynamically controlled by the dosage of SAD1; moreover, the increase and decrease in AS are closely linked to the sensitivity and tolerance of the plants to stress, respectively. Therefore, we considered that increased AS could be a result of inaccurate splicing, which could weaken the function of the corresponding genes by decreasing the functional transcripts. In contrast, decreased AS could be an acclimation to stress resistance. Secondly, we did observe a stress-induced deregulation of splicing machinery. In our study, we noticed the down-regulation of U6 snRNAs under salt stress in quantitative RT-PCR assays (Additional file 22). The U6 snRNA is a core component of the spliceosome and is required for its assembly and catalytic activity during pre-mRNA splicing [27, 28]. A decrease in the level of this snRNA would likely compromise the assembly of the spliceosome and its catalytic activity . Thirdly, most stress-induced splicing variants may not be translated into functional proteins. Similarly, some important genes (such as from ERD14, RD22 and ATGSTF10) that are involved in abscisic acid (ABA) or salt stress responses show increased intron retention under salt stress conditions. These transcripts were predicted to generate a pre-mature stop codon that would lead to non-functional mRNAs or proteins, although we currently could not rule out the possibility that some of these truncated proteins may still have certain functions in plant salt tolerance. Thus, we suggest that stress-induced increase in AS could be ascribed to splicing errors or inaccuracies caused by stress.
Nevertheless, if the increase in AS is merely a non-specific consequence of stress damage, a random distribution would be expected among genes. However, our data, along with previous reports, demonstrated that the genes associated with stress response tend to be alternatively spliced under stress conditions (Figure 4B). It is known that salt stress or other abiotic stresses can activate the expression of a large number of plant stress-responsive genes that are not expressed or are expressed at lower levels under normal non-stressful conditions [30, 31]. With the simultaneous production of a large amount of these stress-inducible pre-mRNAs, cells would need to immediately recruit a significant amount of splicing factors and other factors for their co-transcriptional or post-transcriptional processing. This imposes a huge burden on the splicing machinery and, as a result, a significant portion of these transcripts fails to be processed adequately when the splicing machinery is compromised.
The discussion so far covers only to the global changes in AS under salt stress conditions. It should be noted that there are indeed specific cases in which AS plays a functional role in regulating the response and tolerance of plants to stress. Such cases have been described in the last few years [review in . This functional role can also be seen in the splicing of several SR proteins as discussed below.
The AS pattern of several SR proteins has been shown to change obviously under various abiotic stress conditions, including temperature stress, high salinity and high light irradiation [21, 32, 33]. In this study, we identified one- third of the SR genes (six SR genes from four SR families) that showed clear changes in AS under salt stress. This number is higher than that reported before and is probably attributable to the increased sensitivity of the sequencing technology used in the current study.
Interestingly, we clearly identified four SR genes (AT-RS2Z33, AT-RSP40, AT-SCL33 and AT-RSP41) that showed decreased intron retention under salt stress (Figure 6). Sequence analysis revealed that all the splice variants with reduced abundance under salt stress were aberrant transcripts with premature stop codons that may not produce functional proteins. A decrease in these aberrant transcripts and a simultaneous increase in the functional transcripts in these SR genes could be an acclimation response to stress that may subsequently help to sustain a positive feedback loop to increase the splicing efficiency and the production of functional proteins to combat the stress. Consistently, our recent study demonstrated that the mutations of AT-RSP40 or At-RSP41 led to sensitivity to salt stress, which implied the positive role of At-Rsp40 or At-Rsp41 in salt stress tolerance, probably via regulating the pre-mRNA splicing of certain stress tolerance genes . We predict that regulating the expression of some of these SR genes or other splicing factors may increase plant tolerance to salt stress by enhancing the correct splicing of salt tolerance genes. Our recent study  and a few other studies showed that over-expression of certain splicing factors indeed could increase plant tolerance to salt and other stresses [21, 32, 33].
Through analyzing global changes in AS under salt stress, we firstly identified ~49% of all intron-containing genes that were alternatively spliced under salt stress, 10% of which experienced significant differential alternative splicing (DAS). We found that most DAS genes were not differentially regulated by salt stress, suggesting that AS may represent an independent layer of gene regulation in response to stress; DAS genes were associated with specific functional pathways, such as the pathways for the responses to stresses and RNA splicing. Finally, we revealed that serine/arginine-rich (SR) splicing factors were frequently and specifically regulated in AS under salt stresses, suggesting a complex loop in AS regulation for stress adaptation. Therefore, our study provided a comprehensive view of AS under salt stress and revealed novel insights into the potential roles of AS in plant response to salt stress.
Seeds of the Arabidopsis (ecotype C24) plants were surface-sterilized with 50% bleach in 0.01% Triton X-100 and planted on ½ Murashige and Skoog (MS) medium agar plates supplemented with 3% sucrose. After 4-day stratification at 4°C, the plates were placed in a chamber (Model CU36-L5, Percival Scientific, Perry, IA, USA) under 16 h-white light (~75 μmol m−2 s−1) and 8 h-dark conditions at 21 ± 1°C for germination and seedling growth. Twelve days after being incubated at 21 ± 1°C in the chamber, twenty whole seedlings were transferred from the agar plate onto filter paper (Catalog No. 05-714-4, Fisher Scientific, Pittsburgh, PA, USA) saturated with 20 ml of 0, 50, 150, or 300 mM NaCl solution in a 150 × 15 mm petri dish and incubated in the same chamber for 3 h before being harvested and frozen in liquid nitrogen for total RNA extraction .
Using the TRIzol Reagent (Catalog No. 15596–026, Invitrogen), total RNAs were extracted from seedlings without or with salt stress treatment. Polyadenylated RNAs were isolated using the Oligotex mRNA Midi Kit (Catalog No. 70042, Qiagen). The RNA-seq libraries were constructed using the Illumina Whole Transcriptome Analysis Kit following the standard protocol (Illumina, HiSeq system) and sequenced by the Bioscience Core Facility at KAUST on the HiSeq platform to generate high-quality pair-end reads.
TopHat  was used to align the reads against the Arabidopsis genome sequences and annotated gene models downloaded from TAIR10 (http://www.arabidopsis.org/) with default parameters. Meanwhile, TopHat was also used to predict the splice junctions. Based on the gene annotation information, the splice junctions were classified into the known and novel splice junctions. In addition, the expressed gene or transcripts were identified by the Cufflinks software .
In the initial prediction, there were a great number of novel junctions that had short overhangs (i.e., fewer than 20 bp) with the corresponding exons, while most of the annotated junctions had larger overhangs, with the enrichment at ~90 bp (Additional file 23A). Moreover, the novel junctions had relatively low coverage compared to the annotated junctions (Additional file 23B). In general, the junctions with short overhang size and lower coverage were considered as false positives, which are often caused by non-specific or error alignment. Therefore, to distinguish between true splice junctions and false positives, we assessed the criteria based on simulating data on a set of randomly-constituted junctions. To do this, we firstly generated a set of 80,000 splice junctions in which annotated exons from different chromosomes were randomly selected and spliced together in silico, and 119,618 annotated junctions from the gene annotation. Since the length of our sequencing reads was 101 bp, the splice junction sequences were determined to be 180 bp long (90 nt on either side of the splice junction) to ensure a 11 bp overhang of the read mapping from one side of the junction onto the other. Alignments to the random splice junctions were considered to be false positives, because such junctions are thought to rarely exist when compared to annotated junctions. The alignments of the raw RNA-seq reads to the random junctions revealed that 99.9% of false positive junctions had overhang sizes with fewer than 20 bp. In sharp contrast, the alignments of the annotated junctions indicated that most (98.6%) of annotated junctions had larger overhang sizes (Additional file 24A). In addition, we estimated that 56.9% of false positive junctions had only one read spanning the junction, while the annotated junctions had higher read coverage (Additional file 24B). To minimize the false positive rate, we required that the overhang size had to be greater than 20 bp with at least two reads spanning the junctions. Using these criteria, we filtered out almost all false positive junctions (Additional file 24C).
JuncBASE  was used for annotating all AS events, including cassette exons, alternative 5'SSs, alternative 3'SSs, mutually exclusive exons, coordinate cassette exons, alternative first exons, alternative last exons and intron retention, which takes as input the genome coordinates of all annotated exons and all confidently identified splice junctions.
The global comparison of AS among the control (0), 50, 150 or 300 mM NaCl treatments was started by equally and randomly re-sampling uniquely mapped reads to make sure that the comparison was at the same level. The comparison refers to the two facets: the absolute number of each type of AS event and the number of junction reads that was assigned to each type of AS event, because both of them can be used to measure the global changes of AS. Meanwhile, Fisher’s Exact Tests in R (http://www.r-project.org/) were used to identify differential representation of each type of AS event, performed on the number of junction reads that were assigned to each type of AS event.
Fisher’s Exact Tests were also used to identify the differential representation of each AS event. For alternative 5'SSs and 3'SSs and exon skipping events, Fisher’s Exact Tests were performed on the comparison of the junction-read counts and the corresponding exon-read counts between the control and the 50, 150 or 300 mM NaCl treatments. The events with p values less than 0.05 were identified as significantly differential events. In addition, we considered those AS events that were uniquely identified in the control or the 50, 150 or 300 mM NaCl treatments significant if there were at least five junction reads to support and if the p value of these events was assigned to equal zero. Similarly, for intron retention, Fisher’s Exact Tests were performed on the intron-read counts and the corresponding exon-read counts between the control and the 50, 150 or 300 mM NaCl treatments. The events with p value less than 0.001 were identified as significant differential events. In addition, we considered the intron retention events uniquely identified in the control or the 50, 150 or 300 mM NaCl treatments significant if there were at least 5 × sequence coverage to support and if more than 80% of intron regions were covered by intron-reads. The p value of these events was assigned to equal zero.
The selected AS and intron retention events were validated by RT-PCR using a set of primers (Additional file 25) that were designed based on each AS event. Total RNAs from the control, 50, 150 or 300 mM NaCl treated seedlings were extracted as described above using the TRI solution, treated with DNAase I, and reverse-transcribed to cDNA (random priming) by using a standard protocol (SuperScript II reverse transcriptase, Invitrogen).
We deposited the RNA-seq data in the Sequence Read Archive (SRA) database with an accession number SRP035234.
We would like to thank members in the Bioscience Core Facility of KAUST for their help with next generation sequencing. This work was supported by King Abdullah University of Science and Technology (KAUST), Kingdom of Saudi Arabia.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.