Genome-wide analysis of alternative splicing in Volvox carteri

Background Alternative splicing is an essential mechanism for increasing transcriptome and proteome diversity in eukaryotes. Particularly in multicellular eukaryotes, this mechanism is involved in the regulation of developmental and physiological processes like growth, differentiation and signal transduction. Results Here we report the genome-wide analysis of alternative splicing in the multicellular green alga Volvox carteri. The bioinformatic analysis of 132,038 expressed sequence tags (ESTs) identified 580 alternative splicing events in a total of 426 genes. The predominant type of alternative splicing in Volvox is intron retention (46.5%) followed by alternative 5′ (17.9%) and 3′ (21.9%) splice sites and exon skipping (9.5%). Our analysis shows that in Volvox at least ~2.9% of the intron-containing genes are subject to alternative splicing. Considering the total number of sequenced ESTs, the Volvox genome seems to provide more favorable conditions (e.g., regarding length and GC content of introns) for the occurrence of alternative splicing than the genome of its close unicellular relative Chlamydomonas. Moreover, many randomly chosen alternatively spliced genes of Volvox do not show alternative splicing in Chlamydomonas. Since the Volvox genome contains about the same number of protein-coding genes as the Chlamydomonas genome (~14,500 protein-coding genes), we assumed that alternative splicing may play a key role in generation of genomic diversity, which is required to evolve from a simple one-cell ancestor to a multicellular organism with differentiated cell types (Mol Biol Evol 31:1402-1413, 2014). To confirm the alternative splicing events identified by bioinformatic analysis, several genes with different types of alternatively splicing have been selected followed by experimental verification of the predicted splice variants by RT-PCR. Conclusions The results show that our approach for prediction of alternative splicing events in Volvox was accurate and reliable. Moreover, quantitative real-time RT-PCR appears to be useful in Volvox for analyses of relationships between the appearance of specific alternative splicing variants and different kinds of physiological, metabolic and developmental processes as well as responses to environmental changes. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-1117) contains supplementary material, which is available to authorized users.


Supplemental Figures
The Supplemental Figures S1 to S10 show protein sequence alignments (part A) and schematic representations (part B) of alternative splice variants. The deduced amino acid sequences were aligned and coloured using GeneDoc 2.6 [3]. The upper sequence represents the amino acid sequence of the first alternative splice variant (V1) while the lower sequence represents the amino acid sequence of the second alternative splice variant (V2). Identical amino acids are shown by white letters on black background. The amino acids, for which similarity was conserved at more than 80%, are shown by white letters on grey background. The positions of the introns are marked by red arrowheads. The numbers indicate the amino acid position in (A). The scale bar indicates the length in amino acid residues in (B).

Figure S1: Comparison protein isoforms encoded by clpr2
Protein sequence alignment (A) and schematic representation (B) of the deduced amino acid sequences of two alternative splice variants (clpr2V1 and clpr2V2) of clpr2. The clp protease domain of clpr2V1, which is 181 amino acid residues in length, is shortened to 136 amino acid residues in clpr2V2. The conserved residues F100, N111, Y119 and L120, which are involved in α/β-type fold of the protein [6], are highlighted by the red circles. Thick orange line and orange box indicate clp protease domain in (A) and (B), respectively.

Figure S2: Comparison protein isoforms encoded by efg8
Protein sequence alignment (A) and schematic representation (B) of the deduced amino acid sequences of two alternative splice variants (efg8V1 and efg8V2) of efg8. efg8V1 possesses three elongation factor domains, namely EF-TU, EFTU-II and EFTU-III. EF-TU and EFTU-II domains are shortened in efg8V2. Furthermore, two consensus sequences [7], namely Asn-Lys-x-Asp (residues 192 to 195 of efg8V1) and Ser-Ala-Leu/Lys (residues 230 to 232 of efg8V1), which are important to bind guanine nucleotide [7], do not exist in efg8V2. EF-TU, EFTU-II and EFTU-III domains are highlighted in orange, purple and red in (A) and (B), in that order.

Figure S3: Comparison protein isoforms encoded by hyd2
Protein sequence alignment (A) and schematic representation (B) of the deduced amino acid sequences of two alternative splice variants (hyd2V1 and hyd2V2) of hyd2. The alternative splicing causes an N-terminal truncated Fe-only hydrogenase domain in hydV2. This second protein variant is 169 amino acid residues shorter than hyd2V1. The amino acid residues 90 to 97 and 130 to 135, which have been shown to form the β sheets around the active site [9], are highlighted by red boxes. Three motifs of the Fe-only hydrogenase active site, namely motif 1 (PMFTSCCPxW, amino acid residues 169 to 178), motif 2 (MPCxxKxxExxR, amino acid residues 228 to 239) and motif 3 (FxExMACxGxCV, amino acid residues 415 to 426) [10] are highlighted by yellow boxes. Thick orange line and orange box are highlighting the Fe-only hydrogenase domain in (A) and (B), respectively.

Figure S4: Comparison protein isoforms encoded by lgs2
Protein sequence alignment (A) and schematic representation (B) of the deduced amino acid sequences of two alternative splice variants (lsg2V1 and lsg2V2) of lgs2. Peptidase M11 domain of lsg2V2 is 103 amino acid residues shorter than of lsg2V1. Zinc and calcium binding sites of peptidase domain are highlighted by yellow boxes [11]. Peptidase M11 domain is shown by thick orange line and orange box in (A) and (B), respectively.

Figure S5: Comparison protein isoforms encoded by mgmt
Protein sequence alignment (A) and schematic representation (B) of the deduced amino acid sequences of two alternative splice variants (mgmtV1 and mgmtV2) of mgmt. DNA binding domain is shortened in mgmtV2. Two amino acid residues (Y44 and R57), which are important for DNA binding [12,13], are highlighted by the red circles. Thick orange line and orange box represent the DNA binding domain in (A) and (B), respectively.

Figure S10: Comparison protein isoforms encoded by ppi1
Protein sequence alignment (A) and schematic representation (B) of the deduced amino acid sequences of two alternative splice variants (ppi1V1 and ppi1V2) of ppi1. The C-terminal truncated ppi1V2 lacks 23 amino acid residues of PPI-Ypil (Protein Phosphatase Type 1, heat-stable inhibitor of type 1 protein phosphatase) domain [14]. The PPI-Ypil domain is shown by thick orange line and orange box in (A) and (B), respectively.