The transcriptome of Leishmania majorin the axenic promastigote stage: transcript annotation and relative expression levels by RNA-seq
© Rastrojo et al.; licensee BioMed Central Ltd. 2013
Received: 4 September 2012
Accepted: 25 February 2013
Published: 4 April 2013
Although the genome sequence of the protozoan parasite Leishmania major was determined several years ago, the knowledge of its transcriptome was incomplete, both regarding the real number of genes and their primary structure.
Here, we describe the first comprehensive transcriptome analysis of a parasite from the genus Leishmania. Using high-throughput RNA sequencing (RNA-seq), a total of 10285 transcripts were identified, of which 1884 were considered novel, as they did not match previously annotated genes. In addition, our data indicate that current annotations should be modified for many of the genes. The detailed analysis of the transcript processing sites revealed extensive heterogeneity in the spliced leader (SL) and polyadenylation addition sites. As a result, around 50% of the genes presented multiple transcripts differing in the length of the UTRs, sometimes in the order of hundreds of nucleotides. This transcript heterogeneity could provide an additional source for regulation as the different sizes of UTRs could modify RNA stability and/or influence the efficiency of RNA translation. In addition, for the first time for the Leishmania major promastigote stage, we are providing relative expression transcript levels.
This study provides a concise view of the global transcriptome of the L. major promastigote stage, providing the basis for future comparative analysis with other development stages or other Leishmania species.
Species of the genus Leishmania are protozoan parasites and aetiological agents of a spectrum of clinical diseases, known as leishmaniases, ranging from disfiguring skin lesions to life-threatening visceral infection. The World Health Organization (WHO) estimates that 350 million people worldwide are at risk of infection, and this disease is considered a major public health problem. Two million new cases of leishmaniasis (1.5 million for cutaneous forms and 500 000 for visceral leishmaniasis) occur annually . The genus Leishmania belongs to the order Trypanosomatida , which also includes, among others, Trypanosoma brucei and Trypanosoma cruzi, causative agents of two other important human infectious diseases: sleeping sickness and Chagas disease, respectively. The evolutionary origin of these organisms is found in the deepest roots of the eukaryotic tree , and are characterized by markedly original molecular features.
In 1999, the complete sequence of chromosome 1 of Leishmania major was published and showed a remarkable feature of the gene organization in Leishmania, i.e. genes are grouped in large clusters sharing the same transcriptional direction. Thus, from the left end of chromosome 1, the first 29 genes are all located on the same DNA strand, whereas the remaining 50 genes are located on the other strand . When transcriptional activity was examined by nuclear run-on analyses using single-stranded DNA probes, the protein-coding strand was found to be more strongly transcribed than the non-coding strand in the majority of the chromosome 1 genes . Furthermore, it was found that the RNA polymerase initiates transcription within the strand-switch region of chromosome 1. Similarly, in chromosome 3, which contains two convergent clusters of 67 and 30 genes, nuclear run-on analyses indicated that transcription initiates upstream of the most-5’ gene of the two long polycistronic clusters . After whole genome sequences for Leishmania and other trypanosomatids (i.e. T. brucei and T. cruzi) were completed, it was confirmed that in these organisms most genes are organized into large clusters on the same DNA strand.
Another remarkable molecular feature found in trypanosomatids is that transcription initiation by RNA polymerase II (RNAP II) is not regulated on a per gene basis; instead, most genes are transcribed polycistronically. Genome-wide chromatin immunoprecipitation analysis of L. major promastigotes showed acetylated histone H3 peaks at the 5' ends of all polycistronic protein-coding gene clusters, indicating that global regulation of transcription initiation may be achieved by epigenetic regulation of H3 acetylation at the origins of polycistronic transcription units . In a recent publication, the J base (a modification of thymine, which is introduced with some frequency in the DNA of trypanosomatids) was shown to define the RNAP II transcription termination sites in L. major and L. tarentolae.
In contrast to operons in bacteria, polycistronic units in trypanosomatids require processing before translation, and the mature mRNAs are processed from primary transcripts by coupled trans-splicing and polyadenylation . During trans-splicing, a conserved spliced leader RNA (SL RNA or mini-exon) is added to the 5’ end of all mRNAs, providing the cap structure for translation. The differential expression of mature mRNAs from a single polycistronic unit is thought to be achieved by post-transcriptional control, i.e. mRNA levels are regulated by RNA stability and/or differential translation [10–12].
In 2005, the sequence for the 36 chromosomes of the L. major genome (32.8 Mb) was published, and provided a framework for future comparative genomic studies . Using bioinformatic analyses, 911 RNA genes, 39 pseudogenes, and 8272 protein-coding genes were predicted. Within the latter group, only 36% can be assigned to a putative function based on sequence conservation with protein characterized in other eukaryotic organisms. Most L. major genes have orthologs in the T. brucei and T. cruzi genomes . However, more than 60% of the predicted genes remain annotated as hypothetical. A major challenge lies ahead to discover whether or not these genes are expressed at any moment in the life cycle and, therefore, may be catalogued as functional genes. On the other hand, both known and putative genes lack annotated 5’ and 3’ untranslated regions (UTRs), and for only a few genes these regions have been experimentally determined . In Leishmania, and related trypanosomatids, these flanking regions (largely the 3’-UTRs) have been involved in regulating the steady-state level and translational status of specific mRNAs along the cell cycle and in the different life cycle stages [10–12].
Recent advances in sequencing technologies, known as deep sequencing or next-generation sequencing (NGS), are becoming invaluable tools, among others, for reconstructing of the entire transcriptome of a given organism [16, 17]. In this study, we employed the power of NGS on RNA analysis (RNA-seq) to provide a comprehensive characterization of the poly-A transcriptome for the promastigote stage of Leishmania major. A total of 10285 transcripts were identified, of which 1884 did not match with previously annotated genes and therefore were categorized as novel genes. In addition, the RNA-seq analysis generated valuable information on both the relative abundance of the RNAs and the structures of their corresponding genes (i.e. ORFs, and 5’- and 3’-UTRs).
Leishmaniaculture and RNA isolation
Promastigotes of L. major Friedlin strain (MHOM/IL/80/Friedlin; clone V1) were cultured at 26°C in RPMI medium supplemented with 10% fetal bovine serum, 100 U/ml penicillin G and 0.1 mg/mL streptomycin sulphate. Promastigotes were grown to mid log phase by seeding cultures at 1 × 106 cells/mL, and collected for RNA isolation when the culture density reached 6.1 × 106 cells/mL (mid-logarithmic phase of growth). Total RNA was isolated using the Aurum™ Total RNA Mini Kit (Biorad), and treated with RNAse-free DNAse I. RNA samples were quantified by absorbance at 260 nm using the Nanodrop ND-1000 (Thermo Scientific), all samples showed an A260/A280 ratio higher than 2.0. In addition, RNA integrity was checked in a bioanalyzer (Agilent 2100).
RNA-seq and data processing
RNA-seq was performed at the Massive Sequencing Platform of Cantoblanco (CSIC-PCM, Madrid, Spain). Standard libraries for massive sequencing were generated using the TruSeq RNA Sample Prep Kit (Illumina). Briefly, poly-A+ RNA was selected by oligo-dT chromatography, and RNA fragmentation was achieved using divalent cations under elevated temperature. Afterwards, these fragments were used to generate a cDNA library, and cDNA fragments corresponding in size to about 300-400 bp were selected on an agarose gel. Two cDNA libraries were constructed: first strand synthesis of one of them was initiated with only random hexadeoxynucleotide primers (Illumina standard protocol); however, for the first strand synthesis of the second library, we introduced as an additional component the 5’-T15VN-3’ oligonucleotide together with the random hexamer primers present in the kit. Afterwards, the second strand of the cDNA was synthesized. The cDNA ends were repaired and adenylated, subsequently adapters were added at both ends. Finally, the library was enriched in ligated fragments by limited PCR amplification. Sequencing was carried out in a GAIIx Illumina system. Each library was sequenced in two separated lines. Single reads of 75 nucleotides were obtained, and raw reads were subject to quality-filtered using the standard Illumina process and analyzed using FASTQC tool . Reads were mapped to the last assembled version of L. major genome, obtained from the Sanger Institute (ftp://ftp.sanger.ac.uk/pub/pathogens/Leishmania/major/V6_211210/), using Bowtie . In the alignment of reads, a maximal of three mismatches was allowed within the whole read (aligner V mode). Nevertheless, in order to select the best alignment in terms of number of mismatches, the option “—best” was used. Also, the option “-k1” was elected, i.e. if in the course of the search Bowtie found 2 (or more) possible alignments for a given read, the program selected one of the alignments at random. We analyzed different alignment conditions in terms of multi-hits in order to obtain the best and accurate results from our data. Allowing up to 10 multi-hits for a single read, the main differences with the transcripts assembled with no multi-hits restriction were found at gene-tandem repeat regions. In those regions the assembled transcripts were reduced to the UTRs, losing the coding regions. Therefore, no restriction in the number of multi-hits was introduced, except for SL-containing reads, in which reads mapping to more than 10 sites were excluded for further analysis. Finally, mapped reads were assembled into transcripts using Cufflinks .
Identification of trans-splicing and polyadenylation sites
Among the non-aligned reads, a search for reads containing 8 (or more) nucleotides identical to the 3’-end of the SL sequence (AACTAACGCT ATATAAGTAT CAGTTTCTGT ACTTTATTG) was performed using a custom Perl script. No mismatches were allowed. Afterwards, the SL-matching nucleotides were stripped from the reads and the remaining sequence was used to map the position of the trans-splicing site in the reference genome. Similarly, reads spanning potential polyadenylation sites were extracted from the non-aligned sequences by an in house Perl script, which finds reads with A-runs (higher than 5 nucleotides in length) located at an end of the read sequence. These reads were mapped back to the reference genome.
Additional sequencing analysis tools
Samtools software  was used to interconvert alignment formats, and to assign the annotated genes to transcripts generated from Seqdata, a local version of Blastx program . The IGV browser was used  for visualization of mapped reads and assembling of transcripts to its genome context. Consensus sequences were analyzed using a local version of WebLogo tool . BLAST searches for sequence homologies were performed in the following databases: GeneDB , TritrypDB  and GenBank at the NCBI .
Results and discussion
RNA-seq data and delineation of transcripts
In order to further delineate Leishmania transcripts, we took advantage of the expected addition of the 39-nucleotide long mini-exon sequence at the 5’-end of all Leishmania mRNAs [32, 33]. Thus, we searched among the non-aligned reads (628 765; 4.29% of total reads) for sequences containing at the 5’-end eight (or more) nucleotides identical to the 3’-end of the mini-exon sequence. A total of 188 398 sequence reads met these criteria. After trimming the mini-exon sequences, these reads were aligned to the L. major genome (Figure 1B) and, as a result, 22 592 different mini-exon addition sites were defined.
Interestingly, only 44, of the 188 398 reads containing SL sequences, were mapped in antisense orientation (related to the coding strand), suggesting that trans-splicing occurs almost exclusively in sense transcripts and that antisense transcripts (if produced at meaningful levels) should not be processed by the addition of mini-exon sequences. In a recent published work , the authors describe the role that base J plays in termination of RNAP II transcription in L. tarentolae, mentioning that the vast majority of SL-containing reads were restricted to the coding strand. Proper transcription termination and avoidance of readthrough of transcriptional stops seemed to be vital for Leishmania.
As illustrated in Figure 1, most of the SL-containing reads mapped at expected locations, i.e. upstream of annotated genes and a significant number of reads were found for each putative splicing acceptor site (considering both main and alternative sites). However, exceptions for this rule were also found. Thus, from time to time, single reads containing SL sequences were mapped at unexpected positions, such as coding sequences or 3’-UTRs. Furthermore, the position of those reads was not accompanied by a breakdown in the reads density as occurs for the rest of SL addition sites. A plausible interpretation for these findings is that the trans-splicing machinery generates a low, but detectable number of events in which the mini-exon is misplaced. Keeping in mind this idea, we excluded in the transcript defining process those mini-exon addition sites that were defined by a sole read and located at unexpected positions.
Transcriptome of Leishmania major promastigotes
Number of transcripts
Mis-annotated genes (*)
With the sole exception of genes LmjF.02.0400, LmjF.09.0690, LmjF27.0280, LmjF33.1760, LmjF35.2600 and LmjF35.2610, transcripts were found for all the currently annotated genes at GeneDB database . These six genes code for hypotethical proteins, but, at least, gene LmjF35.2610 seems to be encoding a protein since the predicted amino acid sequence contains a region with similarity to ubiquitin and also an AT hook, DNA-binding motif; furthermore, the gene is present in other Leishmania species . Thus, the lack of expression of these genes, and in particular of LmjF35.2610, in L. major promastigotes is a finding that would merit further studies.
Interestingly, 1884 new transcripts were found spanning genomic regions lacking annotated genes; hence, they were categorized as non-annotated genes (Table 1). These findings suggest that the gene content of L. major would be approximately 20% higher than previously believed . Similar results have been reported after determining the T. brucei transcriptome by RNA-seq . Nevertheless, it is likely that many of these new transcripts may have roles other than protein-coding function; some may even be merely processing products resulting from the unusual polycistronic gene organization and processing of the Leishmania genome. In this regard, non-coding transcripts, derived from intercoding regions of T. brucei VSG genes, were found to be trans-spliced, polyadenylated and present in polyribosomes . Therefore, the new transcripts described in this work might be considered non-coding (nc) RNAs until shown to be otherwise.
On the other hand, for 94 annotated genes, alternative splice addition sites were mapped into the ORF, suggesting that different proteins might be generated from a single gene. In this regard, there is a documented case of alternative trans-splicing in the T. cruzi LYT1 gene, in which the different maturation of the mRNA leads to the expression of protein isoforms showing different compartmental and functional properties . Overall, our transcriptomic study has uncovered that the current annotation of the L. major genome had clear limitations that are corrected by the data reported in this work.
Determination of RNA levels from RNA-seq data
RNA-seq is an accurate method for quantifying transcript levels. The strength of this method is that it produces digital counts of transcript abundance, in contrast to the analog-style signals obtained from fluorescent dye–based microarrays. This technique has been validated by several studies and found to be highly reproducible, with very little technical variability and can measure mRNA levels over several orders of magnitude [37, 38]. A useful parameter is FPKM (fragments –or reads- per kilobase of transcript per million mapped reads), which reflects the abundance of a transcript in the sample by normalizing for RNA length and for the total read number in the measurement . Thus, the presence and abundance of a given RNA can be calculated and subsequently compared with the amount in any other sequenced sample, now or in the future.
The 50 most abundant transcripts in L. major promastigotes
1357.39 ± 5.12
heat-shock protein (HSP70; gene HSP70-II)
ribosomal protein L30
987.24 ± 4.45
heat-shock protein hsp70 (HSP70; gene HSP70-I)
952.68 ± 4.37
inosine-guanosine transporter (NT2)
809.07 ± 7.12
hypothetical protein, conserved
792.33 ± 8.16
ribosomal protein S29
780.12 ± 5.99
kinetoplastid membrane protein-11 (KMP11)
674.85 ± 5.26
672.00 ± 5.81
ribosomal protein L18a
666.85 ± 8.31
617.33 ± 7.36
ribosomal protein L23
616.48 ± 5.07
hypothetical protein, conserved
603.26 ± 4.69
kinetoplastid membrane protein-11 (KMP11)
596.61 ± 6.27
ribosomal protein S29
560.05 ± 5.48
539.44 ± 5.07
524.19 ± 10.51
514.13 ± 5.67
ribosomal protein L31
496.01 ± 5.02
ribosomal protein S12
493.33 ± 8.18
ribosomal protein L23
490.94 ± 5.15
483.03 ± 7.04
ribosomal protein L27A/L29
482.87 ± 5.89
ribosomal protein L9
464.76 ± 5.1
ribosomal protein L32
452.92 ± 3.18
451.18 ± 3.65
calpain-like cysteine peptidase
448.56 ± 5.89
ribosomal protein L15
446.58 ± 5.12
ribosomal protein S3A
446.57 ± 9.14
ribosomal protein L36
436.47 ± 3.61
427.63 ± 4.95
ribosomal protein L27A/L29
426.82 ± 4.82
activated protein kinase c receptor
425.12 ± 4.7
hypothetical protein, conserved
424.16 ± 3.22
small myristoylated protein 4
419.44 ± 8.52
hypothetical protein, conserved
414.41 ± 4.15
activated protein kinase c receptor
414.01 ± 3.8
411.95 ± 4.52
ribosomal protein S3A
410.71 ± 3.38
nucleoside transporter 1
409.67 ± 3.11
hypothetical predicted multi-pass transmembrane protein
406.24 ± 5.44
ribosomal protein L31
403.61 ± 4.43
ribosomal protein S3A
403.46 ± 3.63
amastin-like surface protein
403.05 ± 3.84
403.05 ± 6.46
ribosomal protein L26
403.04 ± 3.84
399.2 ± 3.82
399.06 ± 3.82
396.93 ± 7.5
ribosomal protein L44
396.86 ± 3.81
On the other hand, five transcripts (LmjF.19.T0983, LmjF.20.T1285, LmjF.31.T0964, LmjF.31.T0895, and LmjF.35.T4191), among the most abundant in L. major promastigotes (Table 2), do not contain previously annotated genes. The sequence of transcript LmjF.19.T0983 was found to be conserved in the genomes of different Leishmania species (L. braziliensis, L. donovani, L. mexicana and L. infantum) but conserved sequences were not detected in the genomes of related trypanosomatids (i.e. T. brucei and T. cruzi). Interestingly, a cDNA (named DRS-2) derived from this transcript was previously described in L. major as an mRNA whose expression increases during metacyclogenesis . Similarly, sequences homologous to transcript LmjF.31.T0964 were found in the genomes of all Leishmania species sequenced to date, but absent in the genus Trypanosoma. The sequences of transcripts LmjF.20.T1285, LmjF.31.T0895 and LmjF.35.T4191 were found to be well conserved in the genomes of L. donovani, L. mexicana and L. infantum, but seemed to be absent from the L. braziliensis genome. It is clear that a challenge for the future will be to understand the nature (coding or not) of these transcripts, and certainly for the additional 1879 new transcripts that have been described in this work (Table 1).
On the other hand, our analysis evidenced a negligible level of single nucleotide polymorphisms (SNPs) in the assembled transcripts regarding the reference genome; this is a surprising discovery taking into account that Leishmania is an aneuploid organism, in which disomic and trisomic chromosomes are more frequently observed than monosomic ones [47, 48]). Similarly, this very low rate of heterozygosity was noted when sequencing the L. major genome  and, more recently, when Rogers and co-workers re-sequenced the L. major genome using the Illumina methodology .
Heterogeneity of trans-splicing and polyadenylation sites
Heterogeneity in the polyadenylation sites in the Leishmania transcripts was also observed; however, the number of reads found denoting polyadenylation events was lower (7894 reads) than those mapped at the 5’end (see above), in spite that we prepared a second library in which an oligo-dT for priming was included in the cDNA reaction (see Methods section). Difficulties in the identification of polyadenylation sites were also experienced by other authors . Recently, P.J. Myler and coworkers have deposited in the TriTrypDB database  a large number of SL- and polyadenylation sites for L. major; these new data further illustrate the complexity of trans-splicing and polyadenylation site selection in Leishmania. A comparative study between our data and those from Myler’s laboratory is underway. Nevertheless, some conclusions may be drawn from the analysis of those reads mapping at the polyadenylation sites derived from our data. Polyadenylation sites were categorized as main (3178 different sites) and alternative (1238 sites). A compositional analysis of the regions surrounding the polyadenylation sites for both categories is shown in Figure 4. Searching for possible consensus sequence, we followed the consensus criteria defined by Cavener : a consensus status is assigned to a single base when the frequency of a nucleotide at a certain position is greater than 50% and greater than twice the relative frequency of the second most frequent nucleotide; a pair of bases were assigned co-consensus status if the sum of the relative frequencies of the two nucleotides exceeded 75%. The application of this rule leads to a very short consensus for the polyadenylation addition site, which may be defined as (C/G)AA; the noteworthy differences between main and alternative sites were: i) C is more frequent in the main sites (40.62%) than in the alternative ones (38.77%); ii) the frequencies for A residues at position 2 and 3 are higher in the main sites (88.92 and 59.1%, respectively) than those found in the alternative sites (83.84 and 51.93%, respectively). An unresolved question related to the polyadenylation consensus sequence is whether the polyadenylation occurs either before or after the AA dinucleotide. Although our data cannot elucidated this question, it is likely that the adenosines of the consensus sequences are encoded residues as it is well known that poly(A) polymerases prefer an initial adenosine residue for attachment of the poly(A) tail, and therefore the selection of the polyadenylation site would be strengthened by the presence of adenosine residues .
Sequencing and annotation of the genomes for some Leishmania species [13, 54] have constituted an important milestone for the study of many biological aspects of this group of parasites. The availability of these genome sequences [25, 26] now enables database mining and identification of different protein sets in Leishmania. This information provided new approaches to study the pattern of gene expression during differentiation and development by the use of DNA microarrays . In current genome databases, the Leishmania genes lacks the definition of 5’ and 3’ UTRs; however, it should be noticed that recently P.J. Myler and coworkers have incorporated SL and polyA sites for most genes of L. major in the TriTrypDB datase . The RNA-seq study described here represents the first annotation of the L. major transcriptome, in which the genes have been delimited in their translated and untranslated regions. As a result, we have uncovered many cases of mis-annotated genes, and more importantly we have found 1884 new genes (previously non-annotated) in the promastigote stage. In addition, we have determined relative expression levels for each one of the 10285 transcripts detected in L. major promastigotes. In summary, the data generated by this study constitute a framework for future analysis aimed to determine differential gene expression either along the life cycle or among different Leishmania species.
We are extremely grateful to Dr Julie Sheldon for English style corrections and critical reading of the manuscript. This work was funded by Ministerio de Ciencia y Tecnología [BFU2009-08986 to J.M.R., BFU 2008-03126 to B.A.], Comunidad Autonoma de Madrid (S2010/BMD-2361 to J.M.R.), and the Fondo de Investigaciones Sanitarias [ISCIII-RETIC RD06/0021/0008-FEDER to J.M.R. and R.M.R]. A.R. holded a postgraduate fellowship (FPU) from the Ministerio de Educación y Ciencia. Also, an institutional grant from Fundación Ramón Areces is acknowledged.
- Desjeux P: Leishmaniasis: current situation and new perspectives. Comp Immunol Microbiol Infect Dis. 2004, 27: 305-318. 10.1016/j.cimid.2004.03.004.View ArticlePubMedGoogle Scholar
- Moreira D, Lopez-Garcia P, Vickerman K: An updated view of kinetoplastid phylogeny using environmental sequences and a closer outgroup: proposal for a new classification of the class Kinetoplastea. Int J Syst Evol Microbiol. 2004, 54: 1861-1875. 10.1099/ijs.0.63081-0.View ArticlePubMedGoogle Scholar
- Baldauf SL: The deep roots of eukaryotes. Science. 2003, 300: 1703-1706. 10.1126/science.1085544.View ArticlePubMedGoogle Scholar
- Myler PJ, Audleman L, DeVos T, Hixson G, Kiser P, Lemley C, Magness C, Rickel E, Sisk E, Sunkin S: Leishmania major Friedlin chromosome 1 has an unusual distribution of protein-coding genes. Proc Natl Acad Sci U S A. 1999, 96: 2902-2906. 10.1073/pnas.96.6.2902.PubMed CentralView ArticlePubMedGoogle Scholar
- Martinez-Calvillo S, Yan S, Nguyen D, Fox M, Stuart K, Myler PJ: Transcription of Leishmania major Friedlin chromosome 1 initiates in both directions within a single region. Mol Cell. 2003, 11: 1291-1299. 10.1016/S1097-2765(03)00143-6.View ArticlePubMedGoogle Scholar
- Martinez-Calvillo S, Nguyen D, Stuart K, Myler PJ: Transcription initiation and termination on Leishmania major chromosome 3. Eukaryot Cell. 2004, 3: 506-517. 10.1128/EC.3.2.506-517.2004.PubMed CentralView ArticlePubMedGoogle Scholar
- Thomas S, Green A, Sturm NR, Campbell DA, Myler PJ: Histone acetylations mark origins of polycistronic transcription in Leishmania major. BMC Genomics. 2009, 10: 152-10.1186/1471-2164-10-152.PubMed CentralView ArticlePubMedGoogle Scholar
- van Luenen HGAM, Farris C, Jan S, Genest PA, Tripathi P, Velds A, Kerkhoven RM, Nieuwland M, Haydock A, Ramasamy G: Glucosylated hydroxymethyluracil, DNA base j, prevents transcriptional readthrough in Leishmania. Cell. 2012, 150: 909-921. 10.1016/j.cell.2012.07.030.PubMed CentralView ArticlePubMedGoogle Scholar
- LeBowitz JH, Smith HQ, Rusche L, Beverley SM: Coupling of poly(A) site selection and trans-splicing in Leishmania. Genes Dev. 1993, 7: 996-1007. 10.1101/gad.7.6.996.View ArticlePubMedGoogle Scholar
- Fernandez-Moya SM, Estevez AM: Posttranscriptional control and the role of RNA-binding proteins in gene regulation in trypanosomatid protozoan parasites. Wiley Interdiscip Rev RNA. 2010, 1: 34-46.PubMedGoogle Scholar
- Requena JM: Lights and shadows on gene organization and regulation of gene expression in Leishmania. Front Biosci. 2011, 17: 2069-2085.View ArticleGoogle Scholar
- Kramer S: Developmental regulation of gene expression in the absence of transcriptional control: the case of kinetoplastids. Mol Biochem Parasitol. 2012, 181: 61-72. 10.1016/j.molbiopara.2011.10.002.View ArticlePubMedGoogle Scholar
- Ivens AC, Peacock CS, Worthey EA, Murphy L, Aggarwal G, Berriman M, Sisk E, Rajandream MA, Adlem E, Aert R: The Genome of the Kinetoplastid Parasite, Leishmania major. Science. 2005, 309: 436-442. 10.1126/science.1112680.PubMed CentralView ArticlePubMedGoogle Scholar
- El-Sayed NM, Myler PJ, Blandin G, Berriman M, Crabtree J, Aggarwal G, Caler E, Renauld H, Worthey EA, Hertz-Fowler C: Comparative genomics of trypanosomatid parasitic protozoa. Science. 2005, 309: 404-409. 10.1126/science.1112181.View ArticlePubMedGoogle Scholar
- Coulson RM, Connor V, Ajioka JW: Using 3' untranslated sequences to identify differentially expressed genes in Leishmania. Gene. 1997, 196: 159-164. 10.1016/S0378-1119(97)00221-7.View ArticlePubMedGoogle Scholar
- Martin JA, Wang Z: Next-generation transcriptome assembly. Nat Rev Genet. 2011, 12: 671-682. 10.1038/nrg3068.View ArticlePubMedGoogle Scholar
- Siegel TN, Gunasekera K, Cross GAM, Ochsenreiter T: Gene expression in Trypanosoma brucei: lessons from high-throughput RNA sequencing. Trends Parasitol. 2011, 27: 434-441. 10.1016/j.pt.2011.05.006.PubMed CentralView ArticlePubMedGoogle Scholar
- FASTQC: http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/,
- Langmead B, Trapnell C, Pop M, Salzberg SL: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009, 10: R25-10.1186/gb-2009-10-3-r25.PubMed CentralView ArticlePubMedGoogle Scholar
- Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L: Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010, 28: 511-515. 10.1038/nbt.1621.PubMed CentralView ArticlePubMedGoogle Scholar
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R: The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009, 25: 2078-2079. 10.1093/bioinformatics/btp352.PubMed CentralView ArticlePubMedGoogle Scholar
- Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL: BLAST+: architecture and applications. BMC Bioinformatics. 2009, 10: 421-10.1186/1471-2105-10-421.PubMed CentralView ArticlePubMedGoogle Scholar
- Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP: Integrative genomics viewer. Nat Biotechnol. 2011, 29: 24-26. 10.1038/nbt.1754.PubMed CentralView ArticlePubMedGoogle Scholar
- Crooks GE, Hon G, Chandonia JM, Brenner SE: WebLogo: a sequence logo generator. Genome Res. 2004, 14: 1188-1190. 10.1101/gr.849004.PubMed CentralView ArticlePubMedGoogle Scholar
- GeneDB: http://www.genedb.org,
- TritrypDB: http://www.tritrypdb.org,
- NCBI: http://www.ncbi.nlm.nih.gov,
- Jager AV, De Gaudenzi JG, Cassola A, D'Orso I, Frasch AC: mRNA maturation by two-step trans-splicing/polyadenylation processing in trypanosomes. Proc Natl Acad Sci U S A. 2007, 104: 2035-2042. 10.1073/pnas.0611125104.PubMed CentralView ArticlePubMedGoogle Scholar
- Kapler GM, Beverley SM: Transcriptional mapping of the amplified region encoding the dihydrofolate reductase-thymidylate synthase of Leishmania major reveals a high density of transcripts, including overlapping and antisense RNAs. Mol Cell Biol. 1989, 9: 3959-3972.PubMed CentralView ArticlePubMedGoogle Scholar
- Soto M, Requena JM, Alonso C: Isolation, characterization and analysis of the expression of the Leishmania ribosomal PO protein genes. Mol Biochem Parasitol. 1993, 61: 265-274. 10.1016/0166-6851(93)90072-6.View ArticlePubMedGoogle Scholar
- Monnerat S, Martinez-Calvillo S, Worthey E, Myler PJ, Stuart KD, Fasel N: Genomic organization and gene expression in a chromosomal region of Leishmania major. Mol Biochem Parasitol. 2004, 134: 233-243. 10.1016/j.molbiopara.2003.12.004.View ArticlePubMedGoogle Scholar
- Agabian N: Trans splicing of nuclear pre-mRNAs. Cell. 1990, 61: 1157-1160. 10.1016/0092-8674(90)90674-4.View ArticlePubMedGoogle Scholar
- Agami R, Shapira M: Nucleotide sequence of the spliced leader RNA gene from Leishmania mexicana amazonensis. Nucleic Acids Res. 1804, 1992: 20-Google Scholar
- Kolev NG, Franklin JB, Carmi S, Shi H, Michaeli S, Tschudi C: The transcriptome of the human pathogen Trypanosoma brucei at single-nucleotide resolution. PLoS Pathog. 2010, 6: e1001090-10.1371/journal.ppat.1001090.PubMed CentralView ArticlePubMedGoogle Scholar
- Aline RF, Scholler JK, Stuart K: Transcripts from the co-transposed segment of variant surface glycoprotein genes are in Trypanosoma brucei polyribosomes. Mol Biochem Parasitol. 1989, 32: 169-178. 10.1016/0166-6851(89)90068-6.View ArticlePubMedGoogle Scholar
- Benabdellah K, Gonzalez-Rey E, Gonzalez A: Alternative trans-splicing of the Trypanosoma cruzi LYT1 gene transcript results in compartmental and functional switch for the encoded protein. Mol Microbiol. 2007, 65: 1559-1567. 10.1111/j.1365-2958.2007.05892.x.View ArticlePubMedGoogle Scholar
- Agarwal A, Koppstein D, Rozowsky J, Sboner A, Habegger L, Hillier LW, Sasidharan R, Reinke V, Waterston RH, Gerstein M: Comparison and calibration of transcriptome data from RNA-Seq and tiling arrays. BMC Genomics. 2010, 11: 383-10.1186/1471-2164-11-383.PubMed CentralView ArticlePubMedGoogle Scholar
- Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y: RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 2008, 18: 1509-1517. 10.1101/gr.079558.108.PubMed CentralView ArticlePubMedGoogle Scholar
- Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B: Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature methods. 2008, 5: 621-628. 10.1038/nmeth.1226.View ArticlePubMedGoogle Scholar
- Brandau S, Dresel A, Clos J: High constitutive levels of heat-shock proteins in human-pathogenic parasites of the genus Leishmania. Biochem J. 1995, 310: 225-232.PubMed CentralView ArticlePubMedGoogle Scholar
- Jardim A, Funk V, Caprioli RM, Olafson RW: Isolation and structural characterization of the Leishmania donovani kinetoplastid membrane protein-11, a major immunoreactive membrane glycoprotein. Biochem J. 1995, 305: 307-313.PubMed CentralView ArticlePubMedGoogle Scholar
- Leifso K, Cohen-Freue G, Dogra N, Murray A, McMaster WR: Genomic and proteomic expression analysis of Leishmania promastigote and amastigote life stages: the Leishmania genome is constitutively expressed. Mol Biochem Parasitol. 2007, 152: 35-46. 10.1016/j.molbiopara.2006.11.009.View ArticlePubMedGoogle Scholar
- Guerfali FZ, Laouini D, Guizani-Tabbane L, Ottones F, Ben-Aissa K, Benkahla A, Manchon L, Piquemal D, Smandi S, Mghirbi O: Simultaneous gene expression profiling in human macrophages infected with Leishmania major parasites using SAGE. BMC Genomics. 2008, 9: 238-10.1186/1471-2164-9-238.PubMed CentralView ArticlePubMedGoogle Scholar
- Mougneau E, Altare F, Wakil AE, Zheng S, Coppola T, Wang ZE, Waldmann R, Locksley RM, Glaichenhaus N: Expression cloning of a protective Leishmania antigen. Science. 1995, 268: 563-566. 10.1126/science.7725103.View ArticlePubMedGoogle Scholar
- Folgueira C, Cañavate C, Chicharro C, Requena JM: Genomic organization and expression of the HSP70 locus in New and Old World Leishmania species. Parasitology. 2007, 134: 369-377. 10.1017/S0031182006001570.View ArticlePubMedGoogle Scholar
- Quijada L, Soto M, Alonso C, Requena JM: Analysis of post-transcriptional regulation operating on transcription products of the tandemly linked Leishmania infantum hsp70 genes. J Biol Chem. 1997, 272: 4493-4499. 10.1074/jbc.272.7.4493.View ArticlePubMedGoogle Scholar
- Sterkers Y, Lachaud L, Crobu L, Bastien P, Pages M: FISH analysis reveals aneuploidy and continual generation of chromosomal mosaicism in Leishmania major. Cell Microbiol. 2011, 13: 274-283. 10.1111/j.1462-5822.2010.01534.x.View ArticlePubMedGoogle Scholar
- Rogers MB, Hilley JD, Dickens NJ, Wilkes J, Bates PA, Depledge DP, Harris D, Her Y, Herzyk P, Imamura H: Chromosome and gene copy number variation allow major structural change between species and strains of Leishmania. Genome Res. 2011, 21: 2129-2142. 10.1101/gr.122945.111.PubMed CentralView ArticlePubMedGoogle Scholar
- Siegel TN, Hekstra DR, Wang X, Dewell S, Cross GAM: Genome-wide analysis of mRNA abundance in two life-cycle stages of Trypanosoma brucei and identification of splicing and polyadenylation sites. Nucleic Acids Res. 2010, 38: 4946-4957. 10.1093/nar/gkq237.PubMed CentralView ArticlePubMedGoogle Scholar
- Nilsson D, Gunasekera K, Mani J, Osteras M, Farinelli L, Baerlocher L, Roditi I, Ochsenreiter T: Spliced leader trapping reveals widespread alternative splicing patterns in the highly dynamic transcriptome of Trypanosoma brucei. PLoS Pathog. 2010, 6: e1001037-10.1371/journal.ppat.1001037.PubMed CentralView ArticlePubMedGoogle Scholar
- Requena JM, Quijada L, Soto M, Alonso C: Conserved nucleotides surrounding the trans-splicing acceptor site and the translation initiation codon in Leishmania genes. Exp Parasitol. 2003, 103: 78-81. 10.1016/S0014-4894(03)00061-4.View ArticlePubMedGoogle Scholar
- Cavener DR: Comparison of the consensus sequence flanking translational start sites in Drosophila and vertebrates. Nucleic Acids Res. 1987, 15: 1353-1361. 10.1093/nar/15.4.1353.PubMed CentralView ArticlePubMedGoogle Scholar
- Wahle E, Keller W: The biochemistry of 3'-end cleavage and polyadenylation of messenger RNA precursors. Annu Rev Biochem. 1992, 61: 419-440. 10.1146/annurev.bi.61.070192.002223.View ArticlePubMedGoogle Scholar
- Peacock CS, Seeger K, Harris D, Murphy L, Ruiz JC, Quail MA, Peters N, Adlem E, Tivey A, Aslett M: Comparative genomic analysis of three Leishmania species that cause diverse human disease. Nat Genet. 2007, 39: 839-847. 10.1038/ng2053.PubMed CentralView ArticlePubMedGoogle Scholar
- Cohen-Freue G, Holzer TR, Forney JD, McMaster WR: Global gene expression in Leishmania. Int J Parasitol. 2007, 37: 1077-1086. 10.1016/j.ijpara.2007.04.011.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.