De novo transcriptome assembly reveals sex-specific selection acting on evolving neo-sex chromosomes in Drosophila miranda
© Kaiser and Bachtrog; licensee BioMed Central Ltd. 2014
Received: 11 December 2013
Accepted: 11 March 2014
Published: 27 March 2014
The Drosophila miranda neo-sex chromosome system is a useful resource for studying recently evolved sex chromosomes. However, the neo-Y genomic assembly is fragmented due to the accumulation of repetitive sequence. Furthermore, the separate assembly of the neo-X and neo-Y chromosomes into genomic scaffolds has proven to be difficult, due to their low level of sequence divergence, which in coding regions is about 1.5%. Here, we de novo assemble the transcriptome of D. miranda using RNA-seq data from several male and female tissues, and develop a bioinformatic pipeline to separately reconstruct neo-X and neo-Y transcripts.
We obtain 2,141 transcripts from the neo-X and 1,863 from the neo-Y. Neo-Y transcripts are generally shorter than their homologous neo-X transcripts (N50 of 2,048-bp vs. 2,775-bp) and expressed at lower levels. We find that 24% of expressed neo-Y transcripts harbor nonsense mutation within their open reading frames, yet most non-functional neo-Y genes are expressed throughout all of their length. We find evidence of gene loss of male-specific genes on the neo-X chromosome, and transcriptional silencing of testis-specific genes from the neo-X.
Nonsense mediated decay (NMD) has been implicated to degrade transcripts containing pre-mature termination codons (PTC) in Drosophila, but rampant description of neo-Y genes with pre-mature stop codons suggests that it does not play a major role in down-regulating transcripts from the neo-Y. Loss or transcriptional down-regulation of genes from the neo-X with male-biased function provides evidence for beginning demasculinization of the neo-X. Thus, evolving sex chromosomes can rapidly shift their gene content or patterns of gene expression in response to their sex-biased transmission, supporting the idea that sex-specific or sexually antagonistic selection plays a major role in the evolution of heteromorphic sex chromosomes.
KeywordsSex chromosomes Drosophila Transcriptome Gene loss Nonsense mediated decay
The differentiation of an evolving pair of sex chromosomes involves many evolutionary steps, including changes in gene content on the X and Y [1, 2]; gene loss and the fixation of deleterious mutations on the Y ; epigenetic modifications related to silencing of Y-linked genes  and the evolution of dosage compensation [5, 6]. The Drosophila miranda neo-sex chromosome system is emerging as a useful resource to study the evolution of young sex chromosomes [7–9]. D. miranda split from its sister species D. pseudoobscura only about 2 my (million years) ago, and has sex chromosomes of different ages: Muller A (referred to as XL), the ancestral sex chromosome shared among all Drosophila species including D. melanogaster, is at least 60 my old [10, 11]; Muller D (referred to as XR), a neo-sex chromosome system of intermediate age (about 15 my) and shared with D. pseudoobscura; Muller C (referred to as neo-X/neo-Y; about 1–2 my old), specific to the D. miranda lineage.
The genome sequence of D. miranda has recently been published, including a draft assembly of the neo-Y chromosome . A major challenge in the assembly of this chromosome was the fact that the neo-Y has already accumulated many repetitive sequences, which are difficult to assemble – similar to the ancestral Y chromosome of Drosophila which has not been assembled at all. In addition, the high level of sequence identity between the neo-X and neo-Y especially in coding regions further complicates assemblies using next-generation sequencing data, since many short sequencing reads are identical from the neo-X and neo-Y. To annotate the neo-Y and study expression changes, diagnostic SNPs between the neo-X and neo-Y were used to estimate the relative abundance of neo-sex linked transcripts . This, however, does not allow studying the structure of the neo-Y transcripts per se. In particular, it is unknown how complete neo-Y protein-coding genes are, especially with regards to the transcripts that contain premature stop codons or frameshift mutations. Furthermore, the use of diagnostic SNPs for expression analysis will lead to an inherent mapping bias towards higher neo-X expression, since RNA-Seq reads can overlap with parts of transcripts that differ structurally between the neo-X and neo-Y. Using the neo-X sequence as a baseline for comparison, one can only study neo-Y transcripts that are also present on the neo-X and show no major structural changes or indels. Thus, there is considerable interest to study the neo-sex transcriptome directly, which we do here.
In addition to providing insights into sex chromosome evolution, the neo-sex chromosomes of D. miranda also allow us to study other cellular phenomena, such as Nonsense-mediated mRNA decay (NMD). NMD is a cellular surveillance mechanism that degrades transcripts containing premature termination codons (PTCs) [12–14]. Destruction of transcripts containing PTCs by NMD prevents production of truncated, potentially harmful proteins that may interfere with normal cellular processes. The exact mechanism through which NMD recognized PTCs in Drosophila is unclear, though the length of the 3′ UTR has been implicated , whereas the presence of an exon-exon junction downstream of a stop codon is an initiator of NMD in mammals . Given that a large number of neo-Y transcripts contain PTCs, the D. miranda system makes an excellent case for testing if the length of the 3′UTR leads to NMD in flies.
A major challenge in constructing the neo-sex transcriptome is that coding regions on the neo-X and neo-Y are very similar; based on the genome sequence, the estimated divergence at coding regions is only about 1.5%. Accordingly, using Illumina RNA-Seq reads for transcriptome assembly, not all reads can be assigned to the neo-X or neo-Y unambiguously. This problem is for the most part irrelevant for the neo-X transcriptome, which can be assembled directly in females (but note that this approach will miss male-specific neo-X transcripts). In males, however, transcriptome assemblers fail to assemble the neo-X and neo-Y into separate transcripts. Instead, chimeric transcripts are produced that contain partial neo-X and partial neo-Y sequence, connected by regions that are not differentiated between the neo-sex chromosomes. To obtain an assembly of the neo-Y transcriptome, we developed a bioinformatic pipeline, making use of genomic read mapping against the neo-sex transcripts in males.
We used Illumina Genome Analyzer II paired-end 100-bp RNA-seq reads of the inbred line MSH22 to assemble the D. miranda transcriptome with Trinity . First, low-quality sections of RNA-Seq reads and remaining adaptor sequence were discarded, and the minimum contig size of the Trinity assembly was set to 200 bp. For the neo-Y assembly, 157 million RNA-Seq reads from four different samples (virgin male whole body, virgin male gonadectomized carcass, virgin testis and virgin accessory gland; NCBI accession numbers SRR364803, SRR364802, SRR364798, SRR364799) were combined in a single assembly; similarly, to obtain neo-X transcripts, 163 million RNA-Seq reads from virgin ovary, virgin female whole body and virgin female gonadectomized carcass were combined and assembled together (NCBI accession numbers SRR364804, SRR364801, SRR364800). For the autosomes and old sex chromosomes (XL and XR), both male and female reads were combined for a single assembly. The D. pseudoobscura transcriptome (strain MV25) was assembled de novo using combined virgin male and virgin female whole body samples (87 million reads that had been randomly sampled from the Illumina Hiseq 2000 paired-end reads with NCBI accession numbers SRR357403 and SRR357405).
For this means, male genomic reads were compared to the potential Muller C transcripts and SNP positions were identified. Note that, in contrast to RNA-Seq reads, male genomic reads contain equal proportions of neo-X and neo-Y variants in males. To avoid introducing erroneous variants at intron-exon junctions, male Trinity transcripts were first divided up into exons. This was done by mapping male genomic reads against the male Muller C transcripts using Mosaik (http://code.google.com/p/mosaik-aligner/), allowing for zero mismatches between the reads and transcripts. Any parts of the transcripts for which mapped reads overlapped by less than 6 bp were considered hypothetical exon-exon junctions, and transcripts were split into “exons” at these positions. Using D. pseudoobscura gene models, it was confirmed that exon-exon junctions were correctly identified in a random test sample of transcripts. Next, both male and female genomic reads were mapped against the exons of the male Muller C transcripts using Mosaik, this time allowing for mismatches between the genomic reads and transcripts (in Mosaik, the penalty for gap opening, gap extension and the mismatch score were set to 10, 6.6 and −5 respectively). Freebayes (arXiv:1207.3907) was used to call SNPs in the mapping of male and female genomic reads versus Muller C exons (median coverage of 45 and 62 respectively). SNPs were called when 3 or more reads for the alternative alleles with a base quality of >13 were present at a given site, or if the coverage was 9 or less, the minimum number of reads supporting the alternative alleles was set to two.
At a given position, male reads were polymorphic, and all female reads showed a variant that was different compared to the reference “exon” sequence (0.25% of all sites). These sites were assumed to already contain the neo-Y variant in the transcript sequence.
At a given position, both male and female reads showed a SNP or indel; this could be caused by polymorphisms on the neo-X, sequencing error or by reads from paralogous locations being wrongly mapped to that particular position in both sexes; these positions were left unchanged (0.21% of all sites).
At a given position, male reads were polymorphic, whereas female reads showed no SNP or indel (1.1% of all sites). Such a site was considered to contain the neo-X variant in the transcript sequence and was modified to contain the neo-Y variant, using a perl script.
Genomic neo-Y BAC sequences  were used to heuristically fine-tune the Mosaik mapping parameters as well as the freebayes SNP calling parameters, so as to minimize the neo-Y versus BAC sequence divergence. Our pipeline makes the implicit assumption that neo-X genomic reads can be mapped back to neo-Y transcripts, which allows the identification of neo-X/neo-Y diverged sites. This assumption may be violated in regions where divergence is high enough for neo-X reads not to map, even if the homologous sequence is still present on the neo-X chromosome. However, such regions are unlikely to affect the neo-Y transcripts in the final assembly, since for regions of high divergence, Trinity will assemble neo-X and neo-Y transcripts separately. Our pipeline only modifies parts of a transcript where the neo-X and neo-Y are similar enough for neo-X reads to align to the neo-Y, and highly diverged regions will not be modified by our bioinformatics pipeline.
To test how this bioinformatic pipeline performs, we also mapped male and female reads against autosomal transcripts, and categorized variants in the same manner as for the neo-sex linked genes. We find that 0.03% of autosomal sites would fall into category 1, and 0.15% in category 3, i.e. the proportions of such sites relative to those for neo-sex-linked transcripts are 12% and 13%, respectively. This suggests that about 12-13% of the sites on Muller element C that appear as “male polymorphic and female homomorphic” may be caused by factors other than true neo-X/neo-Y divergence. We find that 0.24% of autosomal sites would fall into category 2 (compared to 0.21% for the neo-sex linked transcripts). Since the proportions of category 2 SNPs are very similar between neo-sex chromosomes and autosomes, this is consistent with SNPs in this category being caused by mapping and sequencing errors, or remaining heterozygosity in the line, rather than true neo-X/neo-Y divergence.
After modification of the male Muller C transcripts, the male RNA-Seq reads were re-mapped against the modified transcripts, allowing for zero mismatches between reads and transcripts. Based on this mapping and the freebayes SNP/Indel output, transcript regions were kept for further analysis if they were supported by male RNA-Seq reads (i.e. actually present in the male transcriptome) and if they contained at least one variant distinguishing the neo-Y from the neo-X. Requiring at least one neo-Y-specific variant per contiguous transcript region has the consequence that genes that were deleted from the neo-Y were removed from the neo-Y assembly, as were neo-X/neo-Y homomorphic genes and silenced neo-Y genes.
Using this approach alone, we missed male-specific neo-X transcripts because they were turned into neo-Y transcripts; however, male-specific neo-X transcripts were assembled in an analogous way to the neo-Y transcripts (Figure 1), except that this time positions were modified to carry the variant that was also found in females (the neo-X variant).
All transcripts were merged using Cap3 , and then scaffolded based on Drosophila pseudoobscura known proteins using the STM method . All transcripts were assigned to Flybase polypeptide IDs, using a reciprocal best hit between the assembled transcripts and D. pseudoobscura Flybase transcripts as a criterion for ID assignment.
Transcript level calculation
To calculate transcript abundance for the neo-X and neo-Y separately, RNA-Seq reads from male carcass, whole body, testis and accessory gland were aligned to the D. miranda transcriptome assembly using Mosaik and allowing for zero mismatches between the reads and the reference transcripts. eXpress  probabilistically assigns reads to different alleles of the same gene – in this case, the neo-X and neo-Y homologs – and was used to calculate transcript abundance levels (measured as FPKM - the fragments per kilobase of transcript per million fragments mapped) for the neo-sex chromosomes separately, as well as for genes on other Muller elements.
Sequence analysis of transcripts
Neo-X and neo-Y genes, as well as their autosomal D. pseudoobscura homologs were aligned using muscle . Gblocks  was used to remove poorly aligned segments of alignments, and the minimum length of a block was set to 100, and the maximum number of consecutive non-conserved positions to four. Next, only alignments that contained all three sequences (neo-X, neo-Y and D. pseudoobscura) were considered (1,646 alignments). For these transcripts, distmat (http://emboss.bioinformatics.nl/cgi-bin/emboss/distmat) was used to calculate the uncorrected divergence between the neo-X, neo-Y and D. pseudoobscura.
To assess the functional properties of neo-Y genes, the neo-Y transcripts were aligned to the protein sequences of their D. pseudoobscura homologs using Genewise . For the alignments, genewise allows for frame-shift mutations and pre-mature stop codons in the transcript sequences, and a perl program was used to extract the positions of these mutations within the neo-Y transcripts.
For each neo-Y transcript, the length of the 3′ UTR was calculated by the distance between the first stop codon to the end of the transcript. Since neo-Y expression is generally low, we also used the length of the 3′UTR of the neo-X homolog as a proxy for UTR length on the neo-Y, assuming that the neo-Y transcript might not be fully assembled if the read number supporting that part of the transcript was too low.
Availability of supporting data
The data sets supporting the results of this article are available in the ncbi Sequence Read Archive (SRA) repository, http://www.ncbi.nlm.nih.gov/sra/?term=SRR364803, http://www.ncbi.nlm.nih.gov/sra/?term=SRR364802, http://www.ncbi.nlm.nih.gov/sra/?term=SRR364798, http://www.ncbi.nlm.nih.gov/sra/?term=SRR364799, http://www.ncbi.nlm.nih.gov/sra/?term=SRR364804, http://www.ncbi.nlm.nih.gov/sra/?term=SRR364801, http://www.ncbi.nlm.nih.gov/sra/?term=SRR364800, http://www.ncbi.nlm.nih.gov/sra/?term=SRR357403, http://www.ncbi.nlm.nih.gov/sra/?term=SRR357405.
Results and discussion
The neo-sex transcriptome
The raw Trinity assembly of the male transcriptome contained 67,872 transcripts, 17,220 of which showed sequence similarity (blat Score >200) to Muller element C of the D. miranda genomic assembly . It was manually confirmed that neo-sex transcripts in males were produced as chimeric transcripts by Trinity, by comparing several transcripts to neo-Y and neo-X BAC clone sequences of ref. . The female Trinity assembly contained 43,730 transcripts, 7,046 of which potentially neo-X linked. 1,568 of the male Muller C matching transcripts had exact matches with the neo-X assembly, suggesting that Trinity assembled some neo-X transcripts correctly, in addition to chimeric and correctly assembled neo-Y transcripts. The 17,220 potential Muller C transcripts from males were divided into 31,368 exonic sequences. By mapping of genomic reads followed by SNP calling, about 1.1% of all sites in Muller C transcripts from males were transformed into the corresponding neo-Y variant.
Using genewise, neo-Y transcripts were divided into those with putative functional ORFs and those that are non-functional. Of the 1,410 neo-Y transcripts with no detectable frame-shift mutations or premature stop codons, the median length was 1,249 – somewhat shorter than for transcripts annotated as containing disrupted ORFs by frameshift mutations (356 transcripts) or pre-mature stop codons (164 transcripts) – combined, 453 putative non-functional neo-Y genes, with a median length of 1,649 bp (Wilcoxon test comparing the lengths of putative functional versus non-functional neo-Y transcripts: W = 234222, p < 10−15). Genes containing nonsense or frameshift mutations might simply be longer, on average, because they present larger mutational targets, but notably, they are still transcribed throughout most of their length. On the other hand, some knock-out mutations might have been missed if they were not part of the assembled transcript sequence. Indeed, compared to the genome assembly , we detect fewer loss-of-function mutations among neo-Y-linked genes (Additional file 1); however, we also find mutations in 170 genes that were annotated as “functional” in the genomic assembly (Additional file 1). Partly, this may be due to the relatively high chance of missing a pseudogene in the genome assembly - which was estimated as 10% . In addition, we followed a different strategy of assigning flybase polypeptide IDs to neo-Y transcripts, and the two assemblies may differ with regards to which exact transcript is assigned to which FBpp ID (in the genome assembly, co-linearity and scaffold information was taken into account, whereas the transcripts were assigned based on a reciprocal best hit with D. pseudoobscura genes only). However, since all loss-of function mutations in the transcriptome assembly are supported by genomic as well as RNA-Seq reads, they are presumably real.
Nonsense mediated decay
Thus, our results suggest that NMD does not play a major role in down-regulating neo-Y transcript levels in D. miranda if PTCs are present. This is consistent with possibly abundant read-through of stop codons in Drosophila, as indicated by protein-coding conservation 3′ of stop codons . In order to infer if translation proceeds past the pre-mature stop codons of neo-Y-linked genes, direct measures of translation are necessary, such as ribosomal profiling. Interestingly, neo-X genes in D. miranda show significantly higher transcript levels if they contain UTR lengths larger than the median (female whole body FPKM = 11.24 for genes with 3′UTRs < 424.5-bp, compared to an FPKM = 23.5 for neo-X genes with 3'UTRs > 425-bp, Wilcoxon test: W = 470328, p < 10−4). If longer 3′UTRs are indeed associated with higher levels of expression (and not due to less complete assemblies of lowly expressed genes), this may counteract any effects of NMD.
Gene loss and silencing on the neo-X chromosome
The availability of neo-Y transcripts allowed us to identify genes that were lost from the neo-X since its split from the neo-Y about 1.5 my ago. When mapping female genomic reads back to male transcripts, four Muller C transcripts had zero female genomic read coverage, indicating that these genes are not present in the female genome. There was also no polymorphism among male genomic reads mapped to these transcripts, further supporting the idea that only a single, neo-Y-linked transcript was present in the genome. A fifth gene (FBpp0276435/FBpp0276436) was identified as having an unusually high “divergence” from BAC clone neo-Y sequences of ref. . This gene turned out to be present in two copies in the neo-Y genomic assembly and also in two copies on Muller C in D. pseudoobscura, but only in one copy on the neo-X (Additional file 2).
In contrast, the 387 genes that were recovered from the neo-X but not the neo-Y (either because of neo-Y deletion or gene silencing) had generally low expression in D. miranda females and in both sexes of D. pseudoobscura (Figure 6b and c), supporting the idea that lowly expressed genes are lost at an increased rate from the neo-Y . Mapping of male genomic reads against neo-X transcripts suggests that most silent genes on the neo-Y are indeed still present in the genome, with only their transcription being suppressed, i.e. their genomic coverage in males is only partially reduced (Additional file 3). Indeed only 30 of the 387 genes with silent neo-Y but expressed neo-X copies were annotated as deleted from the neo-Y .
Evolved sex chromosomes have a highly diverged gene content [1, 5, 27]. Ancient Y chromosomes have usually lost most of their ancestral genes, and the few remaining genes often have testis-specific functions [1, 5]. In contrast, old X chromosomes in Drosophila have become demasculinized, containing a deficiency of genes with male-specific expression . The evolutionary events leading to such a difference in gene content, and their temporal dynamics, are little understood. Here, we show that both gene loss and gene silencing contribute to divergence in gene content between evolving sex chromosomes. Genes with male-specific function are preferentially silenced and lost on an evolving neo-X chromosome. On the other hand, lowly expressed genes are lost quickly from the neo-Y, while male-specific genes are more likely to be retained on the degenerating neo-Y chromosome. Thus, masculinization of the Y and demasculinization of the X can proceed very quickly after a new sex chromosome emerges, supporting the idea that sex-specific or sexually antagonistic selection plays a major role in the evolution of heteromorphic sex chromosomes [1, 5, 28].
We thank Chris Ellison for the whole genome alignments of Drosophila miranda, D. pseudoobscura and D. affinis. The study was funded by grants from the National Institutes of Health (GM076007 and GM093182) to DB.
- Bachtrog D: Y-chromosome evolution: emerging insights into processes of Y-chromosome degeneration. Nat Rev Genet. 2013, 14 (2): 113-124.PubMed CentralPubMedView ArticleGoogle Scholar
- Kaiser VB, Zhou Q, Bachtrog D: Nonrandom gene loss from the Drosophila miranda neo-Y chromosome. Genome Biol Evol. 2011, 3: 1329-1337. 10.1093/gbe/evr103.PubMed CentralPubMedView ArticleGoogle Scholar
- Charlesworth B, Charlesworth D: The degeneration of Y chromosomes. Philos Trans R Soc Lond B Biol Sci. 2000, 355 (1403): 1563-1572. 10.1098/rstb.2000.0717.PubMed CentralPubMedView ArticleGoogle Scholar
- Zhou Q, Ellison CE, Kaiser VB, Alekseyenko AA, Gorchakov AA, Bachtrog D: The epigenome of evolving Drosophila neo-sex chromosomes: dosage compensation and heterochromatin formation. PLoS Biol. 2013, 11 (11): e1001711-10.1371/journal.pbio.1001711.PubMed CentralPubMedView ArticleGoogle Scholar
- Charlesworth B: The evolution of chromosomal sex determination and dosage compensation. Curr Biol. 1996, 6 (2): 149-162. 10.1016/S0960-9822(02)00448-7.PubMedView ArticleGoogle Scholar
- Vicoso B, Charlesworth B: The deficit of male-biased genes on the D. melanogaster X chromosome is expression-dependent: a consequence of dosage compensation?. J Mol Evol. 2009, 68 (5): 576-583. 10.1007/s00239-009-9235-4.PubMedView ArticleGoogle Scholar
- Bachtrog D, Charlesworth B: Reduced adaptation of a non-recombining neo-Y chromosome. Nature. 2002, 416 (6878): 323-326. 10.1038/416323a.PubMedView ArticleGoogle Scholar
- Steinemann M, Steinemann S: Enigma of Y chromosome degeneration: neo-Y and neo-X chromosomes of Drosophila miranda a model for sex chromosome evolution. Genetica. 1998, 102–103 (1–6): 409-420.PubMedView ArticleGoogle Scholar
- Zhou Q, Bachtrog D: Sex-specific adaptation drives early sex chromosome evolution in Drosophila. Science. 2012, 337 (6092): 341-345. 10.1126/science.1225385.PubMed CentralPubMedView ArticleGoogle Scholar
- Carvalho AB, Clark AG: Y chromosome of D. pseudoobscura is not homologous to the ancestral Drosophila Y. Science. 2005, 307 (5706): 108-110. 10.1126/science.1101675.PubMedView ArticleGoogle Scholar
- Vicoso B, Bachtrog D: Reversal of an ancient sex chromosome to an autosome in Drosophila. Nature. 2013, 499 (7458): 332-335. 10.1038/nature12235.PubMed CentralPubMedView ArticleGoogle Scholar
- Kervestin S, Jacobson A: NMD: a multifaceted response to premature translational termination. Nat Rev Mol Cell Biol. 2012, 13 (11): 700-712. 10.1038/nrm3454.PubMed CentralPubMedView ArticleGoogle Scholar
- Isken O, Maquat LE: Quality control of eukaryotic mRNA: safeguarding cells from abnormal mRNA function. Genes Dev. 2007, 21 (15): 1833-1856. 10.1101/gad.1566807.PubMedView ArticleGoogle Scholar
- Rebbapragada I, Lykke-Andersen J: Execution of nonsense-mediated mRNA decay: what defines a substrate?. Curr Opin Cell Biol. 2009, 21 (3): 394-402. 10.1016/j.ceb.2009.02.007.PubMedView ArticleGoogle Scholar
- Chang YF, Imam JS, Wilkinson MF: The nonsense-mediated decay RNA surveillance pathway. Annu Rev Biochem. 2007, 76: 51-74. 10.1146/annurev.biochem.76.050106.093909.PubMedView ArticleGoogle Scholar
- Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A: Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011, 29 (7): 644-652. 10.1038/nbt.1883.PubMed CentralPubMedView ArticleGoogle Scholar
- Bachtrog D, Hom E, Wong KM, Maside X, de Jong P: Genomic degradation of a young Y chromosome in Drosophila miranda. Genome Biol. 2008, 9 (2): R30-10.1186/gb-2008-9-2-r30.PubMed CentralPubMedView ArticleGoogle Scholar
- Huang X, Madan A: CAP3: A DNA sequence assembly program. Genome Res. 1999, 9 (9): 868-877. 10.1101/gr.9.9.868.PubMed CentralPubMedView ArticleGoogle Scholar
- Surget-Groba Y, Montoya-Burgos JI: Optimization of de novo transcriptome assembly from next-generation sequencing data. Genome Res. 2010, 20 (10): 1432-1440. 10.1101/gr.103846.109.PubMed CentralPubMedView ArticleGoogle Scholar
- Roberts A, Pachter L: Streaming fragment assignment for real-time analysis of sequencing experiments. Nat Methods. 2013, 10 (1): 71-73.PubMedView ArticleGoogle Scholar
- Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004, 32 (5): 1792-1797. 10.1093/nar/gkh340.PubMed CentralPubMedView ArticleGoogle Scholar
- Castresana J: Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol. 2000, 17 (4): 540-552. 10.1093/oxfordjournals.molbev.a026334.PubMedView ArticleGoogle Scholar
- Birney E, Clamp M, Durbin R: GeneWise and Genomewise. Genome Res. 2004, 14 (5): 988-995. 10.1101/gr.1865504.PubMed CentralPubMedView ArticleGoogle Scholar
- Palmieri N, Nolte V, Suvorov A, Kosiol C, Schlotterer C: Evaluation of different reference based annotation strategies using RNA-Seq - a case study in Drososphila pseudoobscura. PLoS One. 2012, 7 (10): e46415-10.1371/journal.pone.0046415.PubMed CentralPubMedView ArticleGoogle Scholar
- Jungreis I, Lin MF, Spokony R, Chan CS, Negre N, Victorsen A, White KP, Kellis M: Evidence of abundant stop codon readthrough in Drosophila and other metazoa. Genome Res. 2011, 21 (12): 2096-2113. 10.1101/gr.119974.110.PubMed CentralPubMedView ArticleGoogle Scholar
- Ellison CE, Bachtrog D: Dosage compensation via transposable element mediated rewiring of a regulatory network. Science. 2013, 342 (6160): 846-850. 10.1126/science.1239552.PubMed CentralPubMedView ArticleGoogle Scholar
- Sturgill D, Zhang Y, Parisi M, Oliver B: Demasculinization of X chromosomes in the Drosophila genus. Nature. 2007, 450 (7167): 238-241. 10.1038/nature06330.PubMed CentralPubMedView ArticleGoogle Scholar
- Vicoso B, Charlesworth B: Evolution on the X chromosome: unusual patterns and processes. Nat Rev Genet. 2006, 7 (8): 645-653. 10.1038/nrg1914.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.