Ae. variabilis accession No.1 is a valuable resource for development of CCN-resistance in wheat breeding [12, 13]. However, it is difficult to screen for genes associated with CCN resistance when genomic information is not available. Transcriptomic profiling provides abundant information for a wide range of biological studies. Transcriptomic data gives fundamental insights into biological processes. It can reveal gene expression profiles after experimental treatments or infection, and analyses of conserved orthologous genes can be used for phylogenomic purposes, etc. . Here, we used high-throughput deep sequencing technology to profile the root transcriptome of Ae. variabilis using the Illumina HiSeq™ 2000 platform. To the best of our knowledge, this is the first report on this subject for Ae. variabilis. The cDNA library was constructed using pooled RNA samples from CCN-infected and non-infected plants at three time points. This maximized the number of expressed transcripts included in the analysis, especially those related to CCN resistance.
Accurate sequencing and reliable read assembly are essential for downstream applications of transcriptome data . In this study, we used two popular assemblers, SOAPdenovo and Trinity, for de novo assembly of the transcriptomic data of Ae. variabilis. The SOAPdenovo program has been widely used in many studies [25, 50], while the Trinity method is a newly developed tool. Trinity was reported to recover more full-length transcripts across a broad range of expression levels, and to provide a unified, sensitive solution for transcriptome reconstruction in species without a reference genome, similar to methods that rely on genome alignments . The two methods showed similar average read-depth coverage values. SOAPdenovo produced more unigenes than Trinity; however, many of the sequences assembled by SOAPdenovo were shorter than 200 bp (37,828 out of 130,487). On the other hand, Trinity generated 118,064 unigenes, the unigenes did not contain gaps, and the average unigene length was nearly twice that of those produced by SOAPdenovo (mean length of 599 bp using Trinity, 351 bp using SOAPdenovo). Therefore, Trinity was a better approach than SOAPdenovo for assembly in this research.
The Roche 454 GS FLX platform produces long reads (≫400 bp), whereas the Illumina sequencer generates more reads with a shorter length (90 bp). In this study, however, most of the assembled unigenes (130,487 from SOAPdenovo (≥150 bp) or 118,064 from Trinity (≥200 bp)) achieved a higher coverage of ~33×. This indicates that short-read sequencing combined with an in-depth sequencing strategy and an effective assembly tool is an appropriate strategy to analyze transcriptome profiles.
Compared with other transcriptome studies, the length distribution of the 130,487 and 118,064 unigenes generated in this work tended towards shorter-length reads. There are several possible explanations for this. First, Ae. variabilis (2n = 4x = 28, UUSvSv) is an allotetraploid species of the tribe Titiceae and it has an enormously expanded repeated genome. This may present a substantial barrier to assembling short unigenes into long ones using current and upcoming sequencing technology [51, 52]. Second, the total RNA for sequencing in our work was pooled from six samples, which may negatively affect read assembly . The high dynamic range of mRNA expression is a problem for comprehensive de novo mRNA sequencing and assembly . Third, high frequencies of alternative splicing and fusion events may have restricted the assembly of short sequences into longer ones [54, 55]. Another important reason is that more than 80% of unigenes in this study were expressed at low levels. Therefore, there would be fewer reads corresponding to these unigenes for sequencing and for use in sequence assembly. Even so, the de novo transcriptome of Ae. variabilis provided abundant unigene information without gaps in sequences. This genetic data enriches the genomic resources for the tribe Titiceae.
A total of 7,408 individual unigenes (6.27% of 118,064) were associated with plant defense and resistance (Additional file 7). These unigenes could be classified into five GO sub-categories, three pathways, and a COG function group. More attention should be paid to the three pathways related to plant defense, which included 3712 unigenes. In the “plant-pathogen interaction” pathway, unigenes were mainly involved with the hypersensitive response, cell wall reinforcement, stomatal closure, and defense-related gene induction (Additional file 8). In the “phosphatidylinositol signaling system” pathway, unigenes were mainly related to reactions involving phosphatidylinositol and its derivatives (Additional file 9). In the “ABC transporters” pathway, unigenes were related to eukaryotic-type transporters only, such as the ABCA subfamily, ABCB subfamily, ABCC subfamily, ABCG subfamily, and other putative ABC transporters (Additional file 10). These pathways provide a starting point to explore the genes related to CCN resistance and to understand its molecular mechanism.
Interestingly, 839 unigenes showed high homology to genes from nematode species (Additional file 6), probably because the root had been invaded by CCNs. As there is no genomic information available for CCN, we cannot thoroughly filter sequences of H. avenae genes from the transcriptome database. However, the detection of CCN unigenes confirmed that the method used for CCN inoculation was successful. More importantly, these unigenes represent those expressed during the interaction with a resistant host. Therefore, this experimental system and the unigene dataset obtained from it build a platform for combining genetic, genomic, and expression information on the interaction between CCN and its host in future studies .