Skip to main content
  • Research article
  • Open access
  • Published:

Characterisation of the wheat (triticum aestivum L.) transcriptome by de novo assembly for the discovery of phosphate starvation-responsive genes: gene expression in Pi-stressed wheat



Phosphorus (P) is an essential macronutrient for plant growth and development. To modulate their P homeostasis, plants must balance P uptake, mobilisation, and partitioning to various organs. Despite the worldwide importance of wheat as a cultivated food crop, molecular mechanisms associated with phosphate (Pi) starvation in wheat remain unclear. To elucidate these mechanisms, we used RNA-Seq methods to generate transcriptome profiles of the wheat variety ‘Chinese Spring’ responding to 10 days of Pi starvation.


We carried out de novo assembly on 73.8 million high-quality reads generated from RNA-Seq libraries. We then constructed a transcript dataset containing 29,617 non-redundant wheat transcripts, comprising 15,047 contigs and 14,570 non-redundant full-length cDNAs from the TriFLDB database. When compared with barley full-length cDNAs, 10,656 of the 15,047 contigs were unalignable, suggesting that many might be distinct from barley transcripts. The average expression level of the contigs was lower than that of the known cDNAs, implying that these contigs included transcripts that were rarely represented in the full-length cDNA library. Within the non-redundant transcript set, we identified 892–2,833 responsive transcripts in roots and shoots, corresponding on average to 23.4% of the contigs not covered by cDNAs in TriFLDB under Pi starvation. The relative expression level of the wheat IPS1 (Induced by Phosphate Starvation 1) homologue, TaIPS1, was 341-fold higher in roots and 13-fold higher in shoots; this finding was further confirmed by qRT-PCR analysis. A comparative analysis of the wheat- and rice-responsive transcripts for orthologous genes under Pi-starvation revealed commonly upregulated transcripts, most of which appeared to be involved in a general response to Pi starvation, namely, an IPS1-mediated signalling cascade and its downstream functions such as Pi remobilisation, Pi uptake, and changes in Pi metabolism.


Our transcriptome profiles demonstrated the impact of Pi starvation on global gene expression in wheat. This study revealed that enhancement of the Pi-mediated signalling cascade using IPS1 is a potent adaptation mechanism to Pi starvation that is conserved in both wheat and rice and validated the effectiveness of using short-read next-generation sequencing data for wheat transcriptome analysis in the absence of reference genome information.


As a key component of plant cell molecules, phosphorus (P) is an essential macronutrient for plant growth. Large quantities are used in fertiliser, but worldwide P resources will be exhausted by the end of this century [1]. Phosphate (Pi) starvation can generally be observed throughout an afflicted field. Visual symptoms of Pi starvation (−P) are the development of dark-green leaf colour and a reduction in shoot elongation and leaf size. As −P progresses in wheat (Triticum aestivum L.), the oldest leaves become chlorotic and show signs of desiccation [2].

Wheat is a major staple food crop in many parts of the world in terms of both cultivation area and prevalence as a food source. To meet the increasing global demand for wheat, this crop’s exploitation of nutrients must be made more efficient and its requirement for nutritional fertilisers reduced. Because wheat is primarily grown on substrates with low P levels, such as the acidic soils of tropical and subtropical regions and the calcareous soils of temperate regions, an important constraint to wheat production is its lack of tolerance to −P.

Various genetic approaches have been used to understand genetic control of −P tolerance in wheat; these include aneuploid analyses of the nulli-tetrasomic series and wheat alien chromosome addition lines of the cultivar ‘Chinese Spring’ (CS) and quantitative trait locus (QTL) mapping [36]. QTL analyses using –P-sensitive CS and the tolerant variety ‘Lovrin 10’ indicated that CS possesses positive alleles of the major QTLs for P use efficiency on chromosomes 3B, 4B, and 5A [4]. In another study, seven and six QTLs were repeatedly detected controlling P uptake and use efficiency [5]. A large number of QTLs for agronomic trait changes under low or high P concentrations have been detected on all chromosomes in the hexaploid wheat genome, implying that −P tolerance is controlled by polygenes [5]. However, the studies are few in number; a reverse genetic approach could help characterise genes that potentially contribute to complex multilocus traits and their global transcriptional networks in Pi-starved wheat.

Several technologies, including massively parallel sequencing and microarray analysis, have recently been used to simultaneously catalogue the effects of −P on the expressions of thousands of genes in model species [710]. Transcriptome sequencing using next generation sequencing (NGS) technology provides high-resolution data and is a powerful tool for studying global transcriptional networks. The evaluation of sequence-based expression profiles can identify responsive genes and provide functional annotation for genes underlying complex and multilocus traits under −P in wheat.

In model species, transcriptome profiling and the quantification of gene expression levels are generally performed by mapping reads from the NGS analysis to a reference genome sequence and annotating genes. The strategies for model species are not feasible in wheat, as its reference genome sequence and gene annotation are still incomplete; an international project to achieve these goals is currently making progress (IWGSC: International Wheat Genome Sequencing Consortium, This project may take considerable time, because of the difficulties involved in sequencing the huge (40 times larger than rice), highly-repetitive hexaploid genome of wheat.

The database of putative full-length cDNAs for wheat, TriFLDB, has released approximately 16,000 full-length cDNAs ( [11]. Although this dataset is a useful reference for transcript mapping, it is incomplete, because 36,000 to 50,000 genes have been estimated per diploid genome based on the 3B chromosome of hexaploid wheat [12, 13]. Recently, de novo transcript assembly analysis has made possible comprehensive analyses of transcriptomes, and several studies have detailed the transcriptome sequencing of various non-model species, including wheat, using massively parallel sequencing technology [14, 15]. De novo assembly of short sequences of transcripts enables researchers to reconstruct the sequences of entire transcriptomes, identify and catalogue all of the expressed genes, separate isoforms, and capture transcript expression levels.

Although computer-based de novo assembly tools (e.g., Trans-ABySS, Velvet-Oases, and Trinity) [1618] have been developed in conjunction with massively parallel sequencing, their usefulness in transcriptome assembly is not yet well demonstrated, and improvements can still be achieved using recent advances in bioinformatics. Some studies have used short-read sequence data obtained with the Illumina sequencer for de novo assembly; others have used the relatively long-read sequence data obtained with the Roche 454 pyrosequencing system or have adopted a hybrid approach of both short and long reads. In addition, contig construction is greatly affected by sequence read quality (i.e., length) and quantity. Furthermore, the cDNA library construction methods, sequencing technologies, and data pre-treatment techniques chosen influence the quality of the assembled transcriptomes [19]. Consequently, a comparison of several assembly programs is needed to determine the best combination of parameters, which can then serve as a guideline for sequence assembly performance.

In this study, we verified the de novo assembly approach by comparing analyses from several programs using short-read sequence data obtained from wheat cultivar CS seedlings under –P. We constructed a wheat transcript dataset for de novo assembly and quantified gene expression. As a reference in the gene expression analysis, we used a non-redundant set of transcripts generated from the de novo assembly and full-length cDNAs. This dataset was also used to assess transcripts, to investigate sequence similarity, for conservation analyses among several plant species, and for comparison with our previous report on rice transcript profiling under −P conditions [10]. We demonstrated that an overall mechanism regulating gene expression of −P-responsive genes in wheat could be effectively characterised using short-read NGS data. A comparison of gene expression profiles in wheat and rice revealed the presence of conserved gene expression systems, which appear to be essential to adaptation to –P conditions. Finally, we described an effective method for assembly of short transcript sequences to discover novel functional genes in the absence of a reference genome. The transcript assembly generated in this study should serve as a useful resource for wheat genomics and genetics.

Results and discussion

Construction of the wheat transcript dataset

A set of transcripts consisting of cDNAs from TriFLDB and contigs from RNA-Seq reads was constructed as shown in Figure 1 (steps 1–5). A total of 12 libraries (two tissues, two treatments, and three replicates) from seedlings subjected to –P (0 mM NaH2PO4; 0 mg L−1 P) and control (0.323 mM NaH2PO4; 10 mg L−1 P) treatments were used in the RNA-Seq analysis. Overall, 115 million paired-end short-read sequences were produced by an Illumina HighSeq 2000 system (Illumina, San Diego, CA, USA) and used for de novo assembly after the removal of low-quality segments. The choice of data pre-treatment technique, in which data are pre-processed to remove sequencing errors and other artefacts, influences the accuracy and precision of the gene expression analyses [19]. Trans-ABySS, Velvet-Oases (henceforth Oases), and Trinity were used for sequence assembly in this experiment (see Methods for details). Each contig included various transcript isoforms, which cause redundancy. Because their inclusion could present difficulties in a quantitative analysis, the redundant contigs were removed. The removal of these contigs generated a non-redundant set of contigs, which were assessed using various output parameters—such as number of contigs and mean contig length as a function of k-mer length—to select the appropriate assembly program. The largest assembly, comprising 555,287 contigs at least 101 bp long, was produced by Trinity, while the smallest, consisting of 337,969 contigs of at least 100 bp in length, was generated by Oases. However, Oases yielded the best results of the three assembly programs, with a maximum contig length of 22,610 bp, a mean contig length of 1,113 bp, and a median contig length of 698 bp (Additional file 1 and Additional file 2).

Figure 1
figure 1

Strategy used in the wheat RNA-Seq analysis. A set of 29,617 wheat transcripts was constructed from 15,047 non-redundant contigs and 14,570 full-length cDNAs in TriFLDB. To identify the responsive transcripts under –P conditions, all reads were aligned to the transcripts using Bowtie, and a G-test with an FDR value cut-off < 0.01 was used.

Contig quantity and mean length are often used in de novo assembly assessment, but these values may not reflect contig accuracy. We therefore assessed accuracy using the three assembly programs discussed above. We counted the number of contigs that could be aligned with 90% or greater coverage to their corresponding full-length cDNAs in TriFLDB (Additional file 3). Of the three assembly programs, Oases assembled the longest transcripts and had the highest percentage of alignments to the wheat full-length cDNAs. Although it generated the smallest number of contigs, Oases was able to reconstruct 87% of the TriFLDB cDNAs with identity ≥90% and coverage ≥50%, or 75% of the TriFLDB cDNAs with identity ≥90% and coverage ≥90% (Additional file 3). Conversely, Trinity produced the largest number of contigs, but could only reconstruct 16% of the TriFLDB cDNAs (identity ≥90%, coverage ≥90%). Trans-ABySS could reconstruct just 6% of the TriFLDB cDNAs (identity ≥90%, coverage ≥90%). Thus, the reconstruction percentages of the latter two programs were considerably worse than those of Oases, and we concluded that Oases was the most appropriate program for our analyses. Of the 337,969 contigs generated from Oases, 169,024 could not be aligned to the cDNAs using BLAT. When the longest contigs per locus were selected, 67,616 of them could not be aligned with the cDNAs.

Because the 67,616 non-redundant contigs could indicate homology to either plants or non-plants, we treated them as contaminants and removed them from the analysis, except when a BLASTX hit matched a land plant. The contigs were analysed for similarity/sequence conservation against nr (non-redundancy datasets of various species) using a BLASTX search (E-value ≤1E-03). A total of 15,047 contigs were found in land plants (top hit), with 91.4% in monocots, such as Hordeum vulgare ssp. vulgare (33.3%; 5,017 transcripts), Brachypodium distachyon (26.0%; 3,913 transcripts), Oryza sativa ssp. japonica group (13.5%; 2,025 transcripts), Sorghum bicolor (7.3%; 1,101 transcripts), O. sativa ssp. indica group (4.9%; 743 transcripts), T. aestivum (3.6%; 537 transcripts), Zea mays (2.1%; 317 transcripts), and other H. vulgare subspecies (0.7%; 100 transcripts) (Additional file 4). Notably, 10,656 of the 15,047 contigs could not be aligned to full-length cDNAs in the barley full-length cDNA database ( using BLAT, suggesting that most of the contigs were distinct from the barley transcripts. Finally, we obtained 29,617 wheat transcripts by combining the 15,047 non-redundant contigs with 14,570 non-redundant full-length cDNAs (Steps 1–5 in Figure 1). This dataset is available upon request.

Mapping reads to the wheat transcript dataset

Characterisations of contigs and cDNAs from the 29,617 transcripts are presented in Table 1. To determine whether our contigs reflected the effect of −P in wheat, 51,130,274 of 115,495,198 quality-controlled paired-end reads were aligned back to the wheat transcript dataset (contigs and cDNAs) using Bowtie (step 6 in Figure 1, Table 2), which is designed for short-read mapping onto the genome. We used 39.6–46.7% of the uniquely-mapped reads in the expression analysis of each treatment. Replicates in all treatments were highly correlated (coefficient > 0.94), and reads from the same treatment were merged to calculate reads per kilobase of exon model per million mapped reads (RPKM), which indicates the relative transcription amount. We observed that 70% (10,144) of the 14,570 non-redundant cDNAs were associated with at least one unique read. On average, 92% of transcript length was covered by reads, with coverage depth of each transcript averaging 420 reads. These statistics suggested that nearly complete coverage of the entire cDNA length was obtained in this analysis. Many reads were impossible to align back to the contigs, possibly because the redundant contigs had been removed in the previous analysis (Step 3 in Figure 1, with the longest contigs per locus selected by Oases).

Table 1 Characterisation of the 29,617 wheat transcripts obtained from contigs and cDNAs in TriFLDB
Table 2 Statistical summary of the reads aligned against the set of contigs and full-length cDNAs in TriFLDB

Distribution of RPKM values was assessed by comparing contigs and cDNAs (Additional file 5). Average RPKM values from each treatment ranged from 2.75–4.75 for contigs and 31.95–33.68 for cDNAs. The distributions of maximum and median RPKM values were lower for contigs than cDNAs, suggesting that most cDNAs were highly expressed, probably reflecting the overall distribution of gene expression in this study. We identified 529 contigs, however, with higher expressions than the average observed for cDNAs under one or more of the four tissue/treatment combinations. Some of these contigs may correspond to transcripts rarely represented in the full-length cDNA library or transcripts that were expressed at low levels in other conditions.

Identification of differentially-expressed wheat transcripts under −P

A G-test (FDR [False Discovery Rate] <0.01) of the RPKM-derived read counts was performed to detect differences in gene expression between control and –P-treated plants, and to identify responsive wheat transcripts under –P conditions (step 7 in Figure 1). In roots, 1,004 transcripts were upregulated and 892 were downregulated; in shoots, 2,833 and 1,382 transcripts were upregulated and downregulated, respectively. The most numerous responsive transcripts were those upregulated in shoots, which were more than twice as abundant as those upregulated in roots (Figure 2). This result may be a consequence of the fact that most of the cDNAs in TriFLDB were constructed from shoot samples. On average, 23.4% of the transcripts in the responsive transcript set were contigs not covered by cDNAs in TriFLDB (Figure 2). The wheat transcript set contained more contigs than cDNAs, but there were more responsive cDNAs than contigs. The low-expression contigs might reflect small differences in expression between the two conditions that may not be detectable statistically, making the characterization of these transcripts difficult (Additional file 5). We were able to demonstrate, nonetheless, that the de novo assembly strategy can improve transcriptome analysis of a non-model species and that these upregulated or downregulated contigs could be functionally annotated as Pi responses in wheat under –P.

Figure 2
figure 2

Distribution of transcripts upregulated or downregulated in response to −P. The total number of upregulated or downregulated transcripts identified by RNA-Seq was determined in roots and shoots under –P conditions. Each bar shows the distribution of transcripts with their matching cDNAs in TriFLDB (light grey) and contigs (grey).

Functional annotation of responsive transcripts under –P

To obtain a functional annotation of upregulated wheat transcripts under –P conditions, we used Gene Ontology (GO) biological process categories. GO annotations were assigned to 40.7% (409) of the root transcripts and 34.4% (974) of the shoot transcripts. We identified the top 20 GO categories into which upregulated transcripts in roots and shoots were distributed (Figure 3). Although the numbers of transcripts differed, the overall categorisations were very similar between roots and shoots. Sixteen of the 20 GO categories were represented in both roots and shoots: oxidation-reduction process, protein phosphorylation, transmembrane transport, metabolic process, regulation of transcription (DNA-dependent), lipid metabolic process, proteolysis, transport, translation and carbohydrate metabolic process, defence response, response to oxidative stress, biosynthetic process, cation transport, response to stress, and cell redox homeostasis. Transcripts from both roots and shoots most belonged to the oxidation-reduction process category. This category was represented by many instances of cytochrome P450 and oxidoreductase 2OG-FeII, both of which were reported to be upregulated in response to oxidative stress in Arabidopsis[20]. The protein phosphorylation category included several protein kinases, which may participate in signal transduction during many cellular processes under –P conditions, including metabolism, transcription, and cell cycle progression and differentiation. The lipid metabolic process category had many instances of lipase, which can alter membrane lipid composition to maintain internal Pi homeostasis under −P [21].

Figure 3
figure 3

Distribution of Gene Ontology (GO) biological process categories for upregulated contigs. The number of upregulated transcripts in roots and shoots categorised in the top twenty GOs are summarised. The y-axis indicates the category. The x-axis indicates the number of transcripts in a category.

Analysis of IPS1 expression by RNA-Seq and qRT-PCR

Because RPKM values obtained from RNA-Seq indicated significant upregulation of TaIPS1, the wheat IPS1 homologue, in root and shoot samples after 10 d of –P conditions, its expression levels were further analysed. IPS1, a non-protein coding gene that includes highly conserved sequences near miR399 complementary regions in different plant species [22], has been found to be strongly upregulated under −P in rice [10]. The homologous sequences TaIPS1.1 (accession number EU753150), TaIPS1.2 (EU753151), and TaIPS1.3 (EU753152) have been isolated and deposited in GenBank, but molecular studies, such as expression analyses, have not been reported.

Based on RNA-Seq, expression levels increased 341-fold in roots and 13-fold in shoots under –P conditions. In this study, we did not distinguish homeologous forms, and we present the results for each gene as an integration of counts from the three genomes of hexaploid wheat as reported by Pellny et al. [23]. The primary goal of our study to identify –P-responsive wheat genes, but we noted that using an integration of counts from the three homeologous forms for each gene could cause problems with transcript abundance when reads were aligned back to the wheat transcript set using Bowtie. A quantitative real-time PCR (qRT-PCR) analysis was used to confirm these results under identical conditions. Expression levels increased 368-fold in roots and 17-fold in shoots according to qRT-PCR, confirming the results obtained using RNA-Seq analysis (Figure 4). These data demonstrated that these transcriptomes accurately reflected the response of wheat to Pi starvation and suggest that the IPS1-mediated signalling cascade may also function in wheat.

Figure 4
figure 4

Expression of IPS1 and several upregulated transcripts under –P conditions was determined using RNA-Seq and qRT-PCR. IPS1, IPS2, RNS1, and MGD expression in roots and IPS1, SPX1, GDPD, and PAP expression in shoots under –P were also analysed using qRT-PCR at 0 and 10 d after –P treatment. Transcript expression levels were normalised using an internal control (Ubiquitin1) and plotted relative to expression on day 10. Bars represent mean ± SE from the three experiments. Fold changes based on RPKM values according to RNA-Seq are plotted on the same graph.

A highly conserved PHR1-IPS1-miR399-UBC24/PHO2 signalling cascade

Members of the IPS1 gene family from different plant species, including rice, have a highly-conserved 23-nt-long motif that exhibits complementarity with the miR399 involved in Pi response [22]. Because IPS1 is strongly upregulated in rice [10], we investigated the conserved responsive genes by comparing the wheat transcripts with rice transcripts used in our previous RNA-Seq study [10]. To assign Rice Annotation Project (RAP) IDs to responsive transcripts in wheat under –P conditions, we searched for homology to RAP ( proteins using BLASTP. Of the 29,617 wheat transcripts in our constructed dataset, open reading frames (ORFs) were predicted for 21,864. Among these, 9,435 transcripts with reciprocal top hits (E-value ≤1E-06) were identified.

A total of 7,449 transcripts were found to be orthologous to their corresponding rice genes based on a threshold of ≥ 60% identity and ≥ 60% coverage. In addition, BLASTN was used to identify rice orthologues of wheat non-protein coding genes. Orthologues of the responsive wheat transcripts were verified for their responses by comparing their data with those of the responsive rice transcripts. Because most well-characterised transcripts have been found to be upregulated under −P conditions in plants, we focused on the upregulated transcripts (Figure 2). In wheat, 376 of the 1,004 upregulated transcripts in roots and 1,326 of the 2,833 upregulated transcripts in shoots possessed orthologues, which were then compared with those found in the responsive rice transcripts. Thirty-nine transcripts in roots and 21 transcripts in shoots were upregulated in both wheat and rice, suggesting their importance in adaptation to –P conditions in Poaceae species (Table 3). The 16 transcripts upregulated in both roots and shoots are of particular interest.

Table 3 Upregulated transcripts found in both wheat and rice under –P conditions

The Pi regulatory mechanism has been elucidated through Arabidopsis mutant analyses and involves PHR1 (phosphate starvation response 1), a MYB-type transcription factor, acting as a key factor in regulating downstream –P-responsive gene expression, including that of IPS1, through the P1BS cis-acting element (GNATATNC) of the PHR1 binding site (Figure 5). This mechanism is conserved in vascular plants and unicellular algae [24]. As in conserved genes in wheat and rice, the P1BS cis-element exists in the promoter of wheat-orthologous rice genes; IPS1 is upregulated primarily in the roots of wheat (Table 3) and rice [10], suggesting conservation of the IPS1-mediated signalling cascade under the control of PHR1 (PHR2 in rice [25]). In Arabidopsis, SIZ1 (small ubiquitin-like modifier [SUMO] E3 ligase) [16], miR399, and UBC24/PHO2 (ubiquitin E2 conjugase) are also involved in the cascade (Figure 5) [26]. According to genome sequencing information, SIZ1 is conserved in rice [27]; miR399 expression has been confirmed in rice, and a potential orthologue of UBC24/PHO2 has been identified by assembly using genomic DNA and expressed sequence tags (ESTs) [26]. Transcribed sequences encoding miR399, miR399-binding sites, or protein sequences homologous to N- or C-terminal extensions of UBC24/PHO2 have been observed in wheat [26]. We also detected rice-orthologous wheat transcripts of PHR1, UBC24/PHO2, and SIZ1, although these transcripts were not highly responsive to −P in wheat (data not shown) or rice [10]. Mature miRNAs shorter than the sequencing length of the short reads could be difficult to detect in our RNA-Seq libraries, however. Based on the identification of HvmiR399s [28] and the upregulation of HvIPS1[29], the IPS1-miR399-UBC24/PHO2 signalling cascade appears to be conserved in barley [18]. Consequently, the PHR1-IPS1-miR399-UBC24/PHO2 signalling cascade is probably also conserved in wheat (Figure 5). In silico comparative analysis [30] suggested the presence of conserved cis-acting elements, such as the P1BS-like motif, and trans-acting factors that are capable of regulating the sole putative wheat high-affinity phosphate transporter TaPT2, which is expressed in a tissue-specific and Pi-dependent fashion in both monocots and dicots.

Figure 5
figure 5

Overview of the P-dependent signalling cascades that affect Pi remobilisation, Pi uptake, and changes in Pi metabolism. The functions of wheat transcripts and rice orthologues of Pi-related transcripts were characterised under –P conditions. Asterisks show that the P1BS cis-acting element (GNATATNC), an imperfect palindromic sequence [24], is located in the upstream region (1 kb).

We also studied the upregulated expressions of other responsive rice-orthologous wheat transcripts (Table 3). SPX1, upregulated under –P conditions, acts as a negative regulator for the expression of PHR2 and, consequently, Pi transporters in rice [31]. RNS1-3 (RNase) [32], PAP (purple acid phosphatase) [33], and PHO1 (which transfers Pi from roots to shoots in rice [34]) may function in the remobilisation of Pi. In Arabidopsis, PHR1 SUMOylated by SIZ1 positively regulates RNS1 expression [35]. Pht1;10 (phosphate transporter) [36] may function in the uptake of Pi. AtPht1;8 repression and AtPht1;9 expression are attenuated in the pho2 mutant of Arabidopsis[26].

Genes involved in lipid metabolism, such as those that function in the synthesis of galactolipids and sulfolipids, were strongly induced under −P conditions. These include SQD1 (UDP-sulfoquinovose synthase 1) [37], MGD (MGDG [monogalactosyldiacylglycerol] synthases) [38], and GDPD (glycerophosphoryl diester phosphodiesterase) [39], which may work to re-route Pi during lipid metabolism. UDP-glucose pyrophosphorylase may act as a sugar-signalling network under –P conditions [40, 41]. These factors may be related to the PHR1-IPS1-miR399-UBC24/PHO2 signalling cascade, as Pi remobilisation, Pi uptake, and Pi-related metabolism are under the control of that cascade (Figure 5). Upregulation of several transcripts in this cascade was also confirmed by qRT-PCR (Figure 4). These results suggested that the PHR1-IPS1-miR399-UBC24/PHO2 signalling cascade functions as a potent adaptation to −P in wheat.

Transcriptional control of other −P responsive transcripts

We identified several upregulated regulatory wheat transcripts involved in −P response from top hits to Arabidopsis proteins using BLASTP. (Contig transcript searches were performed with BLASTX.) WRKY6 (RFL_Contig5159) and PHO1 (RFL_Contig2349) transcripts were upregulated in roots. In Arabidopsis, enhanced WRKY6 mutants are more sensitive to −P conditions, having lower Pi contents in shoots compared with wild-type seedlings [42]. WRKY6 is involved in responses to –P by regulating PHOSPHATE1 (PHO1) expression [42]. PHO1 plays a role in Pi translation from roots to shoots, aiding plant adaptation to –P [43]. Salt and mannitol stress-inducible WRKY33 (tplb0001n04) and SKIP (tplb0006k10)[44, 45], and WRKY70 (Contig.7004.23) [46], which have pivotal roles in determining the balance between salicylic acid- and jasmonic acid-dependent defence pathways, were also upregulated in shoots. SWI3C (RFL_Contig4840), SWI3D (tplb0017g02), and GCN5-related N-acetyltransferase (Contig.1767.6) were upregulated in shoots. SWI/SNF (SWITCH/SUCROSE NONFERMENTING) chromatin-remodelling complexes mediate ATP-dependent alterations of DNA-histone contacts. Histone H2A.Z regulates the expression of several classes of phosphate starvation response genes [47]. Although mechanisms of transcription rate modulation entailing chromatin structure alteration have not been fully elucidated under –P in Arabidopsis, SWI/SNF complexes and Gcn5 histone acetyltransferase are necessary for full induction of several phosphatase genes and PHO84 high-affinity phosphate transporter gene in yeast [4852].

Finally, we investigated downregulated wheat transcripts under –P and found that PsbQ-like 1 (RFL_Contig3541), PsbQ-like 2 (RFL_Contig550), PSBP-1 (tplb0004p24), PSBQ-1 (tplb0011f13), LHCB1.5 (tplb0016n06), LHCA6 (RFL_Contig363), LHCB3 (tplb0013f07), and Photosystem I reaction centre subunit N (tplb0007a14) were downregulated in shoots. These genes are involved in photosynthesis. In Brassica nigra leaf petiole suspension cells, the rate of photosynthesis and photosynthetic product partitioning were altered under –P [53]. Glyceraldehyde-3-phosphate dehydrogenase (RFL_Contig1308) and sedoheptulose-1,7-bisphosphatase (SBPase) (tplb0007f04) were also downregulated in shoots under –P. These are key regulatory enzymes in CO2 reduction and the regeneration phase of the Calvin cycle for carbon fixation pathways, respectively. SBPase, which catalyses the dephosphorylation of sedoheptulose-1,7-bisphosphate into sedoheptulose-7-phosphate and Pi, is specific to the eukaryotic Calvin cycle and plays vital roles in regulating the pathways in the cycle [54] and improving photosynthetic capacity [55]. Downregulation under –P of transcripts associated with photosynthesis and alteration of the balance of carbon metabolism are well documented in Arabidopsis[8].


In this study, we demonstrated the use of short-read sequence data to rapidly characterise a wheat transcriptome and have contributed significantly to the corpus of wheat transcript data. Differentially-expressed transcripts under –P in wheat included, on average, 23.4% of the contigs not covered by cDNAs in TriFLDB. The induction of these responses requires a sophisticated regulatory system, however, and details of this regulation in wheat have only recently begun to be elucidated. Comparison of upregulated wheat and rice transcripts revealed that the PHR1-IPS1-miR399-UBC24/PHO2 signalling cascade is conserved in both crop species. Data from our previous study of rice transcripts and other supportive studies in other species confirmed that our analysis captured the transcriptome for –P response in wheat. This study thus represents a genomic approach to discovering wheat transcripts when genome sequences are unavailable. This contribution is significant to the development of genomic resources for wheat and other species and should accelerate the progress of functional genomic studies and breeding programs.


Plant materials and stress treatment

Seeds of the wheat cultivar ‘Chinese Spring’ were germinated and grown hydroponically in nutrient media (0.323 mM NaH2PO4; 10 mg L−1 P) [56] in a growth chamber. After 14 d, seedlings were subjected to a –P stress treatment by being transferred to a similar medium (0 mM NaH2PO4; 0 mg L−1 P). Plants were maintained under –P for 10 d at 23°C under a 16 h light/8 h dark cycle, with the light period extending from 6:00 AM to 10:00 PM. Root and shoot samples were collected as described previously [10].

RNA isolation and quality control

Total RNA was extracted from all tissue samples using a RNeasy Plant Mini kit (Qiagen, Hilden, Germany). RNA quality was assessed using a Bioanalyzer (Agilent, Palo Alto, CA, USA); as suggested by the manufacturer, only those samples with RIN (RNA integrity number) scores greater than 8.0 were used in subsequent analyses.

Illumina sequencing and quality control

Twelve paired-end (PE) cDNA libraries were used to generate 107,298,935 PE reads from root and shoot tissues. Sequencing was performed on each library to generate 100-bp PE reads for transcriptome sequencing on an Illumina High-Seq 2000 platform. Library construction was accomplished using commercial products (Illumina), with sequencing performed by commercial service providers (Takara, Shiga, Japan). The sequence data generated in this study were deposited in the DDBJ Sequence Read Archive (Accession No. DRA000737). Low-quality bases (Q < 15) were trimmed from both ends using a customised program, and the adapters were trimmed using Cutadapt ver. 1.0 ( with the parameters ‘-f fastq -e 0.1 -O 5 -m 20’.

De novo assembly

To generate a non-redundant set of transcripts, we performed a de novo assembly with publicly available programs using 73,804,720 PE RNA-Seq reads (36,902,360 pairs; 97.5 average read length, 7,198,965,163 total bases, 260 average insert length) obtained from four libraries, with each library consisting of three replicates of one of the four tissue/treatment combinations. Reads from each library were aligned to cDNAs in TriFLDB using Bowtie, and the calculated RPKM values were used to conduct a correlation analysis of the replicates. The replicate pair with the highest correlation coefficient among the three replicate pairs was selected for de novo assembly. Trans-ABySS (ver. 1.3.2) [16], Velvet (ver. 1.2.03; (ver. 0.2.05; [17], and Trinity (ver. r2012-01-25, --seqType fq and --kmer_method meryl) [18] were used in this study. Those programs were developed to assemble short reads using a de Bruijn graph algorithm [57]. Various assembly parameters were also optimised to obtain the best results from each program. Trinity, a program developed specifically for de novo transcriptome assembly from short-read RNA-Seq data, was also used to assemble the single k-mers (k = 21). Trinity has been shown to be efficient and sensitive in recovering full-length transcripts and isoforms in yeast, mice, and whiteflies [18]. Velvet (‘-fastq –short’ [velveth], ‘-read_trkg yes’ [velvetg])-Oases (default parameters) and ABySS (ver. 1.3.3, OVERLAP_OPTIONS=‘--no-scaffold’ SIMPLEGRAPH_OPTIONS=‘--no-scaffold’) [58] were also used to assemble individual k-mer lengths from every odd number between 21 and 51.

Using a multiple k-mer (MK) strategy for the de novo assembly gave better results than did a non-multiple k-mer approach because it was able to capture both weakly expressed transcripts with small k-mer values and highly expressed genes with large k-mer values [59]. In this study, Velvet-Oases (‘-merge yes’) and Trans-ABySS (default) were used to assemble different k-mer lengths from 21 to 51; both yielded better results than those obtained with the individual k-mer approach. We therefore used the MK strategy for downstream analyses. We also performed de novo assembly with Oases and Trans-ABySS at different k-mer lengths, using every odd number from 21 to 95; these results did not improve the assessment. To remove redundant contigs, short contigs that were entirely covered by longer ones with more than 90% identity were removed using cd-hit-est (ver. 4.5.7) [60] with default parameters. Furthermore, Oases produces isoforms from a locus; the longest transcripts of each locus were selected to construct a non-redundant set of contigs to identify responsive transcripts.

Calculation of RPKM values and the detection of differentially-expressed transcripts

The PE RNA-Seq reads from each treatment were aligned back to the cDNA, and the transcripts were assembled using Bowtie 2 ver. 2.0.0-beta6, with the parameters ‘-k 2 --no-discordant --no-mixed -q’. Expression levels of all unique transcripts mapped onto the full-length cDNAs and contigs were quantified using the RPKM values [61]. The RPKM value of each transcript was calculated using uniquely-aligned reads with an in-house program; each value was incremented by 1 to avoid division by 0 following the fold change calculations. Differentially expressed transcripts were detected using a G-test with an FDR value cut-off < 0.01. Details of this process have been described in previous work [10].

Gene ontology analysis

The GO terms assigned to the responsive transcripts were obtained from InterProScan 5 Release Candidate 2 ( with the parameters ‘-f tsv -t p -dp –goterms’ for each GO category after those transcripts were predicted with ORFs using with the parameter ‘-B’ in Trinity (ver. r2012-04-27).

qRT-PCR analysis of wheat transcripts

First-strand cDNA was synthesised using a Transcriptor First Strand synthesis kit (Roche, Basel, Switzerland). Expression of Pi-upregulated transcripts was analysed by qRT-PCR in a LightCycler 480 system (Roche) with transcript-specific primers (Additional file 6). The detection threshold cycle of each reaction was normalised using Ubiquitin1 primers 5-GGAGGACACAGAAAGGCAAC-3 and 5-CTCCGTGGTGGCCAGTAAGT-3. Three technical replicates of each treatment were used in the analysis.


  1. Vance CP, Uhde-Stone C, Allan DL: Phosphorus acquisition and use: critical adaptations by plants for securing a nonrenewable resource. New Phytol. 2003, 157 (3): 423-447. 10.1046/j.1469-8137.2003.00695.x.

    Article  CAS  Google Scholar 

  2. Ozturk L, Eke S, Torun B, Cakmak I: Variation in phosphorus efficiency among 73 bread and durum wheat genotypes grown in a phosphorus-deficient calcareous soil. Plant Soil. 2005, 269 (1–2): 69-80.

    Article  CAS  Google Scholar 

  3. Wang SW, Yin LN, Tanaka H, Tanaka K, Tsujimoto H: Identification of wheat alien chromosome addition lines for breeding wheat with high phosphorus efficiency. Breeding Sci. 2010, 60 (4): 371-379. 10.1270/jsbbs.60.371.

    Article  CAS  Google Scholar 

  4. Su JY, Xiao YM, Li M, Liu QY, Li B, Tong YP, Jia JZ, Li ZS: Mapping QTLs for phosphorus-deficiency tolerance at wheat seedling stage. Plant Soil. 2006, 281 (1–2): 25-36.

    Article  CAS  Google Scholar 

  5. Su JY, Zheng Q, Li HW, Li B, Jing RL, Tong YP, Li ZS: Detection of QTLs for phosphorus use efficiency in relation to agronomic performance of wheat grown under phosphorus sufficient and limited conditions. Plant Sci. 2009, 176 (6): 824-836. 10.1016/j.plantsci.2009.03.006.

    Article  CAS  Google Scholar 

  6. Li YJ, Liu JZ, Li B, Li JY, Li ZS: Chromosomal control of the tolerance to soil phosphorus deficiency in genome of common wheat. Chin J Genet. 1999, 26: 529-538.

    CAS  Google Scholar 

  7. Wasaki J, Yonetani R, Kuroda S, Shinano T, Yazaki J, Fujii F, Shimbo K, Yamamoto K, Sakata K, Sasaki T: Transcriptomic analysis of metabolic changes by phosphorus stress in rice plant roots. Plant Cell Environ. 2003, 26 (9): 1515-1523. 10.1046/j.1365-3040.2003.01074.x.

    Article  CAS  Google Scholar 

  8. Wu P, Ma LG, Hou XL, Wang MY, Wu YR, Liu FY, Deng XW: Phosphate starvation triggers distinct alterations of genome expression in Arabidopsis roots and leaves. Plant Physiol. 2003, 132 (3): 1260-1271. 10.1104/pp.103.021022.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  9. Misson J, Raghothama KG, Jain A, Jouhet J, Block MA, Bligny R, Ortet P, Creff A, Somerville S, Rolland N: A genome-wide transcriptional analysis using Arabidopsis thaliana Affymetrix gene chips determined plant responses to phosphate deprivation. Proc Natl Acad Sci USA. 2005, 102 (33): 11934-11939. 10.1073/pnas.0505266102.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  10. Oono Y, Kawahara Y, Kanamori H, Mizuno H, Yamagata H, Yamamoto M, Hosokawa S, Ikawa H, Akahane I, Zhu Z: mRNA-Seq reveals a comprehensive transcriptome profile of rice under phosphate stress. Rice. 2011, 4 (2): 50-65. 10.1007/s12284-011-9064-0.

    Article  Google Scholar 

  11. Mochida K, Yoshida T, Sakurai T, Ogihara Y, Shinozaki K: TriFLDB: a database of clustered full-length coding sequences from Triticeae with applications to comparative grass genomics. Plant Physiol. 2009, 150 (3): 1135-1146. 10.1104/pp.109.138214.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  12. Paux E, Roger D, Badaeva E, Gay G, Bernard M, Sourdille P, Feuillet C: Characterizing the composition and evolution of homoeologous genomes in hexaploid wheat through BAC-end sequencing on chromosome 3B. Plant J. 2006, 48 (3): 463-474. 10.1111/j.1365-313X.2006.02891.x.

    Article  CAS  PubMed  Google Scholar 

  13. Choulet F, Wicker T, Rustenholz C, Paux E, Salse J, Leroy P, Schlub S, Le Paslier M-C, Magdelenat G, Gonthier C: Megabase level sequencing reveals contrasted organization and evolution patterns of the wheat gene and transposable element spaces. Plant Cell. 2010, 22 (6): 1686-1701. 10.1105/tpc.110.074187.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  14. Garg R, Patel RK, Jhanwar S, Priya P, Bhattacharjee A, Yadav G, Bhatia S, Chattopadhyay D, Tyagi AK, Jain M: Gene discovery and tissue-specific transcriptome analysis in chickpea with massively parallel pyrosequencing and web resource development. Plant Physiol. 2011, 156 (4): 1661-1678. 10.1104/pp.111.178616.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  15. Duan J, Xia C, Zhao G, Jia J, Kong X: Optimizing de novo common wheat transcriptome assembly using short-read RNA-Seq data. BMC Genomics. 2012, 13 (1): 392-10.1186/1471-2164-13-392.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  16. Birol I, Jackman SD, Nielsen CB, Qian JQ, Varhol R, Stazyk G, Morin RD, Zhao Y, Hirst M, Schein JE: De novo transcriptome assembly with ABySS. Bioinformatics. 2009, 25 (21): 2872-2877. 10.1093/bioinformatics/btp367.

    Article  CAS  PubMed  Google Scholar 

  17. Schulz MH, Zerbino DR, Vingron M, Birney E: Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics. 2012, 28 (8): 1086-1092. 10.1093/bioinformatics/bts094.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  18. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q: Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011, 29 (7): 644-652. 10.1038/nbt.1883.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  19. Martin JA, Wang Z: Next-generation transcriptome assembly. Nat Rev Genet. 2011, 12 (10): 671-682. 10.1038/nrg3068.

    Article  CAS  PubMed  Google Scholar 

  20. Thibaud M-C, Arrighi J-F, Bayle V, Chiarenza S, Creff A, Bustos R, Paz-Ares J, Poirier Y, Nussaume L: Dissection of local and systemic transcriptional responses to phosphate starvation in Arabidopsis. Plant J. 2010, 64 (5): 775-789. 10.1111/j.1365-313X.2010.04375.x.

    Article  CAS  PubMed  Google Scholar 

  21. Lin WY, Lin SI, Chiou TJ: Molecular regulators of phosphate homeostasis in plants. J Exp Bot. 2009, 60 (5): 1427-1438. 10.1093/jxb/ern303.

    Article  CAS  PubMed  Google Scholar 

  22. Franco-Zorrilla JM, Valli A, Todesco M, Mateos I, Puga MI, Rubio-Somoza I, Leyva A, Weigel D, Garcia JA, Paz-Ares J: Target mimicry provides a new mechanism for regulation of microRNA activity. Nat Genet. 2007, 39 (8): 1033-1037. 10.1038/ng2079.

    Article  CAS  PubMed  Google Scholar 

  23. Pellny TK, Lovegrove A, Freeman J, Tosi P, Love CG, Knox JP, Shewry PR, Mitchell RA: Cell walls of developing wheat starchy endosperm: comparison of composition and RNA-Seq transcriptome. Plant Physiol. 2012, 158 (2): 612-627. 10.1104/pp.111.189191.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  24. Rubio V, Linhares F, Solano R, Martin AC, Iglesias J, Leyva A, Paz-Ares J: A conserved MYB transcription factor involved in phosphate starvation signaling both in vascular plants and in unicellular algae. Gene Dev. 2001, 15 (16): 2122-2133. 10.1101/gad.204401.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  25. Zhou J, Jiao F, Wu Z, Li Y, Wang X, He X, Zhong W, Wu P: OsPHR2 is involved in phosphate-starvation signaling and excessive phosphate accumulation in shoots of plants. Plant Physiol. 2008, 146 (4): 1673-1686. 10.1104/pp.107.111443.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  26. Bari R, Datt Pant B, Stitt M, Scheible WR: PHO2, microRNA399, and PHR1 define a phosphate-signaling pathway in plants. Plant Physiol. 2006, 141 (3): 988-999. 10.1104/pp.106.079707.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  27. Miura K, Jin JB, Hasegawa PM: Sumoylation, a post-translational regulatory process in plants. Curr Opin Plant Biol. 2007, 10 (5): 495-502. 10.1016/j.pbi.2007.07.002.

    Article  CAS  PubMed  Google Scholar 

  28. Schreiber AW, Shi BJ, Huang CY, Langridge P, Baumann U: Discovery of barley miRNAs through deep sequencing of short reads. BMC Genomics. 2011, 12: 129-10.1186/1471-2164-12-129.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  29. Huang CY, Shirley N, Genc Y, Shi B, Langridge P: Phosphate utilization efficiency correlates with expression of low-affinity phosphate transporters and noncoding RNA, IPS1, in barley. Plant Physiol. 2011, 156 (3): 1217-1229. 10.1104/pp.111.178459.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  30. Tittarelli A, Milla L, Vargas F, Morales A, Neupert C, Meisel LA, Salvo GH, Penaloza E, Munoz G, Corcuera LJ: Isolation and comparative analysis of the wheat TaPT2 promoter: identification in silico of new putative regulatory motifs conserved between monocots and dicots. J Exp Bot. 2007, 58 (10): 2573-2582. 10.1093/jxb/erm123.

    Article  CAS  PubMed  Google Scholar 

  31. Wang C, Ying S, Huang H, Li K, Wu P, Shou H: Involvement of OsSPX1 in phosphate homeostasis in rice. Plant J. 2009, 57 (5): 895-904. 10.1111/j.1365-313X.2008.03734.x.

    Article  CAS  PubMed  Google Scholar 

  32. MacIntosh GC, Hillwig MS, Meyer A, Flagel L: RNase T2 genes from rice and the evolution of secretory ribonucleases in plants. Mol Genet Genomics. 2010, 283 (4): 381-396. 10.1007/s00438-010-0524-9.

    Article  CAS  PubMed  Google Scholar 

  33. Zhang Q, Wang C, Tian J, Li K, Shou H: Identification of rice purple acid phosphatases related to posphate starvation signalling. Plant Biology. 2011, 13 (1): 7-15. 10.1111/j.1438-8677.2010.00346.x.

    Article  PubMed  Google Scholar 

  34. Secco D, Baumann A, Poirier Y: Characterization of the rice PHO1 gene family reveals a key role for OsPHO1;2 in phosphate homeostasis and the evolution of a distinct clade in dicotyledons. Plant Physiol. 2010, 152 (3): 1693-1704. 10.1104/pp.109.149872.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  35. Miura K, Rus A, Sharkhuu A, Yokoi S, Karthikeyan AS, Raghothama KG, Baek D, Koo YD, Jin JB, Bressan RA: The Arabidopsis SUMO E3 ligase SIZ1 controls phosphate deficiency responses. Proc Natl Acad Sci USA. 2005, 102 (21): 7760-7765. 10.1073/pnas.0500778102.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  36. Paszkowski U, Kroken S, Roux C, Briggs SP: Rice phosphate transporters include an evolutionarily divergent gene specifically activated in arbuscular mycorrhizal symbiosis. Proc Natl Acad Sci USA. 2002, 99 (20): 13324-13329. 10.1073/pnas.202474599.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  37. Essigmann B, Guler S, Narang RA, Linke D, Benning C: Phosphate availability affects the thylakoid lipid composition and the expression of SQD1, a gene required for sulfolipid biosynthesis in Arabidopsis thaliana. Proc Natl Acad Sci USA. 1998, 95 (4): 1950-1955. 10.1073/pnas.95.4.1950.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  38. Kobayashi K, Awai K, Nakamura M, Nagatani A, Masuda T, Ohta H: Type-B monogalactosyldiacylglycerol synthases are involved in phosphate starvation-induced lipid remodeling, and are crucial for low-phosphate adaptation. Plant J. 2009, 57 (2): 322-331. 10.1111/j.1365-313X.2008.03692.x.

    Article  CAS  PubMed  Google Scholar 

  39. Wang XM, Yi KK, Tao Y, Wang F, Wu ZC, Jiang D, Chen X, Zhu LH, Wu P: Cytokinin represses phosphate-starvation response through increasing of intracellular phosphate level. Plant Cell Environ. 2006, 29 (10): 1924-1935. 10.1111/j.1365-3040.2006.01568.x.

    Article  CAS  PubMed  Google Scholar 

  40. Ciereszko I, Johansson H, Hurry V, Kleczkowski LA: Phosphate status affects the gene expression, protein content and enzymatic activity of UDP-glucose pyrophosphorylase in wild-type and pho mutants of Arabidopsis. Planta. 2001, 212 (4): 598-605. 10.1007/s004250000424.

    Article  CAS  PubMed  Google Scholar 

  41. Hammond JP, White PJ: Sucrose transport in the phloem: integrating root responses to phosphorus starvation. J Exp Bot. 2008, 59 (1): 93-109.

    Article  CAS  PubMed  Google Scholar 

  42. Chen YF, Li LQ, Xu Q, Kong YH, Wang H, Wu WH: The WRKY6 transcription factor modulates PHOSPHATE1 expression in response to low Pi stress in Arabidopsis. Plant Cell. 2009, 21 (11): 3554-3566. 10.1105/tpc.108.064980.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  43. Hamburger D, Rezzonico E, MacDonald-Comber Petetot J, Somerville C, Poirier Y: Identification and characterization of the Arabidopsis PHO1 gene involved in phosphate loading to the xylem. Plant Cell. 2002, 14 (4): 889-902. 10.1105/tpc.000745.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  44. Jiang Y, Deyholos MK: Functional characterization of Arabidopsis NaCl-inducible WRKY25 and WRKY33 transcription factors in abiotic stresses. Plant Mol Biol. 2009, 69 (1–2): 91-105.

    Article  CAS  PubMed  Google Scholar 

  45. Lim GH, Zhang X, Chung MS, Lee DJ, Woo YM, Cheong HS, Kim CS: A putative novel transcription factor, AtSKIP, is involved in abscisic acid signalling and confers salt and osmotic tolerance in Arabidopsis. New Phytol. 2010, 185 (1): 103-113. 10.1111/j.1469-8137.2009.03032.x.

    Article  CAS  PubMed  Google Scholar 

  46. Li J, Brader G, Kariola T, Palva ET: WRKY70 modulates the selection of signaling pathways in plant defense. Plant J. 2006, 46 (3): 477-491. 10.1111/j.1365-313X.2006.02712.x.

    Article  CAS  PubMed  Google Scholar 

  47. Smith AP, Jain A, Deal RB, Nagarajan VK, Poling MD, Raghothama KG, Meagher RB: Histone H2A.Z regulates the expression of several classes of phosphate starvation response genes but not as a transcriptional activator. Plant Physiol. 2010, 152 (1): 217-225. 10.1104/pp.109.145532.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  48. Sarnowski TJ, Rios G, Jasik J, Swiezewski S, Kaczanowski S, Li Y, Kwiatkowska A, Pawlikowska K, Kozbial M, Kozbial P: SWI3 subunits of putative SWI/SNF chromatin-remodeling complexes play distinct roles during Arabidopsis development. Plant Cell. 2005, 17 (9): 2454-2472. 10.1105/tpc.105.031203.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  49. Santisteban MS, Kalashnikova T, Kalashnikova T, Smith MM: Histone H2A.Z regulates transcription and is partially redundant with nucleosome remodeling complexes. Cell. 2000, 103 (3): 411-422. 10.1016/S0092-8674(00)00133-1.

    Article  CAS  PubMed  Google Scholar 

  50. Sudarsanam P, Winston F: The Swi/Snf family nucleosome-remodeling complexes and transcriptional control. Trends Genet. 2000, 16 (8): 345-351. 10.1016/S0168-9525(00)02060-6.

    Article  CAS  PubMed  Google Scholar 

  51. Barbaric S, Luckenbach T, Schmid A, Blaschke D, Horz W, Korber P: Redundancy of chromatin remodeling pathways for the induction of the yeast PHO5 promoter in vivo. J Biol Chem. 2007, 282 (38): 27610-27621. 10.1074/jbc.M700623200.

    Article  CAS  PubMed  Google Scholar 

  52. Wippo CJ, Krstulovic BS, Ertel F, Musladin S, Blaschke D, Sturzl S, Yuan GC, Horz W, Korber P, Barbaric S: Differential cofactor requirements for histone eviction from two nucleosomes at the yeast PHO84 promoter are determined by intrinsic nucleosome stability. Mol Cell Biol. 2009, 29 (11): 2960-2981. 10.1128/MCB.01054-08.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  53. Duff SM, Moorhead GB, Lefebvre DD, Plaxton WC: Phosphate Starvation Inducible; Bypasses’ of adenylate and phosphate dependent glycolytic enzymes in Brassica nigra suspension Cells. Plant Physiol. 1989, 90 (4): 1275-1278. 10.1104/pp.90.4.1275.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  54. Raines CA: The Calvin cycle revisited. Photosynth Res. 2003, 75 (1): 1-10. 10.1023/A:1022421515027.

    Article  CAS  PubMed  Google Scholar 

  55. Lefebvre S, Lawson T, Zakhleniuk OV, Lloyd JC, Raines CA, Fryer M: Increased sedoheptulose-1,7-bisphosphatase activity in transgenic tobacco plants stimulates photosynthesis and growth from an early stage in development. Plant Physiol. 2005, 138 (1): 451-460. 10.1104/pp.104.055046.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  56. Yoshida S, Forno AD, Cock HJ, Gomez AK: Laboratory manual for physiological studies of rice. 1976, Philippines: Manila,, 3rd,

    Google Scholar 

  57. Zerbino DR, Birney E: Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008, 18 (5): 821-829. 10.1101/gr.074492.107.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  58. Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I: ABySS: a parallel assembler for short read sequence data. Genome Res. 2009, 19 (6): 1117-1123. 10.1101/gr.089532.108.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  59. Zhao QY, Wang Y, Kong YM, Luo D, Li X, Hao P: Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study. BMC Bioinforma. 2011, 12 (Suppl 14): S2-10.1186/1471-2105-12-S14-S2.

    Article  CAS  Google Scholar 

  60. Li W, Godzik A: Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006, 22 (13): 1658-1659. 10.1093/bioinformatics/btl158.

    Article  CAS  PubMed  Google Scholar 

  61. Mortazavi A, Williams BA, Mccue K, Schaeffer L, Wold B: Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008, 5 (7): 621-628. 10.1038/nmeth.1226.

    Article  CAS  PubMed  Google Scholar 

Download references


We thank Ms. F. Aota, Ms. K. Ohtsu, and Ms. K. Yamada for their technical assistance.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Youko Oono.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

YO conceived and coordinated this study, performed the experiments, analysed and interpreted the data, and drafted the manuscript. FK, YK, and TY performed the general statistical analysis on the RNA-Seq data and participated in interpreting the results and in useful discussions. HH, TI, and TM provided valuable insights in the discussion and revision of the manuscript. All authors have read and approved the final manuscript.

Electronic supplementary material


Additional file 1:Comparison of the de novo assembly of three datasets using Trinity ( k = 21), Oases (MK) and Trans-ABySS (MK) programs. Bars indicate the number of contigs. The dashed line indicates the mean contig length, and the solid line indicates the median contig length. (PDF 173 KB)


Additional file 2:Statistical analysis of the non-redundant set of wheat transcripts obtained from three assembly programs.(PDF 50 KB)


Additional file 3:Assessment of contigs aligned to full-length wheat cDNAs in TriFLDB. All quality-controlled reads were aligned to these full-length cDNAs using Bowtie and three assembly programs. The proportion of numbers and bases of the full-length transcripts covered by each assembly program were calculated. (PDF 70 KB)


Additional file 4:Conservation of sequences between wheat transcripts and other land plants. The number of contigs showing significant similarities (E-value <1E-03) when compared with nr according to BLASTX is shown. (PDF 79 KB)

Additional file 5:RPKM values of contigs and cDNAs in TriFLDB. (PDF 77 kb) (PDF 77 KB)

Additional file 6:PCR primers used for qRT-PCR analysis.(PDF 59 KB)

Authors’ original submitted files for images

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Oono, Y., Kobayashi, F., Kawahara, Y. et al. Characterisation of the wheat (triticum aestivum L.) transcriptome by de novo assembly for the discovery of phosphate starvation-responsive genes: gene expression in Pi-stressed wheat. BMC Genomics 14, 77 (2013).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: