- Research article
- Open Access
Genome-wide identification and functional prediction of nitrogen-responsive intergenic and intronic long non-coding RNAs in maize (Zea mays L.)
BMC Genomics volume 17, Article number: 350 (2016)
Nitrogen (N) is an essential and often limiting nutrient to plant growth and development. Previous studies have shown that the mRNA expressions of numerous genes are regulated by nitrogen supplies; however, little is known about the expressed non-coding elements, for example long non-coding RNAs (lncRNAs) that control the response of maize (Zea mays L.) to nitrogen. LncRNAs are a class of non-coding RNAs larger than 200 bp, which have emerged as key regulators in gene expression.
In this study, we surveyed the intergenic/intronic lncRNAs in maize B73 leaves at the V7 stage under conditions of N-deficiency and N-sufficiency using ribosomal RNA depletion and ultra-deep total RNA sequencing approaches. By integration with mRNA expression profiles and physiological evaluations, 7245 lncRNAs and 637 nitrogen-responsive lncRNAs were identified that exhibited unique expression patterns. Co-expression network analysis showed that the nitrogen-responsive lncRNAs were enriched mainly in one of the three co-expressed modules. The genes in the enriched module are mainly involved in NADH dehydrogenase activity, oxidative phosphorylation and the nitrogen compounds metabolic process.
We identified a large number of lncRNAs in maize and illustrated their potential regulatory roles in response to N stress. The results lay the foundation for further in-depth understanding of the molecular mechanisms of lncRNAs’ role in response to nitrogen stresses.
Maize (Zea mays L.) is an important cereal grain crop that is grown widely in various agro-ecological environments worldwide. Apart from being used as human food, maize is also largely used for livestock feed and industrial materials in developed countries . Plant growth and grain development require an abundant supply of nutrients particularly nitrogen (N) [2–4]. As a major yield-determining factor, N is a vital plant nutrient for plant growth and development. It not only provides an N source for amino acids, nucleic acids, chlorophyll and ATP (adenosine triphosphate) [5–7], but also mediates the utilization of phosphorus, potassium and other nutrients in plants . Massive amounts of synthetic nitrogen fertilizer are currently applied to crop fields, however, the crops do not use significant amounts. The excess nitrogen inevitably becomes a major component of global environmental pollution [9–12]. To reduce nitrogen pollution while increasing productivity, enabling crops to use nitrogen more efficiently is critical [13–15]. To achieve this goal, multiple quantitative trait loci (QTLs) in the related pathways have been identified [16–19] and differential gene expression studies in response to nitrogen resources and stresses have been reported [20–24]. However, nitrogen assimilation and its associated metabolic pathways are highly complicated. The underlying molecular mechanism of nitrogen regulation remains largely unknown [25–27].
In recent years, studies of long non-coding RNAs (lncRNAs), a class of non-coding RNA molecules longer than 200 nt, have extended our understanding of the eukaryotic transcriptome. Studies in animals and humans have indicated that lncRNAs play a key role in regulating diverse biological processes, such as transcriptional regulation, dosage compensation and genomic imprinting [28–31]. However, the study of lncRNAs in plants remains largely unexplored. With the rapid development of high-throughput sequencing technologies, large numbers of lncRNAs have been identified in silico or experimentally in Arabidopsis, maize, rice and other plant species. Boerner and McGinnis  identified 1802 potential lncRNAs in the maize genome using a computational pipeline from 18,668 full-length cDNA sequences. Li  identified 20,163 putative lncRNAs and 1704 high-confidence lncRNAs by exploiting available public expressed sequence tag (EST) databases and RNA-seq datasets from 30 different experiments. In addition, several studies have demonstrated the regulation of lncRNA expression in response to abiotic stresses in plants [34–38]. Through in silico genome-wide analysis of Arabidopsis full-length cDNA databases, 76 ncRNAs including five siRNA precursors and 14 natural antisense transcripts of protein-coding genes were identified. Twenty-two lncRNAs related to abiotic stress response were identified by studying stress expression profiles . Zhang et al.  reported the deep sequencing of mRNAs derived from drought-stressed maize and 1724 lncRNAs were identified as drought-responsive. In eukaryotes, most transcripts, including mRNAs and lncRNAs have a polyadenylated (poly (A)+) tail at their 3′ ends. However, non-polyadenylated transcripts, such as ribosomal RNAs (rRNAs) and replication-dependent histone mRNAs end in a conserved 3′ stem-loop sequence , and some lncRNAs [42–44] transcribed by RNA polymerase II are present in large numbers among transcripts. Yang et al.  explored poly(A) + and poly(A)- RNAs from HeLa cells and H9 human embryonic stem cells using deep sequencing, and found that a few excised introns accumulate in cells and constitute a new class of non-polyadenylated lncRNAs. Di et al.  performed total RNA-seq for seedlings of Arabidopsis thaliana under four stress conditions and identified 245 poly (A) + and 58 poly (A)- lncRNAs that are differentially expressed under various stress stimuli. Thus, total RNA sequencing should help us to detect more lncRNAs because it selects RNAs independent of the presence of a poly(A) tail.
To investigate the potential role of lncRNAs with/without polyA tails in response to nitrogen resources, we first performed ultra-deep total RNA sequencing with only rRNA depletion and identified a collection of intergenic/intronic lncRNAs genes expressed in maize leaves. Among the 7245 identified putative lncRNAs, 637 were nitrogen-responsive. Eighty-five differentially expressed N responsive lncRNAs were further classified into three co-expressed modules, some of which are involved in metabolic processes associated with energy, oxidative phosphorylation, phosphorus and nitrogen compounds. These results suggest that lncRNAs might have unique roles in the response to nitrogen.
Plant materials and nitrogen stress treatments
The elite maize inbred line B73 was germinated in a plastic container (20 × 20 × 40 cm) filled with a mixture of quartz sand (80 %) and vermiculite (20 %) in a greenhouse at JAAS (Jiangsu Academy of Agricultural Sciences). Plants were watered daily until the three-leaf stage and then watered with nutrient solution containing 5 mM CaCl2, 0.5 mM KH2PO4, 2 mM MgSO4, 0.05 mM EDTA-Fe-Na Salt, 10 uM MnCl2, 0.3 uM CuSO4, 1 uM ZnSO4, 50 uM H3BO3 and 0.5 uM Na2MoO4. Two different nitrate (KNO3) concentrations were used: one as sufficient N conditions (15 mM) and one as limited N conditions (0.15 mM). Leaves at the seven-leaf (V7) stage were collected and stored at −80 °C for further analysis. Three independent biological replicates were grown to validate RNA expression via quantitative real-time PCR (qPCR).
Leaf N content measurement
To analyze the total nitrogen content of the leaves, B73 leaves from the two nitrogen treatments were dried at 105 °C until their weight was stable. They were then ground to fine powder to ensure digestion. Total N contents were then determined using San++ Continuous Flow Analyzer (Skalar, the Netherlands).
RNA extraction, rRNA depletion and total RNA-seq sequencing
The total RNA of each sample was isolated using a Takara MiniBEST universal RNA extraction kit (Takara, Japan) and digested with RNase-free DNase (Qiagen, Germany) according to the manufacturer’s protocol. RNA was then purified and concentrated using an RNeasy column (Takara). The RNA concentration and quality were evaluated using an Agilent 2100 Bioanalyzer (Agilent Technologies, USA). RNA samples were treated with the RiboMinus™ plant rRNA removal kit (Invitrogen, CA, USA) for rRNA depletion. Total RNA-seq libraries were then constructed and sequenced using an Illumina HiSeq™ 2500 with paired-end method by Berry Genomics Co. Ltd., China.
The Tuxedo suite  was used to complete RNA-seq analysis in this study. First, the FastQC  program was employed to assess the quality of the reads. The SolexaQA++ v3.1  program was introduced to perform quality trimming using the Q20 value (Phred score 20, 1 in 100 chance of incorrect base call). Any reads less than 40 bp were removed after trimming. Cleaned reads were then aligned to the ribosomal RNA (rRNA) sequence databases using the BWA  program with default parameters. Any reads containing rRNA sequences were removed and the remaining reads were used for further analysis. The maize genome assembly B73 (v3) was downloaded from the Plant Ensembl database (http://plants.ensembl.org, v26), which contains 39,556 genes and 63,174 transcripts. Cleaned reads were aligned to the maize reference genome using Tophat v2.0.13 . Low-quality alignment results were filtered out using Samtools  with a mapping quality threshold of 20. After the alignment, Cufflinks (version 2.1.1)  was used to assemble reads into transcripts. Subsequently, the assembled transcripts were merged using Cuffmerge  to obtain a non-redundant unified set of transcripts. Transcripts from the two assemblies were compared with reference annotation using the Cuffcompare  program. The class codes in the Cuffcompare output were used to generate a consensus assembly. The number of fragments per kilobase per million mapped reads (FPKM) per gene were calculated using cuffquant . Cuffdiff  was used to perform pairwise comparisons between samples to identify differentially expressed transcripts. Cuffdiff was used to test for differential expression and regulation among the assembled transcripts across the different samples using the Cufflinks output.
Computational identification of intergenic and intronic lncRNAs
To find lncRNAs, a strict computational strategy was performed as described by Iyer et al.  and Xiao et al. . First, transcripts with class code “I” and “U” were selected for further non-coding analysis. The Gffread program was used to extract these transcripts. Then, these transcripts were aligned against uniref90 and Pfam protein databases using the CPC  program to assess their protein-coding potential. Non-coding transcripts larger than 200 bp, with an FPKM > 1 and a CPC score < -1, were finally considered as candidate lncRNAs for further analysis.
Co-expression network analysis
To construct the co-expression network, a set of microarray data under limited and sufficient nitrogen conditions in maize produced by a previous study  were employed. The data set generated 90 gene expression profiles containing 84,246 probe sets using the Affymetrix chip platform. Combined expression profiles were extracted and normalized using the GEOquery  package. Co-expression correlation between lncRNA probes and non-lncRNA probes was then calculated using Pearson correlation with R2 ≥ 0.85. The normalized expression data from lncRNAs probes and co-expressed probes were extracted to construct an unsigned co-expression network using the WGCNA  package with a soft threshold = 8. Module assignment of lncRNAs and non-lncRNAs was identified using Topological Overlap Matrix (TOM). Eigengenes of each module were evaluated and the co-expressed network was visualized by the Cytoscape  program.
GO (Gene Ontology) term enrichment analysis
The eigengene probes of each module were assigned putative functions by searching using the Blastx  program against the uniprot protein database , using a cut-off e-value ≤ 1e-15. Coding eigengenes were then submitted to AgriGO online toolkits  for gene ontology term enrichment. Fisher’s exact test was applied for the enrichment analysis and the false discovery rate (FDR) was assessed using the Bonferroni method. The significance level was set to 0.1 to identify the significant functional terms.
Validation and quantification of lncRNAs
To validate the lncRNAs, 14 lncRNAs were selected and subjected to a PCR test using B73 genomic DNA and cDNA to validate the accuracy of the assembly. Primers were designed using mInDel  are shown in Additional file 1: Table S1 and Additional file 2: Table S2. To confirm the RNA-seq expression results, total RNA was used to synthesize cDNA using a PrimeScript™ RT reagent kit (Takara) and random hexamer primers. Ubiquitin was used as the internal reference gene control . qPCR was performed using SYBR Premix DimerEraser™ kits (Takara) on a Real Time PCR System (Roche LightCyclerR 96, USA), according to the manufacturer’s instructions. Quantification results of target transcripts were calculated using the comparative ΔΔCt method. Three biological replicates for each selected transcript were used.
Phenotypic and physiological responses of maize seedlings to N stress
The influence of N stress of varied intensities on developing maize seedlings (inbred line B73) was monitored by measuring the N content in the leaves. Before N stress, there was no significant difference detected between sufficient nitrogen and limited nitrogen treatments. After treatment, the leaf color differed between sufficient nitrogen and limited nitrogen supplies (Fig. 1a). Measurement of the leaf N content demonstrated that the N content in the nitrogen-limited maize leaves was significantly lower than that in the nitrogen-sufficient treatment. The results suggested that seedlings of inbred line B73 were sensitive to the N stress and that the internal N content was altered by the N treatments (Fig. 1b).
Ultra-deep total RNA sequencing and mapping
To obtain a comprehensive understanding of the transcriptome under nitrogen stress, total RNAs of V7 leaves were isolated from the maize inbred line B73 grown under sufficient and limited N conditions. RNA-Seq libraries were constructed from the total RNA with rRNA depletion and sequenced by the paired-end method (100 bp × 2) using the Illumina HiSeq2500 platform. Approximately 15 Gb (SN, sufficient nitrogen) and 13 Gb (LN, limited nitrogen) of sequences were obtained. After removing low quality sequences below the Q20 threshold, short sequences less than 40 bp in length, and those with residual rRNA, 83.3 % SN and 67.3 % LN of the high-quality reads were successfully aligned against the maize B73 reference genome (V3) using the splice junction mapping algorithm in Tophat2  (Table 1).
To identify unannotated RNA transcripts, Cufflinks produced a merged data set of all nitrogen-treated RNA-seq data sets using the RABT (reference annotation-based transcript) assembly algorithm. As a result, 68,541 loci with 72,169 transcripts from B73 were generated. Comparison with the known B73 reference gene set (Plant Ensembl database, v26) by the Cuffcompare program produced a non-redundant combined set of 66,342 loci with 72,169 transcripts for further analysis. Meanwhile, the comparison results showed that 20.16 % (14,547) of transcripts were completely matched to annotated coding genes, while 34,945 (48.42 %) transcripts were mapped to unknown intergenic regions and 3.82 % (2755) transcripts were located entirely within a reference intron (Fig. 2). In total, 2119 (2.94 %) transcripts were presumed to be novel isoforms. The results indicated that the majority of the transcripts were from the intergenic or intronic regions, and the total RNA-seq strategy is capable of recovering these RNAs.
Computational identification and characterization of intergenic and intronic lncRNAs
Potential lncRNAs and novel protein-coding mRNAs were identified based on their sequence, amino acid peptide lengths and protein-coding potential, using CPC against UniRef90 and Pfam protein databases (Fig. 3a). In this report, only intronic and intergenic transcripts (“I” and “U”, respectively) were selected for lncRNA prediction. Of the 37,700 transcripts with class code “I” and “U”, 16,516 were identified with non-coding transcripts after CPC analysis (CPC score < -1). Non-coding transcripts were aligned against Pfam protein databases for domain filtering. Finally, 7245 non-coding transcripts longer than 200 bp and with an FPKM > 1 were considered as LncRNAs and 5481 were potential novel protein-coding transcripts (CPC score > 1) (Additional file 3: Table S3 and Additional file 4: Table S4).
Among the 7245 putative lncRNAs, most were intergenic transcripts (6,211 lncRNAs) (class code “U”), while 1034 were located fully within an intron of a protein-coding gene (class code “I”) (Additional file 3: Table S3). The average length of lncRNAs was shorter than that of coding RNAs and mostly had only one exon (Fig. 3b, c). LncRNAs from intergenic regions could also be named as long intervening non-coding RNAs (lincRNAs). According to their genome positions, these lncRNAs overlapped with parts of inter- and intragenic sequences and were widely distributed in maize genome (Fig. 4). A fasta-formatted file containing the identified lncRNAs file and a GTF-formatted annotated file are provided in Additional file 5: Table S5.
Moreover, we used Megablast  to identify sequence similarities with known plant ncRNAs (PNRD, plant non-coding RNA database ). As a result, 167 intergenic/intronic lncRNAs showing high homology to known lncRNAs from PNRD lncRNA databases were considered as known lncRNAs. In addition, four miRNA precursors, two tRNAs, one snRNAs and 16 snoRNAs were identified as known ncRNAs (Additional file 6: Table S6).
Evolutionary conservation of intergenic/intronic lncRNAs
To evaluate the sequence conservation, we investigated the sequence similarity relationships of intergenic/intronic lncRNAs among three monocot genomes obtained from the plant ensembl genome database (http://plants.ensembl.org/) using Blast  alignment. Annotated coding RNAs (class code, “=”) sequences served as a control. Distributions of Blast scores were then visualized using SimiTri  (Fig. 4, Additional file 7: Table S7).
The extent of sequence similarity observed among intergenic and intronic lncRNAs was higher than that seen in randomly selected intergenic regions. However, both intergenic and intronic lncRNAs appeared to show less similarity among the different genomes compared with the annotated coding RNA set (Fig. 5a, b, c). A Mann-Whitney U test showed that sequence similarity of the intergenic/intronic lncRNAs was significantly different from the corresponding set (p = 2.2 × 10−12).
Unique expression pattern of lncRNAs
According to RNA types, expressed transcripts were classified into two major clusters: the lncRNAs group and the annotated coding RNAs group. The expression patterns of the two clusters were calculated separately. In detail, the difference between lncRNAs and coding RNAs was measured statistically using a two-tailed Mann-Whitney U-Test (Fig. 6, Additional file 8: Table S8).
The expression pattern of lncRNAs showed significant differences to that of the coding genes (p < 2.2e-16 in both SN and LN). The results suggested lncRNAs have their own unique expression patterns and their overall expression levels were significantly lower than those of coding RNAs.
Nitrogen-responsive intergenic and intronic lncRNAs
To investigate the potential role of intergenic and intronic lncRNAs under nitrogen stress, we performed differentially expressed genes analysis between samples with different N treatments. Based on combined GTF annotation, all transcripts’ expression profiles were obtained by Cufflinks. Differentially expressed genes were analyzed by Cuffdiff and 2391 transcripts with q values < 0.05 were identified as differentially expressed. In detail, 637 of these differentially expressed transcripts were from intergenic and intronic lncRNAs (620 intergenic and 17 intronic RNAs). Under high nitrogen conditions, 426 intergenic/intronic lncRNAs were downregulated, including 417 intergenic and nine intronic RNAs, while 211 lncRNAs (203 lincRNAs and eight intronic lncRNAs) were upregulated (Fig. 7, Additional file 9: Table S9). The remaining 1307 were other differentially expressed transcripts. Meanwhile, we observed that 6608 lncRNAs were statistically insignificant under the two nitrogen treatments. Differentially expressed lncRNAs were further evaluated using co-expression analysis to infer their potential function.
Co-expression network of nitrogen-responsive lncRNAs
Compared with specific coding genes and microRNAs, most lncRNAs’ functions, particularly in response to nitrogen resources, remain largely unknown. Many reports have suggested that co-expressed genes are usually members of the same pathway or protein complexes and are functionally related or controlled by the same transcriptional regulatory program [65–67]. Genes or proteins inside a co-expressed module can be co-regulated. Therefore, computational construction of coding and non-coding gene co-expression networks could be used to infer the potential biological functions of the lncRNAs .
After similarity analysis of microarray probes with E values < 1e-50, 85 probes were identified and considered as differentially expressed intergenic/intronic lncRNAs (Additional file 10: Table S10). We used a well-developed computational algorithm, WGCNA, to construct the co-expression network for coding RNAs and lncRNAs. The distribution of association between coding genes and lncRNA genes was calculated (Pearson correlation, R 2 ≥ 0.85). The significantly correlated genes were selected to construct the co-expression network. Excluding the reserved non-coexpressed gray module, we dissected three modules that included 33 lncRNAs that comprise various nodes in the network. Interestingly, we noted that most lncRNAs (32 of 33) were enriched in the blue module M1 (Fig. 8, Table 2); therefore, we focused on analyzing module M1. Module M1 comprises 32 N-responsive intergenic and intronic lncRNAs and 239 maize protein-coding genes (Megablast against maize annotated coding genes with e-value ≤ 1e-50) (Additional file 11: Table S11). Using intramodular connectivity calculations, two lncRNAs: TCONS_00068420 (A1ZM007479_at, E value = 6e-96) and TCONS_00020903 (A1ZM007576_at, E value = 2e-121) exhibited very high intramodular connectivity, which suggested that these lncRNAs play an important hub role in regulating the eigengenes of module M1 (Fig. 9). A full list of the genes in the blue modules is provided in Additional file 11: Table S11.
Function term enrichment of M1 eigengenes
The coding eigengenes of module M1 were further assigned and enriched to different GO terms using AgriGO  toolkits. We observed some significantly enriched genes in the biological process category of oxidative phosphorylation (GO: 0003954, corrected p value < 0.0005), generation of precursor metabolites and energy (GO: 0006091, p value < 0.033) and oxidation reduction (GO: 0055114, p < 0.054). In the molecular function category, NADH dehydrogenase activity (GO: 0003954, p < 5.80e-09) and oxidoreductase activity (GO: 0016491, p < 0.0033) were highly and significantly enriched (Fig. 10, Table 3). Besides, eigengenes were also assigned to GO terms such as phosphorylation (GO: 0016310), phosphate metabolic process (GO: 0006796), phosphorus metabolic process (GO: 0006793), nitrogen compound metabolic process (GO: 0006807), biosynthetic process (GO: 0009058) and cellular macromolecule metabolic process (GO: 0044260) (Additional file 12: Dataset S1).
Validation and quantification of lncRNAs
We selected 14 lncRNAs, eight with introns and six without introns, to conduct PCR validations using genomic DNA and cDNA. The results showed a high degree of consistency for the product sizes between assembled lncRNAs and the actually amplified product, both at genomic and transcriptome levels (Fig. 11, Additional file 1: Table S1). The results excluded the possibility of mis-assembly of sequences and alternative splicing. To ensure the accuracy and reliability of the RNA-seq results, a set of independent biological replicates of the nitrogen treatments were subjected to qPCR to confirm the expression changes. Ten transcripts were randomly selected for qPCR (Fig. 12, Additional file 2: Table S2). The results showed good consistency between the qPCR results and the high-throughput sequencing results.
Compared with traditional microarrays, RNA-seq is superior for the detection of novel lncRNAs because of its greater sensitivity, high throughput and no need for prior sequence information [68, 69]. In mRNA sequencing, the mRNAs are captured based on the presence of a poly (A) tail. Thus, lncRNAs with polyA tails can be captured. However, Cheng et al.  suggested that 40 % of lncRNA transcripts were not polyadenylated. To obtain a global view of lncRNAs, we constructed and sequenced an entire RNA library. Only ribosomal RNAs were removed. The small RNAs, mRNAs, and all forms of lncRNAs were retained. As a result, we detected more lncRNAs than any other reported mRNA sequencing projects.
Although recent studies of lncRNAs suggested that individual lncRNAs might play important and diverse biological roles, only a few plant lncRNAs have been confirmed to regulate abiotic stress. In this study, we surveyed the intergenic/intronic lncRNAs in B73 leaves at the V7 stage under conditions of N-deficiency and N-sufficiency. Integrated with mRNA expression profiles and physiological evaluations, 7245 lncRNAs and 637 nitrogen responsive lncRNAs were found. Moreover, a number of lncRNAs were identified for the first time as specifically expressed under different N conditions. The highly specific temporal and spatial expression pattern is similar to previous observations in other plant species .
The functions of most lncRNAs remain largely unknown, therefore, we constructed a gene co-expression network of nitrogen-responsive lncRNAs and coding genes and identified three modules using public microarray data sets under different nitrogen treatments. Interestingly, co-expressed lncRNAs were mainly clustered in module M1. Thus, lncRNAs may play a key role in gene expression regulation of module M1. Further functional enrichment results suggested associations with oxidation and reduction, generation of precursor metabolites and energy processes. This result indicated that N treatment during maize fertilization could have a profound influence on the energy and substrate metabolism in leaves, which is highly consistent with the expectation that the category of nitrogen compound metabolic process (GO: 0006807) could be found in the N treatment experiment. However, the most significant process was NADPH/NADH dehydrogenation (GO: 0003954, GO: 0050136, GO: 0008137, GO: 0016655 GO: 0016651). NADH and NADPH are the reduced forms of NAD+ and NADP+, respectively. The assimilation of nitrogen is associated with high NADH/NADPH consumption . N nutrient absorption improves the photosynthesis system which is one of the biggest resources of NADPH production in plants . NADPH also acts as an electron donor in carbon dioxide fixation in the Calvin cycle (light-independent reactions)  and lipid biosynthesis . Phosphorylation (GO: 0016310) was also found, which is a vital procedure in carbon fixation, and in lipids and starch assembly. Previous studies showed that plant growth and biomass production are largely connected to the activity of primary metabolism in the source leaf, including carbon fixation in photosynthesis, large amounts of nitrogen for amino acids and proteins, phosphorus for the synthesis of RNA and the realization of energy metabolism. Environmental stress is usually accompanied by a rebalancing of the cellular C-N-P homeostasis  in plants. The annotated results in this study offered new insights into the potential roles of lncRNAs in C-N-P rebalancing in response to nitrogen.
One limitation is that only 85 lncRNAs were represented by microarray probes. With the rapid accumulation of total RNA-seq data, we hope that a set of combined expression data containing all lncRNAs will soon be available, which would allow the construction of a more comprehensive co-expression network in the future.
Our current understanding of lncRNA regulation in response to nitrogen is still in its infancy. Several approaches can be used to determine their biological functions, including lncRNA silencing and structure disruption. The present study has laid a foundation for future research in this direction.
Genome-wide identification, characterization, differential expression and co-expression network analysis of intergenic/intronic lncRNA in maize leaves provided a global overview of transcriptional responses of lncRNAs to N stress. Co-expression analyses suggested that the expression of N responsive lncRNAs is highly enriched in a co-expressed module that is related to energy metabolic pathways. Future efforts will be devoted to understanding the interaction of these nitrogen-responsive lncRNAs, especially those with hub lncRNA functions in the network module. Experimental approaches such as overexpression, RNA interference and promoter analysis have been demonstrated as useful strategies to characterize their functions, which would provide valuable information for nitrogen-use efficiency improvement in maize.
Availability of supporting data
The data set supporting the results of this article is available in the NCBI SRA database as SRP059655.
Ethics approval and consent to participate
Consent for publication
false discovery rate
fragments per kilobase per million mapped reads
long intervening non-coding RNAs
long non-coding RNA
Tester M, Langridge P. Breeding technologies to increase crop production in a changing world. Science. 2010;327(5967):818–22.
Xu G, Fan X, Miller AJ. Plant nitrogen assimilation and use efficiency. Annu Rev Plant Biol. 2012;63:153–82.
Atilio JB, Causin HF. The central role of amino acids on nitrogen utilization and plant growth. J Plant Physiol. 1996;149(3–4):358–62.
Novoa R, Loomis RS. Nitrogen and plant production. Plant Soil. 1981;58(1-3):177–204.
Lam H, Coschigano KT, Oliveira IC, Melo-Oliveira R, Coruzzi GM. The molecular-genetics of nitrogen assimilation into amino acids in higher plants. Annu Rev Plant Biol. 1996;47(1):569–93.
Shadchina TM, Dmitrieva VV. Leaf chlorophyll content as a possible diagnostic mean for the evaluation of plant nitrogen uptake from the soil. J Plant Nutr. 1995;18(7):1427–37.
Tills AR, Alloway BJ. The effect of ammonium and nitrate nitrogen sources on copper uptake and amino acid status of cereals. Plant Soil. 1981;62(2):279–90.
Brady NC, Weil RR. The nature and proprieties of soils. New Jersey: Upper Saddle River; 2002.
Conway GR, Pretty JN. Unwelcome harvest: agriculture and pollution. London: Routledge; 2013.
Zhu ZL, Chen DL. Nitrogen fertilizer use in China--Contributions to food production, impacts on the environment and best management strategies. Nutr Cycl Agroecosys. 2002;63(2-3):117–27.
Zhou Q. Interaction between heavy metals and nitrogen fertilizers applied to soil-vegetable systems. B Environ Contam Tox. 2003;71(2):338–44.
Humbert S, Subedi S, Cohn J, Zeng B, Bi Y, Chen X, Zhu T, McNicholas PD, Rothstein SJ. Genome-wide expression profiling of maize in response to individual and combined water and nitrogen stresses. BMC Genomics. 2013;14(1):1–13.
Tilman D, Balzer C, Hill J, Befort BL. Global food demand and the sustainable intensification of agriculture. P Natl Acad Sci. 2011;108(50):20260–4.
Fageria NK, Baligar VC. Enhancing nitrogen use efficiency in crop plants. Adv Agron. 2005;88:97–185.
Masclaux-Daubresse C, Daniel-Vedele F, Dechorgnat J, Chardon F, Gaufichon L, Suzuki A. Nitrogen uptake, assimilation and remobilization in plants: challenges for sustainable and productive agriculture. Ann Bot-London. 2010;105(7):1141–57.
Loudet O, Chaillou S, Merigout P, Talbotec JEL, Daniel-Vedele FCCO. Quantitative trait loci analysis of nitrogen use efficiency in Arabidopsis. Plant Physiol. 2003;131(1):345–58.
Agrama H, Zakaria AG, Said FB, Tuinstra M. Identification of quantitative trait loci for nitrogen use efficiency in maize. Mol Breeding. 1999;5(2):187–95.
Raun WR, Johnson GV. Improving nitrogen use efficiency for cereal production. Agron J. 1999;91(3):357–63.
Hirel B, Bertin P, Quiller EI, Bourdoncle W, Attagnant CEL, Dellay C, Gouy AEL, Cadiou S, Retailliau C, Falque M, Gallais A. Towards a better understanding of the genetic and physiological basis for nitrogen use efficiency in maize. Plant Physiol. 2001;125(3):1258–70.
Kant S, Bi Y, Rothstein SJ. Understanding plant response to nitrogen limitation for the improvement of crop nitrogen use efficiency. J Exp Bot. 2011;62(4):1499–509.
Peng M, Bi Y, Zhu T, Rothstein SJ. Genome-wide analysis of Arabidopsis responsive transcriptome to nitrogen limitation and its regulation by the ubiquitin ligase gene NLA. Plant Mol Biol. 2007;65(6):775–97.
Herv ASAB, Canosa IES, Santero E. Transcriptome analysis of Pseudomonas putida in response to nitrogen availability. J Bacteriol. 2008;190(1):416–20.
Cai H, Lu Y, Xie W, Zhu T, Lian X. Transcriptome response to nitrogen starvation in rice. J Biosci. 2012;37(4):731–47.
Jager D, Sharma CM, Thomsen J, Ehlers C, Vogel JOR, Schmitz RA. Deep sequencing analysis of the Methanosarcina mazei Go1 transcriptome in response to nitrogen availability. P Natl Acad Sci. 2009;106(51):21878–82.
Dixon R, Kahn D. Genetic regulation of biological nitrogen fixation. Nat Rev Microbiol. 2004;2(8):621–31.
Simons M, Saha R, Guillard L, Clement G, Armengaud P, Canas R, Maranas CD, Lea PJ, Hirel B. Nitrogen-use efficiency in maize (Zea mays L.): from ‘omics’ studies to metabolic modelling. J Exp Bot. 2014;65(19):5657–71.
Krapp A. Plant nitrogen assimilation and its regulation: a complex puzzle with missing pieces. Curr Opin Plant Biol. 2015;25:115–22.
Guil SON, Esteller M. Cis-acting noncoding RNAs: friends and foes. Nat Struct Mol Biol. 2012;19(11):1068–75.
Muers M. RNA: Genome-wide views of long non-coding RNAs. Nat Rev Genet. 2011;12(11):742–3.
Mercer TR, Mattick JS. Structure and function of long noncoding RNAs in epigenetic regulation. Nat Struct Mol Biol. 2013;20(3):300–7.
Fatica A, Bozzoni I. Long non-coding RNAs: new players in cell differentiation and development. Nat Rev Genet. 2014;15(1):7–21.
Boerner S, McGinnis KM. Computational identification and functional predictions of long noncoding RNA in Zea mays. PLoS One. 2012;7(8):e43047.
Li L, Eichten SR, Shimizu R, Petsch K, Yeh C, Wu W, Chettoor AM, Givan SA, Cole RA, Fowler JE, et al. Genome-wide discovery and characterization of maize long non-coding RNAs. Genome Biol. 2014;15(2):R40.
Contreras-Cubas C, Palomar M, Arteaga-V A, Zquez M, Reyes JEL, Covarrubias AA. Non-coding RNAs in the plant response to abiotic stress. Planta. 2012;236(4):943–58.
Liu J, Jung C, Xu J, Wang H, Deng S, Bernad L, Arenas-Huertero C, Chua NH. Genome-wide analysis uncovers regulation of long intergenic noncoding RNAs in Arabidopsis. Plant Cell. 2012;24(11):4333–45.
Wang H, Chung PJ, Liu J, Jang I, Kean MJ, Xu J, Chua N. Genome-wide identification of long noncoding natural antisense transcripts and their responses to light in Arabidopsis. Genome Res. 2014;24(3):444–53.
Matsui A, Nguyen AH, Nakaminami K, Seki M. Arabidopsis non-coding RNA regulation in abiotic stress responses. Int J Mol Sci. 2013;14(11):22642–54.
Qi X, Xie S, Liu Y, Yi F, Yu J. Genome-wide annotation of genes and noncoding RNAs of foxtail millet in response to simulated drought stress by deep sequencing. Plant Mol Biol. 2013;83(4-5):459–73.
Amor BB, Wirth S, Merchan F, Laporte P, D Aubenton-Carafa Y, Hirsch J, Maizel A, Mallory A, Lucas A, Deragon JM, et al. Novel long non-protein coding RNAs involved in Arabidopsis differentiation and stress responses. Genome Res. 2009;19(1):57–69.
Zhang W, Han Z, Guo Q, Liu Y, Zheng Y, Wu F, Jin W. Identification of maize long non-coding RNAs responsive to drought stress. PLoS One. 2014;9(6):e98958.
Dominski Z, Marzluff WF. Formation of the 3' end of histone mRNA. Gene. 1999;239(1):1–14.
Yang L, Duff MO, Graveley BR, Carmichael GG, Chen L, et al. Genomewide characterization of non-polyadenylated RNAs. Genome Biol. 2011;12(2):R16.
Beaulieu YB, Kleinman CL, Landry-Voyer A, Majewski J, Bachand F. Polyadenylation-dependent control of long noncoding RNA expression by the poly(A)-binding protein nuclear 1. PLoS Genet. 2012;8(11):e1003078.
Di C, Yuan J, Wu Y, Li J, Lin H, Hu L, Zhang T, Qi Y, Gerstein MB, Guo Y, et al. Characterization of stress-responsive lncRNAs in Arabidopsis thaliana by integrating expression, epigenetic and structural features. Plant J. 2014;80(5):848–61.
Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg SL, Rinn JL, Pachter L. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc. 2012;7(3):562–78.
Andrews S, et al. FastQC: A quality control tool for high throughput sequence data. 2010. http://www.bioinformatics.babraham.ac.uk/projects/fastqc/. Accessed May 2014.
Cox MP, Peterson DA, Biggs PJ. SolexaQA: At-a-glance quality assessment of Illumina second-generation sequencing data. BMC Bioinformatics. 2010;11(1):485.
Li H, Durbin R. Fast and accurate long-read alignment with Burrows--Wheeler transform. Bioinformatics. 2010;26(5):589–95.
Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25(9):1105–11.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25(16):2078–9.
Iyer MK, Niknafs YS, Malik R, Singhal U, Sahu A, Hosono Y, Barrette TR, Prensner JR, Evans JR, Zhao S, et al. The landscape of long noncoding RNAs in the human transcriptome. Nat Genet. 2015;47(3):199–208.
Xiao HM, Yuan ZT, Guo DH, Hou BF, Yin CL, Zhang WQ, Li F. Genome-wide identification of long noncoding RNA genes and their potential association with fecundity and virulence in rice brown planthopper, Nilaparvata lugens. BMC Genomics. 2015;16(1):749.
Kong L, Zhang Y, Ye Z, Liu X, Zhao S, Wei L, Gao G. CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic Acids Res. 2007;35 suppl 2:W345–9.
Yang XS, Wu J, Ziegler TE, Yang X, Zayed A, Rajani MS, Zhou D, Basra AS, Schachtman DP, Peng M, et al. Gene expression biomarkers provide sensitive indicators of in planta nitrogen status in maize. Plant Physiol. 2011;157(4):1841–52.
Davis S, Meltzer PS. GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor. Bioinformatics. 2007;23(14):1846–7.
Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9(1):559.
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13(11):2498–504.
Altschul SF, Madden TL, Sch A, Ffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25(17):3389–402.
Wu CH, Apweiler R, Bairoch A, Natale DA, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, et al. The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res. 2006;34 suppl 1:D187–91.
Du Z, Zhou X, Ling Y, Zhang Z, Su Z. agriGO: a GO analysis toolkit for the agricultural community. Nucleic Acids Res. 2010;38 suppl 2:W64–70.
Lv YD, Liu YH, Zhao H. mInDel: a high-throughput and efficient pipeline for genome-wide InDel marker development. BMC Genomics. 2016;17(1):1–5.
Lin F, Jiang L, Liu Y, Lv Y, Dai H, Zhao H. Genome-wide identification of housekeeping genes in maize. Plant Mol Biol. 2014;86(4):543–54.
Yi X, Zhang Z, Ling Y, Xu W, Su Z. PNRD: a plant non-coding RNA database. Nucleic Acids Res. 2015;43(D1):D982–9.
Parkinson J, Blaxter M. SimiTri-visualizing similarity relationships for groups of sequences. Bioinformatics. 2003;19(3):390–5.
D Haeseleer P, Liang S, Somogyi R. Genetic network inference: from co-expression clustering to reverse engineering. Bioinformatics. 2000;16(8):707–26.
Liao Q, Liu C, Yuan X, Kang S, Miao R, Xiao H, Zhao G, Luo H, Bu D, Zhao H, et al. Large-scale prediction of long non-coding RNA functions in a coding--non-coding gene co-expression network. Nucleic Acids Res. 2011;39(9):3864–78.
Ruan J, Dean AK, Zhang W. A general co-expression network-based approach to gene expression analysis: comparison and applications. BMC Syst Biol. 2010;4(1):8.
Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 2008;18(9):1509–17.
Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009;10(1):57–63.
Cheng J, Kapranov P, Drenkow J, Dike S, Brubaker S, Patel S, Long J, Stern D, Tammana H, Helt G, et al. Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science. 2005;308(5725):1149–54.
Evans JR. Photosynthesis and nitrogen relationships in leaves of C3 plants. Oecologia. 1989;78(1):9–19.
Flood PJ, Harbinson J, Aarts MG. Natural genetic variation in plant photosynthesis. Trends Plant Sci. 2011;16(6):327–35.
Ohlrogge J, Browse J. Lipid biosynthesis. Plant Cell. 1995;7(7):957.
Schl U, Ter U, Colmsee C, Scholz U, Br A, Utigam A, Weber AP, Zellerhoff N, Bucher M, Fahnenstich H, Sonnewald U. Adaptation of maize source leaf metabolism to stress related disturbances in carbon, nitrogen and phosphorus balance. BMC Genomics. 2013;14(1):442.
We are grateful to financial support by Natural Science Foundation of China (31271728, 31401393), Jiangsu Agricultural Science and Technology Independent Innovation Fund (cx2009) and Natural Science Foundation of Jiangsu Province, China (BK20141385).
The authors declare that they have no competing interests.
HZ, ZHP and YDL conceived the project idea. YDL, ZKL, MG, WCQ, TFZ and FL performed data analysis. YDL wrote the paper. HZ and ZHP revised the manuscript. All authors have read, edited and approved the current version of the manuscript.
The 7524 identified intergenic and intronic lncRNAs. (XLSX 555 kb)
The 5481 identified novel coding RNAs. (XLSX 346 kb)
Annotation results based on known ncRNA database. (XLSX 26 kb)
Blast similarity results of lncRNAs and annotated coding genes across related species. (XLSX 489 kb)
Expression information of lncRNAs and annotated coding RNAs. (XLSX 2992 kb)
Differentially expressed transcripts between two N conditions. (XLSX 76 kb)
The 85 corresponding probes of N-responsive lncRNAs. (XLSX 17 kb)
239 coding genes among eigengenes of the M1 module. (XLSX 34 kb)
Enriched GO terms of eigengenes of the M1 module. (XLSX 11 kb)
Primers from 14 lncRNAs for PCR testing. (XLSX 15 kb)
Primers from randomly selected 10 differentially expressed RNAs for qPCR testing. (XLSX 15 kb)
The FASTA-formatted sequences file of identified lncRNAs and the GTF-formatted annotation file of identified lncRNAs. (ZIP 1290 kb)
About this article
Cite this article
Lv, Y., Liang, Z., Ge, M. et al. Genome-wide identification and functional prediction of nitrogen-responsive intergenic and intronic long non-coding RNAs in maize (Zea mays L.). BMC Genomics 17, 350 (2016). https://doi.org/10.1186/s12864-016-2650-1
- Long noncoding RNA
- Nitrogen response
- Coexpression network