Identification of genes associated with nitrogen-use efficiency by genome-wide transcriptional analysis of two soybean genotypes
© Hao et al; licensee BioMed Central Ltd. 2011
Received: 1 July 2011
Accepted: 26 October 2011
Published: 26 October 2011
Soybean is a valuable crop that provides protein and oil. Soybean requires a large amount of nitrogen (N) to accumulate high levels of N in the seed. The yield and protein content of soybean seeds are directly affected by the N-use efficiency (NUE) of the plant, and improvements in NUE will improve yields and quality of soybean products. Genetic engineering is one of the approaches to improve NUE, but at present, it is hampered by the lack of information on genes associated with NUE. Solexa sequencing is a new method for estimating gene expression in the transcription level. Here, the expression profiles were analyzed between two soybean varieties in N-limited conditions to identify genes related to NUE.
Two soybean genotypes were grown under N-limited conditions; a low-N-tolerant variety (No.116) and a low-N-sensitive variety (No.84-70). The shoots and roots of soybeans were used for sequencing. Eight libraries were generated for analysis: 2 genotypes × 2 tissues (roots and shoots) × 2 time periods [short-term (0.5 to 12 h) and long-term (3 to 12 d) responses] and compared the transcriptomes by high-throughput tag-sequencing analysis. 5,739,999, 5,846,807, 5,731,901, 5,970,775, 5,476,878, 5,900,343, 5,930,716, and 5,862,642 clean tags were obtained for the eight libraries: L1, 116-shoot short-term; L2 84-70-shoot short-term; L3 116-shoot long-term; L4 84-70-shoot long-term; L5 116-root short-term; L6 84-70-root short-term; L7 116-root long-term;L8 84-70-root long-term; these corresponded to 224,154, 162,415, 191,994, 181,792, 204,639, 206,998, 233,839 and 257,077 distinct tags, respectively. The clean tags were mapped to the reference sequences for annotation of expressed genes. Many genes showed substantial differences in expression among the libraries. In total, 3,231genes involved in twenty-two metabolic and signal transduction pathways were up- or down-regulated. Twenty-four genes were randomly selected and confirmed their expression patterns by quantitative RT-PCR; Twenty-one of the twenty-four genes showed expression patterns consistent with the Digital Gene Expression (DGE) data.
A number of soybean genes were differentially expressed between the low-N-tolerant and low-N-sensitive varieties under N-limited conditions. Some of these genes may be candidates for improving NUE. These findings will help to provide a detailed understanding of NUE mechanisms, and also provide a basis for breeding soybean varieties that are tolerant to low-N conditions.
Plants require large amounts of nitrogen (N) for their growth and survival . This N accounts for approximately 2% of total plant dry matter. N is a necessary component of proteins, enzymes, and metabolic products involved in the synthesis and transfer of energy. At present, the increase in investment in agriculture is mainly due to the use of nitrogen fertilizer because it directly affects yield. Nitrogen fertilizer consumption has been increasing since the early 1960's, and has stabilized slightly over the last decade . Plants can only use approximately 30-40% of the applied N, and more than 40% of the N fertilizer is lost via leakage into the atmosphere, groundwater, lakes and rivers. Such leakage results in serious environmental pollution . The United Nations Environment Programme recently reported that, N pollution, water shortages and global warming are the main global threats . Improving crop and N management is required to optimize crop production and reduce environmental risks due to N losses.
Improving N-use efficiency (NUE) by genetic improvement is necessary for the development of agriculture. NUE comprises assimilation efficiency, which involves N uptake and assimilation, and utilization efficiency, which involves N remobilization. The mechanisms regulating these processes are complex, but it is vital that they are well understood to improve NUE in plants . To study the whole physiological process, the plants grown under low- and high-nitrogen conditions were compared, and the genes, proteins, and other metabolites that played roles in the various steps of nitrogen uptake, assimilation, and remobilization were described in detail . There were significant differences in NUE among different genotypes, and the high NUE genotypes could be selected from the initial plant material. Therefore, one important approach to improve the NUE of crop plants is to develop an understanding of the plant response to N- limitation by comparing two extreme genotypes and using various methods including transcription profiling, mutant analysis, and characterization of plants that grow well under N-limited conditions .
Soybean requires more N than other major crops to sustain seed growth . As a legume, soybean can acquire N for its growth via its N-fixing symbiosis with rhizobacteria, which form nodules on the roots and can fix atmospheric N. In addition, soybean can draw mineral nitrogen from the soil. These processes may not supply enough N for soybeans to maximize yield, especially in high-yield environments . With the human population explosion, the energy crisis and environmental pollution, improving the efficiency of N nutrition of plants has become a research hotspot. Therefore, improving the NUE of soybean is a very urgent issue. Genetic engineering is one strategy to enhance the NUE of soybean.
It's necessary to increase the knowledge of soybean gene expression and regulation under N-limited conditions to understand the responses of this crop to different N regimes. Such information is vital for improving the NUE of soybean, and would also be useful to clarify the signal transduction pathways and the mechanism that regulate the N-uptake, assimilation and remobilization pathways.
Next-generation sequencing techniques are opening fascinating opportunities for life sciences, and have dramatically improved the efficiency and speed of gene discovery. This technology can rapidly produce huge numbers of short sequencing reads, making it possible to analyze a complex sample containing a large amount of nucleic acids, by simultaneously sequencing contents of the entire sample . Digital gene-expression (DGE): Tag profiling is a revolutionary approach for expression analysis . Driven by Solexa/Illumina technology, DGE creates genome-wide expression profiles by sequencing. The ability to identify, quantify, and annotate expressed genes on the whole genome level without prior sequence knowledge enables an entirely new scale of biological experimentation, opening doors to higher-confidence target discovery, disease classification, and pathway studies. DGE: Tag profiling also offers researchers a global orthogonal hybridization array validation method, with almost unlimited dynamic range, providing a tunable depth of coverage for rare transcript discovery and quantification. For example, DGE analysis was used to study gene expression in the gastric lymph nodes of Scottish blackface lambs subjected to persistent Teladorsagia circumcincta infection . To validate gene expression in the developing digits of two individuals of Hipposideros armiger, DGE-tag profiling of developing digits in a pooled sample of two Myotis ricketti was analyzed . Age-related autocrine diabetogenic effects of transgenic resistin in spontaneously hypertensive rats were investigated by gene expression profile analysis. This technique has also been used in plant research. Early developing cotton fiber was analyzed by deep-sequencing, and differential expressions of genes in a fuzzless/lintless mutant were revealed . DGE signatures were also used to study maize development, and the results from that study provided a basis for the analysis of short-read expression data and resolved specific expression signatures that will help define mechanisms of action of the maize RA3 gene . In addition, Solexa/Illumina technology was used to analyze gene expression during female flower development . Overall, the DGE approach has provided more valuable tools for qualitative and quantitative gene expression analysis than the previous micro array-based assays.
Here, this is the first genome-wide analysis of gene expression in soybean seedlings under low N stress. Using the Solexa sequencing system, the transcriptomes were compared between seedlings of two soybean varieties, one tolerant and one sensitive to low nitrogen conditions. By investigating the expressions of genes related to N utilization, a number of candidate genes that are important in this process were identified.
Screening soybean varieties for tolerance to low-N conditions
To obtain soybean varieties with different NUEs, 145 varieties were screened (Additional file 1). Soybean seeds were germinated and grown hydroponically in one-half-strength modified Hoagland solution containing 2 mMCa(NO3)2·4H2O,2.5 mM KNO3, 0.5 mM NH4NO3, 0.5 mM KH2PO4, 1 mM MgSO4·7H2O, 0.05 mM Fe-EDTA, 0.005 mM KI, 0.1 mM H3BO3, 0.1 mM MnSO4·H2O, 0.03 mM ZnSO4·7H2O, 0.0001 mM CuSO4·5H2O, 0.001 mM Na2MO4·2H2O, 0.0001 mM CoCl2·6H2O. The containers used to grow seeds in this solution were 45 × 33 × 20 cm black plastic boxes containing a foam board with 80 holds. This study tested two N levels (N1 level: 10% of the normal N concentration; N2, normal N concentration) in these experiments. The concentration of N in the N1 solution was determined based on a preliminary experiment. Under this N level, stress symptoms (yellow leaves and plant dwarf) were observed within 12 days. The culture solution was refreshed every 3 days. This experiment was conducted once. For preliminary evaluation of N deficiency in soybean plants, the ratios of various parameters, such as relative dry weight, stem length, root length were compared between plants grown in N1 and N2 conditions.
Based on the results of the first screening, three low-N-tolerant varieties and two low-N-sensitive varieties were selected and grown in nutrition solution at two N levels. This experiment was repeated three times. Samples were harvested separately after 0 h and 12 d of treatment. The dry plant weight, stem length, root length and nitrogen content were determined, and these were used as the criteria for screening for genotypes with high NUE. Because different cultivars show genotype-related differences in these biological characteristics, nitrogen use efficiency were estimated using relative indices under several nitrogen levels.
Plant material and stress treatments
Seeds of the No.116 (low-N-tolerant) and No.84-70(low-N-sensitive) soybean varieties were germinated and grown hydroponically in half-strength modified Hoagland solution. The seedlings were grown for 10 days until the first trifoliate leaves fully developed, and then were grown with 10% of the normal N concentration. The roots and shoots were harvested separately after 0.5, 2, 6 and 12 h, and after 3, 6, 9 and 12d of this treatment. The plant tissues were frozen in liquid nitrogen and kept at -80°C until RNA isolation.
Solexa/Illumina sequencing was carried out by BGI-Shenzhen, China. The main reagents and supplies were the Illumina Gene Expression Sample Prep kit and Illumina Sequencing Chip (flowcell), and the main instruments were the Illumina Cluster Station and the Illumina HiSeq™ 2000 System. The experimental process is summarized as follows: 6 μg total RNA was extracted, and then mRNA was purified with Oligo (dT) magnetic beads. Then, oligo (dT) was used as a primer to synthesize the first and second-strand cDNA. The 5' ends of tags can be generated by two types of endonuclease: Nla III or Dpn II. Usually, the bead-bound cDNA is subsequently digested with restriction enzyme Nla III, which recognizes and removes the CATG sites. The fragments apart from the 3' cDNA fragments connected to Oligo (dT) beads are washed away and the Illumina adaptor 1 is ligated to the sticky 5' end of the digested bead-bound cDNA fragments. The junction of Illumina adaptor 1 and the CATG site is the recognition site of Mme I, which has a different recognition and digestion site, i.e., it cuts at 17-bp downstream of the CATG site, producing tags with adaptor 1. After removing 3' fragments with magnetic beads precipitation, Illumina adaptor 2 is ligated to the 3' ends of tags, acquiring tags with different adaptors at both ends to form a tag library. After 15 cycles of linear PCR amplification, 95-bp fragments are purified by 6% TBE PAGE Gel electrophoresis. After denaturation, the single-chain molecules are fixed onto the Illumina Sequencing Chip (Flowcell). Each molecule grows into a single-molecule cluster sequencing template through in situ amplification. Then, four types of nucleotides labeled by four colors are added in, and sequencing is performed via the sequencing by synthesis (SBS) method. Each line of the flowcell tunnel will generate millions of raw reads with sequencing lengths of 35 bp.
Gene expression annotation
All tags were annotated using the database provided by Illumina. Briefly, a preprocessed database of all possible CATG+17-nt tag sequences was created, using the soybean genome and transcriptome. All clean tags were mapped to the reference sequences allowing only a 1-bp mismatch. Clean tags mapped to reference sequences from multiple genes were filtered, and the remaining clean tags were designated as unambiguous clean tags. The number of unambiguous clean tags for each gene was calculated and then normalized to TPM (number of transcripts per million clean tags) [16, 17].
Analysis and screening of differentially expressed genes (DEGs)
Sequencing-received raw image data is transformed by base calling into sequence data, (raw data or raw reads), and is stored in FASTQ format. This type of files stores information about read sequences and quality. Each read is described in four lines in FASTQ files. Raw sequences have 3' adaptor fragments as well as a few low-quality sequences and several types of impurities. Raw sequences are transformed into clean tags after certain data-processing steps. A virtual library was constructed containing all the possible CATG+17 bases length sequences of the reference gene sequences. All clean tags were mapped to the reference sequences and allowing a 1-bp mismatch. Clean tags mapped to reference sequences from multiple genes were filtered. The remaining clean tags were designated as unambiguous clean tags. The number of unambiguous clean tags for each gene was calculated and then normalized to TPM (number of transcripts per million clean tags). A rigorous algorithm  was used to identify differentially expressed genes between the two samples. The P-value corresponds to the differential gene expression test. The FDR (False Discovery Rate) is used to determine the threshold of P-value in multiple tests and analyses by manipulating the FDR value. Assume that R differentially expressed genes have been selected, among which S genes truly show differential expression and V genes are false positives. If we decide that the error ratio "Q = V/R" must stay below a cutoff (e.g. 1%), we should preset the FDR to a number no larger than 0.01. FDR ≤ 0.001 and the absolute value of | log2Ratio |≥ 1 were used as thresholds to judge the significance of differences in transcript abundance . More stringent criteria with smaller FDR and greater fold-change value can be used to identify DEGs.
Real-time quantitative RT-PCR (qRT-PCR) analysis
The expression of candidate genes was determined using qRT-PCR. Tissue samples were removed from the freezer and ground in liquid nitrogen. Total RNA was isolated using Trizol reagent according to the manufacturer's instructions. The quality of the RNA was assessed using an Agilent 2100 Bioanalyzer. The first-strand cDNA fragment was synthesized from total RNA using Superscript II reverse transcriptase (Invitrogen). Gene-specific primers were designed according to gene sequences using Primer 5.0 software. Twenty-four pairs of primers were designed to amplify 24 target genes which were then cloned and sequenced. Using the obtained sequences, gene specific primers were designed for each target gene for qPCR (Additional file 2). Where possible, primers were designed to span intron/exon boundaries to avoid amplification of genomic DNA in qRT-PCR. The quantitative RT-PCR was performed with a iQ™5 and MyiQ™ Real-Time PCR Detection Systems (Bio-Rad) in a final volume of 20 ul containing 2 ul of a 1/10 dilution of cDNA in water, 10 ul 2 × SYBR Green Real-time PCR Master Mix (TOYOBO), and 10 uM of forward and reverse primers. The thermal cycling conditions were as follows: 40 cycles of 95°C for 5 s for denaturation and 55°C for 10 s for annealing and extension. qRT-PCR was performed on three biological replicates. Samples were run in triplicate on the same plate with a negative control that lacked cDNA. Positive controls were set up for each sample in triplicate using soybean the β-actin gene. The soybean β-actin gene was used to normalize gene expressions. PCR efficiency was determined by a series of 2-fold dilutions of cDNAs. The calculated efficiency of all primers was 0.9-1.0. The relative expression levels of genes were calculated using the 2-ΔCTΔCT method, which represents the difference of CT between the control β-actin products and the target gene products.
Screening for soybean varieties with high NUE at the seedling stage
Performance of five soybean varieties under low-N and normal-N conditions
Dry plant weight
Total nitrogen accumulation in shoot
Amount of N absorbed
Categorization and abundance of tags
Distinct tag number
All Tag Mapping to Gene
Total % of clean tag
Distinct Tag number
Distinct Tag % of clean tag
Total % of clean tag
Unambiguous Tag Mapping to Gene
Distinct Tag number
Distinct Tag % of clean tag
% of ref genes
All Tag-mapped Genes
% of ref genes
Unambiguous Tag-mapped Genes
Total % of clean tag
Mapping to Mitochondrion
Distinct Tag number
Distinct Tag % of clean tag
Total % of clean tag
Mapping to Chloroplast
Distinct Tag number
Distinct Tag % of clean tag
Total % of clean tag
Mapping to Genome
Distinct Tag number
Distinct Tag % of clean tag
Total % of clean tag
Distinct Tag number
Distinct Tag % of clean tag
After filtering dirty tags from raw data, a total of 5,739,999, 5,846,807, 5,731,901, 5,970,775, 5,476,878, 5,900,343, 5,930,716 and 5,862,642 clean tags that corresponded to 224,154, 162,415, 191,994, 181,792, 204,639, 206,998, 233,839 and 257,077 distinct tags for L1, L2, L3, L4, L5, L6, L7 and L8 libraries were obtained, respectively. Eight databases represented expressed sequences (or the transcriptome) for each library. Tags can be mapped to known transcripts to reveal the molecular events behind DGE profiles. In our study, the tag sequences of the eight DGE libraries were mapped to the Soybean (Glycine max) genome project, and they matched to more than 80% of all sequence entries in the databases. The tags mapping to the database generated 29,503, 28,271, 27,960, 27,977, 29,174, 28,799, 29,217 and 30,581 tag-mapped transcripts for L1, L2, L3, L4, L5, L6, L7, and L8 libraries, respectively.
Gene ontology functional enrichment analysis of DEGs
where N is the number of all genes with GO annotation; n is the number of DEGs in N; M is the number of all genes that are annotated to the certain GO terms; and m is the number of DEGs in M. The p value is corrected by Bonferroni, and we chose a corrected-p value ≤ 0.05 as the threshold value. The GO term (P ≤ 0.05) is defined as significantly differentially expressed genes enriched GO term. This analysis allowed us to determine the major biological functions of differentially expressed genes.
Pathway enrichment analysis for DEGs
Often, different genes cooperate to achieve their biological functions. Pathway-based analysis helps to further understand the biological functions of those genes. For the pathway-based analysis, KEGG was used, the major public pathway-related database . Pathway enrichment analysis identifies significantly enriched metabolic pathways or signal transduction pathways in DEGs in comparison to the whole genome background. The formula used for this calculation is the same as that used in the GO analysis. Here, N is the number of genes with a KEGG annotation, n is the number of DEGs in N, M is the number of genes annotated to specific pathways, and m is the number of DEGs in M. The pathways with a Q value of ≤ 0.05 are defined as those with significantly differentially expressed (enriched) genes. By pathway enrichment analysis we can determine which metabolic and signal transduction pathways the differentially expressed genes are associated with.
Differential gene expression between the two soybean varieties
Functional annotation of differentially expressed genes
After identifying differentially expressed genes, their annotations were established using GO functional enrichment analysis. In addition, all the genes were mapped to terms in the KEGG database, and compared with the complete reference gene background to identify genes involved in pathways that were significantly enriched. Among all the genes with KEGG pathway annotations, 6,473 differentially expressed genes were identified between L1 and L2; 9,014 between L3 and L4; 6,758 between L5 and L6, and 8,628 between L7 and L8. In the four libraries, the main significantly enriched pathways were the plant circadian rhythm pathway, the flavone/flavonol biosynthetic pathway, the glutathione metabolism pathway, the citrate cycle (TCA cycle), the alanine, aspartate and glutamate metabolism pathway, the nitrogen metabolism pathway, the phosphatidylinositol signaling system, and protein export and ribosome pathways. We noted that the 'nitrogen metabolism' pathway was directly involved in nitrogen availability . Large amounts of energy are required to drive the nitrate assimilation, ammonium assimilation and amino acid biosynthesis pathways. The 'carbohydrate metabolism' pathway could provide most of the energy for these pathways .
Most differentially expressed annotated genes in L1 vs. L2, L3 vs. L4, L5 vs. L6, and L7 vs. L8 libraries based on expressed tag frequency
Relative abundance(TPM ratio)
L1 vs. L2 up
Papain family cysteine protease
L1 vs. L2 down
BRG-1 ASSOCIATED FACTOR 60
MYB transcription factor MYB52
L3 vs. L4 up
similar to PDF
Serine/threonine protein kinase
Gibberellin regulated protein
Protease inhibitor/seed storage/LTP family
Glycosyl hydrolases family 17
L3 vs. L4 down
Late embryogenesis abundant protein
Glyceraldehyde 3-phosphate dehydrogenase
Ribosomal protein L7Ae/L30e/S12e/Gadd45 family
Ribosomal protein L6
L5 vs. L6 up
ABC transporter related protein
OTU-like cysteine protease family protein
L5 vs. L6 down
harpin inducing protein
conserved hypothetical protein
methyl esterase 17
disease resistance protein
L7 vs. L8 up
functional candidate resistance protein KR1
transcription factor homolog BTF3-like protein
glutathione S-transferase omega
dehydration-responsive family protein
L7 vs. L8 down
NAD dependent epimerase/dehydratase family
Ribosomal protein L6
UDP-glucuronosyl and UDP-glucosyl transferase
membrane associated ring finger 1,8
Genes encoding transcription factors
Transcription factors are essential for the regulation of gene expression. Changes in gene transcription are associated with changes in expression of transcription factors. Our DGE results showed that forty-eight genes encoding transcription factors were induced by 1.85 to 62.54-fold, including thirty-one up-regulated and seventeen down-regulated genes. Among the forty-eight genes, six were bHLH family proteins, two were bZIP transcription factors, five were MYB transcription factors, one was a putative TATA element modulatory factor, one was a GT-2 transcription factor, one was a HMG box factor SOX-1, one was a EIL1 transcription factor, one was an auxin response factor, one was a BTF3-like protein transcription factor, and the others were all zinc-finger family proteins.
Kinases play important roles in the development of eukaryotic cells, such as cell cycle control and cell-type determination and differentiation . They regulate metabolic processes in various organs and tissues, and facilitate and control growth, differentiation, reproductive activities, learning and memory. Kinases help the organism to cope with changing conditions and stresses in the environment. Because some of their targets are transcription factors, they also play a role in regulating transcription . Forty-two kinase genes were identified as significantly differentially expressed transcripts, including twenty-four up-regulated and eighteen down-regulated genes. Among these twenty-four genes, four were Tyrosine kinases, nineteen were serine-threonine protein kinases, three were leucine-rich repeat transmembrane protein kinases, two were wall-associated kinases, two were stress-induced receptor-like kinases, and fifteen were other types of kinases.
Genes involved in carbon and energy metabolism
Many genes involved in carbon and energy metabolism were differentially expressed under low-N conditions. Altered expressions of numerous genes involved in glycolysis, the citrate cycle, oxidative phosphorylation, nitrogen metabolism and photosynthesis were observed. For example, four genes involved in phosphorylation showed increased transcript abundance. These genes encoded casein kinase II subunit alpha (Glyma17g17790), Cdc2-related protein kinase (Glyma04g37630), triose-phosphate transporter family protein (Glyma14g23570) and glucose-6-phosphate 1-dehydrogenase (Glyma19g41450). The TPMs for those transcripts were up-regulated by 3.14 to 5.66-fold. Eight genes involved in photosynthesis were differentially expressed including three genes encoding pfkB family carbohydrate kinases. Their expressions were increased by 2.44-fold(Glyma01g07780), 2.44-fold(Glyma10g32050), and 3.9-fold(Glyma14g37260). Four genes encoded chloroplast-related proteins, including three up-regulated Chlorophyll a/b binding protein genes (Glyma04g08370, Glyma05g24660, and Glyma09g07310) and one down-regulated chloroplast-targeted copper chaperone gene (Glyma03g37060). In addition, one gene encoding a photosynthetic reaction center protein (Glyma13g15560) was up-regulated. In the glycolysis pathway, genes encoding eight glycosyl hydrolase family members were differentially expressed; two were down-regulated, and six were up-regulated with the greatest increase (6.98-fold) observed for Glyma04g01030.
Nitrogen assimilation-related genes
Nitrogen assimilation is a fundamental biological process in plants. The assimilation of nitrogen has profound effects on plant productivity, biomass, and crop yield, and nitrogen deficiency can inhibit the formation of structural components. Some genes involved in nitrogen assimilation showed significant differential expressions in this study. For example, our DGE results indicated that seven genes encoding amino acid transporter proteins were differentially expressed: four genes were up-regulated (Glyma06g09270, Glyma10g12290, Glyma11g11310, Glyma14g05890) and three genes were down-regulated (Glyma01g36590, Glyma13g44450, Glyma17g26590). In addition, two genes encoding a glutamate synthase family protein (Glyma02g35560) and an asparagine synthetase (Glyma14g37440) were up-regulated; and one nitrate gene (Glyma10g08730) was down-regulated.
Other differentially regulated genes
There were other genes that showed high-level differential expression related to low-N conditions. After the analysis of the differentially expressed genes in DEGs, six genes related to oxidoreductase activity were identified; a putative ACC-oxidase, a 3-hydroxyacyl-CoA dehydrogenase, a short-chain dehydrogenase and an omega-3 fatty acid desaturase. Six defense response genes were also identified; a putative defensin-like protein, a candidate disease-resistance protein, a wound-induced protein, an abscisic acid-responsive HVA22 family protein, and a GDSL-motif lipase. In addition, one gene encoding a BURP domain protein and one gene encoding a CBS domain-containing protein were found. Another two genes (Glyma08g48240 encoding a UDP-glycosyltransferase and Glyma10g38990 encoding a phosphoinositide binding protein) were also up-regulated. Expression of Glyma10g40580 encoding a gibberellin-regulated protein was up-regulated 33.32-fold under low-N conditions. Expression of Glyma12g33350 encoding an aminotransferase family protein was up-regulated 8.76-fold. Expression of Glyma14g07190 encoding a dehydration-responsive family protein was up-regulated 17.63-fold. Some genes encoding ABC family proteins were also differentially expressed (Glyma19g13500, Glyma08g07560, Glyma06g14450, and Glyma16g28900).
Confirmation of tag-mapped genes by qRT-PCR
This study demonstrated differential transcript abundance and regulation in response to low-N stress between two soybean varieties, one tolerance and one sensitive to low-N conditions. N-stress frequently occurs in agricultural field conditions, and to improve the NUE of plants, it is necessary to formulate strategies to manipulate the genetic architecture of soybean. In this study, numerous genes showed altered expression in plants under low-N stress. These different expressions were analyzed by DGE profiling, which is a fully quantitative approach for gene expression analysis . Identification of differentially expressed genes provides a new platform for understanding the relationships between complex N-responses and regulatory mechanism . Using tag-based deep-sequencing, a direct digital readout of cDNAs can be obtained, showing a dynamic range of genes from transcript libraries. In these experiments, approximately 25,000-27,000 tag-mapped genes were identified for each library. Detailed analysis of N-related genes and pathways showed that approximately 15 significantly differentially expressed genes were enriched in various N-related metabolic or signaling pathways. In addition, several other biological processes that have not previously been linked to N stress, such as flavonoid biosynthesis, natural killer cell mediated cytotoxicity, flavone/flavonol biosynthesis, the phosphatidylinositol signaling system, and N-Glycan biosynthesis, were dramatically altered during N-stress response. These might be novel genes that are relevant to NUE in soybean.
Nitrogen metabolism genes
Through annotation of the transcriptome and screening for differentially expressed genes, several putatively N-related genes were discovered. These included both up-regulated and down-regulated genes. Nitrogen is utilized by plants in several steps, including uptake, assimilation, translocation, recycling, and remobilization . These events are highly dynamic and complex, and numerous genes are potentially involved.
In plants, N uptake is based on absorption kinetics of transporters across the root cell membranes, mass flow, and diffusion to the surface of single or composite roots. Among the candidate genes identified in this study, some may play roles in the uptake process, such as Glyma13g17730, Glyma17g10440, Glyma05g01450, and Glyma02g43740, all of which are nitrate transporters that are presumably responsible for nitrate absorption from soil . Some genes were related to the cell membrane, where they may play roles in nutrient absorption and/or the N-uptake process. These genes included a wall-associated kinase (Glyma19g21700) and a membrane-associating domain (Glyma16g08050).
Another fundamental biological process that occurs in plants is N assimilation. The major enzymes in N assimilation are glutamine synthetase (GS), glutamate synthase (GOGAT), glutamate dehydrogenase (GDH), aspartate aminotransferase (AspAT), and asparagine synthetase (AS). Each of these enzymes exists in multiple isoenzymic forms encoded by distinct gene families . Several candidate genes that may take part in N-assimilation were found, such as glutamate dehydrogenase (Glyma02g07940), which might play a unique role in assimilating ammonia or catabolizing glutamate during these processes; an NADH glutamate synthase precursor (Glyma14g32500), which was hypothesized to be linked to the process by which NADH-GOGAT catalyzes the rate-limiting step of ammonia assimilation in root nodules; and asparagine synthetase (AS;Glyma14g37440), which is regulated by the carbon/nitrogen status of the plant. The levels of asparagine and AS activities are also controlled by environmental and metabolic signals . In this study, a gene (Glyma12g33350) encoding a predicted aspartate aminotransferase that was up-regulated under low-N conditions was found. In plants, aspartate aminotransferase (AAT, EC184.108.40.206) plays a key role in primary N assimilation, the transfer of reducing equivalents and the interchanges of carbon and nitrogen pools among subcellular compartments .
Regulation of transcription factors and protein kinases
A single transcription factor can regulate expression of multiple genes in a metabolic pathway, and transcription factors are important for regulating many plant responses. Therefore, one approach to genetically improve crops is to modify metabolism pathways. Transcription factors might therefore be potent tools to engineer enhanced stress tolerance in plants [36, 37]. Nitrate is the main source of nitrogen for plants, and it serves as the primary signal for several developmental processes including carbon/N metabolism and other metabolic pathways. It is likely that the expressions of numerous genes are regulated in these processes. Some transcription factors and kinases are related to these processes . For example, expressing a Dof1 transcription factor in Arabidopsis improved growth and increased N assimilation under low-N conditions by regulating genes encoding enzymes for production of the carbon skeleton . Therefore, enhanced expression of the key transcription factor(s) could improve the stress tolerance of soybean.
The GATA factors constitute a subgroup of DNA-binding proteins whose members recognize HGATAR core sequences within promoters and enhancers . Many GATA factors can activate or inactivate genes in response to environmental deficiencies and/or to extract chemical elements (i.e., iron, nitrogen, etc.) from the surrounding environment. Some GATA factors regulate N metabolism and are required to activate expression of N catabolic enzymes during periods of N- deficiency in fungi . However, little is known about the functions of GATA factors in plants. In this study, a gene encoding a hypothetical GATA factor protein (Glyma12g08130) showed differential expression under low-N condition. We assume that this gene is involved in N- assimilation in soybean. The function of the gene will be studied by RNA-interference or by overexpression in transgenic plants in the near future.
Several lines of biochemical and genetic research indicate that reversible protein phosphorylation is involved in the regulation of plant stress responses to various environmental stimuli. Some protein kinases might be involved in the regulation of cell differentiation and N-metabolism in nitrogen-fixing filamentous cyanobacteria [41, 42]. Wall-associated kinases are also involved in various processes in plants, including pathogen resistance, heavy-metal tolerance and organ development . Unfortunately, little is known about their function in tolerance to nutrient deficiency. Our DGE results indicated that two genes encoding wall associated kinases, Glyma19g21700 and Glyma19g21690, were up-regulated under N-limited conditions. In addition, a gene encoding receptor-like kinase (Glyma13g09810) was differentially expressed between the two varieties under N-limited conditions. Recent studies revealed that higher plants also have genes encoding putative receptor kinases (receptor-like Kinases; RLKs). For instance, the completely sequenced Arabidopsis genome contains more than 500 genes encoding RLKs, suggesting that higher plants, like animals, use receptor kinase signaling widely to modulate expressions of genes in response to diverse stimuli. Some research indicated that receptor-like kinases (RLK) play important roles in plant growth and development as well as in hormone and stress responses . Therefore, we hypothesize that the Glyma13g09810 gene might be important for adaptation to low-N conditions in soybean.
Other differentially regulated genes
In addition to the genes described above, several other transcript profiles were altered under low N conditions. For example, a gene encoding BURP domain protein (Glyma04g35360) was differentially expressed. Some reports suggest that genes from the BURP family may be crucial for responses and adaptations to stresses. All the members of this family were shown to be induced by at least one type of stress treatment, for example, drought, salt, cold, abscisic acid and nutrition, etc. . Therefore, the soybean BURP gene may be N responsive to N-stress. One gene encoding CBS domain-containing protein which was differentially expressed in two soybean varieties was also found. Previous research revealed that CBS domain-containing proteins play important roles in stress response/tolerance and development in plants . To determine whether this protein has the potential to improve tolerance of transgenic plants to low N-stress, its role in development and N stresses should be further investigated. In addition, some published results suggest that a phosphatase is involved in modulating phosphoinositide signals during the stress response . This results showed that one gene (Glyma10g38990) putatively encoding a phosphoinositide binding protein was up-regulated. We suggest that this gene may function as a component of a stress response pathway that protects the plant against the effects of N-deficiency.
The DGE results indicated that three genes predicted to be members of the ABC1 family, were differentially expressed between N1 and N2 conditions. Several plant ABC1 genes participate in the abiotic stress response . Plants have evolved diverse adaptive physiological and biochemical mechanisms to resist various stresses, and thus, expressions of many related genes are altered.
In the DGE analysis of differentially expressed genes under low-N conditions, fifty-three up-regulated and forty-seven down-regulated genes that were not annotated were found. We hypothesize that these genes are putatively N-related transcripts. However, they may be unique to soybean, and therefore, absent from other species. Further research focusing on these genes will be carried out based on the DEGs information and bioinformatics.
This study has demonstrated the usefulness of the digital gene expression (DGE) approach to identify differentially expression genes between two soybean genotypes in N-limiting conditions. A large data set of tag-mapped transcripts were obtained, which provide a strong basis for future research on the N-nutrition of other crops. In addition, a new list of candidate targets for functional studies on genes involved in N utilization has been generated. Further work should concentrate on characterizing these genes. This could lead to a better understanding of the genetic basis of the phenotypic differences between the two soybean genotypes in N-limiting conditions. This is essential for improving the NUE of soybean.
This work was supported by a grant from the National Genetically Modified Organisms Breeding Major Projects (2009ZX08004- 005) and grants from the Crop Germplasms Resource Protection Project (NB07-2130315-06, NB08-2130315-(25-31)-06). We also thank anonymous reviewers for their constructive comments.
- Horchani F, R'bia O, Hajri R, Aschi-Smiti A: Nitrogen Nutrition and Ammonium Toxicity in Higher Plants. International. 2011, 7 (1): 1-16.
- Raun WR, Solie JB, Stone ML, Martin KL, Freeman KW, Mullen RW, Zhang H, Schepers JS, Johnson GV: Improving nitrogen use efficiency in cereal grain production with optical sensing and variable rate application. Agron J. 2002, 94: 815-820. 10.2134/agronj2002.0815.View Article
- Frink CR, Waggoner PE, Ausubel JH: Nitrogen fertilizer: retrospect and prospect. Proc Natl Acad Sci USA. 1999, 96: 1175-1180. 10.1073/pnas.96.4.1175.PubMed CentralPubMedView Article
- Lian XM, Wang SP, Zhang JW: Expression profiles of 10,422 genes at early stage of low nitrogen stress in rice assayed using a cDNA microarray. Plant molecular biology. 2006, 60 (5): 617-631. 10.1007/s11103-005-5441-7.PubMedView Article
- Kant S, Bi YM, Rothstein SJ: Understanding plant response to nitrogen limitation for the improvement of crop nitrogen use efficiency. J Exp Bot. 2010, 1-12.
- Hirel B, Legovis J, Ney B, Gallais A: The challenge of improving nitrogen use efficiency in crop plants: towards a more central role for genetic variability and quantitative genetics within integrated approaches. Journal of Experimental Botany. 2007, 2369-2387.
- Sinclair TR, deWit CT: Photosynthesis and nitrogen requirements for seed production by various crops. Science. 1975, 189: 565-567. 10.1126/science.189.4202.565.PubMedView Article
- Harper JE: Nitrogen metabolism. Soybeans: Improvement, Production, and Uses. Edited by: Wilcox JR, DA. 1987, ASA/CSSA/SSSA, Madison, WI, 497-533. second
- Ansorge WJ: Next-generation DNA sequencing techniques. N Biotechnol. 2009, 25: 195-203. 10.1016/j.nbt.2008.12.009.PubMedView Article
- Vicki G: Digital Gene-Expression Profiling, Genetic Engineering&Biotechnology. News. 2009, 29 (7):
- Pemberton JM, Beraldi D, Craig BH, Hopkins J: Digital gene expression analysis of gastrointestinal helminth resistance in Scottish blackface lambs. Molecular Ecology. 2011, 20: 910-919. 10.1111/j.1365-294X.2010.04992.x.PubMedView Article
- Wang Z, Dong D, Ru BH: Digital gene expression tag profiling of bat digits provides robust candidates contributing to wing formation. BMC Genomics. 2010, 11: 619-10.1186/1471-2164-11-619.PubMed CentralPubMedView Article
- Wang QQ, Liu F, Chen XS, Ma XJ, Zeng HQ, Yang ZM: Transcriptome profiling of early developing cotton fiber by deep-sequencing reveals significantly differential expression of genes in a fuzzless/lintless mutant. Genomics. 2010, 6 (96): 369-376.View Article
- Eveland AL, Satoh-Nagasawa N, Goldshmidt A, Jackson D: Digital Gene Expression Signatures for Maize Development. Plant Physiology. 2010, 154: 1024-1039. 10.1104/pp.110.159673.PubMed CentralPubMedView Article
- Wu T, Qin Z, Zhou X, Feng Z, Du Y: Transcriptome profile analysis of floral sex determination in cucumber. Journal of Plant Physiology. 2010, 167 (11): 905-913. 10.1016/j.jplph.2010.02.004.PubMedView Article
- 't Hoen PA, Ariyurek Y, Thygesen HH, Vreugdenhil E, Vossen RH, de Menezes RX, Boer JM, van Ommen GJ, den Dunnen JT: Deep sequencing-based expression analysis shows major advances in robustness, resolution and inter-lab portability over five microarray platforms. Nucleic Acids Res. 2008, 36: e141-10.1093/nar/gkn705.PubMed CentralPubMedView Article
- Morrissy AS, Morin RD, Delaney A, Zeng T, McDonald H, Jones S, Zhao Y, Hirst M, Marra MA: Next-generation tag sequencing for cancer gene expression profiling. Genome Res. 2009, 19: 1825-1835. 10.1101/gr.094482.109.PubMed CentralPubMedView Article
- Audic S, Claverie J: The significance of digital gene expression profiles. Genome Res. 1997, 7: 986-995.PubMed
- Benjamini Y, Yekutieli D: The control of the false discovery rate in multiple testing under dependency. The Annals of Statistics. 2001, 29: 1165-1188. 10.1214/aos/1013699998.View Article
- Kanehisa M, Araki M, Goto S, Hattori M, Hirakawa M, Itoh M, Katayama T, Kawashima S, Okuda S, Tokimatsu T: KEGG for linking genomes to life and the environment. Nucleic Acids Res. 2008, 36: D480-484.PubMed CentralPubMedView Article
- Miflin BJ, Lea PJ: Amino Acid Metabolism. Annual Review of Plant Physiology. 1977, 28: 299-329. 10.1146/annurev.pp.28.060177.001503.View Article
- Harding HP, Zhang YH, Zeng HQ, Novoa I: An Integrated Stress Response Regulates Amino Acid Metabolism and Resistance to Oxidative Stress. Molecular cell. 2003, 11: 619-633. 10.1016/S1097-2765(03)00105-9.PubMedView Article
- Mizock BA: Alterations in carbohydrate metabolism during stress: A review of the literature. The American journal of medicine. 1995, 98: 75-84. 10.1016/S0002-9343(99)80083-7.PubMedView Article
- Stitt M, Müller C, Matt P, Gibon Y: Steps towards an integrated view of nitrogen metabolism. J Exp Bot. 2002, 53 (370): 959-970. 10.1093/jexbot/53.370.959.PubMedView Article
- Sudha G, Ravishankar GA: Involvement and interaction of various signaling compounds on the plant metabolic events during defense response, resistance to stress factors, formation of secondary metabolites and their molecular aspects. Plant Cell, Tissue and Organ Culture. 2002, 71: 181-212. 10.1023/A:1020336626361.View Article
- Masclaux-Daubresse C, Daniel-Vedele F, Dechorgnat J, Chardon F, Gaufichon L, Suzuki A: Nitrogen uptake, assimilation and remobilization in plants: challenges for sustainable and productive agriculture. Annals of Botany. 2010, 105: 1141-1158. 10.1093/aob/mcq028.PubMed CentralPubMedView Article
- Kaiser WM, Kandlbinder A, Stoimenova M, Glaab J: Discrepancy between nitrate reduction rates in intact leaves and nitrate reductase activity in leaf extracts: what limits nitrate reduction in situ?. Planta. 2000, 210: 801-807. 10.1007/s004250050682.PubMedView Article
- Hanks SK, Quinn AM, Hunter T: The protein kinase family: conserved features and deduced phylogeny of the catalytic domains. Science. 1988, 241: 42-52. 10.1126/science.3291115.PubMedView Article
- Hunter T, Karin M: The regulation of transcription by phosphorylation. Cell. 1992, 70: 375-387. 10.1016/0092-8674(92)90162-6.PubMedView Article
- Wu X, Walker MG, Luo J, Wei L: GBA server: EST-based digital gene expression profiling. Nucleic Acids Res. 2005, 33: 673-676. 10.1093/nar/gki480.View Article
- Bi YM, Wang RL, Zhu T, Rothstein SJ: Global transcription profiling reveals differential responses to chronic nitrogen stress and putative nitrogen regulatory components in Arabidopsis. BMC Genomics. 2007, 8: 1-17. 10.1186/1471-2164-8-1.View Article
- Glass ADM: Nitrogen Use Efficiency of Crop Plants: Physiological Constraints upon Nitrogen Absorption. Critical Reviews in Plant Sciences. 2003, 22: 453-470.View Article
- Lam HM, Coschigano KT, Oliveira IC: The molecular genetics of nitrogen assimilation into amino acids in higher plants. Ann Rev Plant Physiol Plant Mol Biol. 1996, 47: 569-593. 10.1146/annurev.arplant.47.1.569.View Article
- Lam HM, Peng SY, Coruzzi GM: Metabolic regulation of the gene encoding glutamine-dependent asparagine synthetase in Arabidopsis thaliana. Plant Physiol. 1994, 106: 1347-1357. 10.1104/pp.106.4.1347.PubMed CentralPubMedView Article
- de la Torre F, Suárez MF, Santis L, Cánovas FM: The aspartate aminotransferase family in conifers: biochemical analysis of a prokaryotic-type enzyme from maritime pine. Tree Physiol. 2007, 27 (9): 1283-91.PubMedView Article
- Chen W, Provart NJ, Glazebrook J, Zhu T: Expression profile matrix of Arabidopsis transcription factor genes suggests their putative functions in response to environmental stresses. Plant Cell. 2002, 14: 559-574. 10.1105/tpc.010410.PubMed CentralPubMedView Article
- Jaglo-Ottosen KR, Gilmour SJ, Zarka DG, Schabenberger O, Thomashow MF: Arabidopsis CBF1 overexpression induces COR genes and enhances freezing tolerance. Science. 1998, 280: 104-106. 10.1126/science.280.5360.104.PubMedView Article
- Yanagisawa S, Akiyama A, Kisaka H, Uchimiya H, Miwa T: Metabolic engineering with Dof1 transcription factor in plants: improved nitrogen assimilation and growth under low-nitrogen conditions. Proceedings of the National Academy of Sciences. 2004, 101: 7833-7838. 10.1073/pnas.0402267101.View Article
- Berger H, Pachllnger R, Morozov I, Goller S, Narendja F, Caddick M, Strauss J: The GATA factor AreA regulates localization and in vivo binding site occupancy of the nitrate activator NirA. Molecular Microbiology. 2006, 59 (2): 433-446. 10.1111/j.1365-2958.2005.04957.x.PubMedView Article
- Lowry Jason, Atchley William: Molecular Evolution of the GATA Family of Transcription Factors: Conservation within the DNA-Binding Domain. J Mol Evol. 2000, 50: 103-115.PubMed
- Zhang CC: A Gene Encoding a Protein Related to Eukaryotic Protein Kinases from the Filamentous Heterocystous Cyanobacterium Anabaena PCC 7120. Proc Natl Acad Sci. 1993, 90: 11840-11844. 10.1073/pnas.90.24.11840.PubMed CentralPubMedView Article
- Zhang CC, Libs L: Cloning and Characterization of the pknD Gene Encoding an Eukaryotic-Type Protein Kinase in the Cyanobacterium Anabaena PCC 7120. Mol Gen Genet. 1998, 258: 26-33. 10.1007/s004380050703.PubMedView Article
- Kanneganti V, Gupta AK: Wall associated kinases from plants - an overview. Physiol Mol Biol Plants. 2008, 14: 1-2. 10.1007/s12298-008-0001-7.View Article
- Mizuno S, Osakabe Y, Maruyama K, Ito T, Osakabe K, Sato T, Shinozaki K, Yamaguchi-Shinozaki K: Receptor-like protein kinase 2 (RPK 2) is a novel factor controlling anther development in Arabidopsis thaliana. The Plant Journal. 2007, 50 (5): 751-766. 10.1111/j.1365-313X.2007.03083.x.PubMedView Article
- Ding XP, Hou X, Xie KB, Xiong LZ: Genome-wide identification of BURP domain-containing genes in rice reveals a gene family with diverse structures and responses to abiotic stresses. Planta. 2009, 230: 149-163. 10.1007/s00425-009-0929-z.PubMedView Article
- Kushwaha HR, Singh AK, Sopory SK, Singla-Pareek SL, Pareek A: Genome wide expression analysis of CBS domain containing proteins in Arabidopsis thaliana (L.) Heynh and Oryza sativa L. reveals their developmental and stress regulation. BMC Genomics. 2009, 10: 200-10.1186/1471-2164-10-200.PubMed CentralPubMedView Article
- Williams ME, Torabinejad J, Cohick E, Parker K, Drake EJ, Thompson JE, Hortter M, DeWald DB: Mutations in the Arabidopsis Phosphoinositide Phosphatase Gene SAC9 Lead to Over accumulation of PtdIns(4,5)P2 and Constitutive Expression of the Stress-Response Pathway. Plant Physiology. 2005, 138: 686-700. 10.1104/pp.105.061317.PubMed CentralPubMedView Article
- Gao QS, Zhang D, Xu L, Xu CW: Systematic Identification of Rice ABC1 Gene Family and Its Response to Abiotic Stress. Rice Science. 2011, 18 (2):
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.