Gene expression profiling in peanut using high density oligonucleotide microarrays
© Payton et al. 2009
Received: 29 December 2008
Accepted: 12 June 2009
Published: 12 June 2009
Skip to main content
© Payton et al. 2009
Received: 29 December 2008
Accepted: 12 June 2009
Published: 12 June 2009
Transcriptome expression analysis in peanut to date has been limited to a relatively small set of genes and only recently has a significant number of ESTs been released into the public domain. Utilization of these ESTs for oligonucleotide microarrays provides a means to investigate large-scale transcript responses to a variety of developmental and environmental signals, ultimately improving our understanding of plant biology.
We have developed a high-density oligonucleotide microarray for peanut using 49,205 publicly available ESTs and tested the utility of this array for expression profiling in a variety of peanut tissues. To identify putatively tissue-specific genes and demonstrate the utility of this array for expression profiling in a variety of peanut tissues, we compared transcript levels in pod, peg, leaf, stem, and root tissues. Results from this experiment showed 108 putatively pod-specific/abundant genes, as well as transcripts whose expression was low or undetected in pod compared to peg, leaf, stem, or root. The transcripts significantly over-represented in pod include genes responsible for seed storage proteins and desiccation (e.g., late-embryogenesis abundant proteins, aquaporins, legumin B), oil production, and cellular defense. Additionally, almost half of the pod-abundant genes represent unknown genes allowing for the possibility of associating putative function to these previously uncharacterized genes.
The peanut oligonucleotide array represents the majority of publicly available peanut ESTs and can be used as a tool for expression profiling studies in diverse tissues.
Cultivated peanut (Arachis hypogaeaL.) is the second-most important legume in the world, with a total global production of 48 million tons . Legumes are the second-most important food crop following grains, representing an important source of protein for humans and livestock in the North and South America, Africa, and Asia. Additionally, when considering oil production for cooking and fuels, peanut represents one of the highest value-added crops, with an annual worth of $1 billion to farmers and $6 billion to the overall economy in the U.S. alone.
Recent progress in functional genomics has enabled the study of plant responses at whole-transcriptome levels, revealing the complex nature of multi-genic responses in plants [2–4]. While genes and proteins expressed differentially under a variety of environmental perturbations and developmental stages have been identified in model plant systems such asArabidopsis[2,5], studies on stress-induced or developmentally regulated genes in crop plants have been limited but are beginning to emerge [6–9]. While positional cloning and candidate gene approaches have begun to identify a number of structural genes or transcription factors controlling the larger response to abiotic and biotic stimuli [10,11], this work has been limited in peanut due to a lack of genomic data. Identification of such genes will have a significant effect on varietal development by traditional breeding and genetic engineering.
Greater attention is needed for genomic development in the Leguminosae. Despite its importance as both a cash crop and important staple, little is known about the genetic mechanisms in peanut that control disease resistance or susceptibility, stress tolerance, or pod development . Although significant efforts have gone into legume genomics, there is a paucity of genomic data for peanut, bean, and chickpea compared to soybean,Medicago truncatula, andLotus japonicus[4,8]. In peanut, marker technology is relatively young and only recently have genetic maps been published [13–15]. Although an initial cDNA microarray with 384 unigenes was published , there are no reports of high-density oligonucleotide microarray platforms in peanut. As part of our ongoing effort to identify the molecular mechanisms underlying peanut development and response to abiotic stress, we have designed a custom oligonucleotide microarray using all publicly available peanut ESTs. There are several advantages to the oligonucleotide microarray approach, including uniformity of hybridization, probe performance and specificity, and the flexibility of customization or probe addition as more sequences enter the public domain [17–20]. To test the utility of this array for expression studies in both vegetative and reproductive tissues and identify putatively pod-specific genes, we compared transcript abundance in pod, leaf, stem, root, and peg tissues. We present here, the utility of the first large-scale publicly available peanut microarray and establish the foundation for investigation of molecular responses on a transcriptome scale.
Source tissue and number of ESTs from each library used to design the AH006 peanut microarray.
ESTs for Array Design
drought + Aspergillus
drought + Aspergillus
Gene expression profiles of different tissues provide information about the biological function of the genes expressed in those tissues [24,26]. For the pod abundant pool, only 21 transcripts could be assigned a putative function based on BLAST analysis. All tissues showed similar GO BP enrichments associated with metabolic processes (I), cellular processes (J), and response to stimuli (K). While peanut pod undisputedly is the most important organ from an agronomic perspective and the genes specifically up-regulated in that tissue are of interest, other tissue-specific genes or expression patterns may reveal significant information related to productivity, disease resistance, development, and physiological response. Figure4shows that the functional roles of putative tissue-specific genes are similar for leaf, stem, and peg compared to root. While this is not surprising given the similarities of genes highly expressed in green leaves or stems, it should be noted that the majority of peanut EST sequences in the public domain are from leaf and pod. However, despite the absence of a large number of ESTs from root libraries, there are genes whose expression appears to be root specific.
Due to paucity of information on peanuts in global repositories like NCBI, only half of the pod-abundant transcripts could be meaningfully annotated (Additional file2). Two major categories of transcripts, namely storage proteins and desiccation-related proteins, were identified in pods. Five transcripts related to seed storage proteins such as globulin, conglutin and glycinin were abundant in pod tissues. The desiccation-related transcripts over-represented included seed maturation protein, LEA, early methionine labeled (EM), legumin, plasma membrane intrinsic proteins (aquaporins) and desiccation related pcc13-62 proteins. In most higher plants the later seed maturation phase is characterized by a desiccation phase during which number of proteins distinct from the storage proteins are accumulated in embryos. According to their accumulation pattern it has been suggested that these particular proteins, called Late Embryogenesis Abundant (LEA) could be involved in seed desiccation tolerance [27,28]. In addition to their expression during seed desiccation, many of the genes coding for LEAs can be highly induced in immature seeds or activated in vegetative tissues upon osmotic stress , indicating that they are, in part, regulated at the transcriptional level . On the other hand EM proteins could be responsible for the maintenance of a minimal water content allowing preservation of cell content in dried seeds [30,31].
Utilizing the blast2GO tool, twelve transcripts with an Enzyme Commission (EC) number were mapped to twenty five different Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Of these, 18 pathways relevant to pods were presented in Additional File3. As expected, five major pathways leading to the production of sugars and starch involving the enzymes UDP-glucose pyrophosphorylase (EC:184.108.40.206) and dTDP-glucose 4-6-dehydratase (EC:220.127.116.11) were identified. Peanut being an oilseed crop, the pathways leading to lipid metabolism (2 pathways) and sulfur containing amino acid metabolism (5 pathways) were predominant in pods. Peroxidase enzyme (EC:18.104.22.168) found abundant in pods have multiple roles in plants. Apart from its reactive oxygen scavenging and water-stress signaling activity , the peroxidase enzyme also catalyses phenylpropanoid biosynthesis and phenylalanine metabolism resulting in defense compounds which may protect the developing peanut pods in the soil. Two enzymes involved in pyruvate metabolism phosphoenolpyruvate carboxylase (EC:22.214.171.124) and hydroxyacylglutathione hydrolase (EC.126.96.36.199) were also found to be more abundant in pods. Pyruvate thus generated may be involved in biosynthesis of secondary metabolites like terpenoids by the action of 1-deoxy-d-xylulose 5 phosphate synthase (EC:188.8.131.52). Together the pathway analyses suggests that in pod tissues apart form basic starch and lipid metabolism, secondary metabolites such as phenylpropanoids and terpenoids are also synthesized and may impart defense for developing pod tissues in soil.
List of primers for qRT-PCR analysis of tissue-abundant genes.
Primer sequence (5'-3')
Amplicon size (bp)
Late embryogenesis abundant protein 2
Protein disulfide-isomerase precursor
Putative GPI anchored protein
Plasmamembrane intrinsic protein 2
Transmembrane emp24 domain-containing protein 2 precursor
Dessication-related protein PCC13-62
Expression pattern of peanut pod abundant transcripts.
Microarray fold change
Quantitative real-time PCR fold change
transmembrane emp24 domain-containing protein(emp24)
protein disulfide-isomerase precursor(PDI)
late embryogenesis abundant protein 2(LEA2)
putative GPI anchored protein(GPI-AP)
Peanut, being an under represented crop in terms of genome sequencing and physical mapping, needs a comprehensive tool for dissecting complex mechanisms of development and tolerance to biotic and abiotic stresses. To attain this broad objective, we have designed and characterized a high density oligonucleotide microarray suitable for transcript profiling of various peanut tissues. Analysis of pod abundant transcripts suggested the presence of distinct pathways involved in generation of secondary metabolites apart from the accumulation of transcripts for storage and desiccation-related protein. These peanut microarrays are publicly available and can be upgraded with additional oligonucleotides designed from subsequent sequencing efforts from the peanut research community. The expression profiles generated by these peanut microarrays will provide starting points for in-depth studies on candidate genes that can be utilized in reverse genetics to assign gene functions.
Field grown plants of peanut cultivar FlavRunner 458 were used for tissue collection. The harvested tissue from leaves, pegs, stem, root and pods were immediately frozen in liquid nitrogen and stored at -80°C until further analysis.
Total RNA from different tissue was isolated using the RNeasy Plant Minikit (Qiagen, Valencia, CA). Pooled frozen tissue from five plants were ground to a fine powder in liquid nitrogen and approximately 100 mg of homogenized tissue was used for total RNA isolation according to manufacturer's protocol, except the homogenized seed tissue was initially extracted in 600 μl of RLT buffer and during purification, samples were incubated in buffer RW1 for 5 min during the column washing step. RNA samples were treated with Turbo DNAfree (Ambion, Inc., Austin, TX) prior to cDNA synthesis.
An aliquot of 450 ng of total RNA was used for cDNA synthesis utilizing the Low RNA Input Fluorescence Linear Amplification Kit (Agilent Technologies). Resulting cDNA was transcribed into cRNA and labeled with either cyanine 3 or cyanine 5-labeled nucleotides (Perkin Elmer, Wellesley, MA) using T7 RNA polymerase (Agilent Technologies). Labeled cRNA was purified with RNeasy Mini columns (Qiagen, Valencia, CA). The cRNA quality and quantity were determined spectrophotometrically using a NanoDrop ND-1000 spectrophotometer.
Arrays were scanned using a GenePix®4000B microarray scanner at 5-μm resolution and images were saved as uncompressed tagged image files. For detection of significant differentially expressed genes, each slide image was processed by Agilent Feature Extraction software (version 9.1). This software measured Cy3 and Cy5 signal intensities of whole probes. Since dye bias tends to be signal intensity-dependent, probe sets for dye normalization were selected by rank consistency. Normalization was done by locally weighted linear regression (LOWESS). Ratios were log-transformed and significance values (P-value) were calculated based on a propagate error model and universal error model. In this analysis, the threshold of significant differentially expressed genes was determined with a p-value ≤ 0.05 (p-value is a measure of the confidence that the feature is not differentially expressed). Low-quality spot data generated due to artifacts were eliminated prior to data analysis. Processed intensities from feature extraction analysis were imported into the TIGR Multiexperiment Viewer software (MEV 4.1) and significant genes at a p-value of ≤ 0.05 and more than two-fold difference in expression were defined as differentially expressed.
The Gene Ontology functional annotation tool Blast2GO  was utilized to assign GO ids, enzyme commission numbers, and mapping to Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The Blast2GO tool also enabled statistical analysis related to over representation of functional categories based on a Fisher Exact statistic methodology.
Total RNA samples were treated with Turbo DNAfree (Ambion, Inc., Austin, TX) prior to cDNA synthesis. One microgram of total RNA was used to synthesize first strand cDNA using SuperScript First Strand Synthesis system for RT-PCR (Invitrogren, CA). The primers for pod abundant genes and actin standard were designed using Integrated DNA Technologies primer designing tools. The efficiency of the primer pairs was determined on cDNA derived from the pod of FlavRunner 458 cultivar using a 1:2 serial dilution series. Primer efficiency reactions were performed in triplicate in volumes of 25 μL using SuperArray SYBRGreen reaction mix (SuperArray Bioscience Corp., MD). Reactions were subjected to real-time qRT-PCR using the Roche LightCycler 480 Real-Time PCR System and data analyzed using the LightCycler 480 quantification software (Roche Biochemicals, Indianapolis, IN) .
Samples were analyzed in a 25 μL volume using the Roche LightCycler 480 (Roche Biochemicals, Indianapolis, IN). Reactions were performed in triplicate using cDNA templates from five tissues samples for each gene. A master mix of SYBRGreen and primers was prepared for each primer pair. RT-PCR reactions were performed on 40 ng total RNA with 400 nM specific primers under the following conditions: one cycle of denaturation at 95°C for 10 min followed by 40 cycles of 95°C for 15 sec (denaturation) and 60°C for 15 sec (annealing and elongation). The PCR reaction was followed by a melting curve program (60 - 95°C with a heating rate of 0.1°C per second and a continuous fluorescence measurement) and then a cooling program at 40°C. Negative controls lacking reverse transcriptase were run with all reactions. PCR products were also run on agarose gels to confirm the formation of a single product at the desired size. Crossing points for each transcript were determined using the 2ndderivative maximum analysis with the arithmetic baseline adjustment. Crossing point values for each gene were normalized to the respective crossing point values for the reference gene actin. Data are presented as normalized ratios of genes along with error standard deviations estimated using the Roche Applied Science E-method .
We thank Joseph Quilantan and Meenakshi Mittal for technical help. This research was supported by grants from the Ogallala Aquifer Program, a consortium between USDA-Agricultural Research Service, Kansas State University, Texas Agrilife Research, Texas Agrilife Extension Service, Texas Tech University, and West Texas A&M University, USDA-ARS CRIS 6208-21000-012-00D, and New Mexico State University Agricultural Science Center, Clovis.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.