- Research article
- Open Access
Gene coexpression clusters and putative regulatory elements underlying seed storage reserve accumulation in Arabidopsis
© Peng and Weselake; licensee BioMed Central Ltd. 2011
- Received: 24 August 2010
- Accepted: 2 June 2011
- Published: 2 June 2011
In Arabidopsis, a large number of genes involved in the accumulation of seed storage reserves during seed development have been characterized, but the relationship of gene expression and regulation underlying this physiological process remains poorly understood. A more holistic view of this molecular interplay will help in the further study of the regulatory mechanisms controlling seed storage compound accumulation.
We identified gene coexpression networks in the transcriptome of developing Arabidopsis (Arabidopsis thaliana) seeds from the globular to mature embryo stages by analyzing publicly accessible microarray datasets. Genes encoding the known enzymes in the fatty acid biosynthesis pathway were found in one coexpression subnetwork (or cluster), while genes encoding oleosins and seed storage proteins were identified in another subnetwork with a distinct expression profile. In the triacylglycerol assembly pathway, only the genes encoding diacylglycerol acyltransferase 1 (DGAT1) and a putative cytosolic "type 3" DGAT exhibited a similar expression pattern with genes encoding oleosins. We also detected a large number of putative cis-acting regulatory elements in the promoter regions of these genes, and promoter motifs for LEC1 (LEAFY COTYLEDON 1), DOF (DNA-binding-with-One-Finger), GATA, and MYB transcription factors (TF), as well as SORLIP5 (Sequences Over-Represented in Light-Induced Promoters 5), are overrepresented in the promoter regions of fatty acid biosynthetic genes. The conserved CCAAT motifs for B3-domain TFs and binding sites for bZIP (basic-leucine zipper) TFs are enriched in the promoters of genes encoding oleosins and seed storage proteins.
Genes involved in the accumulation of seed storage reserves are expressed in distinct patterns and regulated by different TFs. The gene coexpression clusters and putative regulatory elements presented here provide a useful resource for further experimental characterization of protein interactions and regulatory networks in this process.
- Seed Development
- Coexpression Network
- Fatty Acid Biosynthesis
- Seed Storage Protein
- Gene Coexpression Network
Seed storage reserves accumulated during embryogenesis in higher plants are crucial for plant propagation, providing carbon and energy during germination prior to seedling establishment. In mature Arabidopsis seeds, storage lipids and proteins are the major storage compounds, each accounting for 30% - 45% of the seed dry weight . The past decade has witnessed a substantial progress in identification and characterization of genes involved in the de novo fatty acid (FA) biosynthesis and triacylglycerol (TAG) assembly pathways [[1, 4] and references therein]. This advancement is particularly evident in the model plant Arabidopsis, largely owing to the sequencing and release of its relatively compact genome in the year 2000 . Moreover, characterization of transcription factors (TFs) has led to the identification of several master regulator genes that play critical regulatory roles in this biological process, including ABI3 (ABSCISIC ACID INSENSITIVE 3), LEC1 (LEAFY COTYLEDON 1), LEC2 and FUS3 (FUSCA 3) [6–17]. These TFs interact with each other and form complex regulatory networks [18–23], regulating multiple aspects of seed development including storage reserve accumulation through interaction with cognate cis-acting DNA elements in the promoter regions of target genes. ABI3, FUS3 and LEC2 contain a plant-specific 'B3' DNA-binding domain which targets RY-repeat regulatory elements, whereas LEC1 and L1L (LEC1-LIKE) contain a NF-YB domain binding to the CCAAT boxes in the promoter region [24, 25]. Additional TFs such as WRINKLED 1 (WRI1), a member of plant-specific APETALA 2 (AP2) - ethylene response element binding factor (EREB) family, is also known to control transcription of many FA biosynthetic genes , and recent studies show it acts via binding to the AW-box motif present in the promoter region of 19 FA biosynthetic genes . Moreover, ABI4 (an AP2 family protein) and various basic-leucine zipper (bZIP) TFs including ABI5 or EEL (ENHANCED EM [EMBRYO MORPHORGENESIS] LEVEL) are known regulators of the expression of SEED STORAGE PROTEIN (SSP) genes, which act in the same signalling network but downstream of ABI3 [28, 29]. Distinct regulatory mechanisms are present in controlling the accumulation processes of oils and proteins, perhaps with cross-talk to coordinate the synthesis of seed storage compounds. This coordination could help to explain the well-documented negative correlation (correlation coefficient ranging from -0.60 to -0.90) between oil and protein content in seeds of many oleaginous species [ and references therein]. Moreover, several TFs, such as LEC1, ABI3 and FUS3, have been demonstrated to regulate many genes in the synthesis of both oils and storage proteins in developing seeds [30–32].
In contrast to the great advancement in characterizing individual genes involved in the accumulation of seed storage reserves, the relationship of their expression and regulation is not well understood. A more holistic view of this biological process at the systems level would prove beneficial in developing strategies to further enhance seed yield and oil content, as well as in the modification of oil composition. To gain insights into global transcriptional dynamics in key cellular processes, microarray is an effective method for analyzing the transcript abundance of a large number of genes simultaneously. Datasets obtained from profiling experiments can be further used to infer gene regulatory networks. In Arabidopsis, two cDNA microarrays were designed several years ago based on the expressed sequence tag (EST) sequences available at the time. One array was used for tissue-specific expression profiling to identify genes that are preferentially expressed in developing seeds compared with vegetative leaves and roots , and the other was used to study the temporal pattern of gene expression during the critical period of seed filling . These transcriptional profiling studies in Arabidopsis seeds have greatly increased our understanding of overall alterations of gene expression during seed development and storage reserve accumulation. These two early cDNA-based microarrays, however, surveyed <3500 unique Arabidopsis genes.
More recently, Schmid et al.  created a global gene expression atlas AtGenExpress (Expression Atlas of Arabidopsis development) representing the Arabidopsis life cycle using the Arabidopsis ATH1 genome array (Affymetrix, Santa Clara, CA), which can measure nearly 24,000 genes in a single assay. In AtGenExpress, 237 chips were hybridized for 79 different samples collected from various organs, growth stages and under various environmental conditions, including 24 arrays for eight stages of maturing seeds. Since its release, this exceptionally large transcriptome dataset has been a goldmine for plant biologists to identify candidate genes for molecular characterization. A number of studies have further "mined" this dataset within different contexts of plant biology. Wang et al.  extracted the expression data for several TFs experimentally determined to regulate seed development and genes that code for enzymes in the FA biosynthesis pathway. Volodarsky et al.  utilized the dataset to analyze hormone-related transcriptional activities in Arabidopsis. Vandepoele et al.  constructed coexpression networks and predicted cis-regulatory elements for the cell cycle-related TF OBP1. Recently, the identification of gene coexpression networks has emerged as a popular method for predicting gene functions and interactions [38–41], and web-based tools such as Genevestigator  and CressExpress  have been developed to facilitate such analyses at a small scale for plant biologists. Transcriptional coordination, or coexpression, of genes may be an indication of functional relatedness, based on the "guilt-by-association" principle . In a coexpression network, a vertex or node represents a gene whereas an edge is a connection inferred from the correlation coefficient calculated from the gene expression data. Although the relationship between coexpression networks and true biological networks is often not clear, it has been shown that gene groups identified from modular (cluster) analysis in coexpression networks often exhibit an enrichment of certain Gene Ontology (GO) categories , suggesting the functional association of genes connected in a coexpression network. Hence, a coexpression edge can be considered a putative interaction between two genes. Genes in a coexpression network, particularly those expressed in a specific tissue or sharing a semantic similarity in the GO 'Biological Process' aspect, might be co-regulated through common TF binding sites in their upstream regions, leading to many attempts to identify overrepresented cis-motifs in coexpressed genes [46–50].
In the current study, we took advantage of this public transcriptome dataset in Arabidopsis , analyzed the raw data thoroughly in the context of seed storage reserve accumulation during seed development, and constructed coexpression networks for seed-expressed genes. We focused on genes involved in FA biosynthesis and the accumulation of storage lipids and proteins in developing seeds. This comprehensive analysis has resulted in the identification of a large number of genes that are possibly coexpressed and function cooperatively during seed maturation. Furthermore, we predicted a large number of cis-regulatory elements for key seed-expressed genes. This information could be useful in designing experiments to probe regulatory mechanisms underlying seed storage reserve accumulation.
Association of seed transcriptome with embryo morphology in developing Arabidopsis seeds
Arabidopsis developing seed samples used for AtGenExpress microarray experiments.
Seeds stage 3 with siliques
C globular stage
Mid globular to early heart
Seeds stage 4 with siliques
D bilateral stage
Early heart to late heart
Seeds stage 5 with siliques
D bilateral stage
Late heart to mid torpedo
Seeds stage 6 without siliques
E expanded cotyledon stage
Mid torpedo to late torpedo
Seeds stage 7 without siliques
E expanded cotyledon stage
Late torpedo to early walking-stick
Seeds stage 8 without siliques
E expanded cotyledon stage
Walking-stick to early curled cotyledons
Seeds stage 9 without siliques
F mature embryo stage
Curled cotyledons to early green cotyledons
Seeds stage 10 without siliques
F mature embryo stage
Construction of gene coexpression networks in the Arabidopsis seed transcriptome
Network characteristics in the Arabidopsis seed coexpression network.
Total number of genes in the network
Mean number of connections per gene
Median number of connections per gene
Clustering coefficient a
Scale-free topology criterion b
Genes encoding fatty acid biosynthetic genes and seed storage reserve associated proteins are located in different subnetworks
In summary, our new results suggest that genes acting in a biological process (FA biosynthesis) can be indicated by their presence in the same coexpression network cluster, but genes involved in the same pathway (TAG assembly) may not necessarily exhibit expression coherence. As a result, computational approaches using coexpression network to predict gene function, such as in , will undoubtedly have limitations.
Putative regulatory elements underlying seed storage reserve accumulation
To computationally identify cis-acting regulatory elements, the upstream promoter sequences for the genes involved in storage reserve biosynthesis were extracted from the RSAT server . We included some 5'-UTR sequences as certain TF binding sites can be located within this region of a gene [27, 74]. On average, the G-C content in the promoter sequences of the gene set was found to be <35%, which is consistent with the compositional bias of nucleotides towards A-T enrichment observed in plant promoter regions [74, 75]. Two software tools, TFBS  and fdrMotif , were used to search for putative TF-binding sites on both strands. Both tools depend on TF- binding profiles (Position Weight Matrix, or PWM) derived from experimentally determined binding sites for the prediction, we thus compiled 118 PWMs from the literature [27, 74] and the JASPAR database  (Additional File 4). In the JASPAR database, we only considered the binding profiles for plant-specific TFs because of their potential critical roles in regulating the accumulation of storage reserves during seed development, a unique physiological process in higher plants.
Overrepresented motifs identified in promoters of genes involved in fatty acid synthesis, and oleosin and seed storage protein accumulation.
CBF (LEC1 L1L)
For the genes and isoforms in the TAG assembly pathway, no overrepresented motifs have been found. Our goal was to identify putative promoter elements that can be used for experimental studies (Additional File 5). Interestingly, promoter motifs for B3 domain TFs, such as ABI3, FUS3 and LEC2, were found to be overrepresented in promoters of genes encoding oleosins and SSPs. Motifs for bZIP factors (e.g., bZIP67) also appeared to be overrepresented in the promoter regions of these genes, but there were no binding matrices for bZIP ABI5 or EEL.
Our approach of computational promoter analysis was limited by the availability of experimentally determined TF-binding sites for deriving binding profiles of additional TFs. We compiled a list of 118 binding matrices for this analysis, but if binding profiles for other TFs can be generated from a reasonable number of known binding sites, we could identify more TFs that possibly regulate the accumulation of seed storage reserves. In addition, we only considered upstream sequences of 1000 bp plus 200 bp 5'-UTR for each gene, because the majority of cis-acting regulatory elements are located in this region . Other genomic regions including the 3'-UTR, or even introns, however, can also harbour TF binding sites.
Our analyses indicate that genes involved in the accumulation of seed storage reserves, along with known TF genes, are expressed in distinct patterns during seed maturation. Promoter motifs for CCAAT binding factors LEC1 and L1L, DOF and GATA factors, AP2 WRI1 as well as MYB factors are enriched in the promoter regions of genes involved in FA biosynthesis. Binding sites for B3-domain factors (ABI3/VP1 TF family) and bZIP factors are overrepresented in the promoter regions of genes encoding oleosins and seed storage proteins. When binding profiles for additional TFs become available, more putative regulatory elements will be detected, which in turn can be validated for functionality.
Retrieval and processing of raw hybridization data
The 24 raw hybridization intensity data files (.CEL files) for Arabidopsis seed development were retrieved from The Arabidopsis Information Resource (TAIR) gene expression data repository (http://www.arabidopsis.org/servlets/TairObject?type=hyb_descr_collection&id=1006710873) . Microarray gene expression data analyses were performed using Bioconductor packages  in the open-source statistical R environment . The raw data files were imported into Bioconductor using the Simpleaffy package . The hybridization and RNA sample qualities were assessed using a number of quality control metrics (data not shown), and the raw data were background corrected, normalized and transformed to the log2 values using the GCRMA package . This normalization method is developed on another normalization approach robust multi-array average (RMA; ), and uses probe sequence information (G-C content) for estimating hybridization affinity. The number of genes expressed in seeds was filtered using a log2 value of 6.0 as the cutoff for the binary 'present' or 'absent' calls, and any gene with 'present' calls in less than three samples (corresponding to one seed development stage) was considered as "unexpressed" in these seed samples. After filtering, 12,353 genes expressed in at least one of the eight development stages in developing Arabidopsis seeds were used for subsequent high-level analyses. Custom Perl scripts were written to find the annotation of each gene in the latest CSV file ATH1-121501.na30.annot.csv (November 15 2009) released by Affymetrix for the ATH1 Genome Array and revised in some cases through sequence analysis using BLAST . For example, the TF gene WRINKLED1 (AT3G54320) was incorrectly annotated in the Affymetrix file as an aintegumaenta-like protein or ovule development protein aintegumenta (Additional File 1).
Principal component analysis and association test of global gene expression with seed development
The normalized, log2-transformed gene expression data were used for principal component analysis (PCA) using the R prcomp function . For this analysis, expression values of the three replicates for each seed development stage were not combined in order to assess the reproducibility of biological replication. Global testing of the transcriptome with a particular variable (e.g., seed development stage) was carried out using the Globaltest package . This package tests the overall gene expression in group(s) of genes for significant association with a given variable. The test gives one P-value for the whole group instead of one P-value for each gene to avoid the issue of multiple testing corrections.
Gene expression correlation analysis and construction of coexpression networks
For the inference of gene coexpression networks in the transcriptome of developing Arabidopsis seeds, we used the 12,353 genes expressed at moderate or high levels and used the Pearson-based correlation coefficient to measure their expression coherence. We first used the median expression data of the genes in the eight samples to compute pairwise correlation coefficients in the R statistical environment, resulting in a correlation matrix of 12353 × 12353. Then we removed self-pairing and duplication, and applied a correlation cutoff of 0.90, which retained over 1.7 million gene pairs representing 11,698 distinct genes for construction of the coexpression network for the Arabidopsis seed genes. This stringent correlation threshold was chosen to eliminate potential spurious correlations in a coexpression network. Network properties were determined using custom scripts. Coexpression networks are visualized using Cytoscape . For time-course clustering analysis, the gene expression values were standardized to have a mean value of zero and a standard deviation of one for each gene profile. This standardization of data ensures that genes with similar temporal profiles are close in Euclidean space during clustering, regardless of their absolute expression levels. The transformed expressions were then clustered using the fuzzy c-means (FCM) clustering algorithm in the Bioconductor Mfuzz package . We determined six clusters can well separate the expression patterns inherent in the dataset, and another FCM parameter m = 1.75, which allows for investigation of the clustering robustness. FCM assigns a membership value in the range of 0-1 for each gene as an indicator of how representative a gene profile is for a specific cluster, and profiles with different membership values were differently coloured.
Computational analyses of transcription factor binding sites
The genomic sequences 1000 bp upstream plus 200 bp 5' untranslated regions (UTR) for the genes involved in storage reserve biosynthesis were retrieved from the RSAT server . If the intergenic region with the upstream neighbouring gene is <1000 bp long, we only retrieved upstream sequence available in order to prevent using the 3'-end sequence of the adjacent gene in the upstream. Putative TF binding sites on both strands were identified with two software tools, TFBS  and fdrMotif . Briefly, the 118 TF binding profiles (position-specific weight matrix, or PWM) were compiled from the literature [27, 74] and the JASPAR database , and converted into a format suitable for each software tool (Additional File 4). In the TFBS search, an 80% similarity cutoff was adopted. In fdrMotif search, for each input sequence 10 background sequences were generated from a 4th-order Markov model and an upper boundary of false discovery rate (FDR) of 0.15 as suggested by fdrMotif was adopted to control FDR. Only putative binding sites predicted by both tools were retained for subsequent analysis. To ascertain the predictive performance, detected motifs were compared with curated motifs in AtcisDB and AGRIS databases [90, 91]. Sequence logos for the predicted motifs for a TF binding profile were created with WebLogo .
The authors are grateful for the financial support provided by Genome Canada, Genome Alberta, Alberta Advanced Education and Technology, and the Canada Research Chairs Program. We also thank three anonymous reviewers for their helpful comments and suggestions.
- Baud S, Lepiniec L: Regulation of de novo fatty acid synthesis in maturing oilseeds of Arabidopsis. Plant Physiol Biochem. 2009, 47 (6): 448-455. 10.1016/j.plaphy.2008.12.006.PubMedView ArticleGoogle Scholar
- Verdier J, Thompson RD: Transcriptional regulation of storage protein synthesis during dicotyledon seed filling. Plant Cell Physiol. 2008, 49 (9): 1263-1271. 10.1093/pcp/pcn116.PubMedView ArticleGoogle Scholar
- Weselake RJ, Taylor DC, Rahman MH, Shah S, Laroche A, McVetty PB, Harwood JL: Increasing the flow of carbon into seed oil. Biotechnol Adv. 2009, 27 (6): 866-878. 10.1016/j.biotechadv.2009.07.001.PubMedView ArticleGoogle Scholar
- North H, Baud S, Debeaujon I, Dubos C, Dubreucq B, Grappin P, Jullien M, Lepiniec L, Marion-Poll A, Martine M, Loïc Rajjou L, Routaboul JM, Caboche M: Arabidopsis seed secrets unravelled after a decade of genetic and omics-driven research. Plant J. 2010, 61 (6): 971-981. 10.1111/j.1365-313X.2009.04095.x.PubMedView ArticleGoogle Scholar
- Arabidopsis Genome Initiative: Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000, 408: 796-815. 10.1038/35048692.View ArticleGoogle Scholar
- West M, Yee KM, Danao J, Zimmerman JL, Fischer RL, Goldberg RB, Harada JJ: LEAFY COTYLEDON1 Is an Essential Regulator of Late Embryogenesis and Cotyledon Identity in Arabidopsis. Plant Cell. 1994, 6 (12): 1731-1745.PubMedPubMed CentralView ArticleGoogle Scholar
- Parcy F, Valon C, Kohara A, Misera S, Giraudat J: The ABSCISIC ACID-INSENSITIVE3, FUSCA3, and LEAFY COTYLEDON1 loci act in concert to control multiple aspects of Arabidopsis seed development. Plant Cell. 1997, 9 (8): 1265-1277.PubMedPubMed CentralView ArticleGoogle Scholar
- Lotan T, Ohto M, Yee KM, West MA, Lo R, Kwong RW, Yamagishi K, Fischer RL, Goldberg RB, Harada JJ: Arabidopsis LEAFY COTYLEDON1 is sufficient to induce embryo development in vegetative cells. Cell. 1998, 93 (7): 1195-1205. 10.1016/S0092-8674(00)81463-4.PubMedView ArticleGoogle Scholar
- Luerssen H, Kirik V, Herrmann P, Misera S: FUSCA3 encodes a protein with a conserved VP1/AB13-like B3 domain which is of functional importance for the regulation of seed maturation in Arabidopsis thaliana. Plant J. 1998, 15 (6): 755-764. 10.1046/j.1365-313X.1998.00259.x.PubMedView ArticleGoogle Scholar
- Stone SL, Kwong LW, Yee KM, Pelletier J, Lepiniec L, Fischer RL, Goldberg RB, Harada JJ: LEAFY COTYLEDON2 encodes a B3 domain transcription factor that induces embryo development. Proc Natl Acad Sci USA. 2001, 98 (20): 11806-11811. 10.1073/pnas.201413498.PubMedPubMed CentralView ArticleGoogle Scholar
- Kwong RW, Bui AQ, Lee H, Kwong LW, Fischer RL, Goldberg RB, Harada JJ: LEAFY COTYLEDON1-LIKE defines a class of regulators essential for embryo development. Plant Cell. 2003, 15 (1): 5-18. 10.1105/tpc.006973.PubMedPubMed CentralView ArticleGoogle Scholar
- Lee H, Fischer RL, Goldberg RB, Harada JJ: Arabidopsis LEAFY COTYLEDON1 represents a functionally specialized subunit of the CCAAT binding transcription factor. Proc Natl Acad Sci USA. 2003, 100 (4): 2152-2156. 10.1073/pnas.0437909100.PubMedPubMed CentralView ArticleGoogle Scholar
- Monke G, Altschmied L, Tewes A, Reidt W, Mock HP, Baumlein H, Conrad U: Seed-specific transcription factors ABI3 and FUS3: molecular interaction with DNA. Planta. 2004, 219 (1): 158-166. 10.1007/s00425-004-1206-9.PubMedView ArticleGoogle Scholar
- Santos-Mendoza M, Dubreucq B, Miquel M, Caboche M, Lepiniec L: LEAFY COTYLEDON 2 activation is sufficient to trigger the accumulation of oil and seed specific mRNAs in Arabidopsis leaves. FEBS Lett. 2005, 579 (21): 4666-4670. 10.1016/j.febslet.2005.07.037.PubMedView ArticleGoogle Scholar
- Braybrook SA, Stone SL, Park S, Bui AQ, Le BH, Fischer RL, Goldberg RB, Harada JJ: Genes directly regulated by LEAFY COTYLEDON2 provide insight into the control of embryo maturation and somatic embryogenesis. Proc Natl Acad Sci USA. 2006, 103 (9): 3468-3473. 10.1073/pnas.0511331103.PubMedPubMed CentralView ArticleGoogle Scholar
- Braybrook SA, Harada JJ: LECs go crazy in embryo development. Trends Plant Sci. 2008, 13 (12): 624-630. 10.1016/j.tplants.2008.09.008.PubMedView ArticleGoogle Scholar
- Mu J, Tan H, Zheng Q, Fu F, Liang Y, Zhang J, Yang X, Wang T, Chong K, Wang XJ, Zuo J: LEAFY COTYLEDON1 is a key regulator of fatty acid biosynthesis in Arabidopsis. Plant Physiol. 2008, 148 (2): 1042-1054. 10.1104/pp.108.126342.PubMedPubMed CentralView ArticleGoogle Scholar
- Baud S, Mendoza MS, To A, Harscoet E, Lepiniec L, Dubreucq B: WRINKLED1 specifies the regulatory action of LEAFY COTYLEDON2 towards fatty acid metabolism during seed maturation in Arabidopsis. Plant J. 2007, 50 (5): 825-838. 10.1111/j.1365-313X.2007.03092.x.PubMedView ArticleGoogle Scholar
- Brocard-Gifford IM, Lynch TJ, Finkelstein RR: Regulatory networks in seeds integrating developmental, abscisic acid, sugar, and light signaling. Plant Physiol. 2003, 131 (1): 78-92. 10.1104/pp.011916.PubMedPubMed CentralView ArticleGoogle Scholar
- To A, Valon C, Savino G, Guilleminot J, Devic M, Giraudat J, Parcy F: A network of local and redundant gene regulation governs Arabidopsis seed maturation. Plant Cell. 2006, 18 (7): 1642-1651. 10.1105/tpc.105.039925.PubMedPubMed CentralView ArticleGoogle Scholar
- Gutierrez L, Van Wuytswinkel O, Castelain M, Bellini C: Combined networks regulating seed maturation. Trends Plant Sci. 2007, 12 (7): 294-300. 10.1016/j.tplants.2007.06.003.PubMedView ArticleGoogle Scholar
- Santos-Mendoza M, Dubreucq B, Baud S, Parcy F, Caboche M, Lepiniec L: Deciphering gene regulatory networks that control seed development and maturation in Arabidopsis. Plant J. 2008, 54 (4): 608-620. 10.1111/j.1365-313X.2008.03461.x.PubMedView ArticleGoogle Scholar
- Angelovici R, Fait A, Zhu X, Szymanski J, Feldmesser E, Fernie AR, Galili G: Deciphering transcriptional and metabolic networks associated with lysine metabolism during Arabidopsis seed development. Plant Physiol. 2009, 151 (4): 2058-2072. 10.1104/pp.109.145631.PubMedPubMed CentralView ArticleGoogle Scholar
- Reidt W, Wohlfarth T, Ellerstrom M, Czihal A, Tewes A, Ezcurra I, Rask L, Baumlein H: Gene regulation during late embryogenesis: the RY motif of maturation-specific gene promoters is a direct target of the FUS3 gene product. Plant J. 2000, 21 (5): 401-408. 10.1046/j.1365-313x.2000.00686.x.PubMedView ArticleGoogle Scholar
- Yamamoto A, Kagaya Y, Toyoshima R, Kagaya M, Takeda S, Hattori T: Arabidopsis NF-YB subunits LEC1 and LEC1-LIKE activate transcription by interacting with seed-specific ABRE-binding factors. Plant J. 2009, 58 (5): 843-856. 10.1111/j.1365-313X.2009.03817.x.PubMedView ArticleGoogle Scholar
- Cernac A, Benning C: WRINKLED1 encodes an AP2/EREB domain protein involved in the control of storage compound biosynthesis in Arabidopsis. Plant J. 2004, 40 (4): 575-585. 10.1111/j.1365-313X.2004.02235.x.PubMedView ArticleGoogle Scholar
- Maeo K, Tokuda T, Ayame A, Mitsui N, Kawai T, Tsukagoshi H, Ishiguro S, Nakamura K: An AP2-type transcription factor, WRINKLED1, of Arabidopsis thaliana binds to the AW-box sequence conserved among proximal upstream regions of genes involved in fatty acid synthesis. Plant J. 2009, 60 (3): 476-487. 10.1111/j.1365-313X.2009.03967.x.PubMedView ArticleGoogle Scholar
- Bensmihen S, Rippa S, Lambert G, Jublot D, Pautot V, Granier F, Giraudat J, Parcy F: The homologous ABI5 and EEL transcription factors function antagonistically to fine-tune gene expression during late embryogenesis. Plant Cell. 2002, 14 (6): 1391-1403. 10.1105/tpc.000869.PubMedPubMed CentralView ArticleGoogle Scholar
- Alonso R, Onate-Sanchez L, Weltmeier F, Ehlert A, Diaz I, Dietrich K, Vicente-Carbajosa J, Droge-Laser W: A pivotal role of the basic leucine zipper transcription factor bZIP53 in the regulation of Arabidopsis seed maturation gene expression based on heterodimerization and protein complex formation. Plant Cell. 2009, 21 (6): 1747-1761. 10.1105/tpc.108.062968.PubMedPubMed CentralView ArticleGoogle Scholar
- Kroj T, Savino G, Valon C, Giraudat J, Parcy F: Regulation of storage protein gene expression in Arabidopsis. Development. 2003, 130 (24): 6065-6073. 10.1242/dev.00814.PubMedView ArticleGoogle Scholar
- Kagaya Y, Okuda R, Ban A, Toyoshima R, Tsutsumida K, Usui H, Yamamoto A, Hattori T: Indirect ABA-dependent regulation of seed storage protein genes by FUSCA3 transcription factor in Arabidopsis. Plant Cell Physiol. 2005, 46 (2): 300-311. 10.1093/pcp/pci031.PubMedView ArticleGoogle Scholar
- Kagaya Y, Toyoshima R, Okuda R, Usui H, Yamamoto A, Hattori T: LEAFY COTYLEDON1 controls seed storage protein genes through its regulation of FUSCA3 and ABSCISIC ACID INSENSITIVE3. Plant Cell Physiol. 2005, 46 (3): 399-406. 10.1093/pcp/pci048.PubMedView ArticleGoogle Scholar
- Girke T, Todd J, Ruuska S, White J, Benning C, Ohlrogge J: Microarray analysis of developing Arabidopsis seeds. Plant Physiol. 2000, 124 (4): 1570-1581. 10.1104/pp.124.4.1570.PubMedPubMed CentralView ArticleGoogle Scholar
- Ruuska SA, Girke T, Benning C, Ohlrogge JB: Contrapuntal networks of gene expression during Arabidopsis seed filling. Plant Cell. 2002, 14 (6): 1191-1206. 10.1105/tpc.000877.PubMedPubMed CentralView ArticleGoogle Scholar
- Schmid M, Davison TS, Henz SR, Pape UJ, Demar M, Vingron M, Scholkopf B, Weigel D, Lohmann JU: A gene expression map of Arabidopsis thaliana development. Nat Genet. 2005, 37 (5): 501-506. 10.1038/ng1543.PubMedView ArticleGoogle Scholar
- Wang H, Guo J, Lambert KN, Lin Y: Developmental control of Arabidopsis seed oil biosynthesis. Planta. 2007, 226 (3): 773-783. 10.1007/s00425-007-0524-0.PubMedView ArticleGoogle Scholar
- Volodarsky D, Leviatan N, Otcheretianski A, Fluhr R: HORMONOMETER: a tool for discerning transcript signatures of hormone action in the Arabidopsis transcriptome. Plant Physiol. 2009, 150 (4): 1796-1805. 10.1104/pp.109.138289.PubMedPubMed CentralView ArticleGoogle Scholar
- Vandepoele K, Quimbaya M, Casneuf T, De Veylder L, Van de Peer Y: Unraveling transcriptional control in Arabidopsis using cis-regulatory elements and coexpression networks. Plant Physiol. 2009, 150 (2): 535-546. 10.1104/pp.109.136028.PubMedPubMed CentralView ArticleGoogle Scholar
- Obayashi T, Hayashi S, Shibaoka M, Saeki M, Ohta H, Kinoshita K: COXPRESdb: a database of coexpressed gene networks in mammals. Nucleic Acids Res. 2008, D77-82. 36 DatabaseGoogle Scholar
- Nayak RR, Kearns M, Spielman RS, Cheung VG: Coexpression network based on natural variation in human gene expression reveals gene interactions and functions. Genome Res. 2009, 19 (11): 1953-1962. 10.1101/gr.097600.109.PubMedPubMed CentralView ArticleGoogle Scholar
- Mutwil M, Usadel B, Schutte M, Loraine A, Ebenhoh O, Persson S: Assembly of an interactive correlation network for the Arabidopsis genome using a novel heuristic clustering algorithm. Plant Physiol. 2010, 152 (1): 29-43. 10.1104/pp.109.145318.PubMedPubMed CentralView ArticleGoogle Scholar
- Hruz T, Laule O, Szabo G, Wessendorp F, Bleuler S, Oertle L, Widmayer P, Gruissem W, Zimmermann P: Genevestigator v3: a reference expression database for the meta-analysis of transcriptomes. Adv Bioinformatics 2008. 2008, 147 (3): 1004-1016.Google Scholar
- Srinivasasainagendra V, Page GP, Mehta T, Coulibaly I, Loraine AE: CressExpress: a tool for large-scale mining of expression data from Arabidopsis. Plant Physiol. 2008, 147 (3): 1004-1016. 10.1104/pp.107.115535.PubMedPubMed CentralView ArticleGoogle Scholar
- Wolfe CJ, Kohane IS, Butte AJ: Systematic survey reveals general applicability of "guilt-by-association" within gene coexpression networks. BMC Bioinformatics. 2005, 6: 227-10.1186/1471-2105-6-227.PubMedPubMed CentralView ArticleGoogle Scholar
- Marco A, Konikoff C, Karr TL, Kumar S: Relationship between gene co-expression and sharing of transcription factor binding sites in Drosophila melanogaster. Bioinformatics. 2009, 25 (19): 2473-2477. 10.1093/bioinformatics/btp462.PubMedPubMed CentralView ArticleGoogle Scholar
- Xulvi-Brunet R, Li H: Co-expression networks: graph properties and topological comparisons. Bioinformatics. 2010, 26 (2): 205-214. 10.1093/bioinformatics/btp632.PubMedView ArticleGoogle Scholar
- Kreiman G: Identification of sparsely distributed clusters of cis-regulatory elements in sets of co-expressed genes. Nucleic Acids Res. 2004, 32 (9): 2889-2900. 10.1093/nar/gkh614.PubMedPubMed CentralView ArticleGoogle Scholar
- Haberer G, Mader MT, Kosarev P, Spannagl M, Yang L, Mayer KF: Large-scale cis-element detection by analysis of correlated expression and sequence conservation between Arabidopsis and Brassica oleracea. Plant Physiol. 2006, 142 (4): 1589-1602. 10.1104/pp.106.085639.PubMedPubMed CentralView ArticleGoogle Scholar
- Obayashi T, Kinoshita K, Nakai K, Shibaoka M, Hayashi S, Saeki M, Shibata D, Saito K, Ohta H: ATTED-II: a database of co-expressed genes and cis elements for identifying co-regulated gene groups in Arabidopsis. Nucleic Acids Res. 2007, D863-9. 35 DatabaseGoogle Scholar
- Lenka SK, Lohia B, Kumar A, Chinnusamy V, Bansal KC: Genome-wide targeted prediction of ABA responsive genes in rice based on overrepresented cis-motif in co-expressed genes. Plant Mol Biol. 2009, 69 (3): 261-271. 10.1007/s11103-008-9423-4.PubMedView ArticleGoogle Scholar
- Goeman JJ, van de Geer SA, de Kort F, van Houwelingen HC: A global test for groups of genes: testing association with a clinical outcome. Bioinformatics. 2004, 20 (1): 93-99. 10.1093/bioinformatics/btg382.PubMedView ArticleGoogle Scholar
- Yang YH, Xiao Y, Segal MR: Identifying differentially expressed genes from microarray experiments via statistic synthesis. Bioinformatics. 2005, 21 (7): 1084-1093. 10.1093/bioinformatics/bti108.PubMedView ArticleGoogle Scholar
- Chen X, Truksa M, Shah S, Weselake RJ: A survey of quantitative real-time PCR internal reference genes for expression studies in Brassica napus. Anal Biochem. 2010, 405 (1): 138-140. 10.1016/j.ab.2010.05.032.PubMedView ArticleGoogle Scholar
- Watts DJ, Strogatz SH: Collective dynamics of 'small-world' networks. Nature. 1998, 393 (6684): 440-442. 10.1038/30918.PubMedView ArticleGoogle Scholar
- Zhang B, Horvath S: A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol. 2005, 4: 17-Google Scholar
- Albert R: Scale-free networks in cell biology. J Cell Sci. 2005, 118 (Pt 21): 4947-4957.PubMedView ArticleGoogle Scholar
- Borgatti SP, Mehra A, Brass DJ, Labianca G: Network analysis in the social sciences. Science. 2009, 323 (5916): 892-895. 10.1126/science.1165821.PubMedView ArticleGoogle Scholar
- Aoki K, Ogata Y, Shibata D: Approaches for extracting practical information from gene co-expression networks in plant biology. Plant Cell Physiol. 2007, 48 (3): 381-390. 10.1093/pcp/pcm013.PubMedView ArticleGoogle Scholar
- Beisson F, Koo AJ, Ruuska S, Schwender J, Pollard M, Thelen JJ, Paddock T, Salas JJ, Savage L, Milcamps A, Mhaske VB, Cho Y, Ohlrogge JB: Arabidopsis genes involved in acyl lipid metabolism. A 2003 census of the candidates, a study of the distribution of expressed sequence tags in organs, and a web-based database. Plant Physiol. 2003, 132 (2): 681-697. 10.1104/pp.103.022988.PubMedPubMed CentralView ArticleGoogle Scholar
- Eccleston VS, Ohlrogge JB: Expression of lauroyl-acyl carrier protein thioesterase in brassica napus seeds induces pathways for both fatty acid oxidation and biosynthesis and implies a set point for triacylglycerol accumulation. Plant Cell. 1998, 10 (4): 613-622.PubMedPubMed CentralGoogle Scholar
- Ramli US, Baker DS, Quant PA, Harwood JL: Control analysis of lipid biosynthesis in tissue cultures from oil crops shows that flux control is shared between fatty acid synthesis and lipid assembly. Biochem J. 2002, 364 (Pt 2): 393-401.PubMedPubMed CentralView ArticleGoogle Scholar
- Bao X, Ohlrogge J: Supply of fatty acid is one limiting factor in the accumulation of triacylglycerol in developing embryos. Plant Physiol. 1999, 120 (4): 1057-1062. 10.1104/pp.120.4.1057.PubMedPubMed CentralView ArticleGoogle Scholar
- Weselake RJ, Shah S, Tang M, Quant PA, Snyder CL, Furukawa-Stoffer TL, Zhu W, Taylor DC, Zou J, Kumar A, Hall L, Laroche A, Rakow G, Raney P, Moloney MM, Harwood JL: Metabolic control analysis is helpful for informed genetic manipulation of oilseed rape (Brassica napus) to increase seed oil content. J Exp Bot. 2008, 59 (13): 3543-3549. 10.1093/jxb/ern206.PubMedPubMed CentralView ArticleGoogle Scholar
- Weselake RJ: Storage lipids. Plant Lipids - Biology, Utilization and Manipulation. Edited by: Murphy DJ. 2005, Oxford, UK: Blackwell Publishing, 162-221.Google Scholar
- Ohlrogge J, Browse J: Lipid biosynthesis. Plant Cell. 1995, 7 (7): 957-970.PubMedPubMed CentralView ArticleGoogle Scholar
- Saha S, Enugutti B, Rajakumari S, Rajasekharan R: Cytosolic triacylglycerol biosynthetic pathway in oilseeds. Molecular cloning and expression of peanut cytosolic diacylglycerol acyltransferase. Plant Physiol. 2006, 141 (4): 1533-1543. 10.1104/pp.106.082198.PubMedPubMed CentralView ArticleGoogle Scholar
- Tzen J, Cao Y, Laurent P, Ratnayake C, Huang A: Lipids, Proteins, and Structure of Seed Oil Bodies from Diverse Species. Plant Physiol. 1993, 101 (1): 267-276.PubMedPubMed CentralView ArticleGoogle Scholar
- Weselake RJ, Pomeroy MK, Furukawa TL, Golden JL, Little DB, Laroche A: Developmental profile of diacylglycerol acyltransferase in maturing seeds of oilseed rape and safflower and microspore-derived cultures of oilseed rape. Plant Physiol. 1993, 102 (2): 565-571.PubMedPubMed CentralView ArticleGoogle Scholar
- Gao MJ, Lydiate DJ, Li X, Lui H, Gjetvaj B, Hegedus DD, Rozwadowski K: Repression of seed maturation genes by a trihelix transcriptional repressor in Arabidopsis seedlings. Plant Cell. 2009, 21 (1): 54-71. 10.1105/tpc.108.061309.PubMedPubMed CentralView ArticleGoogle Scholar
- Ogas J, Kaufmann S, Henderson J, Somerville C: PICKLE is a CHD3 chromatin-remodeling factor that regulates the transition from embryonic to vegetative development in Arabidopsis. Proc Natl Acad Sci USA. 1999, 96 (24): 13839-13844. 10.1073/pnas.96.24.13839.PubMedPubMed CentralView ArticleGoogle Scholar
- Swarbreck D, Wilks C, Lamesch P, Berardini TZ, Garcia-Hernandez M, Foerster H, Li D, Meyer T, Muller R, Ploetz L, Radenbaugh A, Singh S, Swing V, Tissier C, Zhang P, Huala E: The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Res. 2008, 36 (Database issue): D1009-14.PubMedGoogle Scholar
- Finkelstein RR, Lynch TJ: The Arabidopsis abscisic acid response gene ABI5 encodes a basic leucine zipper transcription factor. Plant Cell. 2000, 12 (4): 599-609.PubMedPubMed CentralView ArticleGoogle Scholar
- Thomas-Chollier M, Sand O, Turatsinze JV, Janky R, Defrance M, Vervisch E, Brohee S, van Helden J: RSAT: regulatory sequence analysis tools. Nucleic Acids Res. 2008, 36 (Web Server Issue): W119-27.PubMedPubMed CentralView ArticleGoogle Scholar
- Megraw M, Baev V, Rusinov V, Jensen ST, Kalantidis K, Hatzigeorgiou AG: MicroRNA promoter element discovery in Arabidopsis. RNA. 2006, 12 (9): 1612-1619. 10.1261/rna.130506.PubMedPubMed CentralView ArticleGoogle Scholar
- Pandey SP, Krishnamachari A: Computational analysis of plant RNA Pol-II promoters. BioSystems. 2006, 83: 38-50. 10.1016/j.biosystems.2005.09.001.PubMedView ArticleGoogle Scholar
- Lenhard B, Wasserman WW: TFBS: Computational framework for transcription factor binding site analysis. Bioinformatics. 2002, 18 (8): 1135-1136. 10.1093/bioinformatics/18.8.1135.PubMedView ArticleGoogle Scholar
- Li L, Bass RL, Liang Y: fdrMotif: identifying cis-elements by an EM algorithm coupled with false discovery rate control. Bioinformatics. 2008, 24 (5): 629-636. 10.1093/bioinformatics/btn009.PubMedPubMed CentralView ArticleGoogle Scholar
- Bryne JC, Valen E, Tang MH, Marstrand T, Winther O, da Piedade I, Krogh A, Lenhard B, Sandelin A: JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update. Nucleic Acids Res. 2008, 36 (Database Issue): D102-106.PubMedGoogle Scholar
- Yanagisawa S, Sheen J: Involvement of maize Dof zinc finger proteins in tissue-specific and light-regulated gene expression. Plant Cell. 1998, 10 (1): 75-89.PubMedPubMed CentralView ArticleGoogle Scholar
- Jiao Y, Ma L, Strickland E, Deng XW: Conservation and divergence of light-regulated genome expression patterns during seedling development in rice and Arabidopsis. Plant Cell. 2005, 17 (12): 3239-3256. 10.1105/tpc.105.035840.PubMedPubMed CentralView ArticleGoogle Scholar
- Muller B, Sheen J: Cytokinin and auxin interaction in root stem-cell specification during early embryogenesis. Nature. 2008, 453 (7198): 1094-1097. 10.1038/nature06943.PubMedPubMed CentralView ArticleGoogle Scholar
- Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, Hornik K, Hothorn T, Huber W, Iacus S, Irizarry R, Leisch F, Li C, Maechler M, Rossini AJ, Sawitzki G, Smith C, Smyth G, Tierney L, Yang JY, Zhang J: Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004, 5 (10): R80-10.1186/gb-2004-5-10-r80.PubMedPubMed CentralView ArticleGoogle Scholar
- R Development Core Team: R: a language and environment for statistical computing. The R Foundation for Statistical Computing. 2008, Vienna, AustriaGoogle Scholar
- Wilson CL, Miller CJ: Simpleaffy: a BioConductor package for Affymetrix Quality Control and data analysis. Bioinformatics. 2005, 21 (18): 3683-3685. 10.1093/bioinformatics/bti605.PubMedView ArticleGoogle Scholar
- Wu ZJ, Irizarry RA, Gentleman R, Martinez-Murillo F, Spencer F: A model-based background adjustment for oligonucleotide expression arrays. J Am Stat Assoc. 2003, 99: 909-917.View ArticleGoogle Scholar
- Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP: Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 2003, 4 (2): 249-264. 10.1093/biostatistics/4.2.249.PubMedView ArticleGoogle Scholar
- Altschul SF, Madden TL, Schaffer AA, Zhang JH, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25: 3389-3402. 10.1093/nar/25.17.3389.PubMedPubMed CentralView ArticleGoogle Scholar
- Cline MS, Smoot M, Cerami E, Kuchinsky A, Landys N, Workman C, Christmas R, Avila-Campilo I, Creech M, Gross B, Hanspers K, Isserlin R, Kelley R, Killcoyne S, Lotia S, Maere S, Morris J, Ono K, Pavlovic V, Pico AR, Vailaya A, Wang PL, Adler A, Conklin BR, Hood L, Kuiper M, Sander C, Schmulevich I, Schwikowski B, Warner GJ, Ideker T, Bader GD: Integration of biological networks and gene expression data using Cytoscape. Nat Protoc. 2007, 2 (10): 2366-2382. 10.1038/nprot.2007.324.PubMedPubMed CentralView ArticleGoogle Scholar
- Futschik ME, Carlisle B: Noise-robust soft clustering of gene expression time-course data. J Bioinform Comput Biol. 2005, 3 (4): 965-988. 10.1142/S0219720005001375.PubMedView ArticleGoogle Scholar
- Molina C, Grotewold E: Genome wide analysis of Arabidopsis core promoters. BMC Genomics. 2005, 6 (1): 25-10.1186/1471-2164-6-25.PubMedPubMed CentralView ArticleGoogle Scholar
- Palaniswamy SK, James S, Sun H, Lamb RS, Davuluri RV, Grotewold E: AGRIS and AtRegNet. A platform to link cis-regulatory elements and transcription factors into regulatory networks. Plant Physiol. 2006, 140 (3): 818-829. 10.1104/pp.105.072280.PubMedPubMed CentralView ArticleGoogle Scholar
- Crooks GE, Hon G, Chandonia JM, Brenner SE: WebLogo: a sequence logo generator. Genome Res. 2004, 14 (6): 1188-1190. 10.1101/gr.849004.PubMedPubMed CentralView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.