Identification of nutrient partitioning genes participating in rice grain filling by singular value decomposition (SVD) of genome expression data
© Anderson et al; licensee BioMed Central Ltd. 2003
Received: 04 April 2003
Accepted: 10 July 2003
Published: 10 July 2003
In order to identify rice genes involved in nutrient partitioning, microarray experiments have been done to quantify genomic scale gene expression. Genes involved in nutrient partitioning, specifically grain filling, will be used to identify other co-regulated genes, and DNA binding proteins. Proper identification of the initial set of bait genes used for further investigation is critical. Hierarchical clustering is useful for grouping genes with similar expression profiles, but decreases in utility as data complexity and systematic noise increases. Also, its rigid classification of genes is not consistent with our belief that some genes exhibit multifaceted, context dependent regulation.
Singular value decomposition (SVD) of microarray data was investigated as a method to complement current techniques for gene expression pattern recognition. SVD's usefulness, in finding likely participants in grain filling, was measured by comparison with results obtained previously via clustering. 84 percent of these known grain-filling genes were re-identified after detailed SVD analysis. An additional set of 28 genes exhibited a stronger grain-filling pattern than those grain-filling genes that were unselected. They also had upstream sequence containing motifs over-represented among grain filling genes.
The pattern-based perspective that SVD provides complements to widely used clustering methods. The singular vectors provide information about patterns that exist in the data. Other aspects of the decomposition indicate the extent to which a gene exhibits a pattern similar to those provided by the singular vectors. Thus, once a set of interesting patterns has been identified, genes can be ranked by their relationship with said patterns.
Grain filling aspects of nutrient partitioning are intensely studied as they affect the yield and quality of many important cereals. This quality can be measured in nutritional and aesthetic terms. The grain-filling process of cereal development typically has two processes: dilatory and filling. Together these processes encompass the synthesis, transport, and storage of carbohydrates, fatty acids, proteins, and minerals. The dilatory process is characterized by high biosynthetic activity and low dry matter accumulation. During the filling phase all plant resources contribute toward a steady rate of starch accumulation in the starch storage unit. Genes that influence the grain filling process are particularly important in achieving the goal of manipulating nutrient partitioning pathways.
In Zhu et al. (2003) , several genes responsible for grain filling in rice were computationally identified. There, clustering of gene expression profiles was used to identify grain filling genes and their transcription factors from 21,000 rice genes. The method used consisted of an initial identification of nutrient partitioning genes based on annotation and selection of genes that potentially participate in the grain-filling process by clustering of expression profiles via Self-Organizing Map (SOM), followed by hierarchical clustering influenced by the SOM gene ordering . A set of grain filling related, nutrient partitioning gene clusters were identified via informed visual inspection of the hierarchical clustering results. This initial set of genes formed the sole basis for identification of a wider range of grain filling related genes with diverse functions, over-represented cis acting regulatory elements, and associated transcription factors. Such an approach provided a powerful way to associate genes with traits of interest, to identify key regulators as putative target genes in this complicated biological process, and a potential method to identify strategies for improvement of crop yield and nutrient value by pathway engineering. However, the identified genes and their regulatory networks require thorough functional validations by experimental methods such as reverse genetics. These experimental validation steps usually are time-consuming and expensive. Thus, improvement of microarray data analysis by false positive reduction becomes necessary.
Competitive learning schemes like the Kohonen SOM  and hierarchical clustering are popular methods for visualization and identification of patterns in a large set of gene expression profiles. SOM analysis can provide nonexclusive classifications, but requires an estimate for the number of classes (nodes) and is usually carried out in a low-dimensional space. Hierarchical clustering is a more frequently used method, but visualization via one-dimensional lists can lead to poor resolution of related genes even if a SOM gene ordering influences the branch flipping, as implemented in the software tool Cluster .
Recently, singular value decomposition (SVD) has emerged as an alternative method for genomic research. Several groups have demonstrated its utility in identifying global, cyclic patterns of gene expression [4, 5], and its application in reduction of experimental and biological noise in microarray datasets [5, 6]. SVD is a feature generation technique that facilitates the exploration of multiple dimensions of data variability. SVD is an operation applied to a matrix that results in a list of vectors, which contain features measuring different aspects of variation in the data. One can produce multiple nonexclusive gene orderings or classifications based on significant feature vectors. The patterns exhibited by one or more feature vectors, singly or in combination, may correspond to biological processes.
To improve accuracy of target identification by avoiding exclusive clustering and exploring a wider range of dimensionality in expression pattern variation, we have examined the utility of singular value decomposition in identifying grain-filling genes. In this manuscript, we focus on nutrient partitioning genes potentially involved in grain filling. After evaluating the full spectrum of expression patterns, we address the identification of grain filling genes by a measure of correlation with a familiar expression pattern, conceptually conforming to grain filling. In this manner we have identified several genes potentially involved in grain filling, and evaluate them by comparison with grain filling genes identified in an earlier study . These genes have similar expression profiles showing significant differential expression during rice grain development and tissue specificity to panicle and grain. Cis-acting regulatory element surveys also support their role in the grain filling process.
List of Probe Set IDs for newly identified genes (Class C) and unresolved genes (Class B)
a) Class C
b) Class B
YNT1_ANASP Q05067 ANABAENA SP. (STRAIN PCC 7120). HYPOTHETICAL ABC TRANSPORTER ATP-BINDING PROTEIN IN NTCA/BIFA 3 REGION(ORF1) (FRAGMENT).
gi|7267708| sucrose-phosphate synthase-like protein [Arabidopsis thaliana]
gi|4467848| ADP-glucose pyrophosphorylase large subunit [Hordeum vulgare]
gi|7576210| palmitoyl-protein thioesterase precursor-like [Arabidopsis thaliana]
gi|4467144| putative phosphatidylinositol synthase [Arabidopsis thaliana]
gi|5734442| hexose transporter [Lycopersicon esculentum]
gi|7108597|AF129478_1 K+ transporter HAK5 [Arabidopsis thaliana]
ABCX_PORPU P51241 PORPHYRA PURPUREA. PROBABLE ATP-DEPENDENT TRANSPORTER YCF16.
gi|322850|PC1257 alpha-amylase (EC 220.127.116.11) (clone alphaAmy8-C) – rice (fragment)
PA2L_VIPAA P17935 VIPERA AMMODYTES AMMODYTES (WESTERN SAND VIPER). PHOSPHOLIPASE A2 HOMOLOG, AMMODYTIN L PRECURSOR.
AMYG_NEUCR P14804 NEUROSPORA CRASSA. GLUCOAMYLASE PRECURSOR
PSS_METJA Q58609 METHANOCOCCUS JANNASCHII. CDP-DIACYLGLYCEROL – SERINE O-PHOSPHATIDYLTRANSFERASE (EC 18.104.22.168) (PHOSPHATIDYLSERINE SYNTHASE).
gb|Z95637 acyl-CoA:1-acylglycerol-3-phosphate acyltransferase from Brassica napus. [Arabidopsis thaliana]
gi|2129657|S71286 oleosin, 20 K – Arabidopsis thaliana
PTSN_ECOLI P31222 ESCHERICHIA COLI. NITROGEN REGULATORY IIA PROTEIN
gi|7939571| phospholipase D [Arabidopsis thaliana]
gi|2981620| mutated 3-ketoacyl-CoA thiolase [Arabidopsis thaliana]
KDGL_DROME Q01583 DROSOPHILA MELANOGASTER (FRUIT FLY). DIACYLGLYCEROL KINASE (EC 22.214.171.124) (DIGLYCERIDE KINASE) (DGK)(DAG KINASE).
YDEX_ECOLI P77257 ESCHERICHIA COLI. HYPOTHETICAL ABC TRANSPORTER ATP-BINDING PROTEIN YDEX.
PTFA_MYCGE P47308 MYCOPLASMA GENITALIUM. PTS SYSTEM, FRUCTOSE-SPECIFIC IIABC COMPONENT (EIIABC-FRU) (FRUCTOSE-PERMEASE IIABC COMPONENT) (PHOSPHOTRANSFERASE ENZYME II, ABCCOMPONENT) (EC 126.96.36.199) (EII-FRU / EIII-FRU).
gi|4587543|AC006577_10 Belongs to the PF|00657 Lipase/Acylhydrolase with GDSL-motif family.
GPI2_YEAST P46961 P48014 SACCHAROMYCES CEREVISIAE (BAKER S YEAST). N-ACETYLGLUCOSAMINYL-PHOSPHATIDYLINOSITOL BIOSYNTHETIC PROTEIN GPI2.
TRPC_AZOBR P26938 AZOSPIRILLUM BRASILENSE. INDOLE-3-GLYCEROL PHOSPHATE SYNTHASE
gi|2285885| sulfate transporter [Arabidopsis thaliana]
FCGN_HUMAN P55899 HOMO SAPIENS (HUMAN). IGG RECEPTOR FCRN LARGE SUBUNIT P51 PRECURSOR
AMHX_BACSU P54983 BACILLUS SUBTILIS. AMIDOHYDROLASE AMHX (EC 3.5.1.-) (AMINOACYLASE).
KICH_YEAST P20485 SACCHAROMYCES CEREVISIAE (BAKER S YEAST). CHOLINE KINASE (EC 188.8.131.52).
YKG8_CAEEL P46558 CAENORHABDITIS ELEGANS. HYPOTHETICAL CHOLINE KINASE LIKE B0285.8 IN CHROMOSOME III.
gi|735880| geranylgeranyl pyrophosphate synthase-related protein
gi|2129660|S69197 oleoyl-[acyl-carrier-protein] hydrolase (EC 184.108.40.206) (clone TE 1–7) – Arabidopsis thaliana
YDEZ_ECOLI P77651 ESCHERICHIA COLI. HYPOTHETICAL ABC TRANSPORTER PERMEASE PROTEIN YDEZ.
YCKJ_BACSU P42200 BACILLUS SUBTILIS. PROBABLE AMINO-ACID ABC TRANSPORTER PERMEASE PROTEIN.
R104_SACPA Q92378 SACCHAROMYCES PARADOXUS (YEAST). MEIOTIC RECOMBINATION PROTEIN REC104.
gi|3044212| acyl-CoA oxidase [Arabidopsis thaliana]
GLGC_BACCL P30522 BACILLUS CALDOLYTICUS. GLUCOSE-1-PHOSPHATE ADENYLYLTRANSFERASE (EC 220.127.116.11) (ADP-GLUCOSESYNTHASE) (ADP-GLUCOSE PYROPHOSPHORYLASE) (FRAGMENT).
NEPU_THEVU Q08751 THERMOACTINOMYCES VULGARIS. NEOPULLULANASE (EC 18.104.22.168) (ALPHA-AMYLASE II).
gi|4490321| nitrate transporter [Arabidopsis thaliana]
gi|9294650| lipase/acylhydrolase; myrosinase-associated protein [Arabidopsis thaliana]
gi|4115931| contains similarity to Guillardia theta ABC transporter (GB:AF041468) [Arabidopsis thaliana]
gi|8570057| ESTs AU056822(S20908), C26441(C12328), C28477(C61243) correspond to a region of the predicted gene.~Arabidopsis thaliana putative acyl-coA dehydrogenase (AF049236) [Oryza sativa]
The involvement in grain filling of the novel set (C) of genes is supported by the over-representation of the grain filling cis-element. A survey of cis elements of the C set of genes shows that they have more in common with grain filling genes than the unselected genes. The element AACA was found to be over represented among grain-filling genes in earlier work by Zhu et al., and is more abundant among the novel genes. AACA was part of the motif CAACA, which occurred in 12 of the 14 promoters. This motif was described as the RAV1 AAT binding consensus sequence of Arabidopsis thaliana transcription factor, RAV1 . AACA was also found in an E4-TATA Box element contained in 6 of the 14 promoters . Looking for AACA in the set of unselected genes, it was found in the motif TAACAAA, which only occurred in 2 of the 8 promoters. This motif was described as a binding site for GAMYB . Considering the similarities in expression pattern and promoter elements between the novel genes and the previously identified grain filling genes, the novel set of genes is likely to be involved in grain filling as well.
Our association of genes to singular vectors was not exclusive. A gene's expression profile can be described as linear combination of each right singular vector, and said gene can be associated with those vectors that contribute the most. Clustering provides nested clusters with mutual exclusivity among clusters at a given correlation threshold. When there are genes with low intensity patterns, clustering can result in groups with a mixture of patterns and low internal correlation. Sets of genes strongly correlated with a particular singular vector will have greater internal correlation. Given these differences, both methods agree on a majority of grain filling genes, which had very significant differential expression during grain development. This could be expected as both clustering and SVD seek to minimize the squared error.
The area of disagreement between methods concerned genes with low differential expression. They were listed as grain filling in the control set but were not selected with the SVD method. These unselected genes are not any less important than those identified with SVD, they simply exhibit small changes in expression level during grain filling – changes that may be insignificant due to the presence of errors. In many cases regulatory genes have small changes in expression level, while target genes further along in the cascade have larger changes. The increased sensitivity of pattern detection will improve our ability to extract these target genes before using them to search for regulatory genes with other methods. Almost all of these genes with a weak grain-filling pattern were originally part of hierarchical clustering nodes with very low internal correlation (average pair-wise correlation). A more stringent classification that excluded such nodes would have resulted in a more concordant set of gene expression profiles.
The unselected set might be misclassified in the earlier study , possibly carrying out roles important in the grain filling process but sites physically distant from the grain body itself. Their expression profiles did not follow a pattern similar to that of known grain filling genes, and during vegetative growth, their expression levels are consistently higher. These genes, in general, lacked promoter elements common to grain filling genes. They also had conserved promoter elements that were not found in grain filling genes. If functional in rice, their light repressed elements would explain the higher expression in root, which grows in dark conditions. It is possible that the higher expression levels observed during stem and leaf senescence are due to suppression of photosynthesis activity. Rice is a monocot plant and there are portions of stem that are etiolated due to blockage of light from permanent leaf encirclement. During senescence, the light sensitivity of stem and leaf is biochemically reduced. This may result in a de facto etiolated state also explaining the elevated gene expression. The misclassification of these genes was due to the combined effects of including nodes with low internal correlation and information loss from hierarchical clustering. It should also be noted that compared to the set of novel genes and the set of agreed-upon genes, a greater fraction, 0.3 vs. 0.107 & 0.115 respectively, of these unselected genes have potentially unreliable probe sets. This unreliability stems from the inability to compile a full set of unique probes for these genes, and is indicated by the Probe Set ID suffixes 'r' and 'i', indicating sequences for which it was not possible to pick a full set of unique probes or for which there are fewer than fifteen probes.
When using hierarchical clustering to classify gene expression profiles, there are several drawbacks to consider. Generally, microarray data is information-rich, with multiple dimensions of variability. The ordering of genes produced by hierarchical clustering reduces this variability to a single dimension, which may not accurately reflect the differences between expression profiles. As a result, closeness in this single dimension may not reflect similarities occurring in a higher dimensional space. These factors impact difficult-to-classify profiles more significantly. In Zhu et al. (2003)  this lead to grain-filling genes, with less obvious expression profiles, being grouped with a mixture of other profiles, resulting in the selection of unreliable patterns over better candidates. The SOM ordering used to influence the hierarchical clustering ordering helped with this problem, but there are other ways to improve classification accuracy. To avoid these pitfalls, a more stringent selection criterion could have been used with the hierarchical clustering results to build a core set, followed by a profile ranking based on correlation with the core set. The second round of selection would help to recover profiles with a weaker pattern, which would have been randomly ordered by hierarchical clustering. Another way to avoid these pitfalls would be to take advantage of the many "fuzzy" clustering algorithms, which generate non-exclusive assignment of genes to clusters [12, 13].
While exploring this use of SVD, there were a few caveats learned which concern the contribution of noise to the pattern spectrum. The matrix that is decomposed by SVD is usually a dissimilarity matrix like the covariance matrix. The singular values, w i , are typically used to indicate the significance of each right singular vector to the dataset. The first n vectors that cumulatively account for greater than 90% of the dataset's variation are sometimes used to describe the dataset and reduce its dimensionality. The rest of the right singular vectors are then considered noise. This interpretation is not always correct and should be tempered by a study of the right singular vectors themselves. In the situation where the dataset contains significant additive noise or where the means are not centered, the previous assumption could result in one ignoring informative patterns exhibited by the right singular vectors with relatively low singular values. The primary signal would represent this noise (unconformable pattern) and mask the other patterns. An observation of the primary right singular vector's pattern would indicate a relatively constant level for all samples, poorly classifying experiments or correlating with differentially expressed genes. Although possible, there was no need to filter out strong noise from the dataset and recalculate the SVD. This condition was dealt with by focusing on right singular vectors that correlate with our conceptual grain-filling pattern, regardless of their singular value's magnitude. These "grain filling" right singular vectors were then used to classify the genes in our dataset. In fact, a classification of genes using the right singular vector with the largest singular value was of low quality.
As the coefficients for each right singular vector are used to classify genes and the genes in each class should ideally have coefficients different from genes in other classes, the challenge to identify a vector that produces a good classification can be simplified by measurement of the entropy for each vector's coefficient distribution. Vectors with the lowest entropy, even if they have a small singular value, have the most ordered coefficient distribution and may be quite useful in classifying genes into distinct groups. We will follow up on this idea in future applications of SVD to RNA dynamics.
SVD can be used to reduce the dimensionality of a data set, but our method uses the singular vectors generated by the decomposition to identify patterns that may relate to grain filling in rice. In this way, we attempt to avoid overlooking any dimensions of expression profile variance. A gene ranking based on similarity to interesting feature-vectors allows recovery of profiles with weaker but relevant grain filling patterns. This method selected genes with greater differential expression than the questionable set presented by clustering. The newly identified genes are important because they represent genes that have a stronger pattern of grain filling, which were not easily visually identified from the hierarchical clustering. It is likely that if the previous method only relied on SOM, more of these genes would have been identified. SVD provides much information about patterns of variability in a dataset rather than a rigid assignment of genes to clusters. This added perspective, plus the ability to amplify or attenuate specific patterns in the dataset, complements the classifications given by commonly used clustering techniques.
We conclude that SVD is a useful alternative method that complements widely used clustering methods for studying function of genes. The SVD identified grain-filling related genes, providing additional, valuable candidate genes for improving grain composition and yield.
The dataset comprised expression levels of 491 genes in 33 samples, with emphasis on the 17 samples directly related to grain filling . The complete dataset used is available at http://www.blackwell-science.com/products/journals/suppmat/PBI/PBI006/PBI006sm.htm. Based on their sequence annotation and functional classification , the 491 genes were selected because their products are presumably involved in or associated with three major pathways of nutrient partitioning: the synthesis and transport of fatty acids, carbohydrates, and proteins. The 17 grain filling related tissue samples include panicle 1–3 cm, panicle 4–7 cm, panicle 8–14 cm, panicle 15–20 cm, seed 0 day, seed 2 day, seed 4 day, seed 7 day, seed 9 day, seed (soft dough), seed (hard dough), embryo, endosperm, seed coat (milk stage), aleurone, and seed (milk stage). A complete description of the experimental protocols used to generate this dataset can be found in Zhu et al (2003) .
In our matrix A' each row corresponded to a different gene and each column corresponded to one of 17 different conditions. The a ij cell in A' was the expression level of gene i under condition j. The data in A' was transformed to the n × m matrix A according to the protocol in Zhu et al. (2002) . During this transformation values of a ij less than 5 were set equal to 5 and log2-transformed. Next, the expression vectors were median-centered and normalized such that the sum of squares for each expression vector was equal to one. In efforts to validate our results, we also investigated gene expression level in a wider range of samples, including the 17 mentioned above, totaling 33 samples. The normalization applied to this broader set was the same as that described for the set of 17, above. Note that the difference in sample number will affect the median centering and normalization steps, making smaller deviations from the median less obvious.
The SVD theorem (Press et al., 1992) is stated in eq1 . U (n × q) and V (q × q) contain orthogonal vectors, and W (q × q) is a diagonal matrix of coefficients or singular values,
A = UWV T 
denoted w1, w2, ... wq. q is the rank of A, and is generally the smaller of the two dimensions n and m. The decomposition was performed using the commercial software package S-PLUS™ (Insightful Co., Seattle, WA) according to Golub and van Loan (1996) . The rows of V T , or V transposed, are the right singular vectors, v j . Each right singular vector, alone or in combination with other vectors, describes a pattern of variation in A that could be indicative of a biological process. The columns of U are the left singular vectors, u j . Each coefficient, u ij , indicates the relative contribution of pattern v j to the expression profile of gene i. The singular value w i indicates the relative contribution of pattern v i to all gene expression patterns in A. The square of the singular values divided by the sum of singular values squares defines the relative variance for each singular value. This relative variance indicates how much of the variance in A is explained by a particular singular vector. The expression profile of any gene can be written as a linear combination of these singular vectors and the singular values in W.
The right singular vectors that match our preconception of a grain filling pattern of expression, for example, low expression during panicle development and increasing expression during grain development, were identified after A was decomposed. For each interesting pattern, v j , the genes, g i , were sorted by u ij and the top 80th percentile were selected. These top scorers were compared to 98 genes previously identified as grain filling-related nutrient partitioning genes by Zhu et al., which they used as a template for selecting other genes and transcription factors involved in grain filling. In Zhu et al., the 98 genes were manually selected by visualization of a hierarchical clustering informed by a SOM grouping of the 491 potential nutrient partitioning genes (Figures 5,6). The quality of the ordering given by u j was assessed by plotting the percent of the 98 found having a percentile greater than p for all p less than 1. Similarly, the percent of those genes selected that are in the set of 98, for all p, is plotted.
We observed the entropy (E) of various distributions during our study, and the generalized formula we used is shown in Equations 2 and 3 for a vector F, containing N scalars.
After genes were classified, their promoter sequences were identified to check if pattern similarity could be related to conserved cis elements. The statistically significant elements were identified with a PERL script and annotated with the PLACE database . The PERL script identified motifs among promoter sequences for a given set of genes. Those elements that matched to an annotated cis-acting regulatory DNA element from the PLACE database were then presented. We limited our investigation to elements located within 2 KB of the transcriptional start site and that had an e-value less than 3E-02. At the time of publication, not all probe sets could be associated with high quality assembled upstream sequences.
- Zhu T, Budworth P, Chen W, Provart N, Chang HS, Guimil S, Estes B, Zou G, Wang X: Transcriptional Control of Nutrient Partitioning During Rice Grain Filling. Plant Biotechnol J. 2003, 1: 59-70. 10.1046/j.1467-7652.2003.00006.x.View ArticlePubMedGoogle Scholar
- Eisen MB, Spellman PT, Brown PO, Botstein D: Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA. 1998, 95: 14863-14868. 10.1073/pnas.95.25.14863.PubMed CentralView ArticlePubMedGoogle Scholar
- Kohonen T: Self-Organization and Associative Memory. 1989, Springer-Verlag Telos, 2View ArticleGoogle Scholar
- Holter NS: Fundamental patterns underlying gene expression profiles: Simplicity from complexity. Proc Natl Acad Sci USA. 2000, 97: 9409-9414. 10.1073/pnas.150242097.View ArticleGoogle Scholar
- Alter O, Brown PO, Botstein D: Singular value decomposition for genome-wide expression data processing and modeling. Proc Natl Acad Sci USA. 2000, 18: 10101-10106. 10.1073/pnas.97.18.10101.View ArticleGoogle Scholar
- Dewey GT, Galas DJ: Dynamic models of gene expression and classification. Funct Integr Genomics. 2001, 1: 269-278. 10.1007/s101420000035.View ArticlePubMedGoogle Scholar
- Everitt BS, Dunn G: Applied Multvariate Data Analysis. 2001, Arnold, LondonView ArticleGoogle Scholar
- Kagaya Y, Ohmiya K, Hattori T: RAV1, a novel DNA-binding protein, binds to bipartite recognition sequence through two distinct DNA-binding domains uniquely found in higher plants. Nucleic Acids Res. 1999, 27: 470-478. 10.1093/nar/27.2.470.PubMed CentralView ArticlePubMedGoogle Scholar
- Cordes S, Deikman J, Margossian LJ, Fischer RL: Interaction of a developmentally regulated DNA-binding factor with sites flanking two different fruit-ripening genes from tomato. Plant Cell. 1989, 1: 1025-1034. 10.1105/tpc.1.10.1025.PubMed CentralView ArticlePubMedGoogle Scholar
- Gubler F, Kalla R, Roberts JK, Jacobsen JV: Gibberellin-regulated expression of a myb gene in barley aleurone cells: evidence for Myb transactivation of a high-pl alpha-amylase gene promoter. Plant Cell. 1995, 7: 1879-1891. 10.1105/tpc.7.11.1879.PubMed CentralPubMedGoogle Scholar
- Degenhardt J, Tobin EM: A DNA binding activity for one of two closely defined phytochrome regulatory elements in an Lhcb promoter is more abundant in etiolated than in green plants. Plant Cell. 1996, 8: 31-41. 10.1105/tpc.8.1.31.PubMed CentralView ArticlePubMedGoogle Scholar
- Theodoridis S, Koutroumbas K: Pattern Recognition. 1999, Academic Press, San Diego, CAGoogle Scholar
- Gasch AP, Eisen MB: Exploring the conditional coregulation of yeast expression through fuzzy k-means clustering. Genome Biol. 2002, 3: research0059.1-0059.22. 10.1186/gb-2002-3-11-research0059.View ArticleGoogle Scholar
- Goff SA, Ricke D, Lan TH, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H: A Draft Sequence of the Rice Genome (Oryza sativa L. ssp. Japonica). Science. 2002, 296: 92-100. 10.1126/science.1068275.View ArticlePubMedGoogle Scholar
- Press WH, Teukolsky SA, Vetterling W, Flannery BP: Numerical recipes in C: the art of scientific computing. 1992, Cambridge University Press, Cambridge, 2Google Scholar
- Golub G, Van Loan C: Matrix Computations. 1996, Johns Hopkins University Press, Baltimore, MDGoogle Scholar
- Higo K, Ugawa Y, Iwamoto M, Korenaga T: Plant cis-acting regulatory DNA elements (PLACE) database. Nucleic Acids Res. 1999, 27: 297-300. 10.1093/nar/27.1.297.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.