Comparative analyses of six solanaceous transcriptomes reveal a high degree of sequence conservation and species-specific transcripts
© Rensink et al; licensee BioMed Central Ltd. 2005
Received: 13 May 2005
Accepted: 14 September 2005
Published: 14 September 2005
The Solanaceae is a family of closely related species with diverse phenotypes that have been exploited for agronomic purposes. Previous studies involving a small number of genes suggested sequence conservation across the Solanaceae. The availability of large collections of Expressed Sequence Tags (ESTs) for the Solanaceae now provides the opportunity to assess sequence conservation and divergence on a genomic scale.
All available ESTs and Expressed Transcripts (ETs), 449,224 sequences for six Solanaceae species (potato, tomato, pepper, petunia, tobacco and Nicotiana benthamiana), were clustered and assembled into gene indices. Examination of gene ontologies revealed that the transcripts within the gene indices encode a similar suite of biological processes. Although the ESTs and ETs were derived from a variety of tissues, 55–81% of the sequences had significant similarity at the nucleotide level with sequences among the six species. Putative orthologs could be identified for 28–58% of the sequences. This high degree of sequence conservation was supported by expression profiling using heterologous hybridizations to potato cDNA arrays that showed similar expression patterns in mature leaves for all six solanaceous species. 16–19% of the transcripts within the six Solanaceae gene indices did not have matches among Solanaceae, Arabidopsis, rice or 21 other plant gene indices.
Results from this genome scale analysis confirmed a high level of sequence conservation at the nucleotide level of the coding sequence among Solanaceae. Additionally, the results indicated that part of the Solanaceae transcriptome is likely to be unique for each species.
The Solanaceae family encompasses a number of species of agronomic and ornamental importance. With regards to cultivation for food consumption, in 2003, potato was the world's fifth largest crop in world-wide production acreage and the solanaceous vegetables tomato, eggplant, and pepper ranked 11th, 19th, and 22nd, respectively . Species grown for ornamental purposes include petunia and Nicotiana species. While not consumed for food, these horticultural species are a substantial component of the US agronomic economy. For example, petunia represents greater than $148M output per year in the US . Tobacco represents another crop of significant economical importance with $1.6B in crop value in 2003 . A close relative of tobacco, Nicotiana benthamiana, has been utilized as an experimental model for viral research and disease resistance studies. Coupled with the robust ability of virus induced gene silencing to silence transcripts , N. benthamiana has emerged as a model species for disease resistance research.
The Solanaceae have been bred and developed for a variety of purposes. Potato has been bred for tubers (modified stems) while tomato, pepper, and eggplant have been bred for enhanced fruit production. Likewise, petunia has been bred and selected for floral phenotypes while tobacco has been bred for leaf size. While these modern varieties are accentuated for particular morphological features, these species share common taxonomic features of the Solanaceae such as alternate leaves, flower parts in five, and fruit as a berry or capsule. Compared with other plant families such as the Poaceae, the range of genome sizes of solanaceous species is fairly narrow, ranging from 900 to 4600 Mb per haploid genome . Early studies of the Solanaceae genome revealed conservation of gene content among potato, tomato, tobacco, petunia, and eggplant. These studies employed relatively small scale cross-hybridization studies using cDNA and random genomic DNA clones  in which a set of 20 tomato cDNA clones were hybridized with a panel of solanceous species including Lycopersicon, Solanum, Datura, Petunia, and Nicotiana. For the cDNA clones, there was strong hybridization across the Solanaceae; however, with the genomic clones (50 in total), there was a reduced degree of cross-hybridization with the non-Lycopersicon species. These data suggested conservation among the coding sequences while the non-coding sequences had undergone substantial divergence.
Conserved gene content prompts the question of conserved gene order, i.e. synteny across the Solanaceae. A number of solanaceous species have a base chromosome number of 12 including the main vegetable crop species potato, tomato, pepper and eggplant. Using markers developed from tomato, a strong degree of co-linearity between potato and tomato has been demonstrated with the differences attributable to paracentric inversions occurring between these two species [7, 8]. Using the same approach in pepper, 18 homologous linkage blocks between tomato and pepper could be identified . In eggplant, tomato markers yet again revealed syntenic regions among tomato and eggplant . While these synteny studies utilized anonymous DNA clones as markers, comparative mapping of phenotypes such as fruit morphology , pigmentation  and disease resistance  revealed syntenous mapping of these traits across the Solanaceae.
These early studies relied heavily on cDNA and random genomic clones. The advent of high throughput sequencing projects such as Expressed Sequence Tags (ESTs)  has resulted in the generation of hundreds of thousands of sequences for solanaeous species. For this study, a total of 441,154 ESTs were collected from the public database (dbEST) representing the solanaceous species tomato (162,621), potato (189,864), pepper (29,894), tobacco (26,497), and N. benthamiana (26,918). The available solanaceous ESTs, along with Expressed Transcripts (ETs), available in Genbank, can be clustered into gene indices  that represent a non-redundant set of transcripts and facilitate analysis of redundant EST collections. Using potato and tomato gene indices, a comparative analysis of tomato and potato ESTs revealed that approximately 80% of the potato ESTs had a significant sequence match with a tomato EST at the nucleotide level (E value cutoff of 10-10) .
In this study, we report the construction and comparative analyses of gene indices for six solanaceous species (tomato, potato, tobacco, pepper, petunia and N. benthamiana). These gene indices represent a total of 116,207 non-redundant sequences which we have utilized to assess sequence conservation among the Solanaceae on a genomic scale. We significantly extended previous studies on sequence similarity and conservation among these species as well as documented more thoroughly the characteristics of the coding portion of the Solanaceae genome. Using computational methods, we have identified putative orthologs among these species and generated a phylogenetic tree to ascertain the relationship and sequence divergence among these species. In addition to these computational approaches, we assessed the similarity of expression profiles in mature leaves to experimentally validate the sequence conservation of these species using heterologous hybridization to potato cDNA microarrays. The comparison of the solanaceous transcripts to the predicted proteomes of the near-complete genome sequences of Arabidopsis, rice, as well as to 21 other plant gene indices resulted in the identification of solanaceous transcripts without putative homologs, suggesting that a portion of these transcripts have a high likelihood of being unique to the Solanaceae. These analyses provide insight into the overall sequence conservation among eudicots (Arabidopsis and Solanaceae) as well as between the Solanaceae and the monocots (i.e., rice).
Assembly of sequences into gene indices for potato, tomato, petunia, tobacco, pepper, and N. benthamina
Summary of gene indices of potato, tomato, pepper, tobacco, Nicotiana benthamiana and petunia. EST: expressed sequence tag; ET: expressed transcript; TC: tentative consensus; sEST: singleton EST; sET: singleton ET. TCs are the assembled clusters of redundant and overlapping EST and ET sequences. The total unique sequences for each gene index are created by combining the TCs, sETs, and sESTs.
Total Unique Sequences
Assessment of the transcript sampling
Tissue representation of EST sequences among the gene indices. For each species, the origin of the library was determined and the total number of sequences from each source calculated. a. For the potato ESTs, 62,931 of the Mixed/Other ESTs were derived from a series of stolon and tuber cDNA libraries. b. For the N. benthamiana ESTs, 18,817 of the Mixed/Other ESTs were derived from a single cDNA library constructed by pooling mRNA from abiotic and biotic stressed leaves, roots, and callus.
Analysis of the GC content of Solanaceae gene indices
Functional annotation of the gene indices
Sequence conservation among six Solanaceae species
Identification of orthologs among solanaceous species. Number and percentages of reciprocal best hit pairs determined by BLAST searches (E value cutoff 10-10) were listed and the percentages of the total unique sequences of the species (first column) were calculated.
Identification of transcripts likely unique to Solanaceae
Identification of Solanaceae specific transcripts. Number of transcripts identified in the Solanaceae gene indices with no matches in Arabidopsis, rice or any of the 21 plant gene indices; * including Arabidopsis and rice.
Transcripts with no matches in Arabidopsis or rice
Transcripts with no matches in plant Gene Indices *
Identification of Solanaceae species-specific transcripts. The left panel shows the number of sequences without matches in any of the Solanaceae gene indices. The right panel shows the number of sequences for each species without matches to Arabidopsis, rice, or any plant gene index, including Solanaceae.
Unique among Solanaceae gene index
Solanaceae specific Transcripts
Expression profiling of solanaceous species
A high degree of sequence conservation among Solanaceae family members had been suggested previously based on small scale assays and analysis. Here, we report for the first time, a large scale comparison of six Solanaceae family members. Although the analyses in this study confirmed the high degree of sequence conservation, they also revealed a large number of Solanaceae specific transcripts and sequence divergence among Solanaceae.
Transcript sampling for the Solanaceae gene indices
To date, only a limited amount of genomic sequence data is (publicly) available for the Solanaceae. Therefore, the EST sequence data assembled in this study was used to assess the diversity of transcripts among the Solanaceae. The assessment of the annotation by GO terms of the six gene indices indicated an overall similar functional composition of the transcripts. In addition, the analysis of GC content was consistent with Arabidopsis and among the Solanaceae, with tobacco being the exception. These data show that the sequences used in this study provide a valid representation of the various solanaceous genomes. The wide range of different library sources of the sequences did not affect the number of sequence matches among the different Solanaceae species, indicating the absence of a high percentage of tissue specific transcripts. This can be explained by the close developmental relationship between most plant organs as flowers can be considered modified leaves and stolons as modified stems. A low number of tissue specific transcripts were also observed in Arabidopsis using Massive Parallel Signature Sequencing . Among five different libraries of callus, inflorescence, leaves, roots and siliques, less than 0.25% of the transcripts showed tissue specificity . Also in maize, using cDNA microarrays, only 7% of the genes were expressed in a highly tissue specific manner among seven different organs of maize . In contrast, the assessment of the frequency of the EST sequences can be used for the comparative analysis to evaluate differential expression. This approach has been used for tomato and potato [16, 31], but can only be successfully employed with a large number of diverse libraries and deep sequencing as most tissue specific transcripts may be expressed at low levels and therefore be relatively rare and not be sampled by sequencing.
A single microarray platform was successfully applied for heterologous hybridization of Solanaceae species. For transcripts with significant sequence similarity to the potato probes on the cDNA microarray, reliable expression data could be obtained. Similar hybridization characteristics were found using heterologous hybridization to a fish cDNA microarray ; the number of elements that could be detected on the microarray was correlated with the phylogenetic distance. Cross-species hybridization was also shown for human and bovine orthologous genes on a human cDNA microarray . The global expression data indicated that the conserved transcripts were expressed similarly among leaf tissue of the six Solanaceae species examined.
Solanaceae species contain unique transcripts
Overall, a high degree of sequence conservation among the Solanaceae was observed in accordance with previous small scale studies ; for up to 81% of the gene index sequences, significant matches at the nucleotide level could be found within the Solanaceae, consistent with the level of sequence conservation observed at the protein level. Using a more stringent approach of orthology revealed that for the largest gene indices of potato and tomato, a putative ortholog could be identified at the nucleotide level for 47% of the unique transcripts in the gene index. In addition, comparison of the Solanaceae gene indices to Arabidopsis, rice, and 21 other gene indices revealed transcripts without matches to these non-solanaceous species as well as transcripts without matches to individual Solanaceae species. Depending on the stringency of alignment, 16–19% of the transcripts did not have a match among the plant sequences examined. A similar approach was used to identify transcripts specific for legumes . These results show that between these closely related species there was still substantial sequence divergence, which was supported by the sequence divergence among 308 orthologous transcripts of six Solanaceae, Arabidopsis and rice. The available EST sequences only provide a snapshot of the genome, thus the number of unique transcripts may be lower but still be substantial as the transcript sampling among the Solanaceae proved to be a representative sampling. The large number of EST sequences available for tomato and potato were likely to contain the most abundant transcripts, so a large number of transcripts without sequence homology is likely to remain with increased EST sequencing until more sequence data is generated.
The outlier for most analyses appeared to be tobacco with a low number of significant matches among Solanaceae, Arabidopsis, and rice. No obvious explanation could be found for this but it is unlikely that tobacco will contain a much higher plant specific gene content. Matsuoka et al.  report on the EST sequencing of a cell suspension library of tobacco, which was the origin of a large portion of the tobacco gene index. In this study, a low number of tobacco sequences matched sequences from other plant species, consistent with our analyses. The GO assignments and the identification of orthologs indicated that the tobacco sequence sample did contain similar transcripts as the other five Solanaceae gene indices, validating the general conclusions for the Solanaceae species in this study, including tobacco.
The finding of a large number of transcripts without matches among the Solanaceae species will complicate the efforts of establishing a single reference genome for the Solanaceae by sequencing a single representative species. Although a large level of synteny exists between the Solanaceae, it is unclear how novel genes evolved and whether there is a large difference in gene content among the Solanaceae. Fortunately, for three Solanaceae species (tomato, potato and tobacco), genome sequencing projects are in progress. The availability of three draft genome sequences will allow for the detailed analysis of genome conservation and understanding of the genes involved in the different phenotypes within the Solanaceae.
In summary, this study documents for the first time the genomic scale comparison of the available coding sequences (ESTs and ETs) from six Solanaceae species. Sequence comparisons at the nucleotide level among potato, tomato, pepper, eggplant, tobacco and N. benthamiana, including ortholog analysis, confirmed a high level of sequence conservation. In addition, phylogenetic analysis and comparative analyses with Arabidopsis, rice and 21 other gene indices revealed sequence divergence during speciation as evidenced by transcripts likely unique among the Solanaceae and unique to individual Solanaceae species. Global expression profiling showed similar expression patterns of conserved genes in mature leaves among the six solanaceous species.
Gene indices were constructed essentially as described . In summary, all available sequences for potato, tomato, pepper, eggplant and petunia were collected from Genbank and sequences with over 94% sequence identity over 40 or more bases with unmatched overhangs of 30 bases in length were placed in clusters using the Paracel Transcript Assembler to generate tentative consensus sequences (TC) and singleton ESTs and ETs. The TCs were searched against a non-redundant protein database to provide a putative annotation for the TC, with a minimum of 30% identity over 20% of the length of the translated TC. All gene indices are available at . The 21 gene indices used for searches against the Solanaceae gene indices were: Ice plant (v4.0), Cocao (v1.0), Cotton (v6.0), Grape (v4.0), Barley (v9.0), Sugar beet (v1.0), Brassica napus (v1.0), Sunflower (v3.0), Lettuce (v2.0), Lotus (v3.0), Wheat (v10.0), Maize (v15.0), Medicago truncatula (v8.0), Onion (v1.0), Pinus (v5.0), Poplar (v2.0), Rye (v3.0), Sorghum bicolor (v8.0), Sugarcane (v2.1), Soybean (v12.0) and Spruce (v1.0). GO terms were transitively annotated based on sequence similarity (E value cutoff of 10-10) to Arabidopsis proteins (Release 5,  which has been manually curated for molecular function GO terms. The Plant/GOSlim reduced ontologies were used .
Each of the six gene indices was pair-wise matched against the other gene indices using WU-BLAST  with BLASTN and TBLASTX options. BLAST scores were filtered for significant hits using an E value cut-off as indicated in the text. Each of the six gene indices were searched against the predicted rice and Arabidopsis proteome using BLASTX and the top hit was picked for each entry of the gene indices using an E value cutoff of 10-5. Putative orthologs among the six Solanaceae species, rice and Arabidopsis were identified essentially as described . In summary, the non-redundant sets of eight gene indices were compiled and searched against each other using BLASTN. The reciprocal best hit pairs with a cutoff E value 10-10 were clustered to generate the ortholog groups. 308 clusters which contain at least one transcript from each of the 8 species were selected and one representative sequence for each species was chosen for each group by counting the reciprocal matches in the clusters. Multiple sequence alignments for each of the 308 clusters were performed and sequences in both ends without consensus matches were removed. Sequences from each species were concatenated together in the same order and aligned to each other using CLUSTAL W . A neighbor joining tree was generated using PHYLIP (Phylogeny Inference Package) (Felsenstein, J. 2004, distributed by the author. Department of Genome Sciences, University of Washington, Seattle).
Microarray hybridizations and data analysis
Potato cDNA microarrays were constructed as described . Potato, tobacco, tomato, petunia, pepper and N. benthamiana plants were grown in Percival growth chambers (Percival Scientific, Inc. Perry, IA) at 25°C and 16 h light for 4–6 weeks. Total RNA was extracted from mature leaves using the Qiagen RNAesy kit (Qiagen, Valencia, CA) and labeled as described previously . Hybridization and washing was performed essentially as described . After the final washing step and spin-drying of the slide, slides were scanned using an Axon scanner at maximum laser power (Axon Instruments, Union City, CA) at both 532 and 635 nm. The PMT values for both wavelengths were adjusted to capture a similar number of normalized counts for each channel.
The TIFF images were quantified using Genepix 5.0 (Axon Instruments, Union City, CA). The software automatically flags spots that cannot be found in one of the channels; these are flagged and excluded from further analysis. Spots containing > 30% saturated pixels in either channel or a diameter <70 μm were flagged and not used for subsequent analysis. Local background was subtracted from the signal value (mean pixel intensity). The data were normalized using the quantile method in the limma package  of BioConductor . Flagged spots were given a weight of 0 using the weight function within the package which excludes these spots from affecting the normalization. All analyses used the average of the two on-slide replicates. If one of the two replicates was flagged, the remaining value was used for analysis.
Funding for this work was provided through a grant from the National Science Foundation Plant Genome Research Program (DBI-0218166).
- Food and Agricultural Organization of The United Nations, FAOSTAT. 2005, [http://apps.fao.org/default.jsp]
- United States Department of Agriculture (USDA), National Agricultural Statistics Service, Floriculture Crops. 2005, [http://usda.mannlib.cornell.edu/reports/nassr/other/zfc-bb/]
- United States Department of Agriculture (USDA), National Agricultural Statistics Service, Crop Production. 2005, [http://usda.mannlib.cornell.edu/reports/nassr/field/pcp-bban/]
- Lu R, Martin-Hernandez AM, Peart JR, Malcuit I, Baulcombe DC: Virus-induced gene silencing in plants. Methods. 2003, 30: 296-303. 10.1016/S1046-2023(03)00037-9.PubMedView Article
- Arumuganathan K, Earle ED: Nuclear DNA Content of Some Important Plant Species. Plant Molecular Biology Reporter. 2004, 9: 208-218.View Article
- Zamir D, Tanksley S: Tomato genome is comprised largely of fast-evolving, low copy-number sequences. Mol Gen Genet. 1988, 213: 254-261. 10.1007/BF00339589.View Article
- Bonierbale MW, Plaisted RL, Tanksley SD: RFLP Maps Based on a Common Set of Clones Reveal Modes of Chromosomal Evolution in Potato and Tomato. Genetics. 1988, 120: 1095-1103.PubMedPubMed Central
- Tanksley SD, Ganal MW, Prince JP, de Vicente MC, Bonierbale MW, Broun P, Fulton TM, Giovannoni JJ, Grandillo S, Martin GB, .: High density molecular linkage maps of the tomato and potato genomes. Genetics. 1992, 132: 1141-1160.PubMedPubMed Central
- Livingstone KD, Lackney VK, Blauth JR, van Wijk R, Jahn MK: Genome mapping in capsicum and the evolution of genome structure in the solanaceae. Genetics. 1999, 152: 1183-1202.PubMedPubMed Central
- Doganlar S, Frary A, Daunay MC, Lester RN, Tanksley SD: A comparative genetic linkage map of eggplant (Solanum melongena) and its implications for genome evolution in the solanaceae. Genetics. 2002, 161: 1697-1711.PubMedPubMed Central
- Doganlar S, Frary A, Daunay MC, Lester RN, Tanksley SD: Conservation of gene function in the solanaceae as revealed by comparative mapping of domestication traits in eggplant. Genetics. 2002, 161: 1713-1726.PubMedPubMed Central
- Thorup TA, Tanyolac B, Livingstone KD, Popovsky S, Paran I, Jahn M: Candidate gene analysis of organ pigmentation loci in the Solanaceae. Proc Natl Acad Sci U S A. 2000, 97: 11192-11197. 10.1073/pnas.97.21.11192.PubMedPubMed CentralView Article
- Grube RC, Radwanski ER, Jahn M: Comparative genetics of disease resistance within the solanaceae. Genetics. 2000, 155: 873-887.PubMedPubMed Central
- Adams MD, Soares MB, Kerlavage AR, Fields C, Venter JC: Rapid cDNA sequencing (expressed sequence tags) from a directionally cloned human infant brain cDNA library. Nat Genet. 1993, 4: 373-380. 10.1038/ng0893-373.PubMedView Article
- Quackenbush J, Cho J, Lee D, Liang F, Holt I, Karamycheva S, Parvizi B, Pertea G, Sultana R, White J: The TIGR Gene Indices: analysis of gene transcript sequences in highly sampled eukaryotic species. Nucleic Acids Res. 2001, 29: 159-164. 10.1093/nar/29.1.159.PubMedPubMed CentralView Article
- Ronning CM, Stegalkina SS, Ascenzi RA, Bougri O, Hart AL, Utterbach TR, Vanaken SE, Riedmuller SB, White JA, Cho J, Pertea GM, Lee Y, Karamycheva S, Sultana R, Tsai J, Quackenbush J, Griffiths HM, Restrepo S, Smart CD, Fry WE, Van der HR, Tanksley S, Zhang P, Jin H, Yamamoto ML, Baker BJ, Buell CR: Comparative analyses of potato expressed sequence tag libraries. Plant Physiol. 2003, 131: 419-429. 10.1104/pp.013581.PubMedPubMed CentralView Article
- Carels N, Bernardi G: Two classes of genes in plants. Genetics. 2000, 154: 1819-1825.PubMedPubMed Central
- Carels N, Hatey P, Jabbari K, Bernardi G: Compositional properties of homologous coding sequences from plants. J Mol Evol. 1998, 46: 45-53.PubMedView Article
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000, 25: 25-29. 10.1038/75556.PubMedPubMed CentralView Article
- Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25: 3389-3402. 10.1093/nar/25.17.3389.PubMedPubMed CentralView Article
- Lee Y, Sultana R, Pertea G, Cho J, Karamycheva S, Tsai J, Parvizi B, Cheung F, Antonescu V, White J, Holt I, Liang F, Quackenbush J: Cross-referencing eukaryotic genomes: TIGR Orthologous Gene Alignments (TOGA). Genome Res. 2002, 12: 493-502. 10.1101/gr.212002.PubMedPubMed CentralView Article
- Initiative AG: Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000, 408: 796-815. 10.1038/35048692.View Article
- Goff SA, Ricke D, Lan TH, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H, Hadley D, Hutchison D, Martin C, Katagiri F, Lange BM, Moughamer T, Xia Y, Budworth P, Zhong J, Miguel T, Paszkowski U, Zhang S, Colbert M, Sun WL, Chen L, Cooper B, Park S, Wood TC, Mao L, Quail P, Wing R, Dean R, Yu Y, Zharkikh A, Shen R, Sahasrabudhe S, Thomas A, Cannings R, Gutin A, Pruss D, Reid J, Tavtigian S, Mitchell J, Eldredge G, Scholl T, Miller RM, Bhatnagar S, Adey N, Rubano T, Tusneem N, Robinson R, Feldhaus J, Macalma T, Oliphant A, Briggs S: A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science. 2002, 296: 92-100. 10.1126/science.1068275.PubMedView Article
- Yu J, Hu S, Wang J, Wong GK, Li S, Liu B, Deng Y, Dai L, Zhou Y, Zhang X, Cao M, Liu J, Sun J, Tang J, Chen Y, Huang X, Lin W, Ye C, Tong W, Cong L, Geng J, Han Y, Li L, Li W, Hu G, Huang X, Li W, Li J, Liu Z, Li L, Liu J, Qi Q, Liu J, Li L, Li T, Wang X, Lu H, Wu T, Zhu M, Ni P, Han H, Dong W, Ren X, Feng X, Cui P, Li X, Wang H, Xu X, Zhai W, Xu Z, Zhang J, He S, Zhang J, Xu J, Zhang K, Zheng X, Dong J, Zeng W, Tao L, Ye J, Tan J, Ren X, Chen X, He J, Liu D, Tian W, Tian C, Xia H, Bao Q, Li G, Gao H, Cao T, Wang J, Zhao W, Li P, Chen W, Wang X, Zhang Y, Hu J, Wang J, Liu S, Yang J, Zhang G, Xiong Y, Li Z, Mao L, Zhou C, Zhu Z, Chen R, Hao B, Zheng W, Chen S, Guo W, Li G, Liu S, Tao M, Wang J, Zhu L, Yuan L, Yang H: A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science. 2002, 296: 79-92. 10.1126/science.1068037.PubMedView Article
- Yu J, Wang J, Lin W, Li S, Li H, Zhou J, Ni P, Dong W, Hu S, Zeng C, Zhang J, Zhang Y, Li R, Xu Z, Li S, Li X, Zheng H, Cong L, Lin L, Yin J, Geng J, Li G, Shi J, Liu J, Lv H, Li J, Wang J, Deng Y, Ran L, Shi X, Wang X, Wu Q, Li C, Ren X, Wang J, Wang X, Li D, Liu D, Zhang X, Ji Z, Zhao W, Sun Y, Zhang Z, Bao J, Han Y, Dong L, Ji J, Chen P, Wu S, Liu J, Xiao Y, Bu D, Tan J, Yang L, Ye C, Zhang J, Xu J, Zhou Y, Yu Y, Zhang B, Zhuang S, Wei H, Liu B, Lei M, Yu H, Li Y, Xu H, Wei S, He X, Fang L, Zhang Z, Zhang Y, Huang X, Su Z, Tong W, Li J, Tong Z, Li S, Ye J, Wang L, Fang L, Lei T, Chen C, Chen H, Xu Z, Li H, Huang H, Zhang F, Xu H, Li N, Zhao C, Li S, Dong L, Huang Y, Li L, Xi Y, Qi Q, Li W, Zhang B, Hu W, Zhang Y, Tian X, Jiao Y, Liang X, Jin J, Gao L, Zheng W, Hao B, Liu S, Wang W, Yuan L, Cao M, McDermott J, Samudrala R, Wang J, Wong GK, Yang H: The Genomes of Oryza sativa: a history of duplications. PLoS Biol. 2005, 3: e38-10.1371/journal.pbio.0030038.PubMedPubMed CentralView Article
- Wortman JR, Haas BJ, Hannick LI, Smith RKJ, Maiti R, Ronning CM, Chan AP, Yu C, Ayele M, Whitelaw CA, White OR, Town CD: Annotation of the Arabidopsis genome. Plant Physiol. 2003, 132: 461-468. 10.1104/pp.103.022251.PubMedPubMed CentralView Article
- Yuan Q, Ouyang S, Wang A, Zhu W, Maiti R, Lin H, Hamilton J, Haas B, Sultana R, Cheung F, Wortman J, Buell CR: The institute for genomic research osa1 rice genome annotation database. Plant Physiol. 2005, 138: 18-26. 10.1104/pp.104.059063.PubMedPubMed CentralView Article
- The Institute for Genomic Research (TIGR), Plant Gene Indices. 2005, [http://www.tigr.org/tdb/tgi/plant.shtml]
- Meyers BC, Vu TH, Tej SS, Ghazal H, Matvienko M, Agrawal V, Ning J, Haudenschild CD: Analysis of the transcriptional complexity of Arabidopsis thaliana by massively parallel signature sequencing. Nat Biotechnol. 2004, 22: 1006-1011. 10.1038/nbt992.PubMedView Article
- Fernandes J, Brendel V, Gai X, Lal S, Chandler VL, Elumalai RP, Galbraith DW, Pierson EA, Walbot V: Comparison of RNA expression profiles based on maize expressed sequence tag frequency analysis and micro-array hybridization. Plant Physiol. 2002, 128: 896-910. 10.1104/pp.010681.PubMedPubMed CentralView Article
- Fei Z, Tang X, Alba RM, White JA, Ronning CM, Martin GB, Tanksley SD, Giovannoni JJ: Comprehensive EST analysis of tomato and comparative genomics of fruit ripening. Plant J. 2004, 40: 47-59. 10.1111/j.1365-313X.2004.02188.x.PubMedView Article
- Renn SC, Aubin-Horth N, Hofmann HA: Biologically meaningful expression profiling across species using heterologous hybridization to a cDNA microarray. BMC Genomics. 2004, 5: 42-10.1186/1471-2164-5-42.PubMedPubMed CentralView Article
- Adjaye J, Herwig R, Herrmann D, Wruck W, Benkahla A, Brink TC, Nowak M, Carnwath JW, Hultschig C, Niemann H, Lehrach H: Cross-species hybridisation of human and bovine orthologous genes on high density cDNA microarrays. BMC Genomics. 2004, 5: 83-10.1186/1471-2164-5-83.PubMedPubMed CentralView Article
- Graham MA, Silverstein KA, Cannon SB, VandenBosch KA: Computational identification and characterization of novel genes from legumes. Plant Physiol. 2004, 135: 1179-1197. 10.1104/pp.104.037531.PubMedPubMed CentralView Article
- Matsuoka K, Demura T, Galis I, Horiguchi T, Sasaki M, Tashiro G, Fukuda H: A comprehensive gene expression analysis toward the understanding of growth and differentiation of tobacco BY-2 cells. Plant Cell Physiol. 2004, 45: 1280-1289. 10.1093/pcp/pch155.PubMedView Article
- The Gene Ontology. 2005, [http://www.geneontology.org]
- Washington University BLAST. 2005, [http://blast.wustl.edu/]
- Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994, 22: 4673-4680.PubMedPubMed CentralView Article
- Rensink WA, Iobst S, Hart A, Stegalkina S, Liu J, Buell CR: Gene expression profiling of potato responses to cold, heat, and salt stress. Funct Integr Genomics. 2005, In press-
- Smyth GK: Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology. 2004, 3: Article 3-View Article
- BioConductor. 2005, [http://www.bioconductor.org]
- The Institute for Genomic Research (TIGR), Potato Functional Genomics & Solanaceae Resources. 2005, [http://www.tigr.org/tdb/potato]
- Gene Expression Omnibus. 2005, [http://www.ncbi.nlm.nih.gov/geo]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.