Functional analysis and comparative genomics of expressed sequence tags from the lycophyte Selaginella moellendorffii
© Weng et al; licensee BioMed Central Ltd. 2005
Received: 05 March 2005
Accepted: 06 June 2005
Published: 06 June 2005
The lycophyte Selaginella moellendorffii is a member of one of the oldest lineages of vascular plants on Earth. Fossil records show that the lycophyte clade arose 400 million years ago, 150–200 million years earlier than angiosperms, a group of plants that includes the well-studied flowering plant Arabidopsis thaliana. S. moellendorffii has a genome size of approximately 100 Mbp, as small or smaller than that of A. thaliana. S. moellendorffii has the potential to provide significant comparative information to better understand the evolution of vascular plants.
We sequenced 2181 Expressed Sequence Tags (ESTs) from a S. moellendorffii cDNA library. One thousand three hundred and one non-redundant sequences were assembled, containing 291 contigs and 1010 singletons. Approximately 75% of the ESTs matched proteins in the non-redundant protein database. Among 1301 clusters, 343 were categorized according to Gene Ontology (GO) hierarchy and were compared to the GO mapping of A. thaliana tentative consensus sequences. We compared S. moellendorffii ESTs to the A. thaliana and Physcomitrella patens EST databases, using the tBLASTX algorithm. Approximately 60% of the ESTs exhibited similarity with both A. thaliana and P. patens ESTs; whereas, 13% and 1% of the ESTs had exclusive similarity with A. thaliana and P. patens ESTs, respectively. A substantial proportion of the ESTs (26%) had no match with A. thaliana or P. patens ESTs.
We discovered 1301 putative unigenes in S. moellendorffii. These results give an initial insight into its transcriptome that will aid in the study of the S. moellendorffii genome in the near future.
Our understanding of biology has been greatly improved by studying genome structure and gene function of a broad sampling of model organisms such as Mus musculus (mouse), Drosophila melanogaster (fruit fly), Danio rerio (zebrafish), Caenorhabditis elegans (nematode), and Arabidopsis thaliana [1–5]. Comparative genomics has made it clear that orthologs of many proteins that act as signal transduction components, transcriptional regulatory factors, and metabolic enzymes can be identified between and among these model organisms . As a result, the knowledge gained from comparative and evolutionary studies of these species can provide insights into homologous processes in a wide range of other organisms, varying from crop plants to humans . Within plants however, most of the efforts in genomics have been focused on crop plants or economically important plants such as Oryza sativa (rice), Zea mays (maize), and Lycopersicon esculentum (tomato) [8–10]. Thus, coupled with the sequencing of the A. thaliana genome, these efforts have provided data on only a single branch of the plant evolutionary tree, namely members of the Monocotyledonae and Dicotyledonae, collectively termed the angiosperms and commonly known as flowering plants. As a result, the community of plant scientists has little sequence data on other plant lineages that could provide insights into common mechanisms of how plants develop and survive in a terrestrial environment, nor do they have any kind of evolutionary benchmarks that might reveal how angiosperms have come to dominate most world ecosystems .
Expressed sequence tag (EST) sequencing has been used as an efficient and economical approach for large-scale gene discovery . It has also successfully provided frameworks for many genome projects [18, 19]. Recently, a large number of ESTs have been generated from various plant species and deposited in GenBank, including both model and crop plants like A. thaliana, rice, wheat, and maize as well as species representative of clades other than angiosperms, such as gymnosperms, cycads, and mosses [20–23]. Although over 1000 ESTs from another Selaginella species S. lepidophylla, also known as the resurrection plant, have also been deposited in GenBank , no manuscript has been published reporting on their analysis. In this paper, we describe 2181 ESTs generated from a S. moellendorffii cDNA library. These ESTs were assembled into 1301 clusters, annotated using the BLASTX algorithm, surveyed for their abundance within the dataset, and classified into functional groups according to the Gene Ontology (GO) hierarchy. Finally, a comparative genomics approach was used for comparing S. moellendorffii ESTs with those of A. thaliana and Physcomitrella patens to look for genes unique to S. moellendorffii.
Results and Discussion
Generation of S. moellendorffii cDNA library and ESTs
To gain a broad coverage of S. moellendorffii transcripts, we collected and pooled whole S. moellendorffii plants for mRNA extraction and subsequent cDNA library construction. To enrich for full-length cDNA clones, double-stranded cDNA was size-fractionated before cloning. Based upon the average insert sizes of 35 cDNA clones chosen at random from the library, we estimate that the cDNA library has an average insert size of 850 bp. 2304 clones were sequenced from the 5' end of the cDNAs, which generated 2181 vector-trimmed EST sequences with an average sequencing read length of 640 bp.
Assembly of S. moellendorffii ESTs
Annotation of S. moellendorffii ESTs
To annotate S. moellendorffii ESTs, the 1301 putative unigenes were translated dynamically in all 6 reading frames and searched for homology against the NCBI non-redundant (nr) protein database using BLASTX . BLASTX hits with E-values less than 10-5 were taken to be significant. Among 1301 unigenes, 962 (74%) had BLASTX hits in the nr database, while the remaining 339 (26%) had hits with E-values greater than 10-5 or no hit. When a less permissive cutoff E-value of 10-10 was adopted, the numbers of unigenes with BLASTX hits and without BLASTX hits changed slightly to 891 (68%) and 410 (32%) respectively. Our dataset showed that the inferred translation products of most S. moellendorffii ESTs appear to be similar to proteins in other organisms but that there was also a percentage of ESTs that represented potential Selaginella- or lycophyte-specific genes. Interestingly, 15 ESTs had at least their top five BLASTX hits from non-plant organisms, including six from bacteria or cyanobacteria (SmoC-1_02_N06, SmoC-1_01_C17, SmoC-1_02_B19, SmoC-1_06_K12, SmoC-1_cn167, SmoC-1_03_D21), two from fungi (SmoC-1_06_O23, SmoC-1_02_H20), one from an insect (SmoC-1_06_K02), three from nematodes (SmoC-1_04_D10, SmoC-1_02_L08, SmoC-1_cn108), one from fish (SmoC-1_04_F24), and two from mammals (SmoC-1_02_H05, SmoC-1_03_F21). These data suggest that homologs have either not yet been identified or are absent from other plant lineages, although in one case (SmoC-1_06_O23), a more distantly related A. thaliana gene was returned by BLASTX, and in a further three cases, BLASTN analysis of the EST-others database identified potential homologs in P. patens (SmoC-1_02_N06, SmoC-1_06_K12) and S. lepidophylla (SmoC-1_cn167).
Highly represented S. moellendorffii ESTs
The most abundantly represented ESTs in the S. moellendorffii cDNA library.
Number of ESTs
Top BLASTX hit in non-redundant protein database
Best Identity Description
Ribulose bisphosphate carboxylase small subunit) [Larix laricina]
Ferredoxin, chloroplast precursor [Silene latifolia subsp. alba]
chlorophyll a/b-binding protein [Lycopersicon esculentum]
latex plastidic aldolase-like protein [Hevea brasiliensis]
chlorophyll a/b-binding protein [Pinus sylvestris]
photosystem-1 H subunit GOS5 [Oryza sativa]
Plastocyanin, chloroplast precursor [Physcomitrella patens]
glutamine synthetase cytosolic isoenzyme 1 [Vitis vinifera]
S-adenosylmethionine synthetase [Pinus contorta]
Early light-induced protein, chloroplast precursor (ELIP) [Pisum sativum]
Subtilisin-chymotrypsin inhibitor [Triticum aestivum]
Cytochrome B6-F complex iron-sulfur subunit 1, chloroplast precursor [Nicotiana tabacum]
PSII subunit PsbW [Physcomitrella patens]
Catalase 3 [Glycine max]
Photosystem I reaction center subunit XI, chloroplast precursor [Hordeum vulgare]
Carbonic Anhydrase [Pisum Sativum]
photosystem I reaction center subunit V, chloroplast, [Arabidopsis thaliana]
hypothetical protein K08H10.2a [Caenorhabditis elegans]
ubiquitin conjugating enzyme [Zea mays]
Chlorophyll a-b binding protein 36, chloroplast precursor [Nicotiana tabacum]
core protein [Pisum sativum]
Oxygen-evolving enhancer protein 2, chloroplast precursor [Cucumis sativus]
expressed protein [Arabidopsis thaliana]
photosystem I-N subunit [Phaseolus vulgaris]
chloroplastic iron superoxide dismutase [Barbula unguiculata]
chloroplast ferredoxin-NADP+ oxidoreductase precursor [Capsicum annuum]
Photosystem II 22 kDa protein, chloroplast precursor [Lycopersicon esculentum]
Three highly expressed S. moellendorffii transcripts corresponded to genes encoding enzymes of metabolism, including an aldolase-like protein, a putative glutamine synthetase cytosolic isoenzyme involved in nitrogen assimilation [29, 30], and a putative S-adenosylmethionine synthetase required for the synthesis of the major methyl group donor involved in the methylation of a variety of biomolecules ranging from histones to secondary metabolites, and for the biosynthesis of ethylene [31, 32].
Other relatively abundant ESTs included one encoding a putative subtilisin-chymotrypsin inhibitor, exhibiting 49% amino acid sequence identity with the wheat subtilisin-chymotrypsin inhibitor, which may play a role in plant defense by inhibiting the serine proteinases of pathogens . Two transcripts that matched an A. thaliana expressed protein and Pisum sativum core protein may function as membrane channel proteins. Interestingly, one highly expressed EST matched with an E-value of 10-12 a C. elegans protein of unknown function, and is only more distantly related to an A. thaliana late embryogenesis abundant protein.
There were five highly expressed ESTs that did not yield significant matches using BLASTX (E>10-5). These are putative Selaginella- specific genes and may encode proteins with functions unique to Selaginella or lycophytes. The first two highly expressed ESTs in this project, represented by clusters SmoC1_cn126 and SmoC1_cn125, had 105 and 46 copies in their clusters respectively, but returned no BLASTX hits with the nr protein database or BLASTN hits with the NCBI EST-others database. To determine whether these sequences represented bona fide Selaginella genes, we amplified the corresponding sequences by PCR using genomic DNA as a template (data not shown). Both sequences amplified successfully, and both had introns, indicating that they were not derived from DNA contamination from prokaryotic symbionts. The rational translation of SmoC1_cn126 contig contains a three repeats of the motif "XXXGXXTCDKCAQTGVCTCGKN", which aligns with similar cysteine-rich motifs in proteins with epidermal growth factor repeats. Using a low BLASTX stringency (E = 0.002), SmoC1_cn125 matched to a Cynodon dactylon metallothionein-like protein (GB:AAS88721.1, 75% identical within a 20 amino acid motif). The other three highly expressed S. moellendorffii specific ESTs lack hints for functional annotation. The biological function of the proteins encoded by these genes, and the question of whether high transcript abundance is predictive of high protein expression will be a matter for future investigation.
Functional categorization of S. moellendorffii ESTs
The most sensitive method to find new members of known gene families among EST sequences is to search for homology of the translated ESTs to motifs extracted from a multiple alignment of known gene family members . To functionally categorize S. moellendorffii ESTs using motif homology searches, we translated the 1301 unigenes in six reading frames and imported them into InterProScan , which aligned 491 clusters to InterPro entries (E<10-5). Mapping of InterPro entries to GO , assigned 343 out of 491 InterPro hits with 562 GO accession numbers. The 562 accession numbers further generated 964 individual GO mappings in the three major ontologies (biological processes, molecular functions and cellular components) . The apparent discrepancies between these values arises from the fact that not all InterPro hits had available GO accession numbers associated with them, one InterProScan entry could be assigned to more than one GO accession numbers, and one GO accession number could be mapped under multiple parental categories .
The GO categorization of S. moellendorffii ESTs by biological process, molecular function, and cellular component.
Gene Ontology term
Nucleic acid metabolism
Cell growth and/or maintenance
Response to stimulus and stress
Metal ion binding
Electron transporter activity
Structural molecule activity
Translation regulator activity
Signal transducer activity
Enzyme regulator activity
Transcription regulator activity
Comparison of GO assignments between A. thaliana ESTs and S. moellendorffii ESTs.
Gene Ontology term
Cell growth and/or maintenance
Response to stimulus and stress
Structural molecule activity
Translation regulator activity
Signal transducer activity
Enzyme regulator activity
Transcription regulator activity
The current GO annotations for plants are based solely on the annotated proteins of A. thaliana and O. sativa, both of which are angiosperms. Since the lycophyte clade diverged from other plant lineages 400 million years ago, and 200 million years before angiosperms, it is perhaps to be expected that a large proportion of S. moellendorffii genes could not be accurately assigned to GO categories in the database containing only angiosperm gene entries. We expect that the representation of plant species other than angiosperms will certainly benefit resources as InterPro and in turn will lead to further resolution within GO.
Comparative genomics of S. moellendorffii ESTs
One important objective of comparative genomics is to trace gene evolution including the emergence, development, and loss of orthologous genes in different organisms over evolutionary time . To survey the S. moellendorffii ESTs in an evolutionary context, we used the S. moellendorffii unigene sequences as queries to search for homologous sequences in the A. thaliana and P. patens EST databases using tBLASTX algorithm (cut off E-value = 10-6). There were two reasons that we chose A. thaliana and P. patens ESTs as tBLASTX databases. First, A. thaliana and P. patens are representatives of the most diverged lineages of land plants, namely angiosperms and bryophytes. They flank Selaginella in the plant phylogenetic tree, and last shared a common ancestor over 400 million years ago , thus providing ample opportunity for the evolutionary divergence of individual genes and gene families. Second, the large quantities of A. thaliana and P. patens ESTs in GenBank (472,278 and 104,027 respectively) provide a substantial coverage of the transcriptome in these two species. Using them as BLAST databases makes it possible to do a relatively comprehensive genomic analysis even in the absence of the full genome sequence of P. patens.
Top 20 S. moellendorffii EST tBLASTX hits for A. thaliana ESTs that are not present within the P. patens EST database.
Best BLASTX Descriptor in A. thaliana
oligopeptide transporter OPT family protein
putative Mg-protoporphyrin IX chelatase
putative caffeoyl-CoA 3-O-methyltransferase
chloroplast membrane protein (ALBINO3)
cullin family protein
putative UDP-galactose/UDP-glucose transporter
nicotinate phosphoribosyltransferase family protein
glycoside hydrolase family 77 protein
amine oxidase family protein
RNase L inhibitor protein-related
putative isoflavone reductase
transducin / WD-40 repeat family protein
dehydration stress-induced protein
putative membrane protein
paired amphipathic helix repeat-containing protein
We sequenced 2181 ESTs from the lycophyte S. moellendorffii, putatively representing 1301 unigenes. Our data showed that a large proportion of the genes had homologous genes in the well-studied model plant A. thaliana and other plant species. By browsing the putative functional annotations of these ESTs, researchers will be able to choose S. moellendorffii genes of interest and compare them to their othologs in other species. We also found a substantial number of putative Selaginella- specific genes that do not share similarity with known genes, with some of them even representing very highly expressed genes. Considering the complexity of the plant kingdom and a time span more than 150 million years between the divergences of lycophytes and angiosperms, it will not be surprising to identify gene functions in S. moellendorffii that are not present in A. thaliana. When the draft genome sequence of S. moellendorffii is completed and released, this EST resource will also play an important role in the mapping and annotation of the genome. As a member of a clade that arose after the bryophytes and before all other vascular plants, S. moellendorffii will provide new opportunities in studying plant evolution, particularly those adaptations relating to fundamental traits that facilitated the transition of green plants to the land, such as lignification in vascular plants, root/stem/leaf organography, complex patterns of sporophyte branching, and the elaboration of reproductive structures.
Plant material and cDNA library Construction
S. moellendorffii was obtained from Plant Delights Nursery (Raleigh, NC). Plants were grown at 23°C in a greenhouse with a photoperiod of 16h light/8h dark. The cDNA library used in this study was made from RNA extracted from pooled tissue including stems, microphylls, strobilis, and rhizophores of S. moellendorffii plants. Briefly, fresh tissue was ground in liquid nitrogen and total RNA was extracted using the RNeasy Max Kit (QIAGEN, Valencia, CA), treated with RNase-free DNase, and precipitated in 2 M lithium chloride. Poly A+ RNA was isolated from total RNA using the Dynabeads mRNA Purification Kit (Dynal Biotech, Brown Deer, WI). The cDNA library was constructed from 1 μg mRNA using the Creator Smart cDNA Library Construction Kit (CLONTECH, Palo Alto, CA). After first-strand synthesis, the full length double stranded cDNAs were synthesized by primer-extension. Full length double stranded cDNAs were digested with Sfi I and size fractionated using a CHROMA SPIN-400 column (CLONTECH, Palo Alto, CA). cDNA-containing fractions were pooled, and ethanol precipitated. The cDNAs were then cloned into pDNR-LIB at Sfi I site, and electroporated into E. coli DH10B cells (Invitrogen, Carlsbad, CA). The library had an un-amplified titer of 1.6 × 106 colony-forming units mL-1 and a total complexity of 3.2 × 106 colonies. To estimate the average insert size of the library, plasmid DNAs were extracted from 35 randomly picked clones from the library, digested with Sfi I, and analyzed by agarose gel electrophoresis.
EST sequencing and dbEST submission
18,432 colonies from un-amplified S. moellendorffii cDNA library were arrayed into 48 384-well plates using Q-Pix multifunction colony picker (Genetix). Plasmid DNA was isolated from 2304 clones picked from the first six 384-well plates. Sequences of cDNAs were determined from their 5' end by conventional procedures using the big-dye terminators on the ABI 3730xl DNA analyzer (Applied Biosystems, Foster City, CA) at the Purdue Genomics Center using T7-ZL (5'-TAATACGACTCACTATAGGG-3') as the 5'-sequencing primer. The vector sequence was trimmed from the original EST sequences resulting in 2181 sequences. The 2181 ESTs have been submitted to GenBank dbEST under the accession numbers DN837577 to DN839757 .
EST clustering and homology search
2181EST sequences were imported into the stackPACK v2.2 clustering system (Electric Genetics, Reston, VA) through WebPipe for clustering with default setting, and contig consensus sequences were generated from the clusters. One thousand three hundred and one non-redundant EST sequences were exported through WebReport in FASTA format. BLASTX analyses using the nr database were performed on the 1301 unigene sequences, using E-value of 10-5 as a cutoff threshold. The complete BLASTX annotation of 1301 S. moellendorffii unigenes can be viewed at .
Functional categorization of ESTs
To search for functional protein domains of translated ESTs, 1301 unigene sequences were merged into one FASTA file and imported into InterProScan, which was run on a local SUN unix server. BlastProDom, Coil, FPrintScan, HMMPIR, HMMPfam, HMMSmart, HMMTigr, ProfileScan, ScanRegExp, and Seg superfamily were selected as the database methods. All the sequences were translated in six reading frames and aligned to the entries in the selected databases. EST clusters which had positive InterProScan hits (E <10-5) were automatically assigned InterPro accession numbers. According to the mapping of InterPro entries to GO , GO accession numbers were assigned to EST clusters, which were used to classify ESTs into functional groups by molecular function, cellular component, and biological process. In comparison of the distribution of GO categories between S. moellendorffii ESTs and A. thaliana TCs, the GO assignments for A. thaliana ESTs were obtained from TIGR . The Complete Interpro assignment and GO mapping of S. moellendorffii ESTs can be accessed in the supplemental data (see Additional file: 1).
Comparison of S. moellendorffii ESTs to A. thaliana and P. patens ESTs
472,278 A. thaliana ESTs and 104,027 P. patens ESTs retrieved from GenBank by searching 'Arabidopsis / Physcomitrella and gbdiv est' in NCBI Entrez  were saved to a local server. The 1301 S. moellendorffii unigenes were translated in six reading frames and searched for homology against the six-frame translations of A. thaliana ESTs and P. patens ESTs respectively using the BLAST algorithm. An E-value of 10-6 was set as stringency threshold. The complete result of S. moellendorffii unigenes tBLASTX against A. thaliana and P. patens ESTs can be viewed at .
To amplify the genomic sequences of the two most highly expressed ESTs (SmoC1_cn126 and SmoC1_cn125) in S. moellendorffii, PCR was performed using genomic DNA extracted from 50 mg fresh tissue of S. moellendorffii as described previously  as template and two pairs of PCR primers designed from their EST contig sequences: CC1170 (5'-CGAGCTCGTAGTGATAGTGTC -3') and CC1171 (5'-AACCATAGGAGAGGAAGACC-3') for SmoC1_cn126; CC1228 (5'-ATAGCTTAGCTGCTTTCTTCTC-3') and CC1229 (5'-ATACTACTCATGTCGCAGCTC -3') for SmoC1_cn125. PCR was performed using an initial 2 min denaturation at 94°C, followed by 25 cycles, each consisting of a 0.5 min denaturation at 94°C, a 0.5 min annealing at 50°C, and a 1 min extension at 72°C. These 25 cycles were followed by a 5 min extension at 72°C. PCR products were purified using QIAquick PCR Purification Kit (QIAGEN) and sequenced at Purdue Genomics Center.
This research was supported by a grant from the National Science Foundation to C.C. and a pilot project grant from the Department of Biochemistry, Purdue University. This is journal paper number 2005-17677 from the Purdue University Agricultural Experiment Station. We thank Dr. Jo Ann Banks for critically reading the manuscript.
- Mouse Genome Sequencing Consortium: Initial sequencing and comparative analysis of the mouse genome. Nature. 2002, 420: 520-562. 10.1038/nature01262.View ArticleGoogle Scholar
- Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, Amanatides PG, Scherer SE, Li PW, Hoskins RA, Galle RF: The genome sequence of Drosophila melanogaster. Science. 2000, 287: 2185-2195. 10.1126/science.287.5461.2185.PubMedView ArticleGoogle Scholar
- Grunwald DJ, Eisen JS: Headwaters of the zebrafish – emergence of a new model vertebrate. Nat Rev Genet. 2002, 717-724. 10.1038/nrg892.Google Scholar
- The C. elegans Sequencing Consortium: Genome Sequence of the Nematode C. elegans: A Platform for Investigating Biology. Science. 1998, 282: 2012-2018. 10.1126/science.282.5396.2012.View ArticleGoogle Scholar
- Arabidopsis Genome Initiative: Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000, 408: 796-815. 10.1038/35048692.View ArticleGoogle Scholar
- O'Brien SJ, Menotti-Raymond M, Murphy WJ, Nash WG, Wienberg J, Stanyon R, Copeland NG, Jenkins NA, Womack JE, Marshall Graves JA: The Promise of Comparative Genomics in Mammals. Science. 1999, 286: 458-481. 10.1126/science.286.5439.458.PubMedView ArticleGoogle Scholar
- Miller W, Makova KD, Nekrutenko A, Hardison RC: Comparative genomics. Annu Rev Genomics Hum Genet. 2004, 5: 15-56. 10.1146/annurev.genom.5.061903.180057.PubMedView ArticleGoogle Scholar
- Yu J, Hu S, Wang J, Wong GK, Li S, Liu B, Deng Y, Dai L, Zhou Y, Zhang X: A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science. 2002, 296: 79-92. 10.1126/science.1068037.PubMedView ArticleGoogle Scholar
- Martienssen RA, Rabinowicz PD, O'Shaughnessy A, McCombie WR: Sequencing the maize genome. Curr Opin Plant Biol. 2004, 7: 102-107. 10.1016/j.pbi.2004.01.010.PubMedView ArticleGoogle Scholar
- Tanksley SD, Ganal MW, Prince JP, de Vicente MC, Bonierbale MW, Broun P, Fulton TM, Giovannoni JJ, Grandillo S, Martin GB: High density molecular linkage maps of the tomato and potato genomes. Genetics. 1992, 132: 1141-60.PubMedPubMed CentralGoogle Scholar
- Pryer KM, Schneider H, Zimmer EA, Banks JA: Deciding among green plants for whole genome studies. Trends in Plant Sci. 2002, 7: 550-554. 10.1016/S1360-1385(02)02375-0.View ArticleGoogle Scholar
- Stewart WN, Rothwell GW: Paleobotany and the evolution of plants. 1993, Cambridge University Press, Cambridge, UK, 2Google Scholar
- Kenrick P, Crane PR: The origin and early evolution of plants on land. Nature. 2003, 389: 33-39. 10.1038/37918.View ArticleGoogle Scholar
- Wang W, Tanurdzic M, Luo M, Sisneros N, Kim HR, Weng JK, Kudrna D, Mueller C, Arumuganathan K, Carlson J: Construction of a bacterial artificial chromosome library from the spikemoss Selaginella moellendorffii: A new resource for plant comparative genomics. BMC Plant Biol. 2005, 5: 10-10.1186/1471-2229-5-10.PubMedPubMed CentralView ArticleGoogle Scholar
- The Green Plant BAC Library Project. [http://www.greenbac.org]
- JGI Approved Community Sequencing Program Projects for 2005. [http://www.jgi.doe.gov/sequencing/cspseqplans.html]
- Whitfield CW, Band MR, Bonaldo MF, Kumar CG, Liu L, Pardinas JR, Robertson HM, Soares MB, Robinson GE: Annotated expressed sequence tags and cDNA microarrays for studies of brain and behavior in the honey bee. Genome Res. 2002, 12: 555-566. 10.1101/gr.5302.PubMedPubMed CentralView ArticleGoogle Scholar
- Jongeneel CV: Searching the expressed sequence tag (EST) databases: panning for genes. Brief Bioinform. 2000, 1: 76-92.PubMedView ArticleGoogle Scholar
- Adams MD, Kelley JM, Gocayne JD, Dubnick M, Polymeropoulos MH, Xiao H, Merril CR, Wu A, Olde B, Moreno RF: Complementary DNA sequencing: expressed sequence tags and human genome project. Science. 1991, 252: 1651-1656.PubMedView ArticleGoogle Scholar
- NCBI expressed sequence tag database. [http://www.ncbi.nlm.nih.gov/dbEST]
- Kirst M, Johnson AF, Baucom C, Ulrich E, Hubbard K, Staggs R, Paule C, Retzel E, Whetten R, Sederoff R: Apparent homology of expressed genes from wood-forming tissues of loblolly pine (Pinus taeda L.) with Arabidopsis thaliana. Proc Natl Acad Sci USA. 2003, 100: 7383-7388. 10.1073/pnas.1132171100.PubMedPubMed CentralView ArticleGoogle Scholar
- Brenner ED, Stevenson DW, McCombie RW, Katari MS, Rudd SA, Mayer KF, Palenchar PM, Runko SJ, Twigg RW, Dai G: Expressed sequence tag analysis in Cycas, the most primitive living seed plant. Genome Biol. 2003, 4: R78-10.1186/gb-2003-4-12-r78.PubMedPubMed CentralView ArticleGoogle Scholar
- Nishiyama T, Fujita T, Shin-I T, Seki M, Nishide H, Uchiyama I, Kamiya A, Carninci P, Hayashizaki Y, Shinozaki K: Comparative genomics of Physcomitrella patens gametophytic transcriptome and Arabidopsis thaliana: implication for land plant evolution. Proc Natl Acad Sci USA. 2003, 100: 8007-8012. 10.1073/pnas.0932694100.PubMedPubMed CentralView ArticleGoogle Scholar
- stackPACK. [http://www.egenetics.com/stackpack.html]
- NCBI. [http://www.ncbi.nlm.nih.gov]
- McCarter JP, Mitreva MD, Martin J, Dante M, Wylie T, Rao U, Pape D, Bowers Y, Theising B, Murphy CV: Analysis and functional classification of transcripts from the nematode Meloidogyne incognita. Genome Biol. 2003, 4: R26-10.1186/gb-2003-4-4-r26.PubMedPubMed CentralView ArticleGoogle Scholar
- McKersie BD, Murnaghan J, Jones KS, Bowley SR: Iron-superoxide dismutase expression in transgenic alfalfa increases winter survival without a detectable increase in photosynthetic oxidative stress tolerance. Plant Physiol. 2000, 122: 1427-1438. 10.1104/pp.122.4.1427.PubMedPubMed CentralView ArticleGoogle Scholar
- Fita I, Rossmann MG: The active center of catalase. J Mol Biol. 1985, 185: 21-37. 10.1016/0022-2836(85)90180-9.PubMedView ArticleGoogle Scholar
- Mann AF, Fentem PA, Stewart GR: Identification of two forms of glutamine synthetase in barley (Hordeum Vulgare). Biochem Biophys Res Commun. 1979, 88: 515-521. 10.1016/0006-291X(79)92078-3.PubMedView ArticleGoogle Scholar
- Oliveira IC, Coruzzi GM: Carbon and Amino Acids Reciprocally Modulate the Expression of Glutamine Synthetase in Arabidopsis. Plant Physiol. 1999, 121: 301-310. 10.1104/pp.121.1.301.PubMedPubMed CentralView ArticleGoogle Scholar
- Yang SF, Hoffman NE: Ethylene biosynthesis and its regulation in higher plants. Annu Rev Plant Physiol. 1984, 35: 155-189. 10.1146/annurev.pp.35.060184.001103.View ArticleGoogle Scholar
- Lamblin F, Saladin G, Dehorter B, Cronier D, Grenier E, Lacoux J, Bruyant P, Laine E, Chabbert B, Girault F: Overexpression of a heterologous sam gene encoding S-adenosylmethionine synthetase in flax (Linum usitatissimum) cells: Consequences on methylation of lignin precursors and pectins. Physiol Plant. 2001, 112: 223-232. 10.1034/j.1399-3054.2001.1120211.x.PubMedView ArticleGoogle Scholar
- Poerio E, Gennaro SD, Maro AD, Farisei F, Ferranti P, Parente A: Primary structure and reactive site of a novel wheat proteinase inhibitor of subtilisin and chymotrypsin. Biol Chem. 2003, 384: 295-304. 10.1515/BC.2003.033.PubMedView ArticleGoogle Scholar
- Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Barrell D, Bateman A, Binns D, Biswas M, Bradley P, Bork P: The InterPro Database, 2003 brings increased coverage and new features. Nuc Acids Res. 2003, 31: 315-318. 10.1093/nar/gkg046.View ArticleGoogle Scholar
- Mapping of InterPro entries to GO. [http://www.geneontology.org/external2go/interpro2go]
- Gene Ontology Consortium. [http://www.geneontology.org]
- Gene Ontology Consortium: Creating the gene ontology resource: design and implementation. Genome Res. 2001, 11: 1425-1433. 10.1101/gr.180801.View ArticleGoogle Scholar
- TIGR Arabidopsis Gene Index. [http://www.tigr.org/tigr-scripts/tgi/T_index.cgi?species=arab]
- Quackenbush J, Cho J, Lee D, Liang F, Holt I, Karamycheva S, Parvizi B, Pertea G, Sultana R, White J: The TIGR Gene Indices: analysis of gene transcript sequences in highly sampled eukaryotic species. Nucleic Acids Res. 2001, 29: 159-164. 10.1093/nar/29.1.159.PubMedPubMed CentralView ArticleGoogle Scholar
- Mirkin BG, Fenner TI, Galperin MY, Koonin EV: Algorithms for computing parsimonious evolutionary scenarios for genome evolution, the last universal common ancestor and dominance of horizontal gene transfer in the evolution of prokaryotes. BMC Evol Biol. 2003, 3: 2-10.1186/1471-2148-3-2.PubMedPubMed CentralView ArticleGoogle Scholar
- Jordan IK, Rogozin IB, Wolf YI, Koonin EV: Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. Genome Res. 2002, 12: 962-968. 10.1101/gr.87702. Article published online before print in May 2002.PubMedPubMed CentralView ArticleGoogle Scholar
- Guo D, Chen F, Inoue K, Blount JW, Dixon RA: Downregulation of caffeic acid 3-O-methyltransferase and caffeoyl CoA 3-O-methyltransferase in transgenic alfalfa. impacts on lignin structure and implications for the biosynthesis of G and S lignin. Plant Cell. 2001, 13: 73-88. 10.1105/tpc.13.1.73.PubMedPubMed CentralView ArticleGoogle Scholar
- Kipreos ET, Lander LE, Wing JP, He WW, Hedgecock EM: cul-1 is required for cell cycle exit in C. elegans and identifies a novel gene family. Cell. 1996, 85: 829-839. 10.1016/S0092-8674(00)81267-2.PubMedView ArticleGoogle Scholar
- Koh S, Wiles AM, Sharp JS, Naider FR, Becker JM, Stacey G: An oligopeptide transporter gene family in Arabidopsis. Plant Physiol. 2002, 128: 21-29. 10.1104/pp.128.1.21.PubMedPubMed CentralView ArticleGoogle Scholar
- Norambuena L, Marchant L, Berninsone P, Hirschberg CB, Silva H, Orellana A: Transport of UDP-galactose in plants: Identification and functional characterization of AtUTr1, an Arabidopsis thaliana UDP-galactose/UDP-glucose transporter. J Biol Chem. 2002, 277: 32923-32929. 10.1074/jbc.M204081200.PubMedView ArticleGoogle Scholar
- Petrucco S, Bolchi A, Foroni C, Percudani R, Rossi GL, Ottonello S: A maize gene encoding an NADPH binding enzyme highly homologous to isoflavone reductases is activated in response to sulfur starvation. Plant Cell. 1996, 8: 69-80. 10.1105/tpc.8.1.69.PubMedPubMed CentralView ArticleGoogle Scholar
- Braz AS, Finnegan J, Waterhouse P, Margis R: A plant orthologue of RNase L inhibitor (RLI) is induced in plants showing RNA interference. J Mol Evol. 2004, 59: 20-30. 10.1007/s00239-004-2600-4.PubMedView ArticleGoogle Scholar
- Purdue University Selaginella Page. [http://research.e-enterprise.purdue.edu/selaginella]
- Edwards K, Johnstone C, Thompson C: A simple and rapid method for the preparation of plant genomic DNA for PCR analysis. Nucleic Acids Res. 1991, 19: 1349-PubMedPubMed CentralView ArticleGoogle Scholar