plantsUPS: a database of plants' Ubiquitin Proteasome System
© Du et al; licensee BioMed Central Ltd. 2009
Received: 21 March 2009
Accepted: 16 May 2009
Published: 16 May 2009
The ubiquitin 26S/proteasome system (UPS), a serial cascade process of protein ubiquitination and degradation, is the last step for most cellular proteins. There are many genes involved in this system, but are not identified in many species. The accumulating availability of genomic sequence data is generating more demands in data management and analysis. Genomics data of plants such as Populus trichocarpa, Medicago truncatula, Glycine max and others are now publicly accessible. It is time to integrate information on classes of genes for complex protein systems such as UPS.
We developed a database of higher plants' UPS, named 'plantsUPS'. Both automated search and manual curation were performed in identifying candidate genes. Extensive annotations referring to each gene were generated, including basic gene characterization, protein features, GO (gene ontology) assignment, microarray probe set annotation and expression data, as well as cross-links among different organisms. A chromosome distribution map, multi-sequence alignment, and phylogenetic trees for each species or gene family were also created. A user-friendly web interface and regular updates make plantsUPS valuable to researchers in related fields.
The plantsUPS enables the exploration and comparative analysis of UPS in higher plants. It now archives > 8000 genes from seven plant species distributed in 11 UPS-involved gene families. The plantsUPS is freely available now to all users at http://bioinformatics.cau.edu.cn/plantsUPS.
The ubiquitin/26S proteasome system (UPS) is the major pathway of protein degradation. UPS can affect all aspects of cellular function, and plays an important role in physiological processes like hormonal responses, biotic stress and photomorphogenesis. In UPS, substrate proteins destined for degradation are tagged with 76-residue ubiquitin proteins through a serial cascade process of so-called ubiquitination, and finally hydrolysed by 26S proteasome. There are three steps in ubiquitination, catalyzed by three different enzymes or enzyme complexes: ubiquitin activating enzyme (E1), ubiquitin conjugating enzyme (E2), and ubiquitin protein ligase (E3). There are approximately 1300 E3s in the Arabidopsis genome, and similarly large numbers in other plants. However, in most plant species, the genome-wide classification and annotation of UPS genes, especially E3 families, are not yet available. The rapidly accumulating genome sequences has made those of seven important higher plants: Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa), Populus trichocarpa, Medicago truncatula, grape (Vitis vinifera), soybean (Glycine max), and maize (Zea mays) publicly available. Consequently, analysis work is now inevitable and urgent. However, until now there was no available database concerning higher plants' UPS. The only comprehensive UPS database is PlantsUBQ , which provides information for only a single Arabidopsis species. To help researchers interested in plants' UPS, we developed the platform 'plantsUPS'. This archives > 8000 genes from the above seven plant species, belonging to 11 UPS gene families (one each for E1 and E2, and nine for E3).
Construction and content
Genome sequence data acquisition
Arabidopsis genome data used in plantsUPS is from TAIR ( release 8, and rice data is from the Rice Genome Annotation Project  release 5. Populus, soybean, grape, Medicago and maize genome data are compiled from the Populus genome project , Soybean Genome Project , Genoscope , MGSC  and MaizeSequence.org , respectively, and all used the latest versions available in February 2009. We used maize protein-coding genes for analysis; however, due to the highly complex and unfinished annotation of the maize genome, genes in plantsUPS should not be considered as an integrated UPS profile of maize.
UPS genes identification
In plantsUPS, we used BLAST  and InterproScan  searches in computational prediction to identify UPS gene members for 11 gene families. We used BLAST (E-values ≤ 10.0) as the primary search before performing InterproScan. However, for RBX (Ring-Box) and DDB which is a component of CDD (CUL4-RBX1-CDD complex) families there is no consensus IPR (Interpro Scan) accession for identification. Thus only BLAST was used (E-value ≤ 1e–30). We used InterproScan results as the main evidence in estimation. Gene families and the corresponding IPR accessions used are presented in supplement Table S1 [see Additional file 1]. Subsequently, we did manual curation to reduce the false-positive rate, based on published reports.
We constructed and configured plantsUPS upon a typical LAMP (Linux + Apache + MySQL + PHP) platform. Dataset was stored in MySQL 4.1 [see Additional file 2], and web interface was achieved by PHP scripts (PHP version 4.4) on Red Hat Linux, powered by an Apache server.
Utility and discussion
Web function and comparative tools
Extensive gene annotation
Microarray expression value
GPLs in plantsUPS
Genes data statistics in plantsUPS
In the future, we will continue to incorporate new information, develop more comparative tools for plantsUPS, and extend the available species. Regular update and relative analysis will provide users up-to-date UPS information.
The plantsUPS is the first platform concerning UPS in seven sequenced higher plants. It will assist searchers in related fields by providing comprehensive information on UPS gene families and members of these families. The plantsUPS resource is freely available via http://bioinformatics.cau.edu.cn/plantsUPS.
We thank Ms. Wenying Xu for discussions and critical suggestions. This work was supported by grants from the Ministry of Science and Technology of China (2006CB100105) and the China Agriculture University.
- PlantsUBQ. [http://plantsubq.genomics.purdue.edu/]
- Poole RL: The TAIR database. Methods in molecular biology (Clifton, NJ). 2007, 406: 179-212.Google Scholar
- Ouyang S, Zhu W, Hamilton J, Lin H, Campbell M, Childs K, Thibaud-Nissen F, Malek RL, Lee Y, Zheng L: The TIGR Rice Genome Annotation Resource: improvements and new features. Nucleic acids research. 2007, D883-887. 35 Database
- Tuskan GA, Difazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, Putnam N, Ralph S, Rombauts S, Salamov A, et al: The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science. 2006, 313: 1596-1604.View ArticlePubMedGoogle Scholar
- Soybean Genome Project. [http://www.phytozome.net/soybean]
- Jaillon O, Aury JM, Noel B, Policriti A, Clepet C, Casagrande A, Choisne N, Aubourg S, Vitulo N, Jubin C, et al: The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature. 2007, 449 (7161): 463-467.View ArticlePubMedGoogle Scholar
- The Medicago Genome Sequence Consortium (MGSC). [http://www.medicago.org/genome/index.php]
- Maize Genome Sequencing Project. [http://www.maizesequence.org/index.html]
- Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic acids research. 1997, 25 (17): 3389-3402.PubMed CentralView ArticlePubMedGoogle Scholar
- Quevillon E, Silventoinen V, Pillai S, Harte N, Mulder N, Apweiler R, Lopez R: InterProScan: protein domains identifier. Nucleic acids research. 2005, W116-120. 33 Web Server
- Kozik A, Kochetkova E, Michelmore R: GenomePixelizer – a visualization program for comparative genomics within and between species. Bioinformatics (Oxford, England). 2002, 18 (2): 335-336.View ArticleGoogle Scholar
- Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista C, Kim IF, Soboleva A, Tomashevsky M, Edgar R: NCBI GEO: mining tens of millions of expression profiles – database and tools update. Nucleic acids research. 2007, D760-765. 35 Database
- Xu G, Ma H, Nei M, Kong H: Evolution of F-box genes in plants: different modes of sequence divergence and their relationships with functional diversification. Proceedings of the National Academy of Sciences of the United States of America. 2009, 106 (3): 835-840.PubMed CentralView ArticlePubMedGoogle Scholar
- Gingerich DJ, Hanada K, Shiu SH, Vierstra RD: Large-scale, lineage-specific expansion of a bric-a-brac/tramtrack/broad complex ubiquitin-ligase gene family in rice. The Plant cell. 2007, 19 (8): 2329-2348.PubMed CentralView ArticlePubMedGoogle Scholar