- Open Access
Ashbya Genome Database 3.0: a cross-species genome and transcriptome browser for yeast biologists
BMC Genomics volume 8, Article number: 9 (2007)
The Ashbya Genome Database (AGD) 3.0 is an innovative cross-species genome and transcriptome browser based on release 40 of the Ensembl developer environment.
AGD 3.0 provides information on 4726 protein-encoding loci and 293 non-coding RNA genes present in the genome of the filamentous fungus Ashbya gossypii. A synteny viewer depicts the chromosomal location and orientation of orthologous genes in the budding yeast Saccharomyces cerevisiae. Genome-wide expression profiling data obtained with high-density oligonucleotide microarrays (GeneChips) are available for nearly all currently annotated protein-coding loci in A. gossypii and S. cerevisiae.
AGD 3.0 hence provides yeast- and genome biologists with comprehensive report pages including reliable DNA annotation, Gene Ontology terms associated with S. cerevisiae orthologues and RNA expression data as well as numerous links to external sources of information. The database is accessible at http://agd.vital-it.ch/.
Approximately 90% of the genes present in the genomes of the filamentous fungus Ashbya gossypii and the budding yeast Saccharomyces cerevisiae were found to be orthologous and syntenic . This degree of similarity was remarkable since both species display distinct morphologies as well as growth and differentiation properties. In the case of Ashbya, for example, there is no evidence for sexual reproduction although its genome contains the relevant complement of genes involved in yeast meiosis and spore development . Budding and fission yeasts are key model organisms for high-throughput studies aiming at elucidating gene function by deletion analysis [3, 4], expression profiling of conserved processes such as the mitotic cell cycle [5–7] and meiotic development [8–10] and by analysing protein localization  or protein-protein interaction patterns [12, 13]. The data obtained in these species are believed to be useful for analysing networks and predicting gene function in yeasts and related organisms in the emerging field of systems biology . Similar functional genomics and expression profiling experiments are currently being carried out using A. gossypii but there is currently no comprehensive source of information that provides cross-species coverage of high-throughput annotation, functional analysis and expression profiling data.
Technological progress in the field of DNA sequencing led to the production of a huge amount of information and spawned the development of appropriate bioinformatics tools for DNA data interpretation and analysis. The Ensembl project provides a comprehensive annotation database covering 19 species. Moreover, it enables software engineers to build standalone applications within the freely available Ensembl development environment . The urgent need for a global and coherent approach to genome annotation has lead to the formation of the Gene Ontology (GO) consortium. GO develops a controlled and structured vocabulary (Ontology) for describing the process a gene product is involved in as well as its molecular function and sub-cellular localization . This effort has yielded annotation information very useful for the categorization of genes across species and has proven particularly pertinent for microarray data interpretation ( and references therein).
This paper reports the release of Ashbya Genome Database 3.0, an innovative cross-species genome and transcriptome browser based on the Ensembl environment. The DNA annotation and high-density oligonucleotide microarray RNA expression data available via AGD are regularly updated to continue providing an excellent source of online information for yeast and genome biologists.
Construction and content
AGD 3.0 was developed using the Ensembl Application Programming Interface (API) and base web code release 40 [18–20]. Further details on the AGD developer environment have been published previously .
AGD 3.0 provides highly reliable and in many cases manually verified DNA annotation data on 4726 protein-coding genes present in the genome of A. gossypii. GeneOntology annotation data associated with orthologues from S. cerevisiae are displayed in the A. gossypii gene report page. Whole-genome expression profiling data are displayed for 4451 annotated genes represented in the database and present on proprietary GeneChips used to study gene expression during A. gossypii spore germination (R. Rischatsch, P. Demougin, A. Gattiker, M. Primig and P. Philippsen, unpublished). A number of similar GeneChip-based transcriptome studies that cover mitotic growth  and meiotic development [10, 22, 23] in budding yeast have also been incorporated. This facilitates cross-species interpretation of transcriptional patterns and relative expression levels. Critically, the expression signals were obtained with the same robust microarray platform and experiments that are based on the same array architecture (e.g. Saccharomyces S98 and Ashbya SYNG001 GeneChips) and are therefore directly comparable even between species [24–26].
To analyze homology at the DNA and protein levels we used BLASTZ  and BLAT  as implemented in the Compara system . Syntenic regions were computed by merging homology regions (identified by BLASTZ) that were in the same orientation and not more than 5000 bp apart in both organisms. Among the set of merged regions, only those covering at least 10000 bp were retained to ensure that syntenic regions contained a meaningful number of loci. Furthermore, complementary data from the Orthologous Matrix project (using a conservative approach to the identification of orthologs) were included to provide independent phylogenetic information on 287 species [29, 30].
Utility and discussion
The database can be searched by using a systematic or standard locus name (including aliases) from A. gossypii or S. cerevisiae via the welcome page. A wildcard option is provided (e.g. "CDC*"). Genes can also be searched through cross-references to GenBank/EMBL Protein identification numbers, UniProt accession numbers and Gene Ontology identifiers (e.g. "GO:0005737"). Finally, it is possible to retrieve genes or genome regions by DNA sequence coordinates.
The home page presents the A. gossypii genome as well as links to navigate to the S. cerevisiae, S. pombe and N. crassa genomes. The user can click on the karyotype or input sequence coordinates to display a specific chromosome. The MapView shows a summary display of the chromosome features. Clicking on a portion of the chromosome brings up the ContigView that shows features at three different scales. In the DetailedView section, pop-up menus can be used to display different features: genes, homologous regions from other fungal species mapped by BLASTZ and, notably, the positions of the target sequences covered by oligonucleotide probes on Ashbya and yeast GeneChips. Selecting the "View alongside..." option in the navigation menu brings up the MultiContigView, which displays homologous and syntenic (similarly oriented) genes from two species. The information in the gene pages has been substantially extended since the previous release . Cross-references to Gene Ontology (GO, based on data supplied by SGD in S. cerevisiae homologs), GenBank/EMBL and UniProtKB have been added. Clicking on a GO term allows browsing all entries sharing a given annotation. Furthermore, links to facilitate information retrieval from external sources have been added. This includes the display of recently published articles from the Google Scholar service, and the display of community annotation entries from GermOnline [31, 32].
The conservation of gene order across genomes can be visualized at two scales. The SyntenyView page is available from the chromosome page and displays locations of large-scale syntenic regions on the chromosomes of two selected species (Figure 1, panel a). The MultiContigView page can be accessed from the synteny view or from the Orthologue Prediction section of a gene page and displays the genomic maps of two organisms at a high resolution. In that view, homologous genes are linked by blue lines (panel b).
The gene report page contains a new section that displays putative orthologs of genes present in the Ashbya genome. Information from the Compara pipeline  and the Orthologous Matrix (OMA) [29, 30] is summarized. While both methods start with an all-against-all Smith-Waterman protein similarity search the results are not identical because different algorithms are used. Compara data indicate best reciprocal BLASTP hits among the four loaded fungal genomes, indicating putative orthologs even with relatively low similarity. OMA data contains 287 complete genomes from all kingdoms and attempts to detect truly orthologous relationships.
A section of the report page covers microarray expression data annotated using our Microarray Information Management and Annotation System (MIMAS) . Users can view information on the array type, experimental details and a detailed sample history that is available in a popup menu via clickable sample names. Normalized microarray expression signals are displayed in the context of the DNA annotation (Figure 2, panel a) in a bar diagram as linear or log2-transformed values and the percentile of genes transcribed at the given level is indicated. Currently, expression levels are available from samples of germinating Ashbya somatic spores incubated in rich medium for five, seven and nine hours (panel b). The array data volume is expected to increase in size rapidly as results from ongoing research will constantly be added. By clicking on the appropriate links in the "orthologs prediction" section of the report page users can access comparable data available for mitotically growing and sporulating budding yeast (panel c). It should be noted that the expression signals obtained with microarrays in different species indicate relative but not absolute mRNA levels.
The sequence and annotation information is kept synchronized between AGD and the GenBank/EMBL/DDBJ collaboration . The data are accessible in any of the databases under accession numbers AE016814 through AE016821. The 'Export data' link in AGD also allows creating custom files in a range of formats containing sequence data and/or feature annotation within a certain sequence range. Results of the microarray study using proprietary Ashbya GeneChips will be published elsewhere and the raw expression signal values will be made available via ArrayExpress, a certified public repository at the European Bioinformatics Institute .
Upcoming releases of AGD will contain information about gene function from large-scale deletion studies in A. gossypii and predicted protein-protein interaction data on the basis of high-throughput studies in S. cerevisiae [12, 13]. Finally, high-density oligonucleotide microarray expression data from the species represented in AGD will continuously be integrated as they become available.
AGD 3.0 is a useful and innovative database for researchers in the yeast and genome evolution fields. The gene annotation data is steadily verified and improved and it is now complemented with high-density oligonucleotide microarray expression data. This approach marks out AGD 3.0 as the first database that enables users to view a comprehensive report page including information on gene DNA annotation and RNA expression across conserved fungal species.
Availability and requirements
Project name: Ashbya Genome Database
Project home page: http://agd.vital-it.ch/
Operating system(s): Linux
Programming language: Perl
Licence: the database is freely accessible for academic users under the GNU GPL.
Restrictions to use by non-academics: commercial users are referred to Syngenta for more details on access to microarray expression signals obtained with proprietary GeneChips.
Deep links into AGD from external sources are possible via the uniform resource locator (URL) http://agd.unibas.ch/Ashbya_gossypii/geneview?gene=[NAME] where [NAME] can be a budding yeast gene systematic name (e.g. YPR175W) or standard name (SPO11) or an A. gossypii systematic name (e.g. AEL267C).
Dietrich FS, Voegeli S, Brachat S, Lerch A, Gates K, Steiner S, Mohr C, Pohlmann R, Luedi P, Choi S, Wing RA, Flavier A, Gaffney TD, Philippsen P: The Ashbya gossypii genome as a tool for mapping the ancient Saccharomyces cerevisiae genome. Science. 2004, 304 (5668): 304-307. 10.1126/science.1095781.
Brachat S, Dietrich FS, Voegeli S, Zhang Z, Stuart L, Lerch A, Gates K, Gaffney T, Philippsen P: Reinvestigation of the Saccharomyces cerevisiae genome annotation by comparison to the genome of a related fungus: Ashbya gossypii. Genome Biol. 2003, 4 (7): R45-10.1186/gb-2003-4-7-r45.
Winzeler EA, Shoemaker DD, Astromoff A, Liang H, Anderson K, Andre B, Bangham R, Benito R, Boeke JD, Bussey H, Chu AM, Connelly C, Davis K, Dietrich F, Dow SW, El Bakkoury M, Foury F, Friend SH, Gentalen E, Giaever G, Hegemann JH, Jones T, Laub M, Liao H, Liebundguth N, Lockhart DJ, Lucau-Danila A, Lussier M, M'Rabet N, Menard P, Mittmann M, Pai C, Rebischung C, Revuelta JL, Riles L, Roberts CJ, Ross-MacDonald P, Scherens B, Snyder M, Sookhai-Mahadeo S, Storms RK, Veronneau S, Voet M, Volckaert G, Ward TR, Wysocki R, Yen GS, Yu K, Zimmermann K, Philippsen P, Johnston M, Davis RW: Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science. 1999, 285 (5429): 901-906. 10.1126/science.285.5429.901.
Decottignies A, Sanchez-Perez I, Nurse P: Schizosaccharomyces pombe essential genes: a pilot study. Genome Res. 2003, 13 (3): 399-406. 10.1101/gr.636103.
Cho RJ, Campbell MJ, Winzeler EA, Steinmetz L, Conway A, Wodicka L, Wolfsberg TG, Gabrielian AE, Landsman D, Lockhart DJ, Davis RW: A genome-wide transcriptional analysis of the mitotic cell cycle. Mol Cell. 1998, 2 (1): 65-73. 10.1016/S1097-2765(00)80114-8.
Rustici G, Mata J, Kivinen K, Lio P, Penkett CJ, Burns G, Hayles J, Brazma A, Nurse P, Bahler J: Periodic gene expression program of the fission yeast cell cycle. Nat Genet. 2004, 36 (8): 809-817. 10.1038/ng1377.
Spellman P, Sherlock G, Zhang M, Iyer V, Anders K, Eisen M, Brown P, Botstein D, Futcher B: Comprehensive identification of cell cycle-regulated genes of Saccharomyces cerevisae by microarray hybridization. Mol Biol Cell. 1998, 9: 3273-3297.
Chu S, DeRisi J, Eisen M, Mulholland J, Botstein D, Brown PO, Herskowitz I: The transcriptional program of sporulation in budding yeast. Science. 1998, 282 (5389): 699-705. 10.1126/science.282.5389.699.
Mata J, Lyne R, Burns G, Bahler J: The transcriptional program of meiosis and sporulation in fission yeast. Nat Genet. 2002, 32 (1): 143-147. 10.1038/ng951.
Primig M, Williams RM, Winzeler EA, Tevzadze GG, Conway AR, Hwang SY, Davis RW, Esposito RE: The core meiotic transcriptome in budding yeasts. Nat Genet. 2000, 26 (4): 415-423. 10.1038/82539.
Huh WK, Falvo JV, Gerke LC, Carroll AS, Howson RW, Weissman JS, O'Shea EK: Global analysis of protein localization in budding yeast. Nature. 2003, 425 (6959): 686-691. 10.1038/nature02026.
Gavin AC, Aloy P, Grandi P, Krause R, Boesche M, Marzioch M, Rau C, Jensen LJ, Bastuck S, Dumpelfeld B, Edelmann A, Heurtier MA, Hoffman V, Hoefert C, Klein K, Hudak M, Michon AM, Schelder M, Schirle M, Remor M, Rudi T, Hooper S, Bauer A, Bouwmeester T, Casari G, Drewes G, Neubauer G, Rick JM, Kuster B, Bork P, Russell RB, Superti-Furga G: Proteome survey reveals modularity of the yeast cell machinery. Nature. 2006, 440 (7084): 631-636. 10.1038/nature04532.
Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, Ignatchenko A, Li J, Pu S, Datta N, Tikuisis AP, Punna T, Peregrin-Alvarez JM, Shales M, Zhang X, Davey M, Robinson MD, Paccanaro A, Bray JE, Sheung A, Beattie B, Richards DP, Canadien V, Lalev A, Mena F, Wong P, Starostine A, Canete MM, Vlasblom J, Wu S, Orsi C, Collins SR, Chandran S, Haw R, Rilstone JJ, Gandi K, Thompson NJ, Musso G, St Onge P, Ghanny S, Lam MH, Butland G, Altaf-Ul AM, Kanaya S, Shilatifard A, O'Shea E, Weissman JS, Ingles CJ, Hughes TR, Parkinson J, Gerstein M, Wodak SJ, Emili A, Greenblatt JF: Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature. 2006, 440 (7084): 637-643. 10.1038/nature04670.
de Lichtenberg U, Jensen LJ, Brunak S, Bork P: Dynamic complex formation during the yeast cell cycle. Science. 2005, 307 (5710): 724-727. 10.1126/science.1105103.
Birney E, Andrews D, Caccamo M, Chen Y, Clarke L, Coates G, Cox T, Cunningham F, Curwen V, Cutts T, Down T, Durbin R, Fernandez-Suarez XM, Flicek P, Graf S, Hammond M, Herrero J, Howe K, Iyer V, Jekosch K, Kahari A, Kasprzyk A, Keefe D, Kokocinski F, Kulesha E, London D, Longden I, Melsopp C, Meidl P, Overduin B, Parker A, Proctor G, Prlic A, Rae M, Rios D, Redmond S, Schuster M, Sealy I, Searle S, Severin J, Slater G, Smedley D, Smith J, Stabenau A, Stalker J, Trevanion S, Ureta-Vidal A, Vogel J, White S, Woodwark C, Hubbard TJ: Ensembl 2006. Nucleic Acids Res. 2006, 34 (Database issue): D556-61. 10.1093/nar/gkj133.
The Gene Ontology (GO) project in 2006. Nucleic Acids Res. 2006, 34 (Database issue): D322-6. 10.1093/nar/gkj021.
Wrobel G, Chalmel F, Primig M: goCluster integrates statistical analysis and functional interpretation of microarray expression data. Bioinformatics. 2005, 21 (17): 3575-3577. 10.1093/bioinformatics/bti574.
Stabenau A, McVicker G, Melsopp C, Proctor G, Clamp M, Birney E: The Ensembl core software libraries. Genome Res. 2004, 14 (5): 929-933. 10.1101/gr.1857204.
Curwen V, Eyras E, Andrews TD, Clarke L, Mongin E, Searle SM, Clamp M: The Ensembl automatic gene annotation system. Genome Res. 2004, 14 (5): 942-950. 10.1101/gr.1858004.
Birney E, Andrews TD, Bevan P, Caccamo M, Chen Y, Clarke L, Coates G, Cuff J, Curwen V, Cutts T, Down T, Eyras E, Fernandez-Suarez XM, Gane P, Gibbins B, Gilbert J, Hammond M, Hotz HR, Iyer V, Jekosch K, Kahari A, Kasprzyk A, Keefe D, Keenan S, Lehvaslaiho H, McVicker G, Melsopp C, Meidl P, Mongin E, Pettett R, Potter S, Proctor G, Rae M, Searle S, Slater G, Smedley D, Smith J, Spooner W, Stabenau A, Stalker J, Storey R, Ureta-Vidal A, Woodwark KC, Cameron G, Durbin R, Cox A, Hubbard T, Clamp M: An overview of Ensembl. Genome Res. 2004, 14 (5): 925-928. 10.1101/gr.1860604.
Hermida L, Brachat S, Voegeli S, Philippsen P, Primig M: The Ashbya Genome Database (AGD)--a tool for the yeast community and genome biologists. Nucleic Acids Res. 2005, 33 (Database issue): D348-52. 10.1093/nar/gki009.
Hochwagen A, Wrobel G, Cartron M, Demougin P, Niederhauser-Wiederkehr C, Boselli MG, Primig M, Amon A: Novel response to microtubule perturbation in meiosis. Mol Cell Biol. 2005, 25 (11): 4767-4781. 10.1128/MCB.25.11.4767-4781.2005.
Williams RM, Primig M, Washburn BK, Winzeler EA, Bellis M, Sarrauste de Menthiere C, Davis RW, Esposito RE: The Ume6 regulon coordinates metabolic and meiotic gene expression in yeast. Proc Natl Acad Sci U S A. 2002, 99 (21): 13431-13436. 10.1073/pnas.202495299.
Bammler T, Beyer RP, Bhattacharya S, Boorman GA, Boyles A, Bradford BU, Bumgarner RE, Bushel PR, Chaturvedi K, Choi D, Cunningham ML, Deng S, Dressman HK, Fannin RD, Farin FM, Freedman JH, Fry RC, Harper A, Humble MC, Hurban P, Kavanagh TJ, Kaufmann WK, Kerr KF, Jing L, Lapidus JA, Lasarev MR, Li J, Li YJ, Lobenhofer EK, Lu X, Malek RL, Milton S, Nagalla SR, O'Malley J P, Palmer VS, Pattee P, Paules RS, Perou CM, Phillips K, Qin LX, Qiu Y, Quigley SD, Rodland M, Rusyn I, Samson LD, Schwartz DA, Shi Y, Shin JL, Sieber SO, Slifer S, Speer MC, Spencer PS, Sproles DI, Swenberg JA, Suk WA, Sullivan RC, Tian R, Tennant RW, Todd SA, Tucker CJ, Van Houten B, Weis BK, Xuan S, Zarbl H: Standardizing global gene expression analysis between laboratories and across platforms. Nat Methods. 2005, 2 (5): 351-356. 10.1038/nmeth754.
Irizarry RA, Warren D, Spencer F, Kim IF, Biswal S, Frank BC, Gabrielson E, Garcia JG, Geoghegan J, Germino G, Griffin C, Hilmer SC, Hoffman E, Jedlicka AE, Kawasaki E, Martinez-Murillo F, Morsberger L, Lee H, Petersen D, Quackenbush J, Scott A, Wilson M, Yang Y, Ye SQ, Yu W: Multiple-laboratory comparison of microarray platforms. Nat Methods. 2005, 2 (5): 345-350. 10.1038/nmeth756.
Wrobel G, Primig M: Mammalian male germ cells are fertile ground for expression profiling of sexual reproduction. Reproduction. 2005, 129 (1): 1-7. 10.1530/rep.1.00408.
Schwartz S, Zhang Z, Frazer KA, Smit A, Riemer C, Bouck J, Gibbs R, Hardison R, Miller W: PipMaker--a web server for aligning two genomic DNA sequences. Genome Res. 2000, 10 (4): 577-586. 10.1101/gr.10.4.577.
Kent WJ: BLAT--the BLAST-like alignment tool. Genome Res. 2002, 12 (4): 656-664. 10.1101/gr.229202. Article published online before March 2002.
Gil M, Dessimoz C, Gonnet GH: A dimensionless fit measure for phylogenetic distance trees. J Bioinform Comput Biol. 2005, 3 (6): 1429-1440. 10.1142/S0219720005001636.
Dessimoz C, Canarozzi G, Gil M, Marghadant D, Roth A, Schneider A, Gonnet GH: OMA, A Comprehensive, Automated Project for the Identification of Orthologs from Complete Genome Data: Introduction and First Achievements . Lecture Notes in Computer Science. 2005, Berlin / Heidelberg , Springer , Volume 3678/2005:
Wiederkehr C, Basavaraj R, Sarrauste de Menthiere C, Hermida L, Koch R, Schlecht U, Amon A, Brachat S, Breitenbach M, Briza P, Caburet S, Cherry M, Davis R, Deutschbauer A, Dickinson HG, Dumitrescu T, Fellous M, Goldman A, Grootegoed JA, Hawley R, Ishii R, Jegou B, Kaufman RJ, Klein F, Lamb N, Maro B, Nasmyth K, Nicolas A, Orr-Weaver T, Philippsen P, Pineau C, Rabitsch KP, Reinke V, Roest H, Saunders W, Schroder M, Schedl T, Siep M, Villeneuve A, Wolgemuth DJ, Yamamoto M, Zickler D, Esposito RE, Primig M: GermOnline, a cross-species community knowledgebase on germ cell differentiation. Nucleic Acids Res. 2004, 32 Database issue: D560-7. 10.1093/nar/gkh055.
Gattiker A, Niederhauser-Wiederkehr C, Moore J, Hermida L, Primig M: The Germ Online cross-species systems browser provides comprehensive information on genes and gene products relevant for sexual reproduction. Nucleic Acids Res. 2007
Hermida L, Schaad O, Demougin P, Descombes P, Primig M: MIMAS: an innovative tool for network-based high density oligonucleotide microarray data management and annotation. BMC Bioinformatics. 2006, 7: 190-10.1186/1471-2105-7-190.
Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL: GenBank. Nucleic Acids Res. 2006, 34 (Database issue): D16-20. 10.1093/nar/gkj157.
Parkinson H, Sarkans U, Shojatalab M, Abeygunawardena N, Contrino S, Coulson R, Farne A, Lara GG, Holloway E, Kapushesky M, Lilja P, Mukherjee G, Oezcimen A, Rayner T, Rocca-Serra P, Sharma A, Sansone S, Brazma A: ArrayExpress--a public repository for microarray gene expression data at the EBI. Nucleic Acids Res. 2005, 33 (Database issue): D553-5. 10.1093/nar/gki056.
We thank R. Poehlmann for IT support, V. Jongeneel and R. Fabbretti for hosting and maintaining the database at Vital-IT and J. Moore for critical reading of the manuscript. A. Gattiker is supported by the Swiss Institute of Bioinformatics.
AG developed the database, RR contributed microarray expression data, PD participated in array data production, FSD participated in genome annotation and GeneChip development, PP lead the genome annotation project, MP participated in the development of AGD 3.0' concept and wrote the paper. All authors read and approved the final manuscript.