- Open Access
Predicting the fungal CUG codon translation with Bagheera
© Mühlhausen and Kollmar; licensee BioMed Central Ltd. 2014
- Received: 9 November 2013
- Accepted: 21 May 2014
- Published: 29 May 2014
Many eukaryotes have been shown to use alternative schemes to the universal genetic code. While most Saccharomycetes, including Saccharomyces cerevisiae, use the standard genetic code translating the CUG codon as leucine, some yeasts, including many but not all of the “Candida”, translate the same codon as serine. It has been proposed that the change in codon identity was accomplished by an almost complete loss of the original CUG codons, making the CUG positions within the extant species highly discriminative for the one or other translation scheme.
In order to improve the prediction of genes in yeast species by providing the correct CUG decoding scheme we implemented a web server, called Bagheera, that allows determining the most probable CUG codon translation for a given transcriptome or genome assembly based on extensive reference data. As reference data we use 2071 manually assembled and annotated sequences from 38 cytoskeletal and motor proteins belonging to 79 yeast species. The web service includes a pipeline, which starts with predicting and aligning homologous genes to the reference data. CUG codon positions within the predicted genes are analysed with respect to amino acid similarity and CUG codon conservation in related species. In addition, the tRNACAG gene is predicted in genomic data and compared to known leu-tRNACAG and ser-tRNACAG genes. Bagheera can also be used to evaluate any mRNA and protein sequence data with the codon usage of the respective species. The usage of the system has been demonstrated by analysing six genomes not included in the reference data.
Gene prediction and consecutive comparison with reference data from other Saccharomycetes are sufficient to predict the most probable decoding scheme for CUG codons. This approach has been implemented into Bagheera (http://www.motorprotein.de/bagheera).
- Reference Data
- Codon Usage
- Gene Prediction
- Yeast Species
- Translation Scheme
For a long time it is known that many organisms show alterations to the universal genetic code [1, 2]. These codon reassignments could have happened under strong AT or GC pressure, which might lead to the complete disappearance of the reassigned codon followed by a tRNA with a different amino acid identity taking over the decoding of the respective codon during its reappearance (“codon capture” theory ). In a mutually exclusive scenario, the codon is in a transitional state, in which it is decoded ambiguously by two tRNAs (“ambiguous intermediate” theory ). An example for the latter scenario is the reassignment of the CUG codon from leucine to serine in Candida yeasts [5–7], which cannot be accomplished by a single mutation in the anticodon of a serine tRNA. Indeed, Candida species contain a single tRNA with a CAG anticodon (Ser-tRNACAG) .
The general time line of the switch in using the leucine CUG codon for serine in fungi has already been investigated. Shortly, the unusual Ser-tRNACAG appeared about 270 million years (Ma) ago. However, the genera Candida (CUG codes for serine) and Saccharomyces (CUG codes for leucine) separated from each other about 180 Ma ago implying that the codon ambiguity remained for about 100 Ma in the ancestors of the yeasts [8, 9]. The ancestor of the Saccharomyces lost the mutant Ser-tRNACAG and retained the wild-type Leu-tRNACAG, while the ancestor of the Candida lost the Leu-tRNACAG and maintained the mutant Ser-tRNACAG changing the identity of the CUG codon from leucine to serine. A whole genome comparison showed that only a minor fraction of the CUG codons present in Candida albicans have equivalent CUGs in Saccharomyces cerevisiae implying that almost all original CUG codons disappeared in C.albicans.
However, the decoding cannot be derived unambiguously from the species names (e.g. “Candida” species exist all over the yeast tree [10–13]). In several taxonomically broad protein family analyses [14–16] we have observed that CUG positions are conserved within many of these sequences, that many mapped to structurally conserved residues and can thus often unambiguously assigned to either leucine (large hydrophobic residue, at alignment positions highly enriched in hydrophobic residues) or serine (small polar residue). These observations also suggest that the data can be used as reference for the assignment of the CUG codon translation to further sequenced yeast species in the future.
Here, we provide a tool with which it can fast and easily be determined whether a yeast species uses the Standard Codon Usage or the Alternative Yeast Codon Usage (AYCU). The tool is suitable for both data from whole genome projects and transcriptome analyses. The tool is also thought to provide a reference page for species using the AYCU. In addition, the tool allows easy examination of the correct decoding of existing annotated genes by translating mRNA using the Standard or Alternative Yeast Codon Usage and by verifying the translation of a given protein sequence via gene reconstruction in the respective species. The tool closes an important gap in yeast research because the NCBI taxonomy does not reflect the latest phylogeny, and the assignment of the genetic code at the NCBI webpages is wrong for many species. E.g. Lodderomyces elongisporus, Hyphopichia burtonii, Candida tenuis, and others are denoted as using the Standard Code instead of the AYCU, which is known for e.g. Lodderomyces and C.tenuis for many years.
The workflow for the verification of the CUG translation in a given protein sequence is shown in Figure 1B. Shortly, a gene reconstruction of the query protein in the selected species is performed with WebScipio . Optionally, the gene reconstruction can be performed with less stringent parameters, which include relaxed values for the parameters --minimal identity, −-maximal mismatches and --minimal score to allow prediction of less similar genes. This option is very useful if the respective query gene has not been derived from one of the reference species but a closely related one. The coding DNA obtained from the gene reconstruction is re-translated into protein sequence using the translation scheme of the selected species, which is already implemented in WebScipio.
For the translation of a provided mRNA into protein, the mRNA is first split into codons, which are then translated using the specified translation scheme. Extra nucleotides are ignored.
Identification and annotation of the reference data
Fungal actin and actin-related proteins, dynactin proteins, myosins, kinesins, dynein heavy chains, tubulins, actin-capping proteins CapZ, coronins and WASP homologs have been extracted from previously published datasets [14–16, 37]. The sequences were updated based on newer genome assemblies if necessary. The reference data for the other proteins and the species not included in the published datasets have essentially been obtained as described in . Shortly, the corresponding genes have been identified in TBLASTN searches starting with the respective protein sequence of homologs of Saccharomyces cerevisiae. The respective genomic regions were submitted to AUGUSTUS  to obtain gene predictions. However, feature sets are only available for a few species of the Saccharomycetes clade. Therefore, all hits were subsequently manually analysed at the genomic DNA level. When necessary, gene predictions were corrected by comparison with the homologs already included in the multiple sequence alignments.
Reference tRNACAG genes were predicted with tRNAscan-SE in 45 yeast species. Intron regions were removed and tRNAs aligned manually.
All sequence related data (protein names, corresponding species, sequences, and gene structure reconstructions) and references to genome sequencing centres are available at CyMoBase (http://www.cymobase.org, ). A list of the reference species and their abbreviations as used in the alignments and trees, as well as anamorph and alternative names can be accessed through the web server and as Additional file1. Additional file 1 also includes references to published genomes, and detailed information and acknowledgments of the respective sequencing centres. All gene structures for the reference dataset have been reconstructed with Scipio/WebScipio [36, 38].
Results and discussion
The first step in gene annotation is gene prediction, which can be done in a genome-wide scan or for single genes. Many gene prediction programs allow using different codon translation tables, but this option is not available in most of the gene prediction web interfaces. Even then, it would require the user to know a priori, whether the target organism belongs to the species not using the standard codon table. Especially the yeasts are confusing, as the “Candida” species are well known to use the AYCU in contrast to Saccharomyces cerevisiae. But this has only been shown for a few Candida species, including some of the most pathogenic, and many yeast species are called Candida although there is no monophyletic “Candida clade”. Codon decoding schemes cannot unambiguously be derived from single-gene studies because the respective gene might not contain the codon in question at all, or the respective amino acids are not at meaningful positions. Meaningful positions would be those that are strongly conserved in evolution and therefore in the core of the proteins or at conserved binding interfaces at the surface. In the course of our continuous efforts in identifying and annotating cytoskeletal and motor proteins [14–16, 37] we have already assembled and annotated 2071 sequences from 18 protein families in 79 yeast species. Some of the data has already been used to evaluate the CUG encoding in 60 completely sequenced yeasts (Mühlhausen and Kollmar, unpublished data). Here, these data are used as reference dataset in a pipeline for the prediction of the CUG codon translation, which can be accessed by users through a web interface.
The reference data
Currently, CyMoBase [23, 24], a database for manually assembled and annotated cytoskeletal and motor proteins, contains 26 protein families with annotated proteins in 79 yeast species. All sub-families of these protein families, like for example the α-, β-, and γ-tubulins, already existed in the last common ancestor of the opisthokonts or even the eukaryotes and are therefore treated as independent proteins. Not all protein families in CyMoBase have been analysed at the same depth, e.g. only two dynein light-intermediate chain proteins are available yet. Also, some sub-families like the class-17 myosins or the class-4 kinesins are only present in early diverging yeast species and not in e.g. Candida albicans or Saccharomyces cerevisiae. These proteins do not provide the necessary statistical basis and taxonomic sampling for a CUG prediction and were not included in the Bagheera reference data. Bagheera’s reference data thus consists of 18 protein families (38 independent proteins) comprising 2071 sequences. These data will increase in the future in the course of our continuous efforts in annotating cytoskeletal and motor proteins. Most of the reference proteins are considerably longer then the average yeast proteins like for example the myosins (1100 to 2400 amino acids) and the dynein heavy chain proteins (about 4000 amino acids), and the reference data therefore comprises significantly more data than the sole numbers of proteins and sequences might implicate (Additional files 1 and 2).
The web interface
Regarding the prediction of the most probable CUG translation in large-scale data, Bagheera does the following: i) The user provides genomic data, e.g. a genome assembly, transcriptome assembly or long-read EST data, or data from single to multiple gene analyses. ii) The tool predicts cytoskeletal and motor proteins and aligns the predicted sequences to the respective protein families. iii) The respective positions of the CUG codons of the predicted sequences are compared to the reference data. iv) The tool predicts tRNACAG and performs a sequence similarity search and sequence alignment with reference leu-tRNACAG and ser-tRNACAG genes.
The verification of the translation of a single sequence depends on the provided sequence. The translation of a given cDNA sequence is done as follows: i) The user provides an mRNA sequence and specifies the translation scheme. ii) The tool translates the mRNA into protein. For any given protein sequence the workflow is: i) The user provides a protein sequence and specifies the corresponding species. ii) The tool performs a gene reconstruction for the protein with WebScipio and re-translates the coding regions using the corresponding CUG translation scheme as provided by WebScipio. iii) CUG translations in the user-provided protein are compared to translations in the re-translated.
Care has to be taken when using transcriptome assembly or long-read EST data as input. Transcriptome data do not represent all coding regions of a species, which might lead to wrong assignment of predicted proteins to the reference data. For example, actin and the actin-related proteins are closely related, or the α-, β-, and γ-tubulins, or the members of the other multi-gene protein families. In the case that only actin genes are present in the transcriptome data these would be identified as closest homologs and aligned to all of the actin-related proteins of the reference data. While key residues for folding and ATP-binding are of course conserved between actin and all actin-related proteins, residues in loop regions are less or not at all conserved. Within the same sub-family these regions could also contain valuable information, because loops might be conserved within the entire sub-family. By aligning proteins from different sub-families this information might at best be lost or in the worst case even lead to contradicting results.
In the first step of the prediction pipeline, homologs to the proteins of the reference data need to be identified. Here, one gene is predicted for every protein family present in the reference data using TBLASTN  and AUGUSTUS-PPX [25, 26]. We choose BLAST as search algorithm because it is very fast and not as restrictive in terms of sequence homology as for example BLAT . To optimize the search and subsequent gene prediction the user can select one of the AUGUSTUS feature sets, which contain species-optimized parameters and are available for a number of yeast species. The reference proteins used in the BLAST search are taken from the species, which had been selected for the AUGUSTUS prediction. In most cases, the BLAST hits do not cover the entire genes but miss the N- and C-termini, and low complexity regions. In the latter case and in the case that genes are split into several exons, the search results in several BLAST hits belonging to the same gene. These partial hits are combined and extended in the 5' and 3' direction because AUGUSTUS gene predictions are significantly better when intergenic regions are included in the genomic regions.
Prediction of the CUG codon translation
Prediction of the tRNACAG
Independent support for the proposed translation scheme is provided by tRNA prediction, which is performed with tRNAscan-SE. Subsequently, a BLAST search against reference leu-tRNACAG and ser-tRNACAG genes indicates the most probable identity of the tRNACAG in the query data. In addition, the predicted tRNACAG is aligned with the reference data for visual inspection.
Any given protein sequence can be checked for correct translation of CUG codons. This is an important option for all users who obtained protein sequences from databases, which did not resolve the codon translation yet. For example, at species genome project homepages gene annotations are often provided with non-uniform translations. All CUG codons are highlighted, differences between the correct and the given translation are indicated. If the given translation is partially incorrect, the correctly translated protein sequence can be downloaded in fasta-format.
Details of the CUG codon prediction in the genomes of six yeast species
Proteins with CUG
CUG position conservation
Std codon usage
Std codon usage
Candida bracarensis CBS 10154
Candida castellii CBS 4332
Candida maltosa Xu316
Candida nivariensis CBS 9983
Nakaseomyces bacillisporus CBS 7720
Nakaseomyces delphensis CBS 2170
Limits of Bagheera
Possible limits of the tool might be that the query genes do not contain CUG codons and that the database contains only 18 protein families with 38 independent proteins. However, the proteins of most families of the reference data, e.g. myosins and dyneins, are very long and every of these proteins contains at least a few CUG codons (Figure 3). A whole genome sequence analyse will therefore always provide enough data for unambiguous assignment of the codon usage. Actins and tubulins also belong to the most widely used proteins for species phylogenies (e.g. recent analyses: [42–45]) and because of their high abundance in the cell it is highly likely that they are included in transcriptome assembly data and small-scale analyses. Although the presence of a leu-tRNACAG or ser-tRNACAG gene is a very strong indication for the Standard or AYCU, these genes are often not present in the genomes (e.g. in Saccharomyces cerevisiae) or might contain extremely long introns of more than 250 nucleotides hindering their identification and prediction.
With this software we demonstrated that the most probable codon translation scheme for a given yeast genome can be determined by predicting motor and cytoskeletal proteins and comparing them to reference data. In total, 2071 sequences from 38 proteins belonging to 79 yeast species were included in the reference data providing a two-fold basis for the prediction of the most probable translation scheme: the amino acid composition at CUG positions and the conservation of CUG positions. The presence of hydrophobic amino acids in the reference data suggests the translation of the predicted CUG codons as leucine, while polar and small amino acids suggest their translation as serine. In addition, matching of CUG codons in the predicted genes with CUG codons in the reference data provides further support for the standard or alternative yeast codon usage. This information was implemented into a CUG codon prediction pipeline accessible via a web server called Bagheera. The predictive power of this implementation was demonstrated by a case study of the genomes of six Saccharomycetes species. In addition, the webserver offers the possibility to verify the translation of the CUG codons in any given protein sequence. Moreover, the webserver can be used as reference for the translation scheme used by individual yeast species.
Project name: Bagheera – Predicting CUG codon translation in yeasts
Project home page: http://www.motorprotein.de/bagheera
Operating system: Platform independent
Programming language: Ruby
Licence: The source code for the web application and a command line tool can be obtained upon request and used under a Creative Commons License.
Any restrictions to use by non-academics: No.
We would like to thank Dr. Björn Hammesfahr for help with CyMoBase, Marcel Hellkamp for help with Lucullus, Fabian Meyer for help with Ajax, and Prof. Christian Griesinger for his continuous generous support. This project has been funded by grant KO 2251/13-1 of the Deutsche Forschungsgemeinschaft (DFG) and was partly supported by the Göttingen Graduate School of Neurosciences and Molecular Biosciences (DFG Grants GSC 226/1 and GSC 226/2).
- Jukes TH, Osawa S, Muto A, Lehman N: Evolution of anticodons: variations in the genetic code. Cold Spring Harb Symp Quant Biol. 1987, 52: 769-776. 10.1101/SQB.1987.052.01.086.PubMedView ArticleGoogle Scholar
- Jukes TH, Osawa S: Evolutionary changes in the genetic code. Comp Biochem Physiol B. 1993, 106: 489-494. 10.1016/0300-9629(93)90243-W.PubMedView ArticleGoogle Scholar
- Osawa S, Jukes TH: Codon reassignment (codon capture) in evolution. J Mol Evol. 1989, 28: 271-278. 10.1007/BF02103422.PubMedView ArticleGoogle Scholar
- Schultz DW, Yarus M: Transfer RNA mutation and the malleability of the genetic code. J Mol Biol. 1994, 235: 1377-1380. 10.1006/jmbi.1994.1094.PubMedView ArticleGoogle Scholar
- Ohama T, Suzuki T, Mori M, Osawa S, Ueda T, Watanabe K, Nakase T: Non-universal decoding of the leucine codon CUG in several Candida species. Nucleic Acids Res. 1993, 21: 4039-4045. 10.1093/nar/21.17.4039.PubMed CentralPubMedView ArticleGoogle Scholar
- Pesole G, Lotti M, Alberghina L, Saccone C: Evolutionary Origin of Nonuniversal Cug(ser) Codon in Some Candida Species as Inferred from a Molecular Phylogeny. Genetics. 1995, 141: 903-907.PubMed CentralPubMedGoogle Scholar
- Sugita T, Nakase T: Non-universal usage of the leucine CUG codon and the molecular phylogeny of the genus Candida. Syst Appl Microbiol. 1999, 22: 79-86. 10.1016/S0723-2020(99)80030-7.PubMedView ArticleGoogle Scholar
- Yokogawa T, Suzuki T, Ueda T, Mori M, Ohama T, Kuchino Y, Yoshinari S, Motoki I, Nishikawa K, Osawa S: Serine tRNA complementary to the nonuniversal serine codon CUG in Candida cylindracea: evolutionary implications. Proc Natl Acad Sci USA. 1992, 89: 7408-7411. 10.1073/pnas.89.16.7408.PubMed CentralPubMedView ArticleGoogle Scholar
- Massey SE, Moura G, Beltrão P, Almeida R, Garey JR, Tuite MF, Santos MAS: Comparative evolutionary genomics unveils the molecular mechanism of reassignment of the CTG codon in Candida spp. Genome Res. 2003, 13: 544-557. 10.1101/gr.811003.PubMed CentralPubMedView ArticleGoogle Scholar
- Kurtzman CP, Suzuki M: Phylogenetic analysis of ascomycete yeasts that form coenzyme Q-9 and the proposal of the new genera Babjeviella, Meyerozyma, Millerozyma, Priceomyces, and Scheffersomyces. Mycoscience. 2010, 51: 2-14. 10.1007/S10267-009-0011-5.View ArticleGoogle Scholar
- Kurtzman CP: Phylogeny of the ascomycetous yeasts and the renaming of Pichia anomala to Wickerhamomyces anomalus. Antonie Van Leeuwenhoek. 2011, 99: 13-23. 10.1007/s10482-010-9505-6.PubMedView ArticleGoogle Scholar
- Kurtzman C, Fell JW, Boekhout T: The Yeasts: A Taxonomic Study. 2011, London, Burlington, San Diego: Elsevier, 5Google Scholar
- Kurtzman CP, Robnett CJ: Relationships among genera of the Saccharomycotina (Ascomycota) from multigene phylogenetic analysis of type species. FEMS Yeast Res. 2013, 13: 23-33. 10.1111/1567-1364.12006.PubMedView ArticleGoogle Scholar
- Odronitz F, Kollmar M: Drawing the tree of eukaryotic life based on the analysis of 2,269 manually annotated myosins from 328 species. Genome Biol. 2007, 8: R196-10.1186/gb-2007-8-9-r196.PubMed CentralPubMedView ArticleGoogle Scholar
- Eckert C, Hammesfahr B, Kollmar M: A holistic phylogeny of the coronin gene family reveals an ancient origin of the tandem-coronin, defines a new subfamily, and predicts protein function. BMC Evol Biol. 2011, 11: 268-10.1186/1471-2148-11-268.PubMed CentralPubMedView ArticleGoogle Scholar
- Hammesfahr B, Kollmar M: Evolution of the eukaryotic dynactin complex, the activator of cytoplasmic dynein. BMC Evol Biol. 2012, 12: 95-10.1186/1471-2148-12-95.PubMed CentralPubMedView ArticleGoogle Scholar
- Butler G, Rasmussen MD, Lin MF, Santos MAS, Sakthikumar S, Munro CA, Rheinbay E, Grabherr M, Forche A, Reedy JL, Agrafioti I, Arnaud MB, Bates S, Brown AJP, Brunke S, Costanzo MC, Fitzpatrick DA, de Groot PWJ, Harris D, Hoyer LL, Hube B, Klis FM, Kodira C, Lennard N, Logue ME, Martin R, Neiman AM, Nikolaou E, Quail MA, Quinn J, et al: Evolution of pathogenicity and sexual reproduction in eight Candida genomes. Nature. 2009, 459: 657-662. 10.1038/nature08064.PubMed CentralPubMedView ArticleGoogle Scholar
- Ruby Programming Language. http://www.ruby-lang.org/,
- Ruby on Rails. http://rubyonrails.org,
- PyBioMaps is a framework to manage and visualize scientific data in a browser. http://pypi.python.org/pypi/PyBioMaps,
- Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: A New Generation of Protein Database Search Programs. Nucleic Acids Res. 1997, 25: 3389-3402. 10.1093/nar/25.17.3389.PubMed CentralPubMedView ArticleGoogle Scholar
- Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL: BLAST+: architecture and applications. BMC Bioinformatics. 2009, 10: 421-10.1186/1471-2105-10-421.PubMed CentralPubMedView ArticleGoogle Scholar
- Odronitz F, Kollmar M: Pfarao: a web application for protein family analysis customized for cytoskeletal and motor proteins (CyMoBase). BMC Genomics. 2006, 7: 300-10.1186/1471-2164-7-300.PubMed CentralPubMedView ArticleGoogle Scholar
- CyMoBase - a database for cytoskeletal and motor proteins. http://www.cymobase.org,
- Stanke M, Waack S: Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics. 2003, 19 (2): ii215-ii225.PubMedGoogle Scholar
- Keller O, Kollmar M, Stanke M, Waack S: A novel hybrid gene prediction method employing protein multiple sequence alignments. Bioinformatics. 2011, 27 (6): 757-763. 10.1093/bioinformatics/btr010.PubMedView ArticleGoogle Scholar
- Katoh K, Frith MC: Adding unaligned sequences into an existing alignment using MAFFT and LAST. Bioinformatics. 2012, 28: 3144-3146. 10.1093/bioinformatics/bts578.PubMed CentralPubMedView ArticleGoogle Scholar
- Needleman SB, Wunsch CD: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970, 48: 443-453. 10.1016/0022-2836(70)90057-4.PubMedView ArticleGoogle Scholar
- Gotoh O: An improved algorithm for matching biological sequences. J Mol Biol. 1982, 162: 705-708. 10.1016/0022-2836(82)90398-9.PubMedView ArticleGoogle Scholar
- Smith TF, Waterman MS: Identification of common molecular subsequences. Journal of Molecular Biology. 1981, 147: 195-197. 10.1016/0022-2836(81)90087-5.PubMedView ArticleGoogle Scholar
- Hirschberg DS: A linear space algorithm for computing maximal common subsequences. Commun ACM. 1975, 18: 341-343. 10.1145/360825.360861.View ArticleGoogle Scholar
- Döring A, Weese D, Rausch T, Reinert K: SeqAn An efficient, generic C++ library for sequence analysis. BMC Bioinformatics. 2008, 9: 11-10.1186/1471-2105-9-11.PubMed CentralPubMedView ArticleGoogle Scholar
- Talavera G, Castresana J: Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst Biol. 2007, 56: 564-577. 10.1080/10635150701472164.PubMedView ArticleGoogle Scholar
- Price MN, Dehal PS, Arkin AP: FastTree 2–approximately maximum-likelihood trees for large alignments. PLoS ONE. 2010, 5: e9490-10.1371/journal.pone.0009490.PubMed CentralPubMedView ArticleGoogle Scholar
- Lowe TM, Eddy SR: tRNAscan-SE: A Program for Improved Detection of Transfer RNA Genes in Genomic Sequence. Nucl Acids Res. 1997, 25: 0955-0964. 10.1093/nar/25.5.0955.View ArticleGoogle Scholar
- Odronitz F, Pillmann H, Keller O, Waack S, Kollmar M: WebScipio: an online tool for the determination of gene structures using protein sequences. BMC Genomics. 2008, 9: 422-10.1186/1471-2164-9-422.PubMed CentralPubMedView ArticleGoogle Scholar
- Kollmar M, Lbik D, Enge S: Evolution of the eukaryotic ARP2/3 activators of the WASP family: WASP, WAVE, WASH, and WHAMM, and the proposed new family members WAWH and WAML. BMC Res Notes. 2012, 5: 88-10.1186/1756-0500-5-88.PubMed CentralPubMedView ArticleGoogle Scholar
- Keller O, Odronitz F, Stanke M, Kollmar M, Waack S: Scipio: using protein sequences to determine the precise exon/intron structures of genes and their orthologs in closely related species. BMC Bioinformatics. 2008, 9: 278-10.1186/1471-2105-9-278.PubMed CentralPubMedView ArticleGoogle Scholar
- Kent WJ: BLAT–the BLAST-like alignment tool. Genome Res. 2002, 12: 656-664. 10.1101/gr.229202. Article published online before March 2002.PubMed CentralPubMedView ArticleGoogle Scholar
- Gabaldón T, Martin T, Marcet-Houben M, Durrens P, Bolotin-Fukuhara M, Lespinet O, Arnaise S, Boisnard S, Aguileta G, Atanasova R, Bouchier C, Couloux A, Creno S, Cruz JA, Devillers H, Enache-Angoulvant A, Guitard J, Jaouen L, Ma L, Marck C, Neuvéglise C, Pelletier E, Pinard A, Poulain J, Recoquillay J, Westhof E, Wincker P, Dujon B, Hennequin C, Fairhead C: Comparative genomics of emerging pathogens in the Candida glabrata clade. BMC Genomics. 2013, 14: 623-10.1186/1471-2164-14-623.PubMed CentralPubMedView ArticleGoogle Scholar
- Sugiyama H, Ohkuma M, Masuda Y, Park SM, Ohta A, Takagi M: In vivo evidence for non-universal usage of the codon CUG in Candida maltosa. Yeast. 1995, 11: 43-52. 10.1002/yea.320110106.PubMedView ArticleGoogle Scholar
- Tsui CKM, Daniel H-M, Robert V, Meyer W: Re-examining the phylogeny of clinically relevant Candida species and allied genera based on multigene analyses. FEMS Yeast Res. 2008, 8: 651-659. 10.1111/j.1567-1364.2007.00342.x.PubMedView ArticleGoogle Scholar
- Sekimoto S, Rochon D, Long JE, Dee JM, Berbee ML: A multigene phylogeny of Olpidium and its implications for early fungal evolution. BMC Evol Biol. 2011, 11: 331-10.1186/1471-2148-11-331.PubMed CentralPubMedView ArticleGoogle Scholar
- Verkley GJM, Quaedvlieg W, Shin H-D, Crous PW: A new approach to species delimitation in Septoria. Stud Mycol. 2013, 75: 213-305.PubMed CentralPubMedView ArticleGoogle Scholar
- Hoffmann K, Pawłowska J, Walther G, Wrzosek M, de Hoog GS, Benny GL, Kirk PM, Voigt K: The family structure of the Mucorales: a synoptic revision based on comprehensive multigene-genealogies. Persoonia. 2013, 30: 57-76. 10.3767/003158513X666259.PubMed CentralPubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.