Recent dermatophyte divergence revealed by comparative and phylogenetic analysis of mitochondrial genomes

Background Dermatophytes are fungi that cause superficial infections of the skin, hair, and nails. They are the most common agents of fungal infections worldwide. Dermatophytic fungi constitute three genera, Trichophyton, Epidermophyton, and Microsporum, and the evolutionary relationships between these genera are epidemiologically important. Mitochondria are considered to be of monophyletic origin and mitochondrial sequences offer many advantages for phylogenetic studies. However, only one complete dermatophyte mitochondrial genome (E. floccosum) has previously been determined. Results The complete mitochondrial DNA sequences of five dermatophyte species, T. rubrum (26,985 bp), T. mentagrophytes (24,297 bp), T. ajelloi (28,530 bp), M. canis (23,943 bp) and M. nanum (24,105 bp) were determined. These were compared to the E. floccosum sequence. Mitochondrial genomes of all 6 species were found to harbor the same set of genes arranged identical order indicating that these dermatophytes are closely related. Genome size differences were largely due to variable lengths of non-coding intergenic regions and the presence/absence of introns. Phylogenetic analyses based on complete mitochondrial genomes reveals that the divergence of the dermatophyte clade was later than of other groups of pathogenic fungi. Conclusion This is the first systematic comparative genomic study on dermatophytes, a highly conserved and recently-diverged lineage of ascomycota fungi. The data reported here provide a basis for further exploration of interrelationships between dermatophytes and will contribute to the study of mitochondrial evolution in higher fungi.

the most common species in hospital isolates (72-95%) [4,5]. Morbidity is less commonly associated with M. nanum while T. ajelloi is a geophilic fungus that only rarely infects human.
Mitochondria are generally accepted as descendants of endosymbiotic alpha-proteobacteria [6,7] and are considered to be of monophyletic origin [8][9][10]. As vital physiological processes and basic adaptive strategies do not always correlate with trees derived from ribosomal sequences [11,12], mitochondrial DNA (mtDNA) sequences have become a popular tool for phylogenetic studies. Individual gene sequences often contain a limited number of informative sites and can lead to incongruent phylogenetic trees. In contrast, entire mitochondrial genomes tend to produce reliable phylogenetic trees [11][12][13]. Despite the emergence of new technologies for rapid DNA sequence determination, sequencing of complete mtDNAs is still more feasible and economical than whole-genome sequencing. Furthermore, complete mtDNA sequences reveal gene content, order and position, and provide further information regarding introns and intergenic regions [10].
The number of mitochondrial genomes sequenced has increased greatly over the past decade, notably through the interdisciplinary collaboration of the Organelle Genome Megasequencing Program http:// megasun.bch.umontreal.ca/ogmp. Some thousands of complete mtDNA sequences are already available from taxonomically diverse organisms including fungi, plants and animals. This resource provides an unprecedented insights into the origin and evolution of the mitochondrial genome [13]. In comparison to the genomes of freeliving alpha-proteobacteria, the number of genes contained within the modern mitochondrial genome has been greatly reduced. It is inferred that many previously functional genes have been transferred to the nucleus; others appear to have been replaced by pre-existing nuclear genes of similar function [14]. Moreover, recent studies have suggested that positive selection plays a role in mitochondrial evolution [15][16][17] while mtDNA polymorphisms are thought to be maintained within populations via selection on the joint mitochondrial-nuclear genotype [15].
Nevertheless, only one sequence is derived from a dermatophyte, i.e. E. floccosum [12]. Indeed, the dearth of publicly available genomic data is a major barrier to biomedical research on dermatophytes [23].
We report here complete mtDNA sequences for 5 dermatophytes including 3 species of Trichophyton (T. rubrum, T. mentagrophytes and T. ajelloi) and 2 species of Microsporum (M. canis and M. nanum). These sequences, with the previously reported E. floccosum mtDNA sequence [12], have permitted systematic comparative analysis of dermatophytes. The mitochondrial genomes of dermatophytes are highly conserved, indicating that these superficial fungi are closely related. Furthermore, phylogenetic analysis based on complete mitochondrial genomes has revealed that dermatophytes arose very late among the ascomycota fungi. This is the first comparative genomic study on dermatophytes and will provide valuable insights into the genomics and phylogeny of this important group of fungal pathogens.

General features
The mitochondrial sequences of T. rubrum, T. mentagrophytes, T. ajelloi, M. canis and M. nanum are circular DNAs of 26,985 bp, 24,297 bp, 28,530 bp, 23,943 bp and 24,105 bp, respectively (Fig. 1). The mitochondrial genome of M. canis is the smallest of known dermatophytes, while the previously determined E. floccosum is the largest and exceeds 30 kb [12]. The size differences between mitochondrial genome sizes are primarily due to interspecific variation in intergenic regions and introns (see below).
With the exception of T. rubrum and E. floccosum that carry several additional hypothetical genes, each genome encodes 15 conserved protein-coding genes, 2 rRNA genes (rnl and rns), and 25 tRNA genes (Table 1, Fig. 1). The dermatophyte mitochondrial genomes are highly compact: structural genes account for >70% of each genome ( Table  1). All genes are encoded on the same DNA strand. Moreover, all the mitochondrial genomes retain colinearity at the nucleotide level without detectable DNA rearrangement (Fig. 2). The high conservation of genome structure indicates that these dermatophytes are closely related.
The overall G+C content of the 5 genomes is ~24% (Table  1) consistent with the characteristic AT-rich nature of fungal mitochondrial genomes. The G+C content of genomic regions encoding RNA genes is usually higher than the genome average (Table 1) while the majority of proteincoding genes, with exception of cox1, have a lower G+C content (Fig. 1). The unusual G+C content of cox1 may reflect its unusual location sandwiched between 2 tRNA gene clusters.
Synonymous base substitutions (dS) are considered to be selectively neutral while substitutions causing amino acid substitution, or non-synonymous substitutions (dN), are almost always adaptive mutations. The dN/dS ratio therefore affords an index of selective pressures operating on protein-coding genes. Because selection for adaptive amino acid substitutions increases dN, a dN/dS ratio of >1 is found in genes subject to selection for change (positive selection). In contrast, a dN/dS ratio of <1 indicates conservation of essential amino acid sequences (negative selection). We calculated the dN/dS ratio for each of the 15 dermatophyte protein-coding gene. This revealed that all genes are subject to strong negative selection; no statistically significant sites of positive selection were found (see Additional file 1).
The genomic organization of the 15 conserved proteincoding genes in mtDNA is surprisingly identical across all Circular representations of the dermatophyte mitochondrial genomes Figure 1 Circular representations of the dermatophyte mitochondrial genomes. Protein-coding genes are represented by purple arrows. rRNA and tRNA genes are indicated by green blocks and blue triangles respectively. G+C contents with a window size of 1 kb are shown in yellow (higher than genome average) or orange (lower than average) curves. The inner scale is marked at 5 kb intervals.
Pairwise linear genome comparisons of dermatophyte mtDNAs  the known dermatophytes (Fig. 2). Furthermore, the same gene order is also largely conserved in other pathogenic filamentous fungi such as Penicillium marneffei and Aspergillus niger that cause invasive infections [12]. However, the order of these conserved genes is markedly rearranged in the mtDNA of the opportunistic pathogen Candida albicans [24] (Fig. 3). The similarity of mtDNA gene order between different fungal species was found to correlate with their evolutionary relationships inferred from phylogenetic analysis (see below).
The large (rnl) and small (rns) ribosomal RNA genes are present in all dermatophyte mitochondrial genomes at the same relative locations. The rps5 gene encodes a ribosomal protein that may play a role in maintaining the integrity of the mitochondrial genome [25]; this gene is present within the intron of rnl as expected. The intronic location of this ribosomal protein is therefore maintained in all known dermatophyte mitochondrial genomes.
The previously described nad4L-nad5 consecutive gene unit was observed in all 5 mitochondrial genomes. This gene pair is believed to be present in all ascomycota as well as the basidiomycota and zygomycota, but is interrupted by one or more genes in chytridiomycota. This may reflect early phylogenetic divergence of these fungi from the ascomycota [26]. Interestingly, the additional continuous gene unit, nad4L-nad5-nad2, characterized by one base overlap and no interruption in gene junctions, is shared by all dermatophyte mitochondrial genomes with the exception of the 2 earlier divergent species, M. canis and T. ajelloi (see phylogenetic section below). The arrangement of such uninterrupted blocks may reflect strong conservation during evolution from a common ancestor [27]; more dermatophytes may be expected to carry the consecutive nad4L-nad5-nad2 gene unit.

tRNAs and codon usage
All 5 mitochondrial genomes encode the same set of 25 tRNA species with the potential to deliver all 20 amino acids. Multiple tRNA isoacceptors exist for only leucine, serine, arginine and methionine. Interestingly, the organization of the 25 tRNA genes is conserved between all the mitochondrial genomes. The majority of tRNAs (23 of 25) are grouped into 3 clusters flanked by the atp8-rnl-cox1-atp9 gene cluster and containing 8, 3 and 12 tRNAs, respectively ( Fig. 1). The exceptions are T. rubrum and E. floccosum where the largest tRNA cluster between rnl and cox1 is interrupted and subdivided into two sub-clusters by a ~2.5 kb DNA fragment containing additional hypothetical genes (Fig. 2).
tRNA genes have been proposed to play a role in gene shuffling as they are located between protein-coding genes and can act as mobile elements [28]. Homology between isolated tRNA genes may permit genetic recom-bination, and therefore genetic rearrangement, while gene shuffling is less likely to take place via recombination between conserved clusters of tRNA genes [29,30]. The low number of isolated tRNA genes (2 of 25) may therefore contribute to the high conservation of dermatophyte mtDNA.
All mitochondrial protein-coding genes commence with a classical methionine codon (ATG) and terminate with TAA, the preferred fungal mitochondrial termination codon [31]. The only exception is atp9 where TGA is the translation stop codon. Table 2 summarizes codon usages for all protein-coding genes in the 5 mitochondrial genomes. The most frequently used codons are TTA (Leu), ATA (Ile), and TTT (Phe), accounting for about one third of all codons, and indicating a clear preference for amino acids with nonpolar side chains. This is likely to reflect the fact that the majority of the encoded polypeptides are integral membrane proteins. As expected from the AT-rich nature of dermatophyte mtDNA, the third position of the most frequently used codons is strongly biased towards A or T (Table 2). Furthermore, the most rarely-used codons, appearing no more than 5 times in all genomes (TGC, TGG, GGC, CTG, CTC, AGG, CGA, CCG, CCC and ACC), or absent (CGC, CGG, and ACG), all contain 2 or more G/ C nucleotides in each codon.

Intergenic regions and introns
The dermatophyte genomes are highly compact with short intergenic regions. In M. canis there is no intergenic region >1 kb, while only one intergenic region exceeds 1 kb in the other genomes. Because protein-coding genes and RNAs are highly conserved between species, the differences in mitochondrial genome sizes between the different dermatophytes are largely explained by length variation in intergenic regions. The longest intergenic region in M. canis is only 919 nt while all other intergenic regions in this species are under 500 nt. In contrast, T. ajelloi mtDNA contains 4 long intergenic regions (>500 bp) and the longest (located at the nad2-cob gene junction) spans >2.5 kb (Fig. 1). This divergence in intergenic regions explains the >4 kb difference in the genome sizes of T. ajelloi and M. canis.
mtDNA intergenic regions diverge not only in size but also in sequence. Indeed, multiple pairwise alignment of the different genomes indicates that colinearities are interrupted almost exclusively in intergenic regions (Fig. 2). However, many intergenic regions were found to be conserved between T. rubrum and T. mentagrophytes, consistent with the close evolutionary relationship of these 2 species as revealed by phylogenetic analysis (Fig. 4).
Remarkably, the longest intergenic region in all genomes is located between cob and nad3 (with the exception of T. ajelloi; see above), but pairwise alignment revealed little similarity between the cob-nad3 intergenic sequences of the different species as indicated by the blank space in the linear comparison view (Fig. 2). Such divergent intergenic regions may serve as potential genetic markers for species/ strain identification of dermatophytes.
In contrast, all the dermatophyte mtDNA gene sequences harbor a single intron (within the rnl gene), with the exception of T. ajelloi that carries an additional group I intron (801 bp) within nad1 (Fig. 1). However, previous mtDNA sequence studies revealed that another strain of T. rubrum (IP1817.89) harbors 2 additional introns within nad1 and atp9, respectively [33]. Moreover, the E. floccosum mitochondrial genome carries 4 additional introns located within in cox1, nad5 (2 introns) and cob [12]. The number of introns is therefore strain-specific and also contributes to variation in mitochondrial genome size.

Phylogenetic analysis
Broad phylogenetic trees of fungi were previously constructed based on rDNA [34][35][36] or nuclear protein-coding genes [37,38] but these studies did not permit the elucidation of higher-order relationships. A combination of 6 gene regions was recently employed to construct a fungal phylogenetic tree comprising ~200 species [39]. Unfortunately, no dermatophytes were included in this study. We therefore performed phylogenetic analysis based on the complete mitochondrial genomes of 35 species of ascomycota, including 6 dermatophytes, 12 other filamentous fungi, and 17 yeasts (Fig. 4). The high bootstrap values of most nodes indicate the robustness of the tree computed. Fungal species of ascomycota are clustered into 3 distinct groups corresponding to subphyla Pezizomycotina (filamentous fungi), Saccharomycotina (budding yeast) and Taphrinomycotina (fission yeast) respectively (Fig. 4). This confirmed the reliability of mtDNA sequences in fungal phylogenetic analysis.
Interestingly, the tree reported here divides the clade of filamentous fungi into 2 subgroups (Fig. 4). With only a few exceptions, the dermatophytes cluster together with invasive pathogenic fungi of humans and animals, while the other filamentous fungi, mostly pathogens of plants or insects, form a parallel branch (Fig. 4). This suggests that host adaptation has driven the evolution of filamentous fungi. Indeed, previous phylogenetic studies revealed separation between anthropophilic and geophilic species of Trichophyton [40] suggesting that ecology is a particularly strong driver of dermatophyte evolution [41].
In the tree established here all the dermatophytes species clustered into a single branch, confirming the monophyletic origin of the dermatophyte lineage. Aspergillus [42] and P. marneffei [43] comprise a separate branch that shares an immediate ancestor with the dermatophyte group (Fig. 4). However, the 2 sister branches of human pathogenic fungi (causing superficial and invasive infections respectively) are represented by distinct patterns in the phylogenetic tree. The dermatophytes as a group show far less divergence but longer ancestral branch than the Aspergillus-Penicillium clade (Fig. 4). This indicates the divergence from the latest common ancestor of dermatophytes was later than the Aspergillus-Penicillium group.
Fossil evidence has allowed dating of the emergence of the ascomycota [44]. Based on this calibration, the dermatophyte lineage may be estimated to have diverged from other fungi at about 32 million years ago (Ma). This result    Maximum likelihood phylogenetic tree based on concatenated mitochondrial proteins is consistent with a previous rough estimate (~50 Ma) based on nucleotide substitution rates in the small ribosomal subunit RNA [45]. However, the timing of the radiation of the dermatophytes is much later than the divergence of Candida and Saccharomyces at 723 Ma as previously estimated using 20-188 protein sequences [46]. The high conservation of the dermatophyte mitochondrial genome also suggests that the different dermatophytes diverged only recently.
Conventional phenotypic taxonomy has divided the dermatophytes into 3 genera: Trichophyton, Microsporum and Epidermophyton [47]. Though only a limited number of dermatophyte species were included in the present study, the phylogenetic tree established here does not follow this genus demarcation (Fig. 4). Indeed, recent molecular phylogenetic studies have revealed that both Trichophyton and Microsporum are paraphyletic [48], prompting reevaluation of the phylogenetic relationships between different dermatophytes [41]. Remarkably, the divergence of T. ajelloi from the inferred common ancestor was much earlier than of the other dermatophyte species (Fig. 4). This is consistent with the geophilic features of T. ajelloi: the soil environment may have afforded an early ecological niche for all dermatophyte species prior to more recent adaptation to specialized hosts including animals and humans. An earlier study based on 25S rRNA sequences reported that T. ajelloi and T. terrestre (not included in the present study) are separated from the 'true dermatophyte' [49] and further support the suggestion that Microsporum, as well as the zoophilic and anthropophilic Trichophyton species, evolved from a geophilic member of Trichophyton [48].

Conclusion
Previous studies into the evolutionary relationships between dermatophyte species have been based on nuclear ribosomal internal transcribed spacers (ITS) [40,50,51], large ribosomal RNA subunits (LSU) [49], chitin synthase (CHS) [52][53][54]and DNA topoisomerase II genes [55], as well as on PCR fingerprinting [56] and restriction fragment length polymorphism (RFLP) [57] analysis of mitochondrial DNA. The dermatophytes were found to constitute a homogeneous group of species with low genetic diversity contrasting with high phenotypic heterogeneity [41,58]. We now report comparative analysis of 6 complete mitochondrial genomes from all 3 dermatophyte genera (Trichophyton, Microsporum and Epidermophyton). The composition and organization of genes within the mtDNAs of all dermatophytes analyzed was found to be substantially identical, reinforcing the view that dermatophytes are closely related and constitute a highly conserved lineage of filamentous fungi.
Comparative genomics provides a powerful tool for uncovering similarities and differences between species.
The present study represents the first application of systematic comparative genomics to dermatophyte phylogeny. The common features shared by all (or the majority) of dermatophyte mitochondrial genomes are as follows.
1. Retention of genome colinearity with high nucleotide sequence similarity (>90%) in coding regions; differences between species are largely restricted to introns and intergenic regions. 2. Strict conservation of the nad4L-nad5 gene unit; the consecutive nad4L-nad5-nad2 gene unit is also probably present in most dermatophytes. 3. The presence of 3 tRNA gene clusters of identical composition with 2 isolated tRNA genes at identical locations in all dermatophyte mtDNAs. 4. A characteristic rarity of intronic sequences compared to other fungal species.
Phylogenetic analysis has confirmed the monophyletic origin of dermatophytic fungi; these form a distinct clade among filamentous fungi. Compared with other pathogenic fungi such as those causing invasive infections, dermatophytes comprise a closely-related and recentlydiverged lineage of ascomycota fungi. The genomic data presented here will allow further exploration of the relationships between different dermatophyte species and will be of general utility in the study of mitochondrial evolution in higher fungi. Culture conditions and harvesting of mycelia were as described previously [59]. Mycelia were ground to powder in liquid nitrogen and samples were transferred to liquid nitrogen-cooled 15 ml falcon tubes.  [64] comparison of translations with the GenBank non-redundant protein database and manual curation. Ribosomal RNA genes were identified by comparison with the published rRNA sequences of E. floccosum (GenBank accession: AY916130). Transfer RNA genes were identified using the tRNAscan-SE program [65].

Comparative genomics and phylogenetic analysis
Genomic comparisons of dermatophyte mtDNAs employed GenomeComp [66]. Orthologs between the mitochondrial genomes of T. rubrum and C. albicans were identified by bidirectional BLASTP comparisons. Fourteen of the 15 conserved proteins (excluding Rps5) were used for whole mitochondrial genome-based phylogenetic analysis of 18 filamentous fungi (including 6 dermatophytes) and 17 yeasts. The sequences of the selected proteins were extracted from the fungal mitochondrial genomes in the GenBank database. Protein sequence alignment was carried out for each individual protein using ClustalW [67]. Multi-alignments were then manually checked and trimmed with BioEdit (version 6.0, by Tom Hall, Department of Microbiology, North Carolina State University, Raleigh). The Datamonkey server was used to calculate the mean dN/dS values of protein-coding genes for dermatophytes [68].
The dataset, a concatenation of 14 proteins comprising 4,298 amino acids, was analyzed by TREE-PUZZLE software [69] to construct the maximum likelihood (ML) tree. Before tree construction, the ProTest software [70] was used to test and determine optimal model-fitting of the sequence data. The WAG model was adopted as optimal selection. The heterogeneity rate was estimated by gamma distribution with 8 rate categories and the α-parameter was estimated from the dataset. Reliability of the dataset was assessed by bootstrap. One hundred permutation datasets were generated using the SEQBOOT program from the PHYLIP package (version 3.68, by Joe Felsenstein, Department of Genome Sciences, University of Washington, Seattle). For each of the 100 datasets a ML tree was constructed using the same parameters as described above. TREE-PUZZLE was then used with the 'consensus of user-defined trees' option to generate a consensus tree. Using the 400 Ma ascomycota fossil [44] as a primary calibration point the dating of dermatophyte divergence was estimated using MEGA 4.0 software [71].