Sequencing of mitochondrial genomes of nine Aspergillus and Penicillium species identifies mobile introns and accessory genes as main sources of genome size variability

Background The genera Aspergillus and Penicillium include some of the most beneficial as well as the most harmful fungal species such as the penicillin-producer Penicillium chrysogenum and the human pathogen Aspergillus fumigatus, respectively. Their mitochondrial genomic sequences may hold vital clues into the mechanisms of their evolution, population genetics, and biology, yet only a handful of these genomes have been fully sequenced and annotated. Results Here we report the complete sequence and annotation of the mitochondrial genomes of six Aspergillus and three Penicillium species: A. fumigatus, A. clavatus, A. oryzae, A. flavus, Neosartorya fischeri (A. fischerianus), A. terreus, P. chrysogenum, P. marneffei, and Talaromyces stipitatus (P. stipitatum). The accompanying comparative analysis of these and related publicly available mitochondrial genomes reveals wide variation in size (25–36 Kb) among these closely related fungi. The sources of genome expansion include group I introns and accessory genes encoding putative homing endonucleases, DNA and RNA polymerases (presumed to be of plasmid origin) and hypothetical proteins. The two smallest sequenced genomes (A. terreus and P. chrysogenum) do not contain introns in protein-coding genes, whereas the largest genome (T. stipitatus), contains a total of eleven introns. All of the sequenced genomes have a group I intron in the large ribosomal subunit RNA gene, suggesting that this intron is fixed in these species. Subsequent analysis of several A. fumigatus strains showed low intraspecies variation. This study also includes a phylogenetic analysis based on 14 concatenated core mitochondrial proteins. The phylogenetic tree has a different topology from published multilocus trees, highlighting the challenges still facing the Aspergillus systematics. Conclusions The study expands the genomic resources available to fungal biologists by providing mitochondrial genomes with consistent annotations for future genetic, evolutionary and population studies. Despite the conservation of the core genes, the mitochondrial genomes of Aspergillus and Penicillium species examined here exhibit significant amount of interspecies variation. Most of this variation can be attributed to accessory genes and mobile introns, presumably acquired by horizontal gene transfer of mitochondrial plasmids and intron homing.


Background
The genera Aspergillus and Penicillium contain some of the most beneficial as well as the most harmful fungal species. They include many industrial producers of important antibiotics, enzymes and pharmaceuticals, which have brought about a massive transformational impact on human health [1]. Several other species are extensively used in production of foods or other useful compounds. Pathogenic fungi, on the other hand, can significantly affect livestock and crops, both as agents of infection and by means of contamination with mycotoxins, and they commonly cause allergic and occasionally life-threatening infections in humans [2]. Fungal infections in humans are notoriously difficult to diagnose and treat, especially in immunocompromised patients. To achieve a better understanding of their biology, nuclear genomes of several Aspergillus and Penicillium species have been sequenced, yet only a handful of mitochondrial genomes have been sequenced and annotated.
With a few notable exceptions such as Podospora anserina [3], Chaetomium thermophilum [4], and Rhizoctonia solani (S. Pakala, unpublished), fungal mitochondrial genomes are small, with an average size of 44,681 bp (based on the fungal mitochondrial genomes in the NCBI organelle database, January 2012). For a fraction of the cost required to sequence a nuclear genome, mitochondrial genomic sequences may provide vital clues into the evolution, population genetics, and biology of these fungi. Aside from energy metabolism, mutations in mitochondrial genes have been linked to cellular differentiation, cell death and senescence pathways, as well as drug resistance and hypovirulence [5][6][7][8][9]. The widespread uniparental inheritance and high copy number of these organelles make them promising markers for cost-effective species identification and for studying fungal population structure [10]. Mitochondrial DNA can be a rich source of novel genotyping markers due to the presence of highly mobile introns in many fungal mitochondria [11]. Finally, fungal mitochondria may serve as valuable experimental models for studies of human heart and muscle diseases linked to mitochondrial dysfunction [12]. Mitochondrial biology is gaining notice with the advent of so called "three parent in vitro fertilization" as a means of producing disease free children from a mother with an inherited mitochondrial genetic disease [13]. In Aspergillus fumigatus the advent of the ability to perform sexual crosses will potentially allow for the genetic analysis of mitochondrial mutations and their phenotypes.
Despite their importance, only a few complete Aspergillus and Penicillium mitochondrial genomes have been reported [14][15][16][17]. In A. fumigatus, the ratio of mitochondrial to nuclear genomes is 12:1, based on optical mapping [18]. Although mitochondrial sequence reads are generated in every eukaryotic genome sequencing project, most studies only report nuclear genomes. As a result, little is known about the mitochondrial genome organization in many important fungi, such as the human pathogen A. fumigatus and the penicillin producer P. chrysogenum, which impedes their functional studies. Here we report the complete sequence and annotation of mitochondrial genomes of six Aspergillus and three Penicillium species. The accompanying comparative analysis of these and related publicly available genomes provides insight into mitochondrial genome organization, distribution of group I introns and plasmid-encoded genes, and phylogenetic relationships among these fungi.

Results and discussion
Assembly and annotation of mitochondrial genomes The following mitochondrial genomic DNAs were sequenced, assembled, and/or annotated in this study: Aspergillus fumigatus AF293, Aspergillus fumigatus A1163, Aspergillus fumigatus 210, Aspergillus clavatus NRRL 1, Aspergillus oryzae RIB40, Aspergillus flavus NRRL 3357, Neosartorya fischeri NRRL 181 (teleomorph of Aspergillus fischerianus), Aspergillus terreus NIH 2624, Penicillium chrysogenum 54-1255, Penicillium marneffei ATCC 18224, and Talaromyces stipitatus ATCC 10500 (teleomorph of Penicillium stipitatum) ( Table 1 and Additional file 1). Additional strains in the process of being sequenced were also analyzed with respect to SNPs and DIPs. Sequence reads for A. fumigatus AF293, A. fumigatus A1163, A. clavatus, N. fischeri and P. chrysogenum were generated in the course of nuclear genome sequencing projects [18][19][20][21]. The A. oryzae mitochondrial genome [16] was re-annotated in this study to include protein coding genes. The A. terreus mitochondrial genome was assembled using Sanger reads obtained from GenBank [GenBank:AAJN00000000]. After being trimmed and rotated, mitochondrial sequences were processed through the standard J. Craig Venter Institute (JCVI) annotation pipeline to ensure annotation consistency.

Mitochondrial genome size variation and sources of genome expansion
The sequenced Aspergillus and Penicillium mitochondrial genomes showed remarkable variation in size, ranging from 24,658 bp to 36,351 bp (Table 1 and Figure 1). The differences in length can be primarily attributed to the number and length of the introns present in the genomes. For example, protein coding genes in the two smallest genomes, A. terreus and P. chrysogenum, do not contain introns, whereas the larger genomes contain one or more introns within ORFs. In contrast, the largest mitochondrial genome, T. stipitatus, contains a total of 11 introns. The presence and number of accessory genes also contribute to the larger size of some of the mitochondrial genomes. Repeat content within the analyzed species was determined to be insignificant (0 -1% of the genomes, data not shown). Comparison of nuclear genome sizes shows no correlation with the mitochondrial DNA sizes (Table 1). Comparative analysis of 11 A. fumigatus strains showed little intraspecies variation in their mitochondrial DNA (Additional file 2). To identify single nucleotide polymorphisms (SNPs) and deletion insertion polymorphisms (DIPs), Illumina reads of 10 A. fumigatus strains, generated in the course of a genome sequencing study (Nierman, unpublished), were aligned against the AF293 mitochondrial DNA. The analysis showed few SNPs and no significant DIPs. In total, we identified 15 candidate SNPs. The number of SNPs in individual strains varied from zero (F15861 and F15767) to nine (AF210). Six SNPs were located in intergenic regions, and nine were located in coding regions including eight non-synonymous SNPs. Five of these eight non-synonymous SNPs were found in cob, cox1, nad1, and nad2 genes. A 1.1 kb deletion was detected in strain A1163 (30,696 bp) compared to strains AF293 (31,765 bp) and AF210 (31,762 bp). The deletion in A1163 was confirmed by PCR using primers flanking the missing region. In AF293 and AF210, the region contains an 843-bp open reading frame (AFUA_m0390, AFUD_m0390; hypothetical protein), which is missing in A1163. A putative homolog of this hypothetical protein (NFIA_m0370) is present in the closely-related N. fischeri mitochondrial genome, suggesting a recent gene loss in A. fumigatus A1163.
Likewise, only minor differences were found between mitochondrial genomes of A. oryzae and A. flavus as well as between two strains of P. marneffei (ATCC 18224 and MP1). The P. marneffei genomes are 99.9% identical and their sizes differ only by 6 bp ( Table 1). The genomes of A. flavus and A. oryzae differ in size by 3 bp and are also 99.9% identical. It should be noted that some of the strains analyzed were genetically modified during strain development. For example, the P. chrysogenum strain was developed as a result of an extensive strain improvement programs, which may have affected its mitochondrial DNA as well [20].
Our results are consistent with previous studies of mitochondrial intraspecies polymorphism. With a few exceptions, most Pezizomycotina mitochondrial genomes show little variation. By contrast, Aspergillus japonicus, P. anserina, Neurospora crassa and some other fungi exhibit significant mitochondrial intraspecies polymorphism and genome size variation, which has been attributed to mobile introns [11]. Thus, population surveys based on RFLP have demonstrated the presence of different mitochondrial haplotypes in wild-type subpopulations of P. anserina [22].

Core mitochondrial genes
All sequenced Aspergillus and Penicillium mitochondrial genomes contain 14 core genes involved in oxidative phosphorylation, ATP synthesis and mitochondrial protein synthesis, all present on the forward strand (Additional file 3). In addition, these genomes carry a complete set of tRNAs, the small and large subunits of ribosomal RNA, and the mitochondrial ribosomal protein S5. Additional file 4 depicts the protein-coding genes and non-coding RNAs in the reference mitochondrial genome of A. fumigatus AF293. The core genes share a high level of sequence conservation (Additional files 5 and 6) and synteny. Figure 2 shows the conservation of gene order of the core genes in all the Aspergillus and most of the Penicillium mitochondrial genomes annotated in this study. The exception is the atp9 gene, which is located between nad2 and cob in P. marneffei ATCC 18224). In all the other genomes, atp9 lies between cox1 and nad3. The number of tRNA genes varies from 25 to 31 with no particular correlation between the number and mitochondrial or nuclear genome size (Table 1).
To explore the possibility that the core genes can provide new insights into Aspergillus evolution, we performed a phylogenetic analysis of 14 concatenated core proteins encoded by 13 Aspergillus and Penicillium mitochondrial genomes ( Figure 3). The obtained phylogenetic tree clusters the A. nidulans and A. niger nodes together with a bootstrap support of 83%. Notably, A. terreus and A. oryzae form another group with bootstrap support of 80%. The tree topology also indicates that P. chrysogenum is not closely related to P. marneffei and T. stipitatus. This finding is consistent with morphological observations [20,23,24]. The topology of the Maximum Parsimony based tree is consistent with that of the Maximum-Likelihood tree (data not shown).
We have also examined individual mitochondrial genes (cox1, cob, nad5) and rDNA internal-transcribed spacers (ITS), which have been proposed as markers for species identification and classification to complement common nuclear DNA markers [10]. Phylogenetic analysis based on these individual protein coding and non-coding genes did not yield phylogenies with >75% bootstrap support for any of the trees (data not shown) and/or did not have enough power to resolve the relationships between the species. The single-gene trees show incompatible topologies with each other and with the topology obtained from the 14 concatenated proteins (Figure 3). This confirms that individual mitochondrial genes are not suitable for species identification in Aspergillus species. As shown previously [10], the presence of introns makes cox1 and other mitochondrial genes poor candidates for building fungal phylogenies, although cox1 is often used in animal phylogenetic studies.
In contrast, our results suggest that the concatenated core mitochondrial proteins can be of value for species level phylogeny construction. Interestingly, the concatenated tree topology is identical to the topology obtained previously from over 100 concatenated nuclear proteins with identical number of introns among orthologs [20]. The A. terreus and A. oryzae grouping is also supported by previously published studies based on multilocus phylogenetic analysis of both nuclear ribosomal and protein-coding genes [25] and concatenated nuclear protein-coding genes [26,27]. However, the A. terreus and A. oryzae nodes form two separate groups in phylogenies based on only nuclear ribosomal genes [28]. Our analysis highlights the challenges still facing the Aspergillus systematics including the unresolved Aspergillus tree backbone.

Accessory mitochondrial genes
In addition to the core set, most mitochondrial genomes contain accessory genes. The most compact genome (A. terreus), contains only the core set of mitochondrial genes, while other genomes contain at least one accessory gene. Some accessory genes are shared by a subset of closely related mitochondrial genomes, but do not show sequence similarity to other genes in public databases. These genes have been annotated as 'hypothetical'. Notably, A. fumigatus strains AF293 and A1163 contain 3 hypothetical genes each with > 99% identity.
In contrast, two accessory genes, A. clavatus ACLA_m0040 and N. fischeri NFIA_m0030, share similarity with mitochondrial DNA and RNA polymerase genes from distantly related fungi. They are also the only two protein-coding genes on the reverse strand, and are located between cob and nad1 in their respective genomes. ACLA_m0040 has a potential frame shift and thus appears to be a pseudogene. Homologous DNA polymerase sequences are found within the main mitochondrial genome (as in Glomerella graminicola M1.001) or on linear mitochondrial plasmids (as in pAL2-1 of P. anserina, pClK1 of the phytopathogenic fungus Claviceps purpurea and pFP1 from Fusarium proliferatum). NFIA_m0030 is present as a C-terminal fragment (lacking a start codon) and is most similar to the RNA polymerase genes found on linear plasmids in Blumeria graminis f. sp. hordei (pBgh) and C. purpurea (pClK1).
The similarity to mitochondrial plasmid-encoded genes suggests that both A. clavatus and N. fischeri polymerase genes were acquired by horizontal gene transfer followed by integration of ancestral mitochondrial plasmids into mitochondrial DNA. Indeed, linear fungal mitochondrial plasmids typically encode DNA and RNA polymerases, while circular plasmids have a single gene for a DNA polymerase and a reverse transcriptase [29].
In several studies, plasmids integrated into mitochondrial DNA have been implicated in the mechanism underlying mycelial senescence in fungi [30][31][32].
Another set of accessory genes annotated in Aspergillus and Penicillium species encodes putative homing endonucleases. Although these genes (HEGs for "homing endonuclease genes") can be found within introns and intergenic regions, in this study all HEGs were associated with "homing" group I introns (see next section). Based on sequence conservation, all HEGs were assigned to either LAGLIDADG or GIY-YIG families. The association between HEGs and introns is found in many other species. Homing introns are considered highly mobile, invasive genetic elements common in fungi and plants, but they can be also found in some animals and prokaryotes. They propagate via the double-strand-break-repair pathway into the specific target sequence ("homing site"), which is recognized and cleaved by the endonuclease [33]. Some endonucleases can also function as maturases by facilitating self-splicing of introns [34,35]. HEGs themselves are considered mobile and are typically bounded by two halves of the homing site (15-45 bp). HEGs are considered selfish elements, but may also contribute to mitochondrial DNA integrity [36]. The variability of HEGs in Aspergillus and Penicillium species can be exploited to develop phylogenetic markers that might be useful for differentiation of various species. Variants of HEGs are now being engineered for targeted cleavage of genomic sequences with potential applications in biotechnology, medicine and agriculture [37]. The tree is based on 14 concatenated core mitochondrial proteins from 13 genomes. Gibberella zeae was used as an outgroup. Branch lengths correspond to substitutions per site calculated using a Maximum Likelihood approach. Identical topology was predicted using the Maximum Parsimony approach.

Diversity, evolution and origin of mitochondrial group I intron insertions
Most fungal mitochondrial genomes sequenced to date contain one or more group I or group II introns. In the Pezizomycotina subphylum (to which all these species belong), the largest number of mitochondrial introns, a total of 33, have been documented for P. anserina [3], while Mycosphaerella graminicola is currently the only species identified that lacks mitochondrial introns entirely [38].
The Aspergillus and Penicillium species sequenced here also have variable intron distribution. Similar to previous observations the number and insertion sites of these introns vary, even between closely related species, suggesting cyclical intron gain and loss through horizontal transfer. Our analysis shows that the variation in intron number is the primary source of difference in genome size ( Figure 1). Thus, the T. stipitatus genome contains eleven introns, while A. terreus and P. chrysogenum genomes both contain only a single intron.
The mitochondrial DNA sequences analyzed here contain group I introns distributed between three protein coding genes and the large subunit ribosomal RNA (LSU) ( Table 1). As observed in a variety of mitochondrial genomes [10], the cox1 gene contains the most variable number of introns. For the Aspergillus species, A. nidulans has three cox1 introns, A. clavatus and N. fischeri each have two, and A. fumigatus, A. flavus and A. oryzae contain a single intron. A. terreus is the only Aspergillus species in this study that does not contain an intron in the cox1 gene. The cox1 intron insertions sites also vary between the Aspergillus species. The cox1 intron insertion site in the A. fumigatus species is seen in the closely Figure 4 Secondary structure of the mitochondrial LSU rRNA intron. This intron is present in all the mitochondrial genomes described here, with the sequence shown being from A. fumigatus. Grey highlighted residues are identical in all species. All of the introns contain a mitochondrial S5 protein ORF in the P8 stem-loop. The primary difference between the Aspergillus and Penicillium species is the presence of an additional stem-loop structure, P6a.1, that is present in the Aspergillus species (except A. nidulans), but absent in the Penicillium species. P. chrysogenum is the only species to contain an extended P9.1 region. 5'SS: 5' splice site; 3'SS: 3' splice site. related N. fischeri genome and in the more distantly related P. marneffei and T. stipitatus genomes. The cox1 insertion site in A. flavus and A. oryzae is also present in A. clavatus.
For the Penicillium species, P. marneffei and T. stipitatus contain seven and eight cox1 introns, respectively, while P. chrysogenum does not contain any cox1 introns. Between P. marneffei and T. stipitatus, six of the intron insertion sites are identical. All of the cox1 intron insertions sites have been previously observed in other distantly related fungi such as Saccharomyces cerevisiae, P. anserina and N. crassa [3,39,40].
The other two protein coding genes that contain group I introns are the cob and nad1 genes. A. nidulans and A. clavatus contain a common cob intron, while P. marneffei contains a single intron and T. stipitatus contains two cob introns. These two Penicillium species are also the only two in this study with an intron in the nad1 gene. The cox1, cob, and nad1 introns in all the species encode either a LAGLIDADG or GIY-YIG homing endonuclease (as discussed above).
By contrast, the one intron common to all the mitochondrial genomes described here is found at a single location in the LSU rRNA gene. These introns are closely related in secondary structure (Figure 4) with the only major difference between the genera being an additional stem-loop structure (P6c) present in all the Aspergillus introns (except A. nidulans), but absent in the Penicillium species. All Pezizomycotina mitochondria sequenced to date, with the exception of the Diothideomycetes (Phaeosphaeria nodorum and Mycosphaerella graminicola), contain a mitochondrial LSU rRNA intron inserted at this position. Intron insertions are also commonly found in the mitochondrial LSU gene in plants and other fungi.
Our results are consistent with two previously observed characteristics of Pezizomycotina mitochondrial LSU introns. First, these introns do not contain endonuclease ORFs, but do contain the mitochondrial ribosomal protein S5 ORF within the P8 stem-loop of the intron. In S. cerevisiae and other fungi, mitochondrial S5 is nuclear encoded and transported into the mitochondrial matrix [41] but is not encoded in the nuclear genome of any Pezizomycotina fungi sequenced to date. No studies have thoroughly examined the function of the mitochondrial S5 ORF, but decreased expression of mitochondrial S5 from the intron ORF in N. crassa leads to mitochondrial small ribosomal subunit assembly defects and decreased mitochondrial protein expression [42,43]. Second, though group I introns are normally considered self-splicing, the Pezizomycotina mitochondrial LSU introns tested to date cannot self-splice [44][45][46][47]. These introns require a mitochondrial tyrosyl-tRNA synthetase (TyrRS) as a structure-stabilizing splicing cofactor found only in Pezizomycotina [47].
The intron distribution in the genera described here suggests two distinct mechanisms of intron evolution within the lineage. The presence of homing endonucleases in all of the cox1, cob1, and nad1 introns suggests they likely follow the previously proposed "omega" cycle of intron gain and loss [48]. In this model, horizontal transmission into an intronless site is promoted by a functional homing endonuclease, followed by endonuclease degradation, and eventually, intron loss. The cycle can then be restarted by a new horizontal transmission event. The sporadic distribution of introns in the protein coding genes analyzed here indicates that horizontal gene transfer may be quite common in the Aspergillus and Penicillium mitochondrial genomes. It also highlights the challenges associated with using cox1 and other mitochondrial genes to build species phylogenies.
The mitochondrial LSU intron likely follows a slightly different evolutionary trajectory. The widespread distribution of mitochondrial LSU introns in Pezizomycotina that contain the S5 ORF suggests that this intron insertion event occurred after the divergence from the yeasts, and became fixed within the lineages. Fixation was likely due to the selective advantage from the S5 gene, but also from accumulated mutations in the intron, which resulted in its dependence on the nuclear encoded mitochondrial TyrRS splicing factor. This idea is supported by the observation that the Dithideomycetes, the only Pezizomycotina group that lacks mitochondrial LSU introns, contain degraded adaptations of the mitochondrial TyrRS splicing factor necessary for intron splicing [46]. A plausible scenario is that this degeneration occurred after mitochondrial LSU intron loss. It remains unclear how the Dithideomycetes have compensated for the loss of the S5 protein [38,49].

Conclusions
We report here the complete sequence and annotation of mitochondrial genomes of six Aspergillus and three Penicillium species, which represent the two most significant genera among filamentous fungi. The study expands the genomic resources available to fungal biologists by providing mitochondrial genomes with consistent annotations. The accompanying comparative analysis of these and related publicly available genomes provides insight into genome organization and phylogenetic relationships among these organisms. By clustering together A. terreus and A. oryzae, the phylogenetic tree based on 14 concatenated core mitochondrial proteins has a different topology from some previously published single protein trees, but is similar to trees built using multiple nuclear proteins. This suggests that core genes in mitochondrial and nuclear genomes co-evolved in the Aspergillus lineage.
Despite the conservation of the core genes, mitochondrial genomes of Aspergillus and Penicillium species exhibit significant amount of interspecies variation consistent with experimental evidence for intraspecies horizontal transfer and recombination in mitochondrial DNA [50]. Most of this variation can be attributed to accessory genes and mobile introns, presumably acquired via swapping and integration of mitochondrial plasmid and intron homing followed by gene or intron loss. Annotated core and accessory genes can serve as complementary markers in future population genetics and evolution studies.

Sequencing and assembly
The following genomes were sequenced at JCVI using the Sanger technology: A. fumigatus AF293, A. fumigatus A1163, A. clavatus NRRL 1, A. flavus NRRL 3357, N. fischeri NRRL 181, P. chrysogenum 54-1255, P. marneffei ATCC 18224, and T. stipitatus ATCC 10500 (Additional file 1). The A. fumigatus AF210, genome was sequenced using a combination of 454 GS FLX Titanium instrument (Roche) and Illumina Genome Analyzer II (Illumina). The reads were assembled at JCVI using the Celera Assembler [51]. The P. chrysogenum mitochondrial genome was previously reported [20], but the sequence was not trimmed. We therefore trimmed, rotated, and re-annotated the original sequence to ensure annotation consistency. The A. terreus NIH 2624 genome was assembled at JCVI using Sanger reads or traces obtained from NCBI [GenBank: AAJN00000000], which were deposited by the Broad Institute. The A. nidulans FGSC A4 genome was assembled at JCVI using reads provided by the Broad Institute (with additional sequencing performed at JCVI). The remaining complete mitochondrial genome sequences used in this study were either obtained from NCBI (www.ncbi.nlm.nih. gov) or sequenced and assembled at JCVI (see Genome closure section below). The following mitochondrial genome sequences were obtained from NCBI:

Genome closure
Mitochondrial contigs were completed using available de novo contigs and/or mapping to mitochondrial reference genomes. Seed contigs from de novo assemblies of all genomic data were analyzed for high coverage and for the presence of common mitochondrial genes. Simultaneously, sequence reads were mapped to known references to generate a read set for de novo assembly. Resulting mitochondrial contigs were iteratively extended through recruiting sequence data by aligning to contig edges until no gaps remained. All contigs were manually examined for quality. All contain at least 2 fold high quality coverage of every base and had evidence of circularity based on mate pairing and/or overlapping contig edges. The resulting scaffolds were trimmed and rotated to facilitate comparative analysis. The deletion in A. fumigatus A1163 was confirmed by PCR using primers flanking the missing region (Primers: AF1163C16842: 5 0 -ATTGTTCATTATTC TACAGTTAAGCC-3 0 and AF1163C17705: 5 0 -AATTAGT ATCCTCATCTTCCTTAGG-3 0 ). Annotated scaffolds have been deposited in NCBI [GenBank:JQ346807, GenBank: JQ346808, GenBank:JQ346809, GenBank:JQ354994, Gen-Bank:JQ354995, GenBank:JQ354996, GenBank:JQ354997, GenBank:JQ354998, GenBank:JQ354999, GenBank:JQ355 000 and GenBank:JQ355001].

SNP and DIP identification in A. fumigatus mitochondrial DNA
SNPs and small DIPs were predicted using CLC Genomics Workbench from CLC Bio (www.clc-bio.com). Illumina reads from target strains were mapped to the reference A. fumigatus AF293 mitochondrial genome. The mapping parameters used were 0.9 for Length fraction and 0.9 for Similarity. Non specific matches were ignored. Default cost parameters were used. To call SNPs and small DIPs, we used stringent parameters that were obtained from extensive manual evaluation of alignments. The following cut-offs were used in CLC to call SNPs: (i) read coverage is equal to or above 10; and (ii) variants supported by at least 99% of the reads. The quality of read alignments and the regions surrounding the called SNP locations were manually inspected. Low complexity regions were ignored. The SNPs that passed these filtering criteria were retained. MUMmer [52] and BLASTn [53] were used to check for the presence of large scale rearrangements and insertions/ deletions.

Repetitive regions identification
RepeatMasker (http://www.repeatmasker.org/) was used to check for the presence of high copy interspersed repeats and low complexity DNA sequences. PrintRepeats [54] (http://www.genome.ou.edu/miropeats.html) was used to identify low copy repeats.

Mitochondrial genome annotation
All mitochondrial genomes analyzed in this study were annotated at JCVI, except for A. niger, A. tubingensis, A. nidulans and P. marneffei MP1 (Additional file 1). Mitochondrial A. oryzae tRNA genes were obtained from NCBI. Open reading frames (ORFs) were identified using Artemis [55] with genetic code 4. Functional assignments were made based on sequence similarity to characterized fungal mitochondrial proteins using BLASTp searches against NCBI databases. ORFs containing more than 100 amino acids, and no sequence homology to known genes, were designated as hypothetical genes. tRNA genes were identified using tRNAscan-SE and ribosomal RNA genes were identified using BLASTn [53]. Core mitochondrialencoded genes were identified by all-against-all comparison using BLASTp.

Group I intron annotation
Approximate group I intron insertion boundaries were initially established as interruptions in the proteincoding genes and LSU rRNA gene identified through BLASTp and BLASTn searches. Precise 5 0 intron boundaries were determined by identifying the conserved U-G or C-G within the intron's P1 stem. The 3 0 intron boundaries were determined by identifying the G at the end of the intron and the ability for the downstream sequence to form the P10 guide sequence stem [47,56]. The identified boundaries were confirmed by tBLAST searches of the putative spliced products. The mitochondrial LSU group I intron secondary structures were constructed from the previously described A. nidulans secondary structure [47].

Synteny analysis of core genes
OrthoMCL [57] was used to identify the orthologous relationships between the 15 core protein-coding genes. The first step was an all-against-all BLASTp search with an expect value of 1e-05. This was followed by the MCL clustering algorithm using default parameters and the main inflation value (−I) set to 1.5. The orthologous clusters were displayed in Figure 2 using SynView [58].

Phylogenetic analysis
To generate phylogenetic trees, 14 core proteins encoded by 13 genomes were first concatenated and then aligned using Muscle [59]. Regions with poor alignments were removed with Gblocks using default settings [60]. Maximum-Likelihood (ML) trees were generated using the Randomized Axelerated Maximum Likelihood (RAxML) program [61]. Multiple ML trees were generated and the best-scoring tree was identified. 100 boot-strapped trees were generated and used to assign the boot strap support values to the best-scoring ML tree. The JTT amino acid substitution model was used with the Gamma model of rate heterogeneity. Gibberella zeae (anamorph Fusarium graminearum) was used as an outgroup. A Maximum Parsimony based tree was generated using the Protpars program of the PHYLIP package [62].