Genomic evidence for intraspecific hybridization in a clonal and extremely halotolerant yeast
© The Author(s). 2018
Received: 23 January 2018
Accepted: 2 May 2018
Published: 15 May 2018
The black yeast Hortaea werneckii (Dothideomycetes, Ascomycota) is one of the most extremely halotolerant fungi, capable of growth at NaCl concentrations close to saturation. Although dothideomycetous fungi are typically haploid, the reference H. werneckii strain has a diploid genome consisting of two subgenomes with a high level of heterozygosity.
In order to explain the origin of the H. werneckii diploid genome we here report the genome sequencing of eleven strains isolated from different habitats and geographic locations. Comparison of nine diploid and two haploid strains showed that the reference genome was likely formed by hybridization between two haploids and not by endoreduplication as suggested previously. Results also support additional hybridization events in the evolutionary history of investigated strains, however exchange of genetic material in the species otherwise appears to be rare. Possible links between such unusual reproduction and the extremotolerance of H. werneckii remain to be investigated.
H. werneckii appears to be able to form persistent haploid as well as diploid strains, is capable of occasional hybridization between relatively heterozygous haploids, but is otherwise limited to clonal reproduction. The reported data and the first identification of haploid H. werneckii strains establish this species as a good model for studying the effects of ploidy and hybridization in an extremotolerant system unperturbed by frequent genetic recombination.
The black yeast Hortaea werneckii has been studied for more than two decades for its extreme halotolerance, which exceeds most other fungi. After its genome was sequenced it revealed another intriguing trait of the species: a diploid genome consisting of two (nearly) complete, but highly heterozygous genomes [1, 2]. Ploidy changes in fungi are often associated with environmental perturbations . Considering that genomes of species related to H. werneckii are typically haploid, a possible link between the unusual diploid genome of H. werneckii and its extremotolerant biology was proposed, but not investigated further . The divergence between the two H. werneckii subgenomes is relatively large, with only 89% of nucleotides conserved in regions that could be uniquely aligned between the two subgenomes. However, the expression of the paralogues was conserved to a high degree .
Reports of whole genome duplication in fungi remain scarce. They include examples from early diverging fungi [4, 5] and also from asco- and basidiomycetes . The best known and also the most studied example is the ancient duplication in Saccharomyces cerevisiae, which was recently reported to have occurred through an ancient hybridization . Interspecies hybrids are often asexual, but in the example of Zygosaccharomyces parabailii it was shown that hybrids can regain fertility by inactivating one mating-type locus .
Although a mating locus was identified in the reference H. werneckii genome, sexual reproduction has not yet been observed in this species . However, the once widespread belief that up to a fifth of fungal species are strictly clonal was challenged by the increasingly powerful genomic and population genetic/genomic analyses .
In the absence of a large population dataset and with only one available H. werneckii genome sequence, the origin of the reference genome duplication could not be investigated. It was also unknown whether the diploid genome is a general trait of the species or just a peculiarity of the reference strain. When diploidy was first confirmed by sequencing the reference genome [1, 2], two explanations of its formation were discussed: an endoreduplication, and an intraspecific hybridization. Both MAT alleles of the reference strain identified at the time were MAT1–1 and the nucleotide sequence of the MAT genes was only 88.7% identical. Therefore endoreduplication was proposed as a preliminary hypothesis until further data became available . Here we tested this hypothesis by sequencing eleven H. werneckii genomes and using comparative genomics. However, rather than providing evidence for endoreduplication, our results indicate the existence of not only one, but several hybridization events in the evolution of the twelve compared genomes.
Strains of Hortaea werneckii compared in this study
Ex culture collection strain number
Name in this study
Isolation habitat and location of sampling site
Hypersaline water; Sečovlje saltpans, Slovenia; reference genome 
Hypersaline water; Santa Pola saltpans, Spain
Soil on the sea coast; Namibia
Hypersaline water; Sečovlje saltpans, Slovenia
Human; Keratomycosis; Brasil
Human; Trichomycosis nigra; Italy
Deep sea water; Italy
Human; Tinea nigra; Portugal
Spider web; Atacama desert; Chile
Spider web; Atacama desert; Chile
Spider web; Atacama desert; Chile
Rock wall in a cave; Atacama desert; Chile
Statistics of sequenced H. werneckii genomes
Genome assembly size (Mb)
Number of contigs
CDS total length (Mb)
CDS total length (% of genome)
Gene models (n)
Number of exons (n)
Exons per gene (average)
GC content (%)
As observed previously in the case of the reference H. werneckii genome , high heterozygosity was observed within other diploid strains of the species as well. The proportion of diploid genomes that could be uniquely aligned between the two subgenomes ranged from 67.30% for genome E to 93.66% for genome G, while the share of identical nucleotides in the aligned regions was between 88.31 and 91.92% (Additional file 1: Table S3).
The possible evidence for sexual reproduction was analyzed through linkage disequilibrium. The index of association between the loci was calculated in ten randomly sampled subsets of 1000 SNPs. The null hypothesis of absent linkage (related to sexual reproduction) was rejected for all subsets with p < 0.001, indicating that H. werneckii is clonal (Additional file 1: Figure S6).
Sequencing of eleven genomes of the black yeast Hortaea werneckii provides us with new insights into the evolutionary history of the unusual duplicated genomes of this species.
Reference-guided assembly and annotation of the genomes showed that nine of them were similar in size (the average difference from the reference genome size, 49.9 Mbp, was less than 3%; Sihna et al. ), number of predicted proteins, GC content and other characteristics (Table 2). The assemblies and predicted proteomes can be considered as almost complete (Additional file 1: Table S1). In contrast, genomes of strains C and D were clear outliers. The genome assembly and gene count of these two strains are half the size of the genomes of the reference strain and other strains. Also, while in other strains more than 60% of Benchmarking Universal Single-Copy Ortholog genes/proteins were found in two copies, no duplicated genes/proteins were found in the assembled genomes C and D and their predicted proteomes (Additional file 1: Table S1). The strains C and D were thus considered to be haploid, and this classification was supported by all subsequent analyses. Two thirds of reads from eight strains’ genomes to the reference genome could not be uniquely aligned to a single locus of the reference genome (Additional file 1: Table S2). This can be explained by the diploid nature of the reference genome and the resulting failure of the mapping algorithm to map reads to only one of the duplicated loci. However, in case of three strains (B-D) the proportion of uniquely mapping reads was much higher, reflecting the closer phylogenetic relatedness of these strains to the reference strain (Fig. 2).
Additional insight into the relationship between the haploid and diploid strain genomes was provided by mapping the reads of the haploid genomes (C and D) against assembled genomes of diploid strains, resulting in three mapping patterns (Fig. 1, Additional file 1: Figures S1–S3): (i) Reads from each haploid strain mapping to a different subgenome within the genomes A and B, indicating that these diploid genomes are hybrids of ancestors of strains C and D, as also demonstrated by the alignment of assembled genomes C and D to the reference genome A (Additional file 1: Figure S4); (ii) Reads from haploid strains mapping with higher affinity to the same half of the genomes of strains E and F due to a closer relatedness of both C and D to one of the subgenomes within strains E and F (Fig. 3); (iii) Relatively homogenous mapping throughout the genomes, likely due to a larger (and equal) phylogenetic distance between the genomes of strains C and D and both subgenomes of each of the diploid strains G-L.
2. Genomes of strains G-L are a result of hybridization events distinct from the event producing genomes of strains A and B. While a more detailed investigation of the evolutionary history of these genomes will be difficult to perform unless their corresponding haploid genomes are identified, phylogenies of genes representing the duplicated genomes (Fig. 3, Additional file 1: Figure S5) can be best explained by six hybridizations in addition to the one producing the reference genome (Fig. 5a). One of the hybridization parents of strains E and F was much more closely related to the haploid genomes C and D, explaining the preferential mapping of these haploids to only approximately half of the genomes of strains E and F (Fig. 1). The level of heterozygosity within the diploid strains matched these inferred phylogenetic distances between the subgenomes, which were shortest in the genome G and largest in strains E and F (Additional file 1: Table S3).
3. Although comparative genomics points to several hybridization events in the evolution of analyzed H. werneckii strains, recombination between different phylogenetic lineages within the species appears to be rare. Apart from one hybridization event there appears to have been no major exchange of genetic material in the phylogenetic lineages connecting genomes A and B on the one hand and the genomes C and D on the other hand, despite the large geographical and/or phylogenetic distances between these strains (Fig. 5). Any such exchange would disrupt the almost perfect separation of the reference genome into two subgenomes, each more closely related to either strain C or strain D. Interestingly, few to no large scale losses within the subgenomes occurred after hybridization, each still representing approximately half of the duplicated genome (as noted already by Sinha et al. ). The high degree of concordance between gene trees (Fig. 3, Additional file 1: Figure S5) also points to little (if any) exchange of genetic material between the investigated phylogenetic lineages (apart from the few hybridization events) and fulfils the “strong phylogenetic signal” criterion for clonality . Clonality is also indicated by the linkage disequilibrium analysis, which rejected the null hypothesis of sexuality in H. werneckii.
A largely clonal evolutionary history of H. werneckii interspersed by occasional hybridization events is well described by the concept of restricted recombination, which claims that in clonal species recombination is not necessarily completely absent, but is rather rare enough not to disrupt the prevalent clonal population structure pattern . Considering the prevalence of mating loci in H. werneckii genomes (Fig. 4) it is difficult to explain the apparent absence of sexual reproduction. The presence of a mating locus, however, is not sufficient for mating . Many restrictions to recombination are possible, both extrinsic and intrinsic, from limited dispersal and population bottlenecks, skewing the balance of mating types (H. werneckii can be found in sea water, but is by far the most abundant in hypersaline environments, which are often well-isolated from each other) to hybrid incompatibility . Also, the limited (and thus potentially biased) selection of strains sequenced in this study might not have been sufficient to detect recombination. For example, the yeast Cryptococcus gattii can appear clonal if sampled locally, but is recombining on a global scale .
When the reference H. werneckii genome was first sequenced, the mating locus was identified as heterothallic. However, we here showed that in all strains the mating locus was at least initially homothallic, although the MAT1–2 may have degenerated into a pseudogene in strains A, B, D and G and due to poor conservation could not be identified as such in the original investigation of the reference genome. This does not exclude the possibility of mating as the mechanism of hybridization in the lineage leading to genomes A and B. In both haploids MAT1–1 is well conserved and haploid C also has a conserved MAT1–2. Even if only MAT1–1 remained functional in the ancestors of strains A and B, sexual reproduction between strains of the same mating type (unisexuality) has been observed in fungi before . Another possible mechanism of hybridization is parasexuality, fusion of vegetative cells, in this case not followed by a mitotic or meiotic haploidization, yielding a persistent diploid.
Several questions related to the diploid H. werneckii genome remain to be addressed by further sampling, sequencing and studies of physiology. Can diploid strains revert to haploids? Considering the strong evidence for clonality and a lack of an obvious mechanism that could split a diploid genome back into parent haploid genomes this does not appear to happen often if at all. An alternative option is the persistence of haploid strains on the evolutionary scale, and occasional hybridization of these haploids into persistent diploids. Additionally, it is not clear how ancient are the hybridization events. Strains I-L isolated from the same cave in Atacama share one subgenome but not the other, suggesting that at least these hybridizations might have occurred relatively recently in a geographically limited area and sharing one of the parents.
Finally, if as it appears here, the hybridization events do not significantly contribute to recombination in the species, does the unusual ploidy of H. werneckii play a role in its extremotolerant physiology? The osmotolerant Pichia sorbitophila is a known example of a species formed by (in this case intraspecies) hybridization . Our initial investigations have not identified any substantial physiological differences between haploid and diploid strains (unpublished data), but further research is needed to either confirm or reject these preliminary results.
Comparative genomics of eleven strains of the black yeast H. werneckii, in the past extensively studied for its extreme halotolerance, showed that the majority of investigated strains were diploid, arising from several hybridization events between relatively heterozygous ancestors in a species that otherwise appears to be limited to clonal reproduction. This, together with the first identification of haploid H. werneckii strains establishes the species as a good (extremotolerant) fungal model for studying the effects of ploidy and hybridization on the evolution of the genome in a system largely unperturbed by sexual reproduction and genetic recombination between the phylogenetic lineages.
Cultures, media and growth conditions
Eleven Hortaea werneckii strains collected from various habitats around the world (Table 1) were obtained from the Ex Culture Collection of the Department of Biology, Biotechnical Faculty, University of Ljubljana (Slovenia). They were selected to represent different parts of the intraspecific phylogeny as estimated by the standard phylogenetic markers (data not shown) and with no prior knowledge about their ploidy. Cultivation of biomass for the isolation of DNA was performed in the standard chemically defined medium Yeast Nitrogen Base (YNB, Qbiogene), with 0.5% ammonium sulphate (w/v), and 2% glucose (w/v). The pH was adjusted to 7.0 prior to autoclaving. 2% of agar (w/v) were added for solid media. All cultures were grown at 24 °C. Liquid cultures were grown on a rotary shaker at 180 rpm. Cells were harvested in the mid-exponential growth phase (OD600 = 0.8–1.0) with centrifugation (10 min at 5000×g), the pellet was frozen in liquid nitrogen and kept at − 80 °C until DNA isolation.
DNA for sequencing was isolated from the prepared biomass. The frozen pellet was first homogenized using a pestle and mortar. 100 mg of the homogenate was transferred to 2 ml microcentrifuge tubes, each with one stainless steel ball, placed in holders pre-cooled with liquid nitrogen and additionally homogenized in Retsch Mixer Mill 301 (Thermo Fisher Scientific, USA) at 20 Hz for 1 min. 300 μl MicroBead Solution buffer was added and the mixture was completely thawed on ice. This homogenate was then used for DNA extraction using the UltraClean Microbial DNA isolation kit (MO BIO Laboratories, USA) according to the manufacturer instructions. Contaminating RNA was removed with RNAse A (Thermo Fisher Scientific, USA). The quantity, purity and integrity of the isolated DNA was evaluated by agarose electrophoresis, spectrophotometrically with NanoDrop 2000 (Thermo Fisher Scientific, USA) and by Qubit fluorometry (Thermo Fisher Scientific, USA).
The sequencing was performed by GATC Biotech AG (Germany) on Genome Sequencer Illumina HiSeq with 2× 150 bp Nextera libraries in a multiplexed mode. The resulting output was demultiplexed, the quality was checked with FastQC and the reads were trimmed for adaptors and quality (Q20 threshold) with the bbduk script (https://jgi.doe.gov/data-and-tools/bbtools/).
Sequencing reads, assembly and annotation data have been deposited in Genbank under the BioProject PRJNA428320 and in Open Science Framework 97 (https://doi.org/10.17605/OSF.IO/HQWXG).
Mapping of reads to non-haploid genomes is a non-trivial task that has to be optimised on a case-by-case basis . Mapping with bwa mem  to the published reference genome of H. werneckii (GenBank MUNK00000000.1)  was tested here with different combinations of parameter values: (i) default values; (ii) discarding reads mapping to more than one locus (using options “-c” and “-r” followed by removing all mappings flagged with “XA:Z:” or “SA:Z:”; all of this with different thresholds for lowest reported alignment score (option “-T”: 30 (default), 60, 90, 95).
After genome C was determined to be haploid, the mapping was repeated with this genome used as the reference. Due to the lack of comparable reads the genome A was excluded from this analysis. Mapped reads were sorted with samtools 1.6 , and duplicates marked with picard 2.10.2. Variant calling was performed with Genome Analysis Toolkit 3.8  according to “GATK Best Practices” with the “hard filtering” option. Ploidy was set to 2 for diploid strains and 1 for haploids. The most stringent filtering criterion was depth of coverage. Aligning highly heterozygous diploid genomes to the haploid reference is associated with a risk of aligning only reads of one subgenome, but not the other (due to excessive dissimilarity of these reads from the reference), resulting in underestimation of the heterozygosity. Thus only reads with a high depth of coverage were used for the subsequent analyses (Additional file 1: Figure S7).
Assembly and annotation
The genomes were assembled with IDBA-Hybrid 1.1.3  with the published H. werneckii genome  used as a reference to guide the assembly process. The maximum k value selected was 180, minimum support in each iteration was 2, similarity for alignment 0.95, seed kmer 20, maximum allowed gap in the reference was 100 and the minimum size of contigs was 500.
Annotation of protein-coding and tRNA genes was performed with MAKER 2.31.8 . The fungal subset of the Swissprot database (recovered on 19. 7. 2017) and the published predicted proteome of H. werneckii  were used as evidence. Three ab initio gene predictors were used in the MAKER pipeline. Semi-HMM-based Nucleic Acid Parser (SNAP)  was bootstrap-trained within MAKER based on the gene models derived from the alignment of the protein datasets to the genome as recommended by Campbell et al. (2014). GeneMark-ES (Lomsadze et al., 2014) was self-trained  and Augustus was used with the training parameters for Neurospora crassa .
The genome assembly and gene prediction completeness was evaluated with the Benchmarking Universal Single-Copy Orthologs (BUSCO 3) software  in genomic and proteomic modes, using the dataset for fungi . The genomic mode was used with augustus trained on the genome of Neurospora crassa. All other parameters were left at default values.
Pairwise alignments of genomes A, C and D (discarding all contigs shorter than 25 kBp) were calculated with the nucmer algorithm, as implemented in Mummer 3.23, and plotted with the mummerplot utility  as described by Hane et al. .
The differences between the subgenomes of diploid H. werneckii strains (as a measure of heterozygosity within the genomes) were calculated by separating the diploid genomes into two subgenomes as described by Sinha et al.  and pairwise aligning the subgenomes with the nucmer algorithm of Mummer 3.23  using anchor matches unique in both the reference and query. The alignment was summarized with the show-coords algorithm (Mummer 3.23) using default parameters and the resulting table was analyzed with R  to count the proportion of the genome covered by the alignments and the average share of identical nucleotides in the alignments.
Phylogenetic network was reconstructed from SNP data called using genome C as the reference (as described above). The dissimilarity distance matrix was calculated by the R package poppr  and used to construct the phylogenetic network with the Neighbor-Net algorithm as implemented in the R package phangorn [18, 30].
Gene phylogenetic trees were constructed from predicted coding sequences of all here sequenced genomes and the reference genome. First, BLAST clustering (1e-40 e-value threshold) and analysis of alignments (with 80% identical nucleotides threshold) were used to identify CDSs existing in exactly two copies in diploid genomes and in one copy in haploid genomes using the stand-alone BLAST+ 2.7.1  and processing of the results with a custom script. Sequences from each resulting CDS cluster were aligned with MAFFT 7.215 with the “--auto” option and default parameters , the alignment was optimized with Gblocks 0.91 using options “-b3=10 -b4=3 -b5=n”  and used for the reconstruction of phylogeny with PhyML 3.1  if it was longer than 200 nucleotides and contained on average at least 15 nucleotide differences between the gene pairs. Hasegawa-Kishino-Yano, 85  nucleotide substitution model was used, and alpha parameter of the gamma distribution of substitution rate categories and the proportion of invariable sites were estimated by PhyML. Finally, the trees were sorted into clusters of trees with similar topology measured by the normalized Robinson-Foulds distance calculated by the ETE Toolkit 3.1.1 ; the minimum similarity within the cluster was 0.80. The largest clusters of trees were visualized with DensiTree 2.2.5  and a strict consensus tree was calculated for each cluster with the consensus_tree.py script in QIIME, using only nodes occurring more than 50% of the time .
Phylogenies of RNA polymerase II and beta tubulin genes were estimated by automatically aligning the nucleotide sequences with MAFFT , estimating the custom model of nucleotide substitution with jModelTest 2.1.10  and generating the phylogenetic tree by PhyML 3.1 . The alpha parameter of the gamma distribution of substitution rate categories and the proportion of invariable sites were estimated by PhyML. Branch supports were estimated with aLRT as Chi2 based supports.
Linkage disequilibrium and mating type loci
Linkage disequilibrium was estimated by calculating the index of association r d  using the package poppr in R [18, 29]. The index was calculated on ten datasets, each containing 1000 randomly sampled SNPs. The p-value for the rejection of the null hypothesis (that the loci are not linked and the population is sexual) was estimated with 999 permutations of each dataset.
Mating genes were identified by BLAST searches against the assembled H. werneckii genomes and predicted proteomes, using homologues from other dothideomycetous fungi as queries. Annotated genomes were used to identify the flanking genes. The function of the resulting predicted proteins was inferred by blast comparison with the most similar proteins in the GenBank database.
The authors acknowledge the financial support from the Slovenian Research Agency to the Infrastructural Centre Mycosmo (MRIC UL) and to the programs P1–0170 and P1–0207. This research was also funded by the Ministry of Higher Education, Science and Technology of the Republic of Slovenia, as a Young Researcher grant to JZ (grant no. 382228–1/2013).
Availability of data and materials
Sequence data generated and analysed during the current study are available in Genbank under the BioProject with the accession code PRJNA428320 and in Open Science Framework (https://doi.org/10.17605/OSF.IO/HQWXG or ARK c7605/osf.io/hqwxg).
CG analyzed and interpreted the data and drafted the manuscript. JES contributed to the analysis and interpretation of data. JZ and PZ cultivated the fungal strains and prepared the material for sequencing. NGC and PZ contributed to conception of the study. All authors contributed to design of the experiments, revised the manuscript draft and approved the final version of the manuscript.
Ethics approval and consent to participate
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Lenassi M, Gostinčar C, Jackman S, Turk M, Sadowski I, Nislow C, et al. Whole genome duplication and enrichment of metal cation transporters revealed by de novo genome sequencing of extremely halotolerant black yeast Hortaea werneckii. PLoS One. 2013;8:e71328.View ArticlePubMedPubMed CentralGoogle Scholar
- Sinha S, Flibotte S, Neira M, Formby S, Plemenitaš A, Gunde-Cimerman N, et al. Insight into the recent genome duplication of the halophilic yeast Hortaea werneckii: combining an improved genome with gene expression and chromatin structure. G3-Genes Genomes Genet. 2017;7:2015–22.Google Scholar
- Todd RT, Forche A, Selmecki A. Ploidy variation in fungi: polyploidy, aneuploidy, and genome evolution. Microbiol Spectr. 2017;5Google Scholar
- Ma LJ, Ibrahim AS, Skory C, Grabherr MG, Burger G, Butler M, et al. Genomic analysis of the basal lineage fungus Rhizopus oryzae reveals a whole-genome duplication. PLoS Genet. 2009;5Google Scholar
- Corrochano LM, Kuo A, Marcet-Houben M, Polaino S, Salamov A, Villalobos-Escobedo JM, et al. Expansion of signal transduction pathways in fungi by whole-genome duplication. Curr Biol. 2016;26:1577–84.View ArticlePubMedPubMed CentralGoogle Scholar
- Albertin W, Marullo P. Polyploidy in fungi: evolution after whole-genome duplication. Proc R Soc B-Biological Sci. 2012;279:2497–509.View ArticleGoogle Scholar
- Marcet-Houben M, Gabaldón T. Beyond the whole-genome duplication: phylogenetic evidence for an ancient interspecies hybridization in the baker’s yeast lineage. PLoS Biol. 2015;13Google Scholar
- Ortiz-Merino RA, Kuanyshev N, Braun-Galleani S, Byrne KP, Porro D, Branduardi P, et al. Evolutionary restoration of fertility in an interspecies hybrid yeast, by whole-genome duplication after a failed mating-type switch. PLoS Biol. 2017;15Google Scholar
- Taylor JW, Hann-Soden C, Branco S, Sylvain I, Ellison CE. Clonal reproduction in fungi. Proc Natl Acad Sci. 2015;112:8901–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Tibayrenc M, Ayala FJ. Reproductive clonality of pathogens: a perspective on pathogenic viruses, bacteria, fungi, and parasitic protozoa. Proc Natl Acad Sci. 2012;109:E3305–13.View ArticlePubMedPubMed CentralGoogle Scholar
- Engelthaler DM, Hicks ND, Gillece JD, Roe CC, Schupp JM, Driebe EM, et al. Cryptococcus gattii in north American Pacific northwest: whole-population genome analysis provides insights into species evolution and dispersal. MBio. 2014;5Google Scholar
- Phadke SS, Feretzaki M, Clancey SA, Mueller O, Heitman J. Unisexual reproduction of Cryptococcus gattii. PLoS One. 2014;9Google Scholar
- Louis VL, Despons L, Friedrich A, Martin T, Durrens P, Casarégola S, et al. Pichia sorbitophila, an interspecies yeast hybrid, reveals early steps of genome resolution after polyploidization. G3-Genes Genomes Genet. 2012;2:299–311.Google Scholar
- Clevenger J, Chavarro C, Pearl SA, Ozias-Akins P, Jackson SA. Single nucleotide polymorphism identification in polyploids: a review, example, and recommendations. Mol Plant. 2015;8:831–46.View ArticlePubMedGoogle Scholar
- Li H, Durbin R. Fast and accurate short read alignment with burrows-wheeler transform. Bioinformatics. 2009;25:1754–60.View ArticlePubMedPubMed CentralGoogle Scholar
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25:2078–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Alkan C, Coe BP, Eichler EE. GATK toolkit. Nat Rev Genet. 2011;12:363–76.View ArticlePubMedPubMed CentralGoogle Scholar
- Development R. Core team. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical. Computing. 2017; Available from: https://www.r-project.org
- Wickham H. Ggplot2. Springer-Verlag New York: Elegant Graph. Data Anal; 2009.View ArticleGoogle Scholar
- Peng Y, Leung HCM, Yiu SM, Chin FYL. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics. 2012;28:1420–8.View ArticlePubMedGoogle Scholar
- Campbell MS, Holt C, Moore B, Yandell M. Genome annotation and curation using MAKER and MAKER-P. Curr Protoc Bioinforma. 2014;2014:4.11.1–4.11.39.View ArticleGoogle Scholar
- Korf I. Gene finding in novel genomes. BMC Bioinformatics. 2004;5:59.View ArticlePubMedPubMed CentralGoogle Scholar
- Ter-Hovhannisyan V, Lomsadze A, Chernoff YO, Borodovsky M. Gene prediction in novel fungal genomes using an ab initio algorithm with unsupervised training. Genome Res. 2008;18:1979–90.View ArticlePubMedPubMed CentralGoogle Scholar
- Stanke M, Morgenstern B. AUGUSTUS: a web server for gene prediction in eukaryotes that allows user-defined constraints. Nucleic Acids Res. 2005;33:W465–7.View ArticlePubMedPubMed CentralGoogle Scholar
- Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015;31:3210–2.View ArticlePubMedGoogle Scholar
- Waterhouse RM, Tegenfeldt F, Li J, Zdobnov EM, Kriventseva EV, Ortho DB. A hierarchical catalog of animal, fungal and bacterial orthologs. Nucleic Acids Res. 2013;41:D358–65.View ArticlePubMedGoogle Scholar
- Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, Antonescu C, et al. Versatile and open software for comparing large genomes. Genome Biol. 2004;5Google Scholar
- Hane JK, Rouxel T, Howlett BJ, Kema GHJ, Goodwin SB, Oliver RP. A novel mode of chromosomal evolution peculiar to filamentous ascomycete fungi. Genome Biol. 2011;12Google Scholar
- Kamvar ZN, Brooks JC, Grünwald NJ. Novel R tools for analysis of genome-wide population genetic data with emphasis on clonality. Front Genet. 2015;6Google Scholar
- Schliep K, Potts AJ, Morrison DA, Grimm GW. Intertwining phylogenetic trees and networks. Methods Ecol Evol. 2017;8:1212–20.View ArticleGoogle Scholar
- Altschul SF, Madden TL, Shaffer AA, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–402.View ArticlePubMedPubMed CentralGoogle Scholar
- Katoh K, Toh H. Recent developments in the MAFFT multiple sequence alignment program. Brief Bioinform. 2008;9:286–98.View ArticlePubMedGoogle Scholar
- Talavera G, Castresana J. Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst Biol. 2007;56:564–77.View ArticlePubMedGoogle Scholar
- Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W, Gascuel O. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 2010;59:307–21.Google Scholar
- Hasegawa M, Kishino H. Yano T Aki. Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol. 1985;22:160–74.View ArticlePubMedGoogle Scholar
- Huerta-Cepas J, Serra F, Bork P. ETE 3: reconstruction, analysis, and visualization of phylogenomic data. Mol Biol Evol. 2016;33:1635–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Bouckaert RR. DensiTree: making sense of sets of phylogenetic trees. Bioinformatics. 2010;26:1372–3.View ArticlePubMedGoogle Scholar
- Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, et al. QIIME allows analysis of high-throughput community sequencing data. Nat Methods. 2010;7:335–6.View ArticlePubMedPubMed CentralGoogle Scholar
- Darriba D, Taboada GL, Doallo R, Posada D. JModelTest 2: more models, new heuristics and parallel computing. Nat Methods. 2012;9:772.View ArticlePubMedPubMed CentralGoogle Scholar
- Agapow P-M, Burt A. Indices of multilocus linkage disequilibrium. Mol Ecol Notes [Internet]. 2001;1:101–2.View ArticleGoogle Scholar