The genome of the truffle-parasite Tolypocladium ophioglossoides and the evolution of antifungal peptaibiotics
© Quandt et al. 2015
Received: 15 January 2015
Accepted: 14 July 2015
Published: 28 July 2015
Two major mycoparasitic lineages, the family Hypocreaceae and the genus Tolypocladium, exist within the fungal order, Hypocreales. Peptaibiotics are a group of secondary metabolites almost exclusively described from Trichoderma species of Hypocreaceae. Peptaibiotics are produced by nonribosomal peptide synthetases (NRPSs) and have antibiotic and antifungal activities. Tolypocladium species are mainly truffle parasites, but a few species are insect pathogens.
The draft genome sequence of the truffle parasite Tolypocladium ophioglossoides was generated and numerous secondary metabolite clusters were discovered, many of which have no known putative product. However, three large peptaibiotic gene clusters were identified using phylogenetic analyses. Peptaibiotic genes are absent from the predominantly plant and insect pathogenic lineages of Hypocreales, and are therefore exclusive to the largely mycoparasitic lineages. Using NRPS adenylation domain phylogenies and reconciliation of the domain tree with the organismal phylogeny, it is demonstrated that the distribution of these domains is likely not the product of horizontal gene transfer between mycoparasitic lineages, but represents independent losses in insect pathogenic lineages. Peptaibiotic genes are less conserved between species of Tolypocladium and are the product of complex patterns of lineage sorting and module duplication. In contrast, these genes are more conserved within the genus Trichoderma and consistent with diversification through speciation.
Peptaibiotic NRPS genes are restricted to mycoparasitic lineages of Hypocreales, based on current sampling. Phylogenomics and comparative genomics can provide insights into the evolution of secondary metabolite genes, their distribution across a broader range of taxa, and their possible function related to host specificity.
Hypocreales is home to a wide array of ecologically diverse fungi. Some are devastating plant pathogens (e.g., Fusarium spp.), while others form numerous lineages of both insect pathogens (e.g., Cordyceps) and mycoparasites (e.g., Trichoderma) . At the divergence of the four most derived families (Clavicipitaceae, Cordycipitaceae, Hypocreaceae, and Ophiocordycipitaceae) of Hypocreales, there was a major shift away from plant-based nutrition to either insect pathogenesis or fungal parasitism, i.e., mycoparasites . Two major lineages of mycoparasites are found within the order, although other mycoparasites exist (e.g., several species of Polycephalomyces; [3, 4]). The first and larger of these two lineages is the family Hypocreaceae, most notable for mycoparasitic Trichoderma spp. used in biological control of plant pathogenic fungi, and Trichoderma reesei E.G. Simmons , the industrial workhorse for cellulase production [6, 7]. The second major lineage of mycoparasites, the genus Tolypocladium, is nested within the insect pathogenic family, Ophiocordycipitaceae [1, 8]. Most species of Tolypocladium parasitize the truffles of Elaphomyces [Eurotiales, Ascomycota], ectomycorrhizal fungi closely related to Aspergillus and Penicillium [9, 10]. Tolypocladium ophioglossoides (Ehrh. ex J.F. Gmel.) Quandt, Kepler & Spatafora is a commonly collected truffle parasite with a broad geographic distribution throughout many parts of the Northern Hemisphere [11, 12]. There are, however, a few Tolypocladium species that attack insects and rotifers, and based on current multigene phylogenies some of these are inferred to be reversals to insect pathogenesis [1, 8, 13]. One of these is a beetle pathogen, T. inflatum, which was the first source of the immunosuppressant drug, cyclosporin A . Evidence from multigene studies has also shown a close phylogenetic relationship between T. ophioglossoides and T. inflatum .
Secondary metabolism is defined as the synthesis of often bioactive, small molecules that are not essential to the growth of an organism. Genes related to production of secondary metabolites are often clustered together in close proximity within a genome and coregulated . A wide variety of secondary metabolites including the ergot alkaloids, fumonisins, and destruxins, is produced by species of Hypocreales [16–18]. Many of these metabolites are produced by nonribosomal peptide synthetases (NRPSs), which are often large, multi-modular proteins that produce short peptides frequently incorporating non-standard amino acids. NRPS modules are composed of three primary functional domains including adenylation (A), thiolation (T), and condensation (C) domains . Due to their high level of amino acid and nucleotide conservation, the A-domains are frequently used to reconstruct the evolutionary histories of these genes [20, 21]. Polyketide synthases (PKSs) are another class of secondary metabolite producing enzymes that are common in fungi and are also modular in nature. They are related to fatty acid synthases , and assemble small bioactive molecules based on acetyl-CoA or malonyl-CoA subunits . The other major classes of secondary metabolite-producing enzymes are terpene synthases and dimethylallyltryptophan (DMAT) synthases, both of which have been reported from hypocrealean taxa. Fungal secondary metabolites clusters often include genes required for regulation of expression of the gene cluster and decoration, epimerization, and transport of the mature secondary metabolite [24, 25].
Peptaibols, or peptaibiotics, are antibiotic secondary metabolites products produced by very large NRPS enzymes (up to 21,000 amino acids in length). Their name is a derivative of their structure as they are Peptides containing the uncommon non-proteinogenic amino acid, α-amino isobutryic acid (AIB), and a C-terminal amino ethanol . The presence of AIB residues promotes helix formation, and several of these helices form multimeric units that in turn form voltage gated ion channels capable of inserting into cell membranes where they disrupt membrane potential causing leakiness [27, 28]. Peptaibols are produced by Trichoderma spp. and other members of Hypocreaceae (The Peptabiol Database ), leading to the proposition that they may play a role in mycoparasitism. There is at least one empirical study to support this in Trichoderma . Further studies have found that peptaibols function, along with cell wall degrading enzymes, to synergistically inhibit new cell wall synthesis in fungal prey of Tr. harzianum [31–33]. Wiest et al.  identified and characterized the first peptaibol synthetase NRPS modular structure (a product of the tex1 gene) from Tr. virens along with the 18 residue peptaibol product. Since that time, several other peptaibol NRPS genes have been identified in Tr. virens and other species [21, 35, 36].
Efrapeptins are another class of peptaibiotics described almost exclusively from Tolypocladium spp. that have antifungal and insecticidal properties [37–39]. They differ from orthodox peptaibols by the presence of a mitochondrial ATPase inhibiting C-terminal “blocking group” N-peptido-1-isobutyl-2[1-pyrrole-(1-2-α)-pyrimidinium,2,3,4,6,7,8-hexahydro]-ethylamine . Eight of these, named efrapeptins A, and C-I, have been isolated from T. inflatum [38, 41].
Numerous genomes from species of the mycoparasitic genus Trichoderma (Hypocreaceae) have been sequenced [42, 43], and more recently the genomes of several insect pathogens in Hypocreales have been completed (e.g., Cordyceps militaris, Beauveria bassiana, Metarhizium spp., and Ophiocordyceps sinensis) [44–47], including T. inflatum, the beetle pathogenic congener of T. ophioglossoides . The genomes of all of these species are rich in secondary metabolite genes and clusters, ranging from 23 to 51 secondary metabolite gene clusters per genome. Comparisons of gene content and expression are beginning to shed light on mechanisms underlying host specificity and the evolution of primary and secondary metabolism. In this study, the draft genome of the truffle parasite T. ophioglossoides was generated to compare the gene content and secondary metabolite content of this truffle parasite to those of closely related insect pathogens and more distantly related mycoparasites. The secondary metabolite potential of T. ophioglossoides is characterized with a focus on understanding the evolution of gene clusters encoding for peptaibiotics.
Results and discussion
Genome assembly and structure
Genome statistics for Tolypocladium ophioglossoides compared to T. inflatum (Bushley et al. 2013)
# SM clusters
An overview of secondary metabolites in T. ophioglossoides
The T. ophioglossoides and T. inflatum genomes harbor 45 and 55 core secondary metabolite genes – NRPS, PKS, terpene synthase, and DMATs – spread across 38 and 38 secondary metabolite gene clusters, respectively. Similar to the contrast observed between closely related Metarhizium species (), T. ophioglossoides and T. inflatum differ in the types of core secondary metabolite genes they possess with 21 shared between the two species, 34 unique to T. inflatum, and 24 unique to T. ophioglossoides (Additional file 1). Notably, T. ophioglossoides does not contain the NRPS gene, or any of the other genes in the simA cluster responsible for the production of cyclosporin A in T. inflatum . However, T. ophioglossoides does share a Hypocreales-conserved core set of genes that flank the simA region with T. inflatum. Cyclosporin production has been reported from several species of Tolypocladium [50, 51], however, none of the truffle parasites, including T. ophioglossoides, have been demonstrated to produce the compound. Whether possession of the simA gene is a derived character state, or whether there has been a single or multiple losses within the genus, remains unknown and requires further sampling.
Secondary metabolites core genes predicted from the T. ophioglossoides genome include 15 PKSs, two PKS-like genes, 15 NRPSs, six NRPS-like genes, three hybrid NRPS-PKS genes, and four terpenes (Additional file 1). No DMAT synthases were identified in the T. ophioglossoides genome. Based on A-domain homology, two putative siderophore synthetases (one intracellular [TOPH_02853] and one extracellular [TOPH_02629]) were among the predicted NRPSs (Additional files 2 and 1). This is in contrast to T. inflatum, which possesses three putative siderophore synthetases (two extracellular and one intracellular) . The entire Pseurotin-A precursor synthetase hybrid NRPS-PKS cluster (TOPH_07102) was identified in the T. ophioglossoides genome. Pseurotin-A, an antifungal compound described from several Aspergillus spp. [Eurotiales, Ascomycota], was also recently identified in the genome of M. robertsii . The disjunct distribution of this secondary metabolite cluster raises several questions about the evolutionary mechanisms (e.g., horizontal gene transfer vs. complex patterns of gene loss) that may have led to this distribution.
T. ophioglossoides possesses the destruxins synthetase NRPS gene (TOPH_08872) (Additional file 1). Destruxins are known for their insecticidal properties in Metarhizium spp. , and the entire destruxins synthetase cluster was characterized by Wang et al. (2012) in M. robertsii. Homologs of the other essential genes in the destruxins cluster are present in the T. ophioglossoides TOPH_08872 cluster, except for dtxS4 (Additional file 3), an aspartic decarboxylase responsible for producing β-alanine, one of the amino acids incorporated into destruxins. There are inversions in this cluster between M. robertsii and T. ophioglossoides as well, including the dtxS2 aldo-keto reductase homolog (TOPH_08871) and an ABC transporter (TOPH_08869), which was not found to be essential in destruxins production in M. robertsii . For these reasons, it is not likely that T. ophioglossoides produces destruxins, but possibly produces another group of related compounds. The sequenced strain of Tr. virens shares a homolog of the destruxins NRPS gene (Tv62540) (Additional file 3), but destruxins have not been reported to be produced by that species either .
To date, only two secondary metabolites have been reported to be produced by T. ophioglossoides: ophiocordin (also reported as balanol) and ophiosetin [55–57]. Ophiosetin is structurally similar to equisetin, an antibiotic with inhibitory activity of HIV-1 integrase, and both are produced by NRPS-PKS hybrid genes. Based on phylogenetic analysis of A-domains, cluster synteny with the equisetin cluster , and sequence homology, this study identifies the putative ophiosetin synthetase cluster around the hybrid NRPS-PKS, TOPH_07403 (Additional file 1). Further studies involving transformations and chemical verification and characterization of this cluster will be necessary to confirm this genotype-chemotype linkage. Ophiocordin is a polyketide and no putative gene or gene cluster related to its production was identified here. Except for the two peptaibiotic clusters discussed below, the remaining 33 secondary metabolite gene clusters are not yet associated with a specific gene product.
Peptaibiotics of Tolypocladium
There are differences between species and gene membership within A-domain clades, however. For instance, Clade 1 is enriched in A-domains from the Trichoderma peptaibols (62 Trichoderma A-domains vs. 27 Tolypocladium A-domains), while Clade 2 contains more A-domains from Tolypocladium spp. (17 Trichoderma A-domains vs. 33 Tolypocladium A-domains). Importantly Clade 1 contains A-domains that encode for incorporation of AIB, as well as other A-domains that encode for incorporation of isovaline, leucine, isoleucine, alanine, glycine, valine, and serine ([21, 35]). Clade 2 is known to include A-domains that encode for valine, glutamine, asparagine, leucine, and isoleucine ([21, 35]). Clade 3, which is known only to incorporate a single amino acid, proline ([21, 35]), has a relatively equal distribution of both Trichoderma (10) and Tolypocladium (8) A-domains. Prolines are proposed to play an important structural role in peptaibols by creating a kink in the peptaibol chain, and Clade 3 occupies a long branch within the tree, suggesting it is highly diverged from the other A-domains (Additional file 2).
Peptaibols were reported from T. geodes  based on chemical isolation, but all previous reports of the compounds were identified from fungi in Hypocreaceae  (or Boletaceae [Basidiomycota] which are likely produced by hypocreaceous mycoparasites of Boletaceae fruiting bodies [60, 61]). A gene cluster responsible for the production of these peptaibols has not been identified, and to date no genomic sequence data have been produced for T. geodes from which to predict which genes or clusters may be responsible for its production. Efrapeptins, which are peptaibiotics originally described from T. inflatum, have been reported from several species of Tolypocladium , and it remains unknown whether the products produced by these clusters in T. ophioglossoides are of the efrapeptin class, or more traditional class of peptaibols. Regardless, the phylogenetic diversity of peptaibiotic NRPSs, as revealed by phylogenomic analyses of A-domains, supports a greater chemical diversity of peptaibiotics than currently known from chemical analyses.
There exist at least two possible explanations to these findings in Tolypocladium spp. First, the divergence of the A-domains could be significant enough so as to distort the evolutionary history as represented in the phylogenetic tree, especially since many of the branches of the Tolypocladium A-domain tree do not have strong bootstrap support. This explanation is not supported by the data, because based on the species phylogeny amino acid divergence levels between Tolypocladium species is less than that of Trichoderma species (Fig. 1). Second, incomplete lineage sorting could lead to this pattern of coalescence, suggesting that the ancestor to the genus possessed a multitude of these A-domains within one or several peptaibiotic genes that have undergone a complex history of ancient gains (duplications) and losses.
In contrast, five of the A-domains inferred to be present in the common ancestor of Trichoderma spp. deeply coalesce, and 59 are shared in the common ancestor. This pattern indicates a higher degree of domain tree – species tree congruence, which could be explained by two different mechanisms including 1) vertical descent with expansion and maintenance or 2) horizontal gene transfer from Tolypocladium to common ancestor of Trichoderma. To test for signatures of horizontal gene transfer, the A-domain tree was reconciled, with a modified species tree in which Trichoderma was sister to Tolypocladium, and outgroup A-domains were included to root the domain tree. This produced a smaller deep coalescent cost (212 v. 275), but this is due to the fewer number of extinctions (in Cordycipitaceae, Clavicipitaceae, and O. sinensis) required (118 v. 181), because the number of duplications remained the same (91 v. 91). In this simulated reconciliation, 14 A-domains were inferred to deeply coalesce, and of these, seven were inherited in each lineage. Taken together, this suggests that the diversity of Trichoderma A-domains cannot solely be characterized as the product of horizontal transfer from Tolypocladium A-domains. Instead it suggests that the common ancestor of Trichoderma possessed a small number of peptaibiotic NRPSs and A-domains that largely diversified in a manner consistent with speciation of the genus.
T. ophioglossoides peptaibiotic gene clusters
The remaining T. ophioglossoides peptaibiotic NRPS gene, TOPH_08469 (10 modules), is located within a cluster containing two PKS genes (TOPH_08457 and TOPH_08462), several ABC transporters (TOPH_08453, TOPH_08459, and TOPH_08470), an esterase (TOPH_08466), an epimerase (TOPH_08468), a hydrolase (TOPH_08465), two cytochrome p450s (TOPH_08458 and TOPH_08455), and a RadR transcription regulator (TOPH_08461) (Fig. 5b). It remains to be seen if the products of the PKSs are incorporated into the peptide created by TOPH_08469. Some of the A-domains within TOPH_08469 are divergent (Fig. 3), especially those that group (without support) as the earliest diverging lineages of Clade 2, and the presence of these A-domains in the tree, causes the support for this clade to weaken substantially.
Peptaibiotic cluster synteny between mycoparasitic and insect pathogenic Tolypocladium spp.
The region of the scaffold containing the T. ophioglossoides third large peptaibiotic gene, TOPH_08469, does not align well with any portion of the T. inflatum genome. Similarly, the final peptaibol cluster in T. inflatum, containing two peptaibiotic NRPS genes (TINF07827 and TINF07876), does not align well to any portion of the T. ophioglossoides genome. The lack of synteny between these clusters in Tolypocladium spp. highlights the significant amount of genomic rearrangements between these closely related taxa. Campbell et al.  observed patterns of differential gene loss in Botrytis spp. within an ancient, horizontally-transferred, secondary metabolite gene cluster, leading to a patchy distribution of the genes within the clusters. This is not the pattern seen in the peptaibiotic clusters in Tolypocladium spp., in which the protein models are not reciprocal best BLAST hits (except for the protein models in the “relic” cluster in T. inflatum). Thus, despite the fact that their products may have similar functions, these peptaibiotic NRPS genes are divergent and located within nonhomologous gene clusters.
Mixed homology of peptaibol A-domains in Hypocreales
Using the moderate to strongly supported nodes (≥50 maximum likelihood bootstrap percentage [MLBP]) in the A-domain phylogeny as a guide for module homology, the peptaibol NRPS genes are more conserved among the Trichoderma species examined as compared to species of Tolypocladium (Fig. 6), a finding reflected in the domain tree – species tree reconciliation analyses (Fig. 4). Using whole genome data of the sampled species of Trichoderma, A-domains from the Tr. virens three peptaibol NRPS genes (tex1, tex2, and tex3)  were identified for the phylogenetic analyses; Tr. virens peptaibols contain 18, 14, and 7 modules respectively. Tr. reesei has two peptaibol NRPSs (Tr_23171 and Tr_123786) which possess 18 and 14 modules. In the annotation of Tr. atroviride IMI 206040, the A-domain HMM identified one 19 module peptaibol NRPS (Ta_317938), and several single A-domain protein models that group within the three peptabioitic clades and are all located on scaffold 29. Further examination of this Tr. atroviride gene region using the JGI genome browser (Grigoriev et al. 2014) revealed that all of the ab initio gene predictions of that region predict a single protein model that is the approximate length of the 14 module peptaibol genes in Tr. virens and Tr. reesei. Degenkolb et al.  identified a homolog of the 14 modular peptaibol gene from a different strain of Tr. atroviride, and thus this is likely a mis-annotation of Tr. atroviride scaffold 29. Alignment of this scaffold in Tr. atroviride (scaffold 29) to those of Tr. virens and Tr. reesei (Additional file 5) revealed high nucleotide homology. However, the flanking regions did not align well, and a BLAST search of the Tr. atroviride genome using the nucleotide sequences from Tr. virens and Tr. reesei revealed that the 14 modular peptaibol gene in Tr. atroviride is located within a different portion of the genome than in the other two Trichoderma spp. (Additional file 5).
Comparing the Trichoderma A-domains, each species possesses one ortholog of the large 18 or 19 module NRPS gene (Fig. 6), and at the nucleotide level these regions of their genomes also align and are syntenous (Additional file 5). The A-domains are syntenous in their arrangement within the large peptaibol NRPS genes across the species, except for: (a) the insertion of a Clade 3 domain at the third module, and (b) a duplication of either the seventeenth or eighteenth A-domains, which are most closely related. Within the two 14 module NRPS genes in Tr. virens and Tr. reesei, there is complete synteny of the A-domains. In Tr. virens, it has been demonstrated that this 14 module NRPS, Tex2, is responsible for two different sizes of peptaibols (11 and 14 residues in length) . Due to differences in annotation (see above), the 14 module peptaibol gene in Tr. atroviride is not compared in this analysis. The short 7 module peptaibol synthetase from Trichoderma spp. is found only in Tr. virens. Between these three groups of peptaibol NRPSs, the terminating residues are all orthologous, as well as the initiating residues in the larger classes of peptaibol NRPSs.
In Tolypocladium, there is a very different pattern of homology and synteny between the peptaibiotic NRPSs of the two species. Only a few of the A-domain relationships within Tolypocladium are statistically supported (≥50 MLBP) (Figs. 3 and 6). The first, third, and last A-domains of the largest NRPS in both species (TOPH_03025 and TINF07827) are orthologous, but not the other domains within those two genes. There are several instances of intragenic module duplications which are known to occur within NRPS genes and have been proposed to play a role in the evolution of novel metabolites . Within TOPH_03035, for example, there is strong support for a shared ancestry between modules 2, 3 and 6 (Fig. 6), indicating that these modules are the product of lineage specific duplications (Fig. 4). This indicates a more complicated evolutionary history of these genes in Tolypocladium.
The lack of module synteny and orthology between Tolypocladium peptaibiotic gene modules is comparable to the lack of genomic synteny observed between their clusters. Part of this is due to the deep coalescence of the Tolypocladium A-domains. This evidence indicates that Tolypocladium peptaibiotic genes are not highly similar but are the products of more ancient divergences. This is notable, because in contrast to Trichoderma spp., all of which exhibit some degree of mycoparasitism (Druzhinina et al. 2011), T. ophioglossoides and T. inflatum have different ecologies, which are characterized by mycoparasitism and insect pathogenicity, respectively. Thus, if peptaibols are important in successful mycoparasitism (as the case has been made in Trichoderma spp. [30, 31], then there may be less selective pressure to maintain a specific mycoparasitic function of these extremely large (>10,000 amino acid) NRPS genes in more ecologically diverse lineages. Future genome-scale studies sampling other hypocrealean lineages containing both mycoparasites and insect pathogens, including the genus Polycephalomyces, could further test this hypothesis.
The genome of T. ophioglossoides is rich in secondary metabolite gene clusters, and 31 out of 38 of these clusters have no putative product. Given this potential and its life history as a mycoparasite, this species should be targeted for future studies to discover novel natural compounds with potential antibiosis, including antifungal, activity. The simA NRPS gene cluster, responsible for the production of the immunosuppressant cyclosporin, is not present in the T. ophioglossoides genome, but three large peptaibiotic genes are present within two clusters. These are the first data to suggest the potential for peptaibiotic production from a mycoparasitic species of Tolypocladium. This study confirms the presence of three phylogenetic clades of peptaibiotic NRPS A-domains from Tolypocladium and Trichoderma spp., and that peptaibiotics in general are limited to the mycoparasitic lineages of Hypocreales, based on current sampling. Reconciliation of the A-domain tree with the organismal phylogeny reveals that the peptaibiotic NRPSs of Trichoderma and Tolypocladium are likely the product of different mechanisms of diversification. Trichoderma is characterized by A-domain diversification that is largely consistent with speciation whereas Tolypocladium is characterized by A-domain diversity that shows patterns of deep coalescence. Deep coalescence is inconsistent with peptaibiotic NRPS diversity being the product of HGT to Tolypocladium, rather it is the product of complex patterns of lineage sorting and gains and losses of A-domains from hypocrealean ancestors. While the diversity of peptaibiotic NRPSs in Trichoderma could possibly be explained by HGT in the common ancestor of the three species, none of the Tolypocladium peptaibiotic NRPSs analyzed here are candidates for HGT. Further research is required to identify the structures of specific metabolites of the Tolypocladium gene clusters and to determine if these peptaibiotics are produced during mycoparasitism by T. ophioglossoides or if these genes are present in other mycoparasitic lineages of Hypocreales.
T. ophioglossoides strain CBS 100239 was grown for 7 days in a shaking incubator in potato dextrose broth (PDB) inoculated with plugs of tissue growing on potato dextrose agar for collection of tissue for DNA extraction. Tissue was harvested via filtration, frozen at −80 °C in 1.5 mL tubes, and then lyophilized for 24 h. Lyophilized tissue was ground using a mortar and pestle, and DNA was extracted using a Qiagen DNeasy Plant Mini kit following the standard protocol starting at the step with the addition of lysis buffer AP1 and eluted in 50 μL water. Tissue for RNA extraction was grown in Yeast Malt (YM) broth, minimal media (MM) containing autoclaved insect cuticle with proteins removed using the protocol in Andersen (1980) , and MM containing lyophilized Elaphomyces muricatus peridium for 24 h and harvested into liquid nitrogen and stored at −80° until extraction. RNA was extracted using the Qiagen RNeasy Plant kit following the manufacturer’s protocol. The small insert DNA library was prepared using New England Biomedicals NEBNext reagents, and size selection (350 bp) was performed using gel extraction. Nextera Mate Pair Sample Preparation of a large insert (6800 bp) library and sequencing was conducted by the Core Labs at the Center for Genome Research and Biocomputing (CGRB) at Oregon State University. The Illumina TruSeq RNA Sample Preparation Kit v2 was using for RNA library construction, using the manufacturer’s suggested protocols including Agencourt AMPure magnetic beads for cleaning steps. All libraries were sequenced on the Illumina HiSeq2000 at the Core Labs of the CGRB with paired-end 101 cycles for DNA libraries and single-end 51 cycles for RNA libraries.
Assembly, annotation, and bioinformatic analyses
Using scripts in the fastx toolkit , raw reads were trimmed (to 50 bp in length) and filtered based on quality score (all bases ≥ q20). Initial de novo assembly of the short insert reads was conducted in Velvet v. 1.19  with over 156 million reads where the assembly had a median coverage depth of 74.45. The final trim length (50 bp) used in the assembly was chosen after trimming to different lengths (40–80 bp) followed by quality filtering and then the assembly with highest n50 and fewest number of contigs was selected as the “best” assembly. From that assembly, 50 million overlapping 150 bp paired reads were simulated with a 250 bp insert size using the program wgsim v. 0.3.1-r13 in “haploid” mode . Final assembly using the simulated overlapping short insert library reads and the mate pair reads from the 6 kb library was conducted in AllPaths-LG with default settings . The Core Eukaryotic Mapping Genes Approach (CEGMA) was used to estimate the completeness of the T. ophioglossoides genome . Scripts in the Mummer3 package were used to create a mummerplot between T. ophioglossoides and T. inflatum; specifically nucmer v. 3.07 was run with default settings .
Gene model predictions were created using the Maker annotation pipeline  incorporating RNA data assembled in Trinity  using the Jellyfish v. 2.0 method of kmer counting . Other information given to Maker included a custom hidden Markov model (HMM) for T. ophioglossoides built by Genemark-ES v 2.0 , a SNAP HMM  trained on Fusarium graminearum, which was also set as the species model for AUGUSTUS , and protein and/or EST data from the following hypocrealean taxa: F. graminearum, N. haematococca, Tr. reesei, Tr. virens, M. robertsii, T. inflatum, C. militaris, B. bassiana. Annotation of transposable elements was performed in RepeatMasker v 3.2.8 with organism set to “fungi” , and custom repeat content was estimated using RepeatScout v 1.0.3 and scripts associated with that package . Non-overlapping ab initio protein models were aligned using BLAST  against a custom database of all the protein models of all the hypocrealean taxa used in this study. Any of these protein models with a significant hit (≤ 1e−5) were included in the final protein set and used for downstream analyses.
Using a set of NRPS A-domains from a wide array of published fungal genomes  an HMM was created for this study using the program Hmmer 3.0 . This HMM was then used to mine the 18 hypocrealean genomes used for this study for the identification of A-domains. Putative A-domains identified were filtered for short sequences (less than 100 bp), and where applicable cross referenced with published reports of NRPS from those species (e.g., Tr. virens Tex1) . Additional annotation of secondary metabolite clusters was completed using the antiSMASH  and SMURF  pipelines. A-domain trees were reconciled with species trees in Mesquite v. 2.75 with the contained tree treated as unrooted .
Whole scaffold alignments were performed in the program Mauve  with default progressivemauve alignment settings.
Predicted A-domain amino acids sequences were aligned using MUSCLE v 3.8.31  under default settings. Gaps were removed manually, and all alignments were analyzed using RAxML v 7.2.6  using the Gamma model of rate heterogeneity and the WAG substitution matrix with 100 bootstrap replicates.
Whole genome phylogenomic analyses were executed in the HAL pipeline . Orthologous clusters of proteins were identified in MCL  across inflation parameters 1.2, 3 and 5. Briefly, orthologous clusters were filtered for retention of clusters with one sequence per genome and removal of any redundant clusters. The resulting unique, single-copy orthologous clusters of proteins were aligned in MUSCLE  with default settings; poorly aligned regions were identified using Gblocks (; gap removal setting = c, for conservative) and excluded from subsequent analyses. The aligned clusters were concatenated into a superalignment and maximum likelihood analysis was performed using RAxML v 7.2.6 with the Gamma model of rate heterogeneity and the WAG substitution matrix with 100 bootstrap replicates.
Availability of supporting data
This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession LFRF00000000. The version described in this paper is version LFRF01000000.
The authors thank Mark Desanko, Cedar Hesse, and Chris Sullivan for providing technical and computational assistance for this project. This project was supported with funds from the NSF grant DEB-0732993 to JWS, and CAQ was supported by an NSF Graduate Research Fellowship.
- Sung G-H, Hywel-Jones NL, Sung J-M, Luangsa-Ard JJ, Shrestha B, Spatafora JW. Phylogenetic classification of Cordyceps and the clavicipitaceous fungi. Stud Mycol. 2007;57:5–59.PubMed CentralPubMedView ArticleGoogle Scholar
- Spatafora JW, Sung G-H, Sung J-M, Hywel-Jones NL, White JF. Phylogenetic evidence for an animal pathogen origin of ergot and the grass endophytes. Mol Ecol. 2007;16:1701–11.PubMedView ArticleGoogle Scholar
- Kepler RM, Sung G-H, Harada Y, Tanaka K, Tanaka E, Hosoya T, et al. Host jumping onto close relatives and across kingdoms by Tyrannicordyceps (Clavicipitaceae) gen. nov. and Ustilaginoidea (Clavicipitaceae). Am J Bot. 2012;99:552–61.PubMedView ArticleGoogle Scholar
- Kepler R, Ban S, Nakagiri A, Bischoff J, Hywel-Jones N, Owensby CA, et al. The phylogenetic placement of hypocrealean insect pathogens in the genus Polycephalomyces: an application of One Fungus One Name. Fungal Biol. 2013;117:611–22.PubMedView ArticleGoogle Scholar
- Hjeljord L, Tronsmo A: Trichoderma and Gliocladium in biological control: an overview. In Trichoderma Gliocladium. Enzymes Biological Control and Commercial Applications. Edited by Harman GE, Kubicek CP; 2002:115–133Google Scholar
- Durand H, Clanet M, Tiraby G. Genetic improvement of Trichoderma reesei for large scale cellulase production. Enzyme Microb Technol. 1988;10:341–6.View ArticleGoogle Scholar
- Sukumaran RK, Singhania RR, Pandey A. Microbial cellulases: Production, applications and challenges. J Scientific and Industrial Reseach. 2005;64:832–44.Google Scholar
- Quandt CA, Kepler RM, Gams W, Araujo JPM, Ban S, Evans HC, et al. Phylogenetic-based nomenclatural proposals for Ophiocordycipitaceae (Hypocreales) with new combinations in Tolypocladium. IMA Fungus. 2014;5:121–34.PubMed CentralPubMedView ArticleGoogle Scholar
- Landvik S, Shailer NFJ, Eriksson OE. SSU rDNA sequence support for a close relationship between the Elaphomycetales and the Eurotiales and Onygenales. Mycoscience. 1996;37:237–41.View ArticleGoogle Scholar
- LoBuglio KF, Berbee ML, Taylor JW. Phylogenetic origins of the asexual mycorrhizal symbiont Cenococcum geophilum Fr. and other mycorrhizal fungi among the ascomycetes. Mol Phylogenet Evol. 1996;6:287–94.PubMedView ArticleGoogle Scholar
- Mains EB. Species of Cordyceps parasitic on Elaphomyces. Bull Torrey Bot Club. 1957;84:243–51.View ArticleGoogle Scholar
- Kobayasi Y, Shimizu D. Monographic studies on Cordyceps 1. Group parasitic on Elaphomyces. Bull Natl Sci Museum, Tokyo. 1960;5:69–85.Google Scholar
- Sung G-H, Poinar GO, Spatafora JW. The oldest fossil evidence of animal parasitism by fungi supports a Cretaceous diversification of fungal-arthropod symbioses. Mol Phylogenet Evol. 2008;49:495–502.PubMedView ArticleGoogle Scholar
- Borel JF. History of the discovery of cyclosporin and of its early pharmacological development. Wien Klin Wochenschr. 2002;114:433.PubMedGoogle Scholar
- Keller NP, Turner G, Bennett JW. Fungal secondary metabolism - from biochemistry to genomics. Nat Rev Microbiol. 2005;3:937–47.PubMedView ArticleGoogle Scholar
- Isaka M, Kittakoop P, Thebtaranonth Y: Secondary Metabolites of Clavicipitalean Fungi. In Clavicipitalean Fungi Evolutionary Biology, Chemistry, Biocontrol And Cultural Impacts. Edited by White JF, Bacon CW, Hywel-Jones NL, Spatafora JW. New York, NY: Marcel Dekker; 2003:355–98.Google Scholar
- Desjardins AE, Proctor RH. Molecular biology of Fusarium mycotoxins. Int J Food Microbiol. 2007;119:47–50.PubMedView ArticleGoogle Scholar
- Molnár I, Gibson DM, Krasnoff SB. Secondary metabolites from entomopathogenic Hypocrealean fungi. Nat Prod Rep. 2010;27:1241–75.PubMedView ArticleGoogle Scholar
- Marahiel MA, Stachelhaus T, Mootz HD. Modular Peptide Synthetases Involved in Nonribosomal Peptide Synthesis. Chem Rev. 1997;97:2651–74.PubMedView ArticleGoogle Scholar
- Bushley KE, Turgeon BG. Phylogenomics reveals subfamilies of fungal nonribosomal peptide synthetases and their evolutionary relationships. BMC Evol Biol. 2010;10:26.PubMed CentralPubMedView ArticleGoogle Scholar
- Wei X, Yang F, Straney DC. Multiple non-ribosomal peptide synthetase genes determine peptaibol synthesis in Trichoderma virens. Can J Microbiol. 2005;51:423–9.PubMedView ArticleGoogle Scholar
- Jenke-Kodama H, Sandmann A, Müller R, Dittmann E. Evolutionary implications of bacterial polyketide synthases. Mol Biol Evol. 2005;22:2027–39.PubMedView ArticleGoogle Scholar
- Fischbach MA, Walsh CT, Clardy J. The evolution of gene collectives: How natural selection drives chemical innovation. Proc Natl Acad Sci. 2008;105:4601–8.PubMed CentralPubMedView ArticleGoogle Scholar
- Zhang YQ, Wilkinson H, Keller NP, Tsitsigiannis D, An Z. Secondary metabolite gene clusters. In: An Z, editor. Handbook of industrial microbiology. New York: Dekker; 2004. p. 355–86.Google Scholar
- Hoffmeister D, Keller NP. Natural products of filamentous fungi: enzymes, genes, and their regulation. Nat Prod Rep. 2007;24:393–416.PubMedView ArticleGoogle Scholar
- Chugh JK, Wallace BA: Peptaibols : models for ion channels Sequence alignments into subfamilies (SFs). 2001:565–570.Google Scholar
- Fox RO, Richards FM. A voltage-gated ion channel model inferred from the crystal structure of alamethicin at 1.5-A resolution. Nature. 1982;300:325–30.PubMedView ArticleGoogle Scholar
- Chugh JK, Bruckner H, Wallace BA. Model for a Helical Bundle Channel Based on the High-Resolution Crystal Structure of Trichotoxin _ A50E. Biochemistry. 2002;41:12934–41.PubMedView ArticleGoogle Scholar
- Whitmore L, Wallace BA. The Peptaibol Database: a database for sequences and structures of naturally occurring peptaibols. Nucleic Acids Res. 2004;32(Database issue):D593–4.PubMed CentralPubMedView ArticleGoogle Scholar
- Röhrich CR, Iversen A, Jaklitsch WM, Voglmayr H, Berg A, Dörfelt H, et al. Hypopulvins, novel peptaibiotics from the polyporicolous fungus Hypocrea pulvinata, are produced during infection of its natural hosts. Fungal Biol. 2012;116:1219–31.PubMedView ArticleGoogle Scholar
- Schirmbock M, Lorito M, Wang Y, Hayes CK, Arisan-atac I, Scala F, et al. Parallel Formation and Synergism of Hydrolytic Enzymes and Peptaibol Antibiotics, Molecular Mechanisms Involved in the Antagonistic Action of Trichoderma harzianum against Phytopathogenic Fungi. Appl Environ Microbiol. 1994;60:4364–70.PubMed CentralPubMedGoogle Scholar
- Lorito M, Peterbauer C, Hayes CK, Harman GE. Synergistic interaction between fungal cell wall degrading enzymes and different antifungal compounds enhances inhibition of spore germination. Microbiology. 1994;140(Pt 3):623–9.PubMedView ArticleGoogle Scholar
- Lorito M, Woo SL, D’Ambrosio M, Harman GE, Hayes CK, Kubicek CP, et al. Synergistic interaction between cell wall degrading enzymes and membrane affecting compounds. Mol Plant Microbe Interact. 1996;9:206–13.View ArticleGoogle Scholar
- Wiest A, Grzegorski D, Xu B-W, Goulard C, Rebuffat S, Ebbole DJ, et al. Identification of peptaibols from Trichoderma virens and cloning of a peptaibol synthetase. J Biol Chem. 2002;277:20862–8.PubMedView ArticleGoogle Scholar
- Mukherjee PK, Wiest A, Ruiz N, Keightley A, Moran-Diez ME, McCluskey K, et al. Two classes of new peptaibols are synthesized by a single non-ribosomal peptide synthetase of Trichoderma virens. J Biol Chem. 2011;286:4544–54.PubMed CentralPubMedView ArticleGoogle Scholar
- Degenkolb T, Agheheh RK, Dieckmann R, Neuhof T, Baker SE, Druzhinina IS, et al. The production of multiple small peptaibol families by single 14-module peptide synthetases in Trichoderma/Hypocrea. Chem Biodivers. 2012;9:499–535.PubMedView ArticleGoogle Scholar
- Krishna K, Sukumar M, Balaram P, Unit MB. Structural chemistry and membrane modifying activity of the fungal polypeptides zervamicins, antiamoebins and efrapeptins. Pure Appl Chem. 1990;62:1417–20.View ArticleGoogle Scholar
- Krasnoff SB, Gupta S. Efrapeptin production by Tolypocladium fungi (Deuteromycotina: Hyphomycetes): Intra- and interspecific variation. J Chem Ecol. 1992;18:1727–41.PubMedView ArticleGoogle Scholar
- Bandani AR, Khambay BPS, Faull JL, Newton R, Deadman M, Butt TM. Production of efrapeptins by Tolypocladium species and evaluation of their insecticidal and antimicrobial properties. Mycol Res. 2000;104:537–44.View ArticleGoogle Scholar
- Gupta S, Krasnoff SB, Roberts DW, Renwick JAA. Structure of Efrapeptins from the fungus Tolypocladium niveum: peptide inhibitors of mitochondrial ATPase. J Org Chem. 1992;57:2306–13.View ArticleGoogle Scholar
- Nagaraj G, Uma MV, Shivayogi MS. Antimalarial Activities of Peptide Antibiotics Isolated from Fungi. Antimicrob Agents Chemother. 2001;45:145–9.PubMed CentralPubMedView ArticleGoogle Scholar
- Martinez D, Berka RM, Henrissat B, Saloheimo M, Arvas M, Baker SE, et al. Genome sequencing and analysis of the biomass-degrading fungus Trichoderma reesei (syn. Hypocrea jecorina). Nat Biotechnol. 2008;26:553–60.PubMedView ArticleGoogle Scholar
- Kubicek CP, Herrera-Estrella A, Seidl-Seiboth V, Martinez DA, Druzhinina IS, Thon M, et al. Comparative genome sequence analysis underscores mycoparasitism as the ancestral life style of Trichoderma. Genome Biol. 2011;12:R40.PubMed CentralPubMedView ArticleGoogle Scholar
- Zheng P, Xia Y, Xiao G, Xiong C, Hu X, Zhang S, et al. Genome sequence of the insect pathogenic fungus Cordyceps militaris, a valued traditional Chinese medicine. Genome Biol. 2011;12:R116.PubMed CentralPubMedView ArticleGoogle Scholar
- Xiao G, Ying S-H, Zheng P, Wang Z-L, Zhang S, Xie X-Q, et al. Genomic perspectives on the evolution of fungal entomopathogenicity in Beauveria bassiana. Sci Rep. 2012;2:483.PubMed CentralPubMedGoogle Scholar
- Gao Q, Jin K, Ying SH, Zhang Y, Xiao G, Shang Y, et al. Genome sequencing and comparative transcriptomics of the model entomopathogenic fungi Metarhizium anisopliae and M. acridum. PLoS Genet. 2011;7:e1001264.PubMed CentralPubMedView ArticleGoogle Scholar
- Hu X, Zhang Y, Xiao G, Zheng P, Xia Y, Zhang X, et al. Genome survey uncovers the secrets of sex and lifestyle in caterpillar fungus. Chinese Sci Bull. 2013;58:2846–54.View ArticleGoogle Scholar
- Bushley KE, Raja R, Jaiswal P, Cumbie JS, Nonogaki M, Boyd AE, et al. The genome of Tolypocladium inflatum: evolution, organization, and expression of the cyclosporin biosynthetic gene cluster. PLoS Genet. 2013;9, e1003496.PubMed CentralPubMedView ArticleGoogle Scholar
- Parra G, Bradnam K, Ning Z, Keane T, Korf I. Assessing the gene space in draft genomes. Nucleic Acids Res. 2009;37:289–97.PubMed CentralPubMedView ArticleGoogle Scholar
- Sedmera P, Havlicek V, Jegorov A, Segre AL. Cyclosporin D Hydroperoxide, a new metabolite of Tolypocladium terricola. Tetrahedron Lett. 1995;36:6953–6.Google Scholar
- Traber R, Dreyfuss MM. Occurrence of cyclosporins and cyclosporin-like peptolides in fungi. J Ind Microbiol. 1996;17:397–401.View ArticleGoogle Scholar
- Wiemann P, Guo C-J, Palmer JM, Sekonyela R, Wang CCC, Keller NP. Prototype of an intertwined secondary-metabolite supercluster. Proc Natl Acad Sci U S A. 2013;110:17065–70.PubMed CentralPubMedView ArticleGoogle Scholar
- Kershaw M, Moorhouse E, Bateman R, Reynolds S, Charnley A. The role of destruxins in the pathogenicity of Metarhizium anisopliae for three species of insect. J Invertebr Pathol. 1999;74:213–23.PubMedView ArticleGoogle Scholar
- Wang B, Kang Q, Lu Y, Bai L, Wang C. Unveiling the biosynthetic puzzle of destruxins in Metarhizium species. Proc Natl Acad Sci U S A. 2012;109:1287–92.PubMed CentralPubMedView ArticleGoogle Scholar
- Kneifel H, Konig WA, Loeffler W, Muller R. Ophiocordin, an antifungal antibiotic of Cordyceps ophioglossoides. Arch Microbiol. 1977;113:121–30.PubMedView ArticleGoogle Scholar
- Boros C, Hamilton SM, Katz B, Kulanthaivel P. Comparison of Balanol from Verticillium balanoides and ophiocordin from Cordyceps ophioglossoides. J Antibiot (Tokyo). 1994;47:1010–6.View ArticleGoogle Scholar
- Putri SP, Kinoshita H, Ihara F, Igarashi Y, Nihira T. Ophiosetin, a new tetramic acid derivative from the mycopathogenic fungus Elaphocordyceps ophioglossoides. J Antibiot (Tokyo). 2010;63:195–8.View ArticleGoogle Scholar
- Sims JW, Fillmore JP, Warner DD, Schmidt EW. Equisetin biosynthesis in Fusarium heterosporum. Chem Commun (Camb). 2005;186–8.Google Scholar
- Tsantrizos YS, Pischos S, Sauriol F, Widden P. Peptaibol metabolites of Tolypocladium geodes. Can J Chem. 1996;172:165–72.View ArticleGoogle Scholar
- Lee S-J, Yeo W-H, Yun B-S, Yoo I-D. Isolation and sequence analysis of new peptaibol, boletusin, from Boletus spp. J Pept Sci. 1999;5:374–8.PubMedView ArticleGoogle Scholar
- Lee S-J, Yun B-S, Cho D-H, Yoo I-D. Tylopeptins A and B, new antibiotic peptides from Tylopilus neofelleus. J Antibiot (Tokyo). 1999;52:998–1006.View ArticleGoogle Scholar
- Young C, McMillan L, Telfer E, Scott B. Molecular cloning and genetic analysis of an indole-diterpene gene cluster from Penicillium paxilli. Mol Microbiol. 2001;39:754–64.PubMedView ArticleGoogle Scholar
- Fleetwood DJ, Scott B, Lane GA, Tanaka A, Johnson RD. A complex ergovaline gene cluster in Epichloë endophytes of grasses. Appl Environ Microbiol. 2007;73:2571–9.PubMed CentralPubMedView ArticleGoogle Scholar
- Campbell MA, Staats M, van Kan JAL, Rokas A, Slot JC. Repeated loss of an anciently horizontally transferred gene cluster in Botrytis. Mycologia. 2013;105:1126–34.PubMedView ArticleGoogle Scholar
- Mukherjee PK, Horwitz BA, Kenerley CM. Secondary metabolism in Trichoderma--a genomic perspective. Microbiology. 2012;158(Pt 1):35–45.PubMedView ArticleGoogle Scholar
- FASTQ/A short-reads pre-processing tools [http://hannonlab.cshl.edu/fastx_toolkit/]
- Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008;18:821–9.PubMed CentralPubMedView ArticleGoogle Scholar
- Wgsim - read simulator for next generation sequencing [http://github.com/lh3/wgsim]
- Gnerre S, Maccallum I, Przybylski D, Ribeiro FJ, Burton JN, Walker BJ, et al. High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci U S A. 2011;108:1513–8.PubMed CentralPubMedView ArticleGoogle Scholar
- Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, Antonescu C, et al. Versatile and open software for comparing large genomes. Genome Biol. 2004;5:R12.PubMed CentralPubMedView ArticleGoogle Scholar
- Cantarel BL, Korf I, Robb SMC, Parra G, Ross E, Moore B, et al. MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res. 2008;18:188–96.PubMed CentralPubMedView ArticleGoogle Scholar
- Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011;29:644–52.PubMed CentralPubMedView ArticleGoogle Scholar
- Marçais G, Kingsford C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics. 2011;27:764–70.PubMed CentralPubMedView ArticleGoogle Scholar
- Ter-Hovhannisyan V, Lomsadze A, Chernoff YO, Borodovsky M. Gene prediction in novel fungal genomes using an ab initio algorithm with unsupervised training. Genome Res. 2008;18:1979–90.PubMed CentralPubMedView ArticleGoogle Scholar
- Korf I. Gene finding in novel genomes. BMC Bioinformatics. 2004;5:59.PubMed CentralPubMedView ArticleGoogle Scholar
- Stanke M, Morgenstern B. AUGUSTUS: a web server for gene prediction in eukaryotes that allows user-defined constraints. Nucleic Acids Res. 2005;33(Web Server issue):W465–7.PubMed CentralPubMedView ArticleGoogle Scholar
- RepeatMasker [http://www.repeatmasker.org]
- Price AL, Jones NC, Pevzner PA. De novo identification of repeat families in large genomes. Bioinformatics. 2005;21 Suppl 1:i351–8.PubMedView ArticleGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic Local Alignment Search Tool. J Mol Biol. 1990;215:403–10.PubMedView ArticleGoogle Scholar
- Eddy SR. A new generation of homology search tools based on probabilistic inference. Genome informatics. 2009;23:205–11.PubMedView ArticleGoogle Scholar
- Blin K, Medema MH, Kazempour D, Fischbach MA, Breitling R, Takano E, et al. antiSMASH 2.0--a versatile platform for genome mining of secondary metabolite producers. Nucleic Acids Res. 2013;41(Web Server issue):W204–12.PubMed CentralPubMedView ArticleGoogle Scholar
- Khaldi N, Seifuddin FT, Turner G, Haft D, Nierman WC, Wolfe KH, et al. SMURF: Genomic mapping of fungal secondary metabolite clusters. Fungal Genet Biol. 2010;47:736–41.PubMed CentralPubMedView ArticleGoogle Scholar
- Mesquite: a modular system for evolutionary analysis [http://mesquiteproject.org]
- Darling AE, Mau B, Perna NT. ProgressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One. 2010;5, e11147.PubMed CentralPubMedView ArticleGoogle Scholar
- Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792–7.PubMed CentralPubMedView ArticleGoogle Scholar
- Stamatakis A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 2006;22:2688–90.PubMedView ArticleGoogle Scholar
- Robbertse B, Yoder RJ, Boyd A, Reeves J, Spatafora JW. Hal: an automated pipeline for phylogenetic analyses of genomic data. PLoS Curr. 2011;3, RRN1213.PubMed CentralPubMedView ArticleGoogle Scholar
- Enright AJ, Van DS, Ouzounis CA. An efficient algorithm for large-scale detection of protein families. Nucleic Acid Res. 2002;30:1575–84.PubMed CentralPubMedView ArticleGoogle Scholar
- Talavera G, Castresana J. Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst Biol. 2007;56:564–77.PubMedView ArticleGoogle Scholar
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.