- Research article
- Open Access
Chromosome level assembly and secondary metabolite potential of the parasitic fungus Cordyceps militaris
BMC Genomicsvolume 18, Article number: 912 (2017)
Cordyceps militaris is an insect pathogenic fungus that is prized for its use in traditional medicine. This and other entomopathogenic fungi are understudied sources for the discovery of new bioactive molecules. In this study, PacBio SMRT long read sequencing technology was used to sequence the genome of C. militaris with a focus on the genetic potential for secondary metabolite production in the genome assembly of this fungus.
This is first chromosome level assembly of a species in the Cordyceps genera. In this seven chromosome assembly of 33.6 Mba there were 9371 genes identified. Cordyceps militaris was determined to have the MAT 1-1-1 and MAT 1-1-2 mating type genes. Secondary metabolite analysis revealed the potential for at least 36 distinct metabolites from a variety of classes. Three of these gene clusters had homology with clusters producing desmethylbassianin, equisetin and emericellamide that had been studied in other fungi.
Our assembly and analysis has revealed that C. militaris has a wealth of gene clusters for secondary metabolite production distributed among seven chromosomes. The identification of these gene clusters will facilitate the future study and identification of the secondary metabolites produced by this entomopathogenic fungus.
Entomopathogenic fungi are a fascinating group of insect parasitic microbes, which include species from a variety of different fungal taxa including Beauvaria, Hirsutella, Metarhizium Cordyceps and Ophiocordyceps (Fig. 1). These entomopathogenic fungi typically have a lifecycle in which the host is infected and is killed in the process of fungal propagation. Two closely related genera of entomopathogemic fungi, Cordyceps and Ophiocordyceps (often times just referred to as cordyceps in common literature) are characterized by their unique lifecycle and a specific process by which they parasitize and reproduce using the insect host.
Fungal spores and hyphae are able to penetrate insect cuticle and then colonize and proliferate within their body cavity. As the insect’s body is used as a nutrient reservoir for growth, the insect’s behaviour is modified, eventually leading the host to die in an advantageous location for fungal spore dispersal. The fungus then emerges as a fruiting body from the corpse of the insect, which matures and disperses spores of the next generation. Spread globally, Cordyceps and Ophiocordyceps fungi have been described in climates across Asia, the Americas, Europe and Australia, with many of these species having not been characterized. Although these genera are believed to contain well over 400 species of fungi, there are a few standout examples which are revered for their medicinal potential or unusual host pathogenesis.
Ophiocordyceps sinensis, found in the mountains of Tibet, infects and kills ghost moth larvae to give the highly prized herbal remedy “dong chong xia cao,” which is believed to treat a plethora of disorders . This prized specimen is identified by the fruiting body growing from the ground, as the infected ghost moth larvae dies situated just below the surface of the soil with its head oriented upward, from which the fruiting body emerges. Ophiocordyceps unilateralis, also known as the zombie-ant fungus, is noted for its pathogenic process in ants, which is characterized by particular behaviour modifications in the host, that leaves the host ant perished with its jaw clamped to a leaf in prime location for spore dispersal . Cordyceps militaris, which is a common component of supplements as it is also believed to have medicinal potential, is often used as a cheaper and more readily available version of Ophiocordyceps sinensis [3, 4].
The genome of a handful of these fungi have been sequenced, however, the available assemblies are often fragmented in over 500 contigs [5,6,7,8]. These assemblies do indicate that these fungi are capable to producing natural products, possibly over 30 distinct molecules per species. Only a few of these natural products from entomopathogenic fungi have been isolated and described, including the immunosuppressant, cyclosporine, from Tolypocladium inflatum, and fingolimoid, the immunomodulatory molecule derived from the Isaria sinclarii natural product myriosin, signifying that these fungi may be an underexplored source of novel molecules [9, 10].
Natural products have been established as a source of bioactive molecules, however, discovery has dwindled implying the need for new sources. Fungi have been shown to produce a wealth of diverse molecules  suggesting that Cordyceps and Ophiocordyceps could be an understudied and relevant avenue for the discovery of natural products. Furthermore, by studying the secondary metabolites in Cordyceps and Ophiocordyceps not only is there the potential for discovery of new bioactives, but for the identification and study of chemicals that have a role in the process of host pathogenesis, from behaviour modifying molecules to insecticides . As genome sequencing becomes cheaper, faster, and more readily available, the possibility of taking a computational approach to genome mining for secondary metabolite discovery in fungi becomes a more realistic possibility, allowing for the study of secondary metabolites which may be cryptic under typical laboratory conditions . Indeed, laboratory culture of these organisms is a challenge due to their slow growth rates – a genome-based approach to their natural product genes is likely essential for this field to progress.
In this study, a new method of long read sequencing, Pacific Biosciences SMRT sequencing is applied to an exemplary sample from the Cordyceps genera, Cordyceps militaris, a strain isolated from butterfly pupa. The overarching goal is to provide a chromosome level genome assembly to serve as a model for the genera. Furthermore, as these fungi have the potential to produce many understudied natural products, this study is focused on the genetic potential for secondary metabolite expression in this organism.
General genome features
Purified genomic DNA isolated from culture of C. militaris grown up from a single colony was sequenced using the Pacific Biosciences platform using a sheared large insert library . Sequence data from 6 SMRT cells, providing approximately 180× coverage were assembled using two de novo assemblers, Celera with the PBcR protocol and the HGAP2 protocol from SMRT portal [15,16,17]. Both chosen assemblers were applied to self-correct the reads, a process in which the shorter PacBio reads were used to error correct the long PacBio reads. These corrected reads were then subsequently assembled into contigs. The PBcR-pipeline gave an assembly with 32 contigs, four of which had telomeric repeats (CCCTAA or TTAGGG)n on either the 5′ or 3′ end of the contig.
The second assembly protocol, the HGAP2 protocol from the SMRT portal software package, which also included an additional polishing step using the Quiver algorithm, yielded an even further improved assembly. This assembly, which contained 18 contigs, gave five assembled chromosomes, having telomeric repeats (CCCTAA or TTAGGG)n on both the 5′ and 3′ ends of the sequence and four having telomeric repeats on one of the 5′ or 3′ end of the sequence. After manually curating the assembly and submitting the curated assembly to the SMRT resequencing protocol, the resultant assembly contained 7 contigs, each with telomeres on both ends, indicative of 7 chromosomes. Coverage across these seven chromosomes, including regions where the assembly was manually curated is shown in Additional file 1: Figure S1. The coverage across the chromosomes is generally consistent in the assembly after manual curation and the SMRT resequencing protocol. The N50 (5.78kba) and Nmax (8.29kba) remain unchanged when comparing the initial assembly and the curated and resequenced assembly. However, there is a spike in coverage in the contig corresponding to chromosome IV, possibly implying a collapsed repeat region within the assembly (Additional file 1: Figure S1D). When this area of the assembly is further inspected, it is noted that there are a large number of short low quality repeated reads, which align on top of >30× coverage of high quality reads that span this repetitive region. The repetitive sequence of this region and the overabundance of low quality reads in this region was noted prior to manual curation. The initial assembly consisted of 3 contigs containing these short repetitive reads, each about 30,000 bp in length, which overlapped in this region. With an overall coverage of approximately 150×, a genome size of 33.6 Mba, and a GC content of 50.9%. (Table 1), the BUSCO completeness of this assembly is 98.2% with only 0.4% missing . Furthermore, the assembly is comparable to the previously sequenced C. militaris Cm01 (though that sequence is broken into a very large number of contigs) with a similar genome size of 32.2 Mba and a GC content of 51.4% .
The seven assembled C. militaris chromosomes range in size from 1.9 to 8.3 Mba. The sequenced haploid genomes of Aspergillus niger and Neurospora crassa contain eight and seven chromosomes, respectively [19, 20]. Furthermore, a karyotype analysis of Tolypocladium inflatum shows that this related species contains seven chromosomes ranging in size from 1.0 to 6.3 Mba , suggesting that the assembly with seven chromosomes is reasonable for C. militaris.
The MAKER genome annotation pipeline [22, 23] predicted 9907 genes for C. militaris. Passing the MAKER gene predictions through an additional evidence modeler using Funannotate gave a set of 9371 genes with a BUSCO analysis of the resulting gene set estimating a completeness of 93.7% with 3.9% of genes missing [18, 24]. Estimates of mean gene length, mean exon length, mean intron length and gene density (Table 2) are similar to those of C. militaris Cm01 and other filamentous ascomycete fungi (Table 3) [5,6,7, 12, 25, 26]. An Interpro analysis of the annotated genes using the Blast2GO suite was used to assign 8792 genes (93.8%) of genes InterPro IDs. A Gene Ontology (GO) annotation was assigned to 6453 of the genes (68.9%).
Mating type loci
The sequence of our isolate revealed the presence of only a MAT 1-2-1 mating type gene present on chromosome VII, supporting the notion that this fungus is indeed heterothalic. Both MAT 1-1, MAT 1-2 and strains with hybrid mating loci have been reported . The previously sequenced C. militaris Cm01 strain was determined to have both MAT 1-1-1 and MAT 1-1-2 mating type genes. The C. militaris Cm06 strain was reported to be a hybrid strain containing both the MAT 1-1 and MAT 1-2 mating types, with single spore isolates from this hybrid strain producing progeny with either the MAT 1-1 (93.3%) or MAT 1-2 (6.7%) loci . Both the MAT 1-1 and MAT 1-2 containing isolates have been shown to fruit, but only the hybrid strain containing both the MAT 1-1 and MAT 1-2 loci was able to produce mature spores . However, fruiting bodies were not observed under analogous conditions with the ATCC® 34164 strain. Comparison of the genomic regions containing the mating type genes in both our ATCC® 34164 strain and the Cm01 strain reveal that the genes in these regions are highly similar, with the exception of the MAT genes (Fig. 2).
One hallmark molecule of interest in C. militaris is the nucleoside analogue cordycepin. Although this biosynthesis is unknown, it is proposed that this mechanism is dependent upon a reduction step, believed to be potentially catalyzed by a ribonucleoside diphosphate reductase (RNR) [6, 27, 28]. However, our sequenced C. militaris is similar to the sequenced Cm01 strain, in that it only seems to possess two type I RNRs (genes A9K55_000536 and A9K55_003140), both of which have homologues in non-cordycepin producing fungus, and have been identified in C. militaris Cm01, leaving the biosynthesis of cordycepin elusive .
Secondary metabolite potential
Sequenced fungi from the Cordyceps, Ophiocordypces, and related genera have revealed the potential for production of over 30 diverse secondary metabolites per strain [7, 8, 26, 29, 30]. The limited number of prior systematic studies to identify bioactive secondary metabolites from Cordyceps and related fungi have shown that a number of novel molecules can be produced by these microbes [31,32,33,34,35]. However, these studies do not nearly capture the full secondary metabolite potential of these fungi, likely due to the fact that many of these metabolites may be cryptic and not expressed under the tested laboratory conditions. To determine whether this fungus could produce a wealth of secondary metabolites, the genetic potential for diverse metabolite production became the focus of the study. Using two gene cluster predictors, AntiSMASH and SMURF, all seven C. militaris chromosomes were profiled for the presence of genes that could be responsible for the biosynthesis of secondary metabolites [36,37,38]. The fungal version of AntiSMASH, predicted 32 secondary metabolites and SMURF predicted 25. Taken together, the two algorithms predicted the presence of 36 unique gene clusters which could be responsible for secondary metabolite production in C. militaris (Table 4).
Distribution of these secondary metabolite producing genes were mapped on the 7 chromosomes (Fig. 3a). No gene clusters for secondary metabolites were noted in the presumably collapsed area of the genome shown in chromosome IV. The 36 metabolite producing gene clusters were from a variety of classes, including eight nonribosomal peptide synthetases (NRPS), seven type 1 polyketide synthases (T1PKS), six polyketide synthase-nonribosomal peptide synthetase (PKS-NRPS) hybrids, four terpenes, one indole and ten falling into other classes (Fig. 3b).
For comparison with the previously sequenced strain, the antiSMASH algorithm was also used to predict the presence of natural product producing gene clusters in the in C. militaris Cm01. In Cm01 there were 28 natural product clusters identified by AntiSMASH, compared to the 32 in our strain of study. All of the 28 clusters were present in the ATCC® 34164 strain and the ATCC®34164 strain had four additional natural product clusters identified. These additional clusters are predicted to produce one indole (V-3), two T1PKS (VI-2, VI-8) and one T1PKS-NRPS (VII-5). Furthermore, using the ClusterFinder algorithm in the fungal version AntiSMASH, an additional 41 putative clusters were predicted, bringing the total to 73 predicted clusters from AntiSMASH with ClusterFinder, suggesting that the secondary metabolite potential of this organism is impressive.
Herein is described the first chromosome level assembly of a Cordyceps genome. This seven chromosome assembly has revealed that this heterothallic strain, which contains 9371 genes, is capable of producing a wealth of secondary metabolites. Of the 36 gene clusters identified in ATCC®34164 by both the antiSMASH and SMURF algorithms (Additional file 2: Table S1), 3 clusters, III-1, V-6 and VII-5, are of particular interest as they have homology with gene clusters from other organisms that produce characterized natural products [39, 40].
It seems that a logical product of cluster III-1 could be a 2-pyridone alkaloid molecule. The hybrid NRPS-PKS central to this cluster (A9K55_001190) is similar to the NRPS-PKS responsible for the production of desmethylbassianin (70% identity) and tenellin (67% identity) from Beauveria bassiana and fumosorinone (66% identity) from Isaria fumosoinone [41,42,43]. Both the desmethylbassanin and tennellin gene clusters consist of 4 genes: the NRPS-PKS hybrid, an enoyl reductase and two cytochrome P450s (Fig. 4). In C. militaris, based on the sequence of gene A9K55_001191 it seems that the missing enoyl reductase may be fused with the cytochrome P450. Interestingly, a structurally related pigmented derivative, militarinone A, and the variants militarinone B–D, have been isolated from the possible C. militaris anamorph, Paecilomyces militaris [44, 45]. However, militarinone A–D was not identified by mass in extracts of the C. militaris strain of interest in this study.
On chromosome V, a cluster (V-6) with homology to the emercellamide producing cluster is present (Fig. 5). The emercellamide family molecules produced from the hybrid NRPS-PKS containing cluster have been described in the marine fungus Emericella, as well as the fungus Aspergillus nidulans [46, 47]. Other related molecules, scopularide A and W493-B have been isolated from Scopulariopsis brevicaulis and Fusarium pseudograminearum, respectively [48,49,50]. The biosynthesis of emericellamide A in Aspergillus has been described and is shown to rely on four genes . Comparing this emercellamide-like cluster in C. militaris to the gene cluster producing emercellamide in Aspergillus shows a conservation of 4 genes: an NRPS, a PKS, an acyl-transferase and a CoA ligase. The NRPS present in C. militaris (A9K55_005039) has 98% coverage and 43% identity with the NRPS in Aspergillus nidulans.
A cluster with similarities to the equisitin-producing cluster (VII-5) is also present in C. militaris on chromosome VII. This molecule, equisetin, was described as having structural similarities to the cholesterol lowering molecule lovastatin and was first isolated from Fusarium equiseti with a described bioactivity as a HIV-1 integrase inhibitor [51,52,53]. The biosynthesis, studied in Fusarium heterosporum, reveals that the gene cluster consists of seven genes, two NRPS/PKS, two regulators, an oxidase, a methyltransferase and a transporter . The comparable cluster in C. militaris consists of five genes, homologous to the genes present in the F. heterosporum minus one of the regulators and the oxidase (Fig. 6). The NRPS/PKS present in C. militaris (A9K55_008762) has 99% coverage and 50% identity with the NRPS/PKS in Fusarium heterosporum and seems to be well conserved among fungi in the Aspergillus and Penicillum genera.
This assembly of a genome from the Cordyceps genera, Cordyceps militaris, has shown the potential for production of an array of potentially novel natural products. This species is predicted to produce at least 36 secondary metabolites, three of which have significant similarity to characterized gene clusters. As fungal secondary metabolites can be cryptic under standard laboratory conditions, this assembled genome will allow for the application of genome mining techniques to guide the discovery and identification of new natural products. This can progress forward through a variety of techniques; one approach is to heterologously express gene clusters identified in the C. militaris genome. Alternatively, utilizing the genome to extrapolate potential natural products for expression can give important clues about the structure and favorable culture conditions of a secondary metabolite associated with a characterized gene cluster. This knowledge can increase the likelihood of the production of the correlated molecule and simplify structural determination. Regardless, this assembly has shown that there is a great potential for the production of secondary metabolites in C. militaris and that this and other fungi from related Cordyceps and Ophiocordyceps genera could provide a wealth of molecular structural diversity.
Presented here is the first chromosome level assembly of a genome from the Cordyceps genera. This assembly and analysis has revealed that C. militaris has seven chromosomes containing a wealth of gene clusters for secondary metabolite production. Of the 36 gene clusters identified using the antiSMASH and SMURF algorithms, three clusters are found to have a high degree of similarity with clusters from other organisms that produce a known molecule. With this genome, further study and characterization of the secondary metabolites produced by C. militaris can be aided through genome based techniques including heterologous expression of gene clusters. As there is great potential for the production of secondary metabolites from C. militaris, this is one step towards discovering and characterizing the wealth of molecular structural diversity in this genera.
Phylogenetic tree construction
To compare the fungal species of interest, 18S rRNA sequences were obtained from the Silva database . Sequence alignment was performed using ClustalW . To construct the phylogenetic tree, The evolutionary history was inferred using the Neighbor-Joining method . The optimal tree with the sum of branch length = 0.35499733 is shown. The evolutionary distances were computed using the Maximum Composite Likelihood method  and are in the units of the number of base substitutions per site. The analysis involved 12 nucleotide sequences. Codon positions included were 1st + 2nd + 3rd + Noncoding. All ambiguous positions were removed for each sequence pair. There were a total of 988 positions in the final dataset. Evolutionary analyses were conducted in MEGA7 .
Fungal strain and maintenance
Cordyceps militaris ATCC® 34164 was received from the American Type Culture Collection (ATCC). This strain, as described in the ATCC records was isolated from a butterfly pupa. Fungal cultures were maintained at 23.0 °C on potato dextrose agar. The nrDNA of extracted genomic DNA was amplified using the ITS4 and ITS5 primer pairs, sequenced and compared against the BLAST database to determine sample validity [59,60,61].
Genomic DNA extraction and purification
Liquid cultures, containing 5 mL of seed media (10 g peptone, 40 g maltose, 10 g yeast extract, 1 g agar in 1 L DI water) in a culture tube were aseptically inoculated with a 3 mm square agar slab containing mycelial growth. The fungal mycelia were grown for 5 days at 23.0 °C. The mycelial mat was harvested, rinsed with sterile TE buffer and then frozen in liquid nitrogen. Frozen mycelia were macerated with a mortar and pestle with a spatula tip of aluminum oxide to aid in grinding the sample. The mycelial powder was transferred to a set of epitubes and 500 μL of CTAB DNA extraction buffer was added (100 mM Tris pH = 8.0, 10 mM EDTA, 2% CTAB, 2.8 M NaCl). The samples were incubated at room temperature for 5 min, then 2 μL of RNAse A (Thermo Scientific, 10 mg/mL) and 10 μL of Proteinase K were added (Invitrogen, 10 mg/mL) and inverted to mix. After centrifuging for 5 min, the pellet was ground in the epitube with a pellet pestle, then incubated for an additional 5 min before purification with phenol-chloroform. Each sample was washed twice with phenol-chloroform (50:50, phenol buffered with Tris pH 8.0, 600 μL) then twice with chloroform (600 μL). The resulting DNA containing aqueous portions were pooled and DNA was precipitated using cold ethanol (2.5× sample volume) and 3 M sodium acetate (0.1× sample volume). DNA was precipitated for at least 30 min by storing at −20 °C. The DNA precipitate was collected by centrifuging for 30 min, the pellet was washed with 70% ethanol and resuspended in TE buffer. The DNA was further purified with AMPure XP beads (Agencourt) by using an equal volume of beads to volume of DNA and eluting into TE. DNA was quantified using a PicoGreen assay (ThermoFisher) prior to sequencing.
The Cordyceps militaris DNA was sequenced using Pacific Biosciences RS II sequencing at the Genome Quebec Innovation Center (McGill University, Montreal, Canada). The sample was prepared using a large insert sheared DNA library and was sufficient for sequencing 6 SMRT cells.
The sequencing reads were assembled using two different assemblers. The first assembler chosen was the PBcR pipeline from the Celera assembler (version 8.3rc2) using a genome size of 32 Mba . The second assembler was SMRT portal (version 2.3.0) launched from an Amazon machine image. Assembly was performed on all 6 SMRT cells using the RS_HGAP Assembly.2 application with default settings and a genome size of 32 Mba . The resulting assembly yielded 18 contigs, with five of these contigs containing characteristic telomeric (CCCTAA)n or (TTAGGG)n repeats on both ends and four of these contigs containing telomeric repeats on one end. The 5 contigs with telomeres on both ends were taken to be fully sequenced chromosomes. The remaining 13 contigs were evaluated for overlapping regions that could possibly be used to join the contigs. Two of these 13 contigs were discarded due to low coverage (<30×), three of these remaining contigs were found to have overlapping regions that allowed them to be joined into 1 supercontig and the remaining 8 contigs were found to also contain overlapping regions that would allow them to be joined into a second supercontig. These overlapping regions were evaluated by subjecting the entire genome of 5 chromosomes initially assembled by SMRT portal, plus the two manually curated supercontig chromosomes, to the SMRT portal resequencing protocol as a reference genome, along with manually evaluating these overlapping regions by evaluating the reads that spanned the overlapping regions. The resulting assembly was in 7 contigs, with each end of the contig terminating in a telomeric repeat sequence.
Gene prediction, functional annotation and protein classification
Genome annotation was performed using the MAKER (version 2.31.8) pipeline using three ab inito gene prediction methods: Augustus trained for Fusarium graminearum, and GeneMark-ES and SNAP self-trained on the C. militaris genome [22, 23, 63,64,65]. Protein data from Cordyceps brongniartii, C. militaris, Cordyceps confragosa, Ophiocordyceps sinensis, Ophiocordyceps unilateralis and Tolypocladium ophioglossoides were used as protein evidence in MAKER. EST from Cordyceps militaris were downloaded from GeneBank and used as EST evidence in MAKER. Repeat elements were identified using Repeat Masker using the Repbase Library 20150807 . A final set of consensus gene predictions was chosen using Exonerate . The final gene set from MAKER was subjected to additional evidence modeling using Funannotate (0.7.0) . The gene models were functionally annotated using the BLAST component of the Blast2GO software package and searching against the NCBI nr protein database (accessed July 2017) with the best hit being selected . Gene families were established using the Interpro database using BlastProDOM, HMMPIR, HMMPfam, SuperFamily, SignalPHMM, HMMPanther . The BLAST hits were mapped to the Gene Ontology database and KEGG analysis was carried out . Secondary metabolite genes and gene clusters were predicted using both AntiSMASH, fungal version 4.0.0 and SMURF (accessed September 2016)[36, 38].
Attempt at fruiting body production
Using the protocol outlined in Zheng et al. fruiting body production was attempted with C. militaris ATCC® 34164 .
Attempt at militarinone production
Using the protocol in Schmidt et al. milititarone A–D production was attempted . Specifically, precultures of C. militaris were used to inoculate 150 mL of medium (2% glucose, 2% neopeptone, 0.5% glycine, 0.2% K2HPO4, MgPO4-7H2O) in still 500 mL Erlenmeyer flasks at 25 °C. After 20 days the broth was removed and the mycelia were collected and freeze dried, then extracted with methanol. This methanol extract was treated with water (1.5 mL for 10 g of extract) and then partitioned in a 1:1:1:1 mixture of ethyl acetate/methanol/hexane/1% acetic acid. The lower phase was collected, concentrated, and then analyzed for militarinone production via UPLC-MS. No mass peaks corresponding to ionized militarinone A–D or sodium or acetate adducts of those natural products were apparent.
Coverage and identity
Coverage and identity of C. militaris genes compared to genes in known biosynthetic clusters was determined using BLAST .
Hierarchical genome assembly process
PacBio corrected reads
Lo H-C, Hsieh C, Lin F-Y, Hsu T-H. A systematic review of the mysterious caterpillar fungus Ophiocordyceps sinensis in Dong-ChongXiaCao ( Dōng Chóng Xià Cǎo) and related bioactive ingredients. J Tradit Complement Med. 2013;3:16–32.
Pontoppidan M-B, Himaman W, Hywel-Jones NL, Boomsma JJ, Hughes DP. Graveyards on the move: the spatio-temporal distribution of dead Ophiocordyceps-infected ants. PLoS One. 2009;4:e4835.
Paterson RRM. Cordyceps – a traditional Chinese medicine and another fungal therapeutic biofactory? Phytochemistry. 2008;69:1469–95.
Yue K, Ye M, Zhou Z, Sun W, Lin X. The genus Cordyceps: a chemical and pharmacological review. J Pharm Pharmacol. 2013;65:474–93.
Xia E-H, Yang D-R, Jiang J-J, Zhang Q-J, Liu Y, Liu Y-L, et al. The caterpillar fungus, Ophiocordyceps sinensis, genome provides insights into highland adaptation of fungal pathogenicity. Sci Rep. 2017;7:1806.
Zheng P, Xia Y, Xiao G, Xiong C, Hu X, Zhang S, et al. Genome sequence of the insect pathogenic fungus Cordyceps militaris, a valued traditional Chinese medicine. Genome Biol. 2011;12:R116.
Bushley KE, Raja R, Jaiswal P, Cumbie JS, Nonogaki M, Boyd AE, et al. The genome of tolypocladium inflatum: evolution, organization, and expression of the cyclosporin biosynthetic gene cluster. PLoS Genet. Public Library of Science. 2013;9:e1003496.
Agrawal Y, Khatri I, Subramanian S, Shenoy BD. Genome sequence, comparative analysis, and evolutionary insights into Chitinases of Entomopathogenic fungus Hirsutella thompsonii. Genome Biol Evol. 2015;7:916–30.
Adachi K, Chiba K. FTY720 story. Its discovery and the following accelerated development of sphingosine 1-phosphate receptor agonists as immunomodulators based on reverse pharmacology. Perspect Med Chem. 2007;1:11–23.
Traber R, Hofmann H, Kobel H. Cyclosporins. New analogues by precursor directed biosynthesis. J Antibiot (Tokyo). 1989;42:591–7.
Hoffmeister D, Keller NP. Natural products of filamentous fungi: enzymes, genes, and their regulation. Nat Prod Rep. 2007;24:393–416.
de Bekker C, Ohm RA, Loreto RG, Sebastian A, Albert I, Merrow M, et al. Gene expression during zombie ant biting behavior reflects the complexity underlying fungal parasitic behavioral manipulation. BMC Genomics. 2015;16:620.
Keller NP, Turner G, Bennett JW. Fungal secondary metabolism – from biochemistry to genomics. Nat Rev Microbiol. 2005;3:937–47.
Rhoads A, Au KF. PacBio sequencing and its applications. Genomics Proteomics Bioinformatics. 2015;13:278–89.
Yandell M, Ence D. A beginner’s guide to eukaryotic genome annotation. Nat Rev Genet. 2012;13:329–42.
Berlin K, Koren S, Chin C-S, Drake JP, Landolin JM, Phillippy AM. Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. Nat Biotechnol. 2015;33:623–30.
De Novo Assembly – Pacific Biosciences [Internet]. Available from: http://www.pacb.com/products-and-services/analytical-software/smrt-analysis/analysis-applications/de-novo-assembly/. Cited 8 Jun 2017.
Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015;31:3210–2.
Baker SE. Aspergillus niger genomics: Past, present and into the future. Medical Mycology. 2006;44(Suppl 1):S17–S21.
Galagan JE, Calvo SE, Borkovich KA, Selker EU, Read ND, Jaffe D, et al. The genome sequence of the filamentous fungus Neurospora crassa. Nature. Nature Publishing Group. 2003;422:859–68.
Stimberg N, Walz M, Schörgendorfer K, Kiick U. Electrophoretic karyotyping from Tolypocladium inflatum and six related strains allows differentiation of morphologically similar species. Appl Microbiol Biotechnol. 1992;37:485–9.
Campbell MS, Holt C, Moore B, Yandell M. Genome annotation and curation using MAKER and MAKER-P. Curr Protoc Bioinformatics. 2014;48:4.11.1–39.
Cantarel BL, Korf I, Robb SMC, Parra G, Ross E, Moore B, et al. MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res. 2008;18:188–96.
Palmer J. Funannotate: pipeline for genome annotation [Internet]. Available from: https://github.com/nextgenusfs/funannotate/. Cited 1 May 2017.
Cuomo CA, Guldener U, Xu J-R, Trail F, Turgeon BG, Di Pietro A, et al. The Fusarium graminearum genome reveals a link between localized polymorphism and pathogen specialization. Science. 2007;317:1400–2.
Quandt CA, Bushley KE, Spatafora JW. The genome of the truffle-parasite Tolypocladium ophioglossoides and the evolution of antifungal peptaibiotics. BMC Genomics. 2015;16:553.
Lennon MB, Suhadolnik RJ. Biosynthesis of 3′-deoxyadenosine by Cordyceps militaris. Mechanism of reduction. Biochim Biophys Acta. 1976;425:532–6.
Reichard P. Ribonucleotide reductases: substrate specificity by allostery. Biochem Biophys Res Commun. 2010;396:19–23.
Hu X, Zhang YJ, Xiao GH, Zheng P, Xia YL, Zhang XY, et al. Genome survey uncovers the secrets of sex and lifestyle in caterpillar fungus. Chin Sci Bull. 2013;58:2846–54.
Larriba E, Jaime MDLA, Carbonell-Caballero J, Conesa A, Dopazo J, Nislow C, et al. Sequencing and functional analysis of the genome of a nematode egg-parasitic fungus, Pochonia chlamydosporia. Fungal Genet Biol. Elsevier Inc. 2014;65:69–80.
Asai T, Yamamoto T, Oshima Y. Aromatic Polyketide production in Cordyceps indigotica, an Entomopathogenic fungus, induced by exposure to a Histone Deacetylase inhibitor. Org Lett. 2012;14:2006–9.
Grudniewska A, Hayashi S, Shimizu M, Kato M, Suenaga M, Imagawa H, et al. Opaliferin, a new Polyketide from cultures of Entomopathogenic fungus Cordyceps sp. NBRC 106954. Org Lett. 2014;16:4695–7.
Khaokhajorn P, Samipak S, Nithithanasilp S, Tanticharoen M, Amnuaykanjanasin A. Production and secretion of naphthoquinones is mediated by the MFS transporter MFS1 in the entomopathogenic fungus Ophiocordyceps sp. BCC1869. World J Microbiol Biotechnol. 2015;31:1543–54.
Varughese T, Rios N, Higginbotham S, Elizabeth Arnold A, Coley PD, Kursar TA, et al. Antifungal depsidone metabolites from Cordyceps dipterigena, an endophytic fungus antagonistic to the phytopathogen Gibberella fujikuroi. Tetrahedron Lett. 2012;53:1624–6.
Krasnoff SB, Reátegui RF, Wagenaar MM, Gloer JB, Gibson DM, Cicadapeptins I. II: new Aib-containing peptides from the Entomopathogenic fungus Cordyceps h eteropoda. J Nat Prod. 2005;68:50–5.
Weber T, Blin K, Duddela S, Krug D, Kim HU, Bruccoleri R, et al. antiSMASH 3.0--a comprehensive resource for the genome mining of biosynthetic gene clusters. Nucleic Acids Res. 2015;43:W237–43.
Fedorova ND, Moktali V, Medema MH. Bioinformatics approaches and software for detection of secondary metabolic gene clusters. Methods Mol Biol. 2012;944:23–45.
Khaldi N, Seifuddin FT, Turner G, Haft D, Nierman WC, Wolfe KH, et al. SMURF: genomic mapping of fungal secondary metabolite clusters. Fungal Genet Biol. 2010;47:736–41.
Fisch KM. Biosynthesis of natural products by microbial iterative hybrid PKS–NRPS. RSC Adv. 2013;3:18228.
Chooi Y-H, Tang Y. Navigating the fungal polyketide chemical space: from genes to molecules. J Org Chem. 2012;77:9933–53.
Liu L, Zhang J, Chen C, Teng J, Wang C, Luo D. Structure and biosynthesis of fumosorinone, a new protein tyrosine phosphatase 1B inhibitor firstly isolated from the entomogenous fungus Isaria fumosorosea. Fungal Genet Biol. 2015;81:191–200.
Eley KL, Halo LM, Song Z, Powles H, Cox RJ, Bailey AM, et al. Biosynthesis of the 2-Pyridone tenellin in the insect pathogenic fungus Beauveria bassiana. Chembiochem. WILEY-VCH Verlag. 2007;8:289–97.
Heneghan MN, Yakasai AA, Williams K, Kadir KA, Wasil Z, Bakeer W, et al. The programming role of trans-acting enoyl reductases during the biosynthesis of highly reduced fungal polyketides. Chem Sci The Royal Society of Chemistry. 2011;2:972.
Schmidt K, Gu W, Stoyanova S, Schubert B, Li Z, Hamburger M. Militarinone A, a Neurotrophic Pyridone alkaloid from Paecilomyces militaris. Org Lett. 2002;4(2):197–9.
Schmidt K, Riese U, Li Z, Hamburger M. Novel Tetramic acids and Pyridone alkaloids, Militarinones B, C, and D, from the insect pathogenic fungus Paecilomyces militaris. J Nat Prod. 2003;66:378–83.
D-C O, Kauffman CA, Jensen PR, Fenical W. Induced production of Emericellamides a and B from the marine-derived fungus Emericella sp. in competing co-culture. J Nat Prod. 2007;70:515–20.
Chiang Y-M, Szewczyk E, Nayak T, Davidson AD, Sanchez JF, Lo H-C, et al. Molecular genetic Mining of the Aspergillus Secondary Metabolome: discovery of the Emericellamide biosynthetic pathway. Chem Biol. 2008;15:527–32.
Lukassen M, Saei W, Sondergaard T, Tamminen A, Kumar A, Kempken F, et al. Identification of the Scopularide biosynthetic gene cluster in Scopulariopsis brevicaulis. Mar Drugs. 2015;13:4331–43.
Niheik K, Itoh H, Hashimoto K, Miyairi K, Okuno T. Antifungal Cyclodepsipeptides, W493 a and B, from Fusarium sp.: isolation and structural determination. Biosci Biotechnol Biochem. 1998;62:858–63.
Sørensen JL, Sondergaard TE, Covarelli L, Fuertes PR, Hansen FT, Frandsen RJN, et al. Identification of the biosynthetic gene clusters for the Lipopeptides Fusaristatin a and W493 B in Fusarium graminearum and F. pseudograminearum. J Nat Prod. 2014;77:2619–25.
Vesonder RF, Tjarks LW, Rohwedder WK, Burmeister HR, Laugal JA. Equisetin, an antibiotic from Fusarium equiseti NRRL 5537, identified as a derivative of N-methyl-2,4-pyrollidone. J Antibiot (Tokyo). 1979;32:759–61.
Campbell CD, Vederas JC. Biosynthesis of lovastatin and related metabolites formed by fungal iterative PKS enzymes. Biopolymers. 2010;93:755–63.
Sims JW, Fillmore JP, Warner DD, Schmidt EW. Equisetin biosynthesis in Fusarium heterosporum. Chem Commun (Camb). 2005:186–8.
Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 2013;41:D590–D6.
Thompson JD, Gibson TJ, Higgins DG. Multiple sequence alignment using ClustalW and ClustalX. Curr Protoc Bioinforma. 2002. doi: 10.1002/0471250953.bi0203s00.
Saitou N, Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 1987;4:406–25.
Tamura K, Nei M, Kumar S. Prospects for inferring very large phylogenies by using the neighbor-joining method. Proc Natl Acad Sci U S A. 2004;101:11030–5.
Kumar S, Stecher G, Tamura K. MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol Biol Evol. 2016;33:1870–4.
Toju H, Tanabe AS, Yamamoto S, Sato H. High-coverage ITS primers for the DNA-based identification of ascomycetes and basidiomycetes in environmental samples. PLoS One. Public Library of Science. 2012;7:e40863.
Schoch CL, Seifert KA, Huhndorf S, Robert V, Spouge JL, Levesque CA, et al. Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for fungi. Proc Natl Acad Sci. 2012;109:6241–6.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–10.
Chin C-S, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C, et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat Methods. 2013;10:563–9.
Stanke M, Keller O, Gunduz I, Hayes A, Waack S, Morgenstern B. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 2006;34:W435–9.
Borodovsky M, Lomsadze A. Eukaryotic gene prediction using GeneMark.hmm-E and GeneMark-ES. Curr Protoc Bioinforma. 2011. doi: 10.1002/0471250953.bi0406s35.
Korf I. Gene finding in novel genomes. BMC Bioinformatics. BioMed Central. 2004;5:59.
Smit, A. F, Hubley, R., Green P. RepeatMasker Home Page [Internet]. Repeat Masker 3.0. Available from: http://www.repeatmasker.org/. Cited 30 Sep 2015.
Slater GSC, Birney E. Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics. 2005;6:31.
O’Leary NA, Wright MW, Brister JR, Ciufo S, Haddad D, McVeigh R, et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 2016;44:D733–45.
Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, et al. InterPro: the integrative protein signature database. Nucleic Acids Res. Oxford University Press. 2009;37:D211–5.
Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. Oxford University Press. 2017;45:D353–61.
We would like to acknowledge Purapharm Co. Ltd. Canada. for financial support for this project and the staff at the Génome Québec Innovation Centre for the guidance with experimental design for genome sequencing.
Funding was in part provided from CIHR grant MOP-57684.
Availability of data and materials
The datasets supporting the conclusions of this article are available in the National Center for Biotechnology Information (NCBI) under the bioproject number PRJNA323705, https://www.ncbi.nlm.nih.gov/bioproject/PRJNA323705.
Ethics approval and consent to participate
Consent for publication
Not applicable. The study does not include any individual person’s data.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Coverage across chromosomes. Coverage across chromosomes from SMRT analysis resequencing protocol assembly. (DOCX 1935 kb)
Predicted gene clusters in C. militaris. Predicted gene clusters are labeled, putative natural product class and the predicted length of each enzyme that is part of the putative cluster is given. (DOCX 106 kb)