Skip to main content

The massive 340 megabase genome of Anisogramma anomala, a biotrophic ascomycete that causes eastern filbert blight of hazelnut

Abstract

Background

The ascomycete fungus Anisogramma anomala causes Eastern Filbert Blight (EFB) on hazelnut (Corylus spp.) trees. It is a minor disease on its native host, the American hazelnut (C. americana), but is highly destructive on the commercially important European hazelnut (C. avellana). In North America, EFB has historically limited commercial production of hazelnut to west of the Rocky Mountains. A. anomala is an obligately biotrophic fungus that has not been grown in continuous culture, rendering its study challenging. There is a 15-month latency before symptoms appear on infected hazelnut trees, and only a sexual reproductive stage has been observed. Here we report the sequencing, annotation, and characterization of its genome.

Results

The genome of A. anomala was assembled into 108 scaffolds totaling 342,498,352 nt with a GC content of 34.46%. Scaffold N50 was 33.3 Mb and L50 was 5. Nineteen scaffolds with lengths over 1 Mb constituted 99% of the assembly. Telomere sequences were identified on both ends of two scaffolds and on one end of another 10 scaffolds. Flow cytometry estimated the genome size of A. anomala at 370 Mb. The genome exhibits two-speed evolution, with 93% of the assembly as AT-rich regions (32.9% GC) and the other 7% as GC-rich (57.1% GC). The AT-rich regions consist predominantly of repeats with low gene content, while 90% of predicted protein coding genes were identified in GC-rich regions. Copia-like retrotransposons accounted for more than half of the genome. Evidence of repeat-induced point mutation (RIP) was identified throughout the AT-rich regions, and two copies of the rid gene and one of dim-2, the key genes in the RIP mutation pathway, were identified in the genome. Consistent with its homothallic sexual reproduction cycle, both MAT1-1 and MAT1-2 idiomorphs were found. We identified a large suite of genes likely involved in pathogenicity, including 614 carbohydrate active enzymes, 762 secreted proteins and 165 effectors.

Conclusions

This study reveals the genomic structure, composition, and putative gene function of the important pathogen A. anomala. It provides insight into the molecular basis of the pathogen’s life cycle and a solid foundation for studying EFB.

Peer Review reports

Background

The investigation of biotrophic fungi – pathogens that require living host tissue – is complex and challenging. Because of their dependency on the host organism, biotrophs are difficult to isolate and grow in artificial media. They often have strict nutritional requirements and may require certain hormones or signaling chemicals secreted by the host to induce spore germination [1, 2]. Satisfying these conditions complicates any form of manipulation under laboratory conditions. Studies of rust fungi, which are basidiomycetes, and powdery mildew fungi, which are ascomycetes, highlight many of these challenges. Despite the significant economic impact of the resulting diseases, complete life cycles of these fungi have never been witnessed outside of their natural hosts. Consequently, despite substantial effort on the parts of many scientists, many details of host–pathogen interactions in rust and powdery mildew fungi remain poorly understood [3,4,5].

Advances in sequencing and bioinformatic tools have led to the rapid development of genomic techniques that facilitate investigation even of recalcitrant organisms. As the number of sequenced fungal genomes expands, patterns and features that are linked to obligate biotrophy have emerged [6, 7]. Genomic features, including both coding and non-coding elements, reveal characteristics of lifestyle and pathogen biology [8, 9]. A large repertoire of species-specific secreted small cysteine-rich proteins that represent candidate effectors is typical of biotrophs that have gene specific interactions with their host [10, 11]. Large genomes inflated with repetitive elements are another hallmark of biotrophic pathogens, as amplification of such elements contributes to a flexible genomic landscape that is highly adaptable to the gene-for-gene arms race that pathogens engage in with their hosts [11,12,13]. Identifying these characteristics of genomic features can fill in the blanks left by a lack of experimental data [14, 15].

One such fungal pathogen whose biology lacks understanding is Anisogramma anomala, an ascomycete within the order Diaporthales. A. anomala causes Eastern Filbert Blight (EFB), a devastating disease of hazelnut (Corylus spp.). The native host of A. anomala, American hazelnut (C. americana) tolerates infection, displaying mild disease symptoms and small, non-threatening cankers [16,17,18]. Both host and fungus are abundant on the east coast of the U.S. However, nearly all cultivars of the commercially important European hazelnut (C. avellana) are highly susceptible and develop severe perennial cankers that girdle stems, resulting in branch die-back and eventual tree death [19,20,21]. As such, EFB is the primary limiting factor of commercial hazelnut production in North America [22]. Historically, C. avellana cultivation was restricted to the Pacific Northwest region outside of the native range of A. anomala, limiting hazelnut cultivation to a fraction of its potential growth range [23]. Today, after an inadvertent introduction in the 1960s [24], EFB is widespread in the Pacific Northwest where it significantly impacts commercial production. Disease management costs were alleviated only recently by the release of resistant cultivars [25]. Despite the economic importance of A. anomala and considerable efforts now underway to breed for resistance [26], the EFB pathosystem remains poorly understood.

To support disease management and resistance breeding efforts, there is a need for a better understanding of the biology of A. anomala and the EFB pathosystem. However, A. anomala is an obligate biotroph, presenting many of the methodological difficulties as do rust fungi, powdery mildews, and other biotrophic pathogens [27]. The only useful source of tissue of A. anomala is ascospores extracted from the stromata of cankers of infected hazelnut, and successful subculture has not been achieved. Ascospores represent the only known spore stage of A. anomala; no conidial stage has been documented. Ascospores, by nature, are sexual spores and are not isogenic. While A. anomala ascospores can germinate and form small, branching germ hyphae, the fungus cannot be grown continuously in culture. It is predicted that A. anomala exhibits some form of self-inhibition, as ascospores will germinate in axenic culture only with the addition of an adsorbent such as activated charcoal or bovine serum albumin (BSA) [27]. Even with these additives, germinated ascospores exhibit poor growth and form small colonies (~ 0.25–0.5 mm in diameter) that yield little biomass [28]. Furthermore, the disease exhibits a complex, two-year infection cycle, which normally includes 15–18 months of latency, in which it is not feasible to visibly identity infected trees (Figure S1) [20, 21, 29,30,31].

Despite the challenges to performing experimental host/pathogen research, we saw the importance in understanding more about A. anomala, both as contributions to the U.S. hazelnut industry, and to plant pathogen biology. Due to the lack of an experimental system by which to study A. anomala, we used a genomic approach to elucidate features of EFB biology and pathogenesis. An earlier draft genome of A. anomala was assembled and mined for sequences that would be useful as simple sequence repeat (SSR) primers to examine population biology of the fungus and assist with resistance breeding [28]. That study revealed that the genome of A. anomala is surprisingly large, > 300 megabases (Mb) and consists of an abundance of transposons that constitute nearly 90% of the genome sequence. In this study, we present an updated and refined draft of the A. anomala genome sequence, its annotation, and analysis. Genomic analysis reveals characteristics of biotrophy, including a massive population of transposable elements (TEs), bimodal distribution of GC content, and a cache of genes encoding effector molecules. We also identified a number of genes that code for proteins predicted to be involved in pathogenesis and host/pathogen interactions. The annotated genome of A. anomala will serve as a vital resource for future research on the pathogen and EFB disease.

Results

A. anomala has a large, gene-poor genome

The mate-pair and paired-end reads of genomic DNA for Anisogramma anomala OR1 generated over 31 Gb of data that were assembled into a 342,525,599 nucleotide (nt) genome with an average 91 × coverage (Table S1). The final assembly was distributed across 112 scaffolds with a GC content of 34.46%. Four scaffolds with a combined length of 27,247 nt were removed from further analysis as contamination, resulting in a final assembly size of 342,498,352 nt across 108 scaffolds. More than half of the assembly (N50) was on 5 scaffolds with an N50 scaffold length of 33.3 Mb. The largest scaffold was 43.9 Mb (Table 1). Nineteen major scaffolds (> 1 Mb) represent over 99% of the genome. This demonstrates a marked improvement over the first version of the assembly, published in 2013 [28] (Table S2). We identified telomere sequences (repeats of TTAGGG) on both ends of the second and third largest scaffolds with lengths of 40.1 and 39.2 Mb respectively, indicating these two scaffolds represent full-length chromosomes. Telomere sequences were also found on one end of 10 other scaffolds. Of the 19 largest scaffolds, telomere sequences were found on one end or both ends in 10 scaffolds (Fig. 1). On the contig level, the N50 was 196,655 bp and the L50 was 528 (Table 1).

Table 1 Table of assembly statistics and feature summary for the haploid genome of A. anomala
Fig. 1
figure 1

Repeat content and gene density distribution across major scaffolds (> 1Mb). Size of bar reflects the length of the scaffold (x-axis). Repeat density of dispersed repeats in bins of 100kb is represented as a heat map, ranging from white to black, the darker the color indicating higher repeat density. The height of each scaffold bar (y-axis) ranges from 0 to 200 genes/Mb with the average gene density of the scaffold plotted in orange. Gene density was calculated per 100kb and plotted along each scaffold in purple

To evaluate the completeness of the A. anomala genome assembly, we performed flow cytometry using nuclei released from 8-week old mycelium. Based on flow cytometry, the genome size of A. anomala OR1 was estimated to be 370 Mb (Figure S2), slightly more than, but consistent with the genome assembly estimate.

Using a combination of RNA-seq evidence and ab initio gene prediction, we predicted 9,179 protein coding genes in the A. anomala genome. This gene set includes 94.4% of eukaryotic benchmarking universal single-copy orthologs (BUSCOs) and 95.5% of fungal BUSCOs. Average gene density on major scaffolds was approximately 25.8 genes/Mb and remained relatively consistent among major scaffolds (Fig. 1).

Gene models were annotated with Gene Ontology (GO) terms merged with InterPro IDs. Eighty-eight percent of gene models had BLASTp hits against the NCBI nr database. Approximately 75% of gene models have been annotated by biological process and 50% with a molecular function (Fig. 2, Table S3). Gene models were also annotated with KEGG Orthology (KO) terms, using a combination of the KEGG Automated Annotation Server and BlastKOALA. Roughly 38% of protein sequences were assigned KO identifiers, which make up 99 complete or nearly complete KEGG pathways (Table S3).

Fig. 2
figure 2

Annotation of A. anomala predicted gene models by Gene Ontology (GO) categories. Functions of genes are shown by biological process and molecular functions

A. anomala has large arsenal of effectors and CAZymes

To identify proteins that may be involved in virulence and disease, we identified genes that code for potential effectors, molecules that are involved in host/pathogen interactions. We first identified 762 proteins with signal peptides as evidence of a secreted protein. Those proteins were then analyzed with EffectorP 2.0 to further predict potential effector proteins. One hundred and sixty-five proteins (1.8% of total proteins. 21.7% of secreted proteins) were predicted to be effector candidates (Table 1). All effector candidates were subjected to a BLASTp search of the NCBI nr database. Over half (55%) of candidate effectors returned no BLAST hit, and of those that did return a hit, 42% were hypothetical proteins or proteins with unknown function. For those effector candidates that match a protein with a known function, possible roles include one glycoside hydrolase, one cutinase, and two peptidases (Table S4).

Genes encoding putative effector molecules were evaluated for their proximity to the closest repeat element and the closest large RIP affected region (LRAR) as predicted by RIPPER. BUSCOs and a random subset of all genes were included for comparison (Fig. 3). On average, effectors were approximately 1.5 kb from the nearest TE while BUSCOs and a randomized set were 3 kb and 2.5 kb respectively. The closest distance to LRARs for effectors, BUSCOs, and the randomized set did not differ significantly from each other and averaged at 9900 bp, 9400 bp, and 9700 bp respectively.

Fig. 3
figure 3

A Comparison of distance of well-conserved genes (BUSCOs), predicted effectors, and a randomized set of gene models from the nearest repeat element or B large RIP affected region (LRAR) as predicted by RIPPER software

In addition to effector molecules, we also identified carbohydrate active enzymes (CAZymes) that may play a role in plant pathogenesis. Using the dbCAN3 meta server, we identified 614 potential CAZymes. These proteins include 298 glycoside hydrolases, 154 glycosyl transferases, and 41 carbohydrate esterases (Table 1, Table S3). Finally, we identified biosynthetic gene clusters with the fungal version of antiSMASH. Twenty-five biosynthetic gene clusters were predicted, including 8 polyketide synthase (PKS), 7 terpene synthesis, 9 nonribosomal peptide synthetase (NRPS) clusters, and 1 PKS/NRPS combination cluster (Table S5).

Genome and annotation statistics including genome size, repeat content, and different categories of protein coding genes (effectors, CAZymes, and biosynethetic gene clusters) were compared to related fungi (Table 2). Like other biotrophic fungi, A. anomala has a large genome with high repeat content (shown below). A large number of effectors (relative to total protein coding genes), small number of biosynthetic gene clusters and CAZymes are other hallmarks shared between A. anomala and related biotrophic fungi.

Table 2 Genomic statistics of related fungi used for comparison in this study

A. anomala genome hosts a large population of transposable elements (TEs)

The A. anomala genome hosts a large population of TEs that accounts for approximately 88% of the final genome assembly (Table 3). Repeat content remained relatively constant at 88% across major scaffolds (Fig. 1). The TE population consists of 2,536 individual repeat families, making up over 300,000 individual interspersed elements (Table 3). The vast majority (90%) of repetitive sequences was comprised of Long Terminal Repeat (LTR) retrotransposons, mostly Copia-like elements, which alone account for over half of the genome assembly. Eight of the ten repeat families with the highest copy numbers (> 7,000 members each) were identified as Copia-like elements.

Table 3 Detailed breakdown of the repeat population in the A. anomala genome. Repeat elements are classified by the name assigned by RepeatClassifier

A. anomala exhibits “two-speed” genome

The overall distribution of GC-content across major scaffolds remained relatively constant at approximately 34%. However, measurement of proportions of GC-distribution across the entire genome reveals two peaks, indicating a bimodal genome (Fig. 4). The first peak, at 32.9% GC indicates AT-rich regions. This peak accounts for 93% of the genome and 10% (933/9,179) of the protein coding genes. These AT-rich regions are gene poor, with an average gene density of 2.93 genes/Mb. The second peak, at 57.1% GC indicates GC equilibrated regions that account for 7% of the genome and 90% (8,246/9,179) of protein coding genes. These GC-equilibrated regions are over 100-fold more gene-dense with an average of 344 genes/Mb.

Fig. 4
figure 4

Distribution of GC-content across the A. anomala genome. The genome was broken up into regions using Jensen-Shannon divergence, for which GC-content was calculated. Proportions of the genome were assigned to GC-content in 1 percent increments

We performed an enrichment analysis using the Fisher’s exact test of the gene models within AT-rich genomic regions (Table S6), sheet 1). A number of GO terms are over-represented (p-value < 0.05) including beta-glucan/cellulase metabolism, peptidase/hydrolase activity, and ion transport (Table 4). Additionally, despite these regions encoding only 10% of protein coding genes, 30% (49/165) of predicted effector coding genes were found in these AT-rich hotspots.

Table 4 GO term enrichment analysis of genes within the AT-rich regions of the A. anomala genome

A. anomala exhibits a number of unique gene families

We performed an Orthofinder analysis to identify gene families shared with related fungal pathogens (Table S7). A super-gene phylogeny was constructed using 34 single-copy orthologous gene families and their corresponding protein sequences. Gene family counts were used to reconstruct ancestral gene family content and gain/loss of homologous gene families with Wagner parsimony and stochastic mapping (Fig. 5).

Fig. 5
figure 5

Predicted pattern of gene family gain and loss in representative fungal genomes. Cladogram representation of Maximum Likelihood phylogeny of A. anomala and 15 related fungi based on 2,800 single copy orthologues. The total number of protein families in each species or node is estimated by Wagner parsimony and stochastic mapping. The numbers of the branches correspond to gene family gain (green) or loss (red) and inferred ancestral protein families (in oval). The numbers of gene families, unassigned genes, and total gene numbers are indicated for each species

There are 1,121 gene models that are not identified as orthologous to related fungal pathogens and are likely specific to A. anomala (Table S6), sheet 2). Of these unique genes, 83 of them are predicted to code for effectors, indicating that over half of the predicted effectors are unique to A. anomala. GO terms overrepresented include beta-glucan and cellulose metabolism (p-value < 0.05), suggesting a role in production of plant degrading compounds (Table S8). An additional 450 GO terms are underrepresented, mostly including processes involved in central metabolism and fungal growth and development.

The Orthofinder analysis and Wagner parsimony revealed 354 genes families gained and 721 lost in A. anomala since diverging from its last common ancestor with C. parasitica. Gene families that are expanded or gained in A. anomala account for an additional 32 putative effector genes- meaning that approximately 70% of putative effectors are in species specific gene families or lineages of gene families that have expanded in A. anomala. GO terms overrepresented in gained/expanded gene families include catabolic processes and degradation of organic compounds (Table S9). The GO terms that are underrepresented include protein, organelle, and cellular biosynthetic processes.

Transposable elements show evidence of Repeat-induced point mutation (RIP)

The A. anomala genome encodes two genes that exhibit sequence homology and are orthologous to rid (RIP defective) in Neurospora crassa. The two genes are predicted to encode a C5-DNA methyltransferase and a modification methylase respectively. Both genes have been assigned GO terms for methyltransferase activity. A. anomala also encodes a homolog of dim-2, an additional methyltransferase identified in N. crassa to be involved in the RIP process (Figure S3).

Dinucleotide frequencies and RIP indices were calculated for a subset of up to 100 members for all identified repeat families (Fig. 6a). Compared to a control of non-repeat sequences, repeat sequences exhibit an over-abundance of TpA (6.7 × more frequent) and TpT (4.1 × more frequent) dinucleotides and under-abundance of GpC (3.9 × less frequent) and CpG (2.8 × less frequent) dinucleotides. RIP indices were also calculated for the same subsets of repeat families (Table 5). The mean TpA/ApT index for repetitive sequences is 1.14, while non-repeat sequences have an index of 0.48. The mean (CpA + TpG)/(ApC + CpT) index is 0.087 in repetitive sequences and 0.095 in non-repetitive sequences. There were no significant differences in dinucleotide frequencies or RIP indices between repeat classes.

Fig. 6
figure 6

a Log(10) of average fold change in dinucleotide frequencies of all repeat families compared to non-repetitive control sequences. b Alignment based RIP analysis of repeat family with highest copy number (rnd-1_family-0). Each type of RIP mutation is represented by a different color, demonstrating the most dominant types of RIP within this repeat family

Table 5 Calculated RIP-indices for the five repeat families with the largest copy number. A TpA/ApT index ≥ 0.89 and (CpT + TpG)/(ApC + GpT) index ≤ 1.03 indicate RIP activity. The numbers presented are the mean values calculated for a subset of 100 repeat family members

An alignment-based RIP analysis of the repeat family with the highest copy number shows that A. anomala exhibits two dominant kinds of RIP (Fig. 6b). CpA→ TpA and CpT→ TpT mutations were dominant over other RIP-like mutations. The top 10 repeat families with the highest copy number were also analyzed with the alignment-based RIP analysis and demonstrate the same RIP mutational preference.

A. anomala demonstrates genetic basis for homothallism

Homologs for both MAT1-1 and MAT1-2 idiomorphs have been identified in the A. anomala genome within the same 7 kilobase cluster (Table S6, sheet 3), consistent with evidence that the fungus is homothallic [20]. Homologs for the mat genes were identified through a BLASTp search of the NCBI nr database and verified by a pairwise sequence comparison to the corresponding genes in Cryphonectria parasitica [56]. Like the C. parasitica idiomorphs, three protein-coding genes are predicted to constitute MAT1-1 (MAT1-1–1, containing an alpha box motif; MAT1-1–2, a protein of unknown origin; and MAT1-1–3, containing an HMG motif), and a single protein-coding gene is predicted for MAT1-2 (MAT1-2–1, also containing an HMG motif). Within the A. anomala MAT locus, the gene encoding MAT1-2–1 was embedded between MAT1-1–1 and MAT1-1–2. Other genes usually associated with mating clusters in fungi, apn2 and sla2, were identified in close proximity to the other MAT protein coding genes. The entire MAT cluster is largely syntenic to that of Chrysoporthe cubensis, a closely related homothallic fungus. The MAT loci of A. anomala is more compact and contains no additional genes besides those directly involved in determining mating type (Fig. 7). RNAseq data indicate that all four of these MAT genes were expressed constitutively (Figure S4).

Fig. 7
figure 7

Genomic region corresponding to mating-type locus in A. anomala. Gene models were identified as MAT homologs through BLASTp search of NCBI database and analyzed for synteny compared to Chrysoporthe cubensis, a homothallic fungus in the Cryphonectriaceae

Discussion

The final genome assembly of A. anomala OR1 is approximately 343 Mb. This assembly is thought to be relatively complete based on genome size estimation compared to flow cytometry data as well as identified BUSCOs. The A. anomala genome is very large by fungal standards, almost 10 times the ~ 37 Mb size of the genome of the average ascomycete (Table 2) [32,33,34,35, 37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55, 57, 58]. However, large genomes are not uncommon amongst obligate biotrophic pathogens. Powdery mildew fungi, which are ascomycetes, have genomes in excess of 100 Mb [11], and rust fungi, which are basidiomycetes often with complex life cycles, may have genomes approaching 1 Gb [59]. Both of these unrelated fungi are subjected to the strong selective pressure imposed on biotrophic plant pathogens to maintain an intimate interaction with their host while avoiding recognition that initiates an immune response [60, 61]. The outcome of evolution driven by the pressure of a host/pathogen arms race is parallel adaptations resulting in remarkedly similar genomes among biotrophic pathogens [7, 11, 32,33,34,35,36,37,38,39].

The expansion of the A. anomala genome is driven by the proliferation of TEs, rather than accumulation of protein coding genes. The TE population is made primarily of LTR retrotransposons. Copia-like elements are by far the most abundant, which contrasts related fungi that are dominated by Gypsy-like repeats [48]. Despite the massive number of identified LTR retrotransposons, no single element has been determined to be intact with both 5’ and 3’ LTRs and the protein domains required for autonomous transposition, namely reverse transcriptase (RT), RNAse H (RH), and integrase (INT) [62]. One of the looming questions regarding the TEs in the A. anomala genome is how the invasion and uncontrolled replication of repetitive elements, largely of a single type of TE, was responsible for such extreme genome expansion.

Effector molecules play an important role in the colonization of biotrophic plant pathogens. Plants are able to recognize specific effector molecules through resistance (R) genes and activate a powerful hypersensitive response (HR) resulting in plant cell death which halts the spread of the invading pathogen [63]. If recognized, the pathogen is considered avirulent and the effector protein that triggers the HR response is characterized as an avirulence (avr) gene [64]. The pathogen responds by mutating or losing avr genes, so that they are no longer recognizable, or developing new effectors that avoid or suppress the effector-triggered immune response [65]. This relationship is the basis of the coevolutionary arms race between host plants and pathogens [66]. Most commercially available cultivars of C. avellana are protected by the R-gene “Gasaway”, named after the pollinizing cultivar that carried the dominant allele [67]. There is evidence that the Gasaway R-gene protects through an HR response [21]. However, Gasaway protected plants are overcome in regions of high pathogen pressure and diversity, suggesting that effectors and avr genes play a role in the breakdown of resistant cultivars [68,69,70].

For many putative effectors, there is no known function. As the goal is to be unrecognizable, there is no benefit to maintain conserved effector genes. However, we know that effectors can play multiple roles in establishing and maintaining infection [71, 72]. The large arsenal of putative effectors encoded in the A. anomala genome allows for flexibility. This is also why we have observed effectors in repeat-rich regions of genome, where there are high rates of mutation and recombination [73]. The compartmentalization of effectors and genes involved in pathogenicity in repeat rich regions fits the “two-speed” model of evolution [74].

CAZymes play an important role in both necrotrophic and biotrophic phytopathogenic infection. Necrotrophic fungi are known for having an arsenal of plant cell wall busting enzymes to launch an aggressive attack on their host. Necrotrophic ascomycetes code for between 600–800 CAZymes while A. anomala codes for 456 putative CAZymes, a typical number for biotrophic pathogens [75]. One of the most notable families of CAZymes encoded in the A. anomala genome is the glycoside hydrolase-18 (GH-18) family that includes all identified fungal chitinases [76]. It is predicted that both plant and fungal cell wall degrading enzymes are important for establishing biotrophic infection. Histological data of early infection of A. anomala on C. avellana shows a single germ hypha penetrating the plant cell wall, followed by the formation of intracellular vesicles [21]. CAZymes are required for initial penetration of the plant cell wall as well as the reformation of the fungal cell wall at the fungal/host interface [77]. Reasonable future steps would include investigating the expression of CAZymes during early infection to elucidate what genes are required to establish infection.

The Wagner parsimony analysis on Orthogroups of related fungal pathogens revealed the loss of 876 and gain of 285 gene families in A. anomala since it diverged from C. parasitica. Gene reduction in obligate parasites is a common trend, usually due to the loss of specific metabolic pathways as the parasite derives required compounds from their host [78]. KEGG pathway reconstruction revealed a number of missing or incomplete pathways for the biosynthesis of several amino acids including lysine, tryptophan, and asparagine [79]. It should be noted that the culture medium used for A. anomala contains yeast extract as well as additional asparagine to encourage growth [27]. Genes involved in pathways involved in energy generation (NADH dehydrogenase, nitrate reduction/assimilation) are missing as well. A. anomala exhibits parallel evolution to unrelated obligate biotrophic fungal pathogens that have independently lost similar biosynthetic and metabolic pathways [60, 80, 81].

The exception to the trend of gene loss is genes or gene families that encode effectors. Gene families that are unique or expanded in A. anomala contain 70% of predicted effectors. Biotrophs use effectors to maintain an intimate signaling relationship during infection [82, 83]. The need for a large and diverse effector arsenal drives the evolution of effector diversification and expansion [84, 85] as we observed with A. anomala. GO terms overrepresented in unique or expanding families include transmembrane transporters that are involved in the secretion of secondary metabolites that participate in pathogenesis. Amylases, peptidases, and catabolic activity GO terms are overrepresented in expanded families, likely aiding in adaptation to the obligate biotrophic lifestyle [10].

Despite TEs accounting for 88% of the final genome assembly, very few of these elements contain intact protein domains required for autonomous transposition. Sequences from all identified repeat families show evidence of RIP mutation. RIP is a defense mechanism that protects fungal genomes from TEs expanding unchecked [86, 87]. RIP functions by recognizing stretches over 400 bp of DNA with high (> 80%) sequence identity. The DMNT-1 homologue RID (RIP defective) methylates cytosine residues, which then undergo spontaneous deamination into thymine. This induces C→ T and G→ A transitions in both copies of duplicated sequences, resulting in permanent mutational changes in the DNA sequence [88, 89].

Fungi that demonstrate evidence of RIP vary in the degree or effectiveness by which RIP acts on the genome. Neurospora crassa, in which RIP was first described [90], has a very efficient RIP system, to the point where N. crassa has almost no duplicated sequences, TEs nor duplicated genes [91]. In other fungi, RIP is often demonstrable, but less effective [92]. In the related fungus C. parasitica, roughly 14% of the 43.9 Mb genome represented TEs, and there was some limited evidence for RIP [48, 92]. The A. anomala genome exhibits indications of RIP activity. The RIP indices calculated for repeat families exceed the accepted threshold for RIP activity (TpA/ApT ≥ 0.89 and (CpT + ApT)/(ApC + CpT) ≤ 1.03) [93, 94] and the dinucleotide frequencies demonstrate a depletion of pre-RIP dinucleotides and an enrichment of post-RIP dinucleotides in repeat regions compared to non-repeat regions [92]. In spite of evidence of a functional RIP pathway, transposons have managed to overtake the A. anomala genome. The massive expansion of the TE population in the A. anomala genome underscores the observation that mere presence of an apparently functional RIP system is no guarantee that TEs will be held in check.

In addition to defending against the uncontrolled replication of TEs in a genome, RIP is a major driver of genome evolution. RIP induces mutations on duplicated sequences, but those mutations often bleed into neighboring regions, so called “leaky RIP” [86, 95, 96]. Furthermore, the G/C→ T/A mutations have a major impact on GC content of a genome. The GC content of the A. anomala genome is relatively low, at 34%, however, it is not equally distributed across the genome. GC-proportion distribution reveals two peaks; 93% of the genome landscape has a GC-content of 32%. These stretches of GC-poor containing DNA are broken up by GC-rich blocks that are gene-rich and TE-poor. These data demonstrate that A. anomala fits the “two-speed” genome model [96,97,98].

Analysis of the mating-type locus revealed that A. anomala has the genes for both MAT1-1 and MAT1-2 idiomorphs, providing molecular evidence to support the previous evidence for homothallism [20]. Mating type systems, processes, and their associated genes are extraordinarily complicated in fungi, and many genes other than the MAT genes themselves may have different roles in the reproductive process [99]. In addition to controlling sexual development, MAT genes may be important in growth and virulence, including regulation of secondary metabolites and hyphal morphology.

Homothallism, such as that in A. anomala, is thought to be an evolutionary destination from which there is no likely return to a progenitor heterothallic state, an idea that was supported through research with Neurospora shifting multiple times from heterothallic to homothallic lifestyle, but never the reverse [100]. Chrysoporthe which is closely related to Anisogramma, has MAT locus features similar to Neurospora, including pronounced influence of retrotransposons, but there was some evidence to suggest that the MAT1-2 and MAT1-1 idiomorphs of the heterothallic C. austroafricana evolved from a homothallic progenitor [101]. In both the case of Neurospora and Chyrosporthe, the evolutionary transition of mating type is facilitated by TEs within the mat locus. Like the rest of the A. anomala genome, the mat locus is flanked by TEs. But the core genes for each idiomorph are found within the same 7 kb block with no TEs or additional genes. The mating cluster of C. cubensis includes additional genes not found in A. anomala as well as a 200 kb insertion of DNA that contains over 60 genes not related to determination of mating type (Fig. 7). One of the roles of sex in fungi and other organisms is to bring genetic variation to the species. It seems that a combination of homothallic sex and rampant genome invasion and expansion by transposons brings sufficient variability to A. anomala.

Conclusions

At nearly 350 Mb, the A. anomala genome represents the largest ascomycete genome yet characterized. Gene number and putative functions are typical of fungal plant pathogens, but runaway amplification of repeat sequences has led to a massively bloated genome, despite hallmarks of functional genome surveillance by RIP. The A. anomala genome characterization will serve as a resource for others investigating this economically important plant pathogen, and for those interested in fungal genome evolution.

Methods

Fungal strain

A. anomala is an obligate biotroph that has not been grown in continuous culture, so tissue is scarce and not clonal. Based on knowledge of EFB epidemiology [102, 103], the A. anomala population in Oregon is believed to be decedents of a single introduction event from east of the Rocky Mountains and belong to a single lineage. But mycelium from different trees in fields is not clonal, and DNA or RNA extracted from a collection of germinated ascospores is also not clonal. The closest approximation we have to homogeneous tissue is to harvest ascospores from a single canker on a single tree, with the understanding that it most likely represents the result of a single infection. We collected ascospores from individual cankers from infected branches harvested from hazelnut plants growing at the Oregon State University Smith Horticultural Research Farm, Corvallis, OR. These plants had been inoculated 18 months prior in the greenhouse using local diseased plant material as inoculum source. We designate the strain presented here Oregon1, OR1.

Ascospores were extracted following the protocol we used previously [28]. Briefly, the branches were cut into pieces 5–7 cm in length, and surface-sterilized for 3 min in 10% bleach (0.525% sodium hypochlorite) followed by 1 min in 70% ethanol. After rinsing with sterile H2O, the stromata were hydrated in sterile H2O for 30 min and air-dried. The top of a canker was cut off with a sterile razor blade to expose the necks of the perithecia, and another sterile razor blade was inserted under the perithecia to provide pressure from below and push ascospores out of perithecial neck. The spores from individual cankers were suspend in sterile H2O containing 10 ppm rifampicin and 100 ppm streptomycin and quantified with a hemocytometer. We found one canker that produced approximately 5.5 M ascospores and these spores were used in this study unless noted otherwise.

To generate primary mycelium, a portion of the ascospores was adjusted to 1 × 105 spores per ml and used to inoculate plates of culture medium overlaid with cellophane. The rest of the ascospores were stored at -80 °C. Half a milliliter of the spore suspension was spread on the cellophane surface in individual 9-cm diameter petri dishes. The medium contained (per liter) 2.7 g modified Murashige and Skoog basal salt mixture; 20 g sucrose; 2 g yeast extract; 2 g L-Asparagine; 15 g Bacto agar; 0.25 g activated charcoal; and 10 mg Rifampicin [27]. The cultures were grown at 18 °C in the dark for 8 weeks, by which time many spores had germinated and grown into opaque, whitish colonies approximately 0.25–0.5 mm in diameter. Mycelium was harvested by rinsing the cellophane with sterile H2O. A subset of plates was kept for four more weeks. By then the small colonies were turning grey and black, and the senescent mycelium was harvested as described above.

Nucleic acid extraction, genome sequencing and assembly

Mycelium from 8-week-old cultures were used for DNA extraction using Gentra Puregene kit (Qiagen) following the fungi protocol. One paired-end DNA library with insert size approximately 350 bp (excluding adapters) was constructed using the TruSeq DNA Sample Prep kit (Illumina). Three mate-pair DNA libraries with insert sizes approximately 3 kb, 6 kb, and 10 kb, respectively, were constructed using the Nextera Mate Pair Library Prep kit (Illumina) following manufacturer’s instructions. All libraries were sequenced on the Illumina MiSeq platform.

The paired-end reads were trimmed with Trimmomatic v0.32 [104] in paired-end mode to remove adapter sequences and reads shorter than 100 bp after trimming were dropped. The mate-pair reads were first trimmed with Trimmomatic in paired-end mode to remove external adapters, then trimmed with Trimmomatic in single-end mode to remove internal adapters at ligation junctions. Reads shorter than 35 bp after trimming were dropped. The resulting reads were processed with a custom Perl script and only read pairs meeting the following conditions were retained for genome assembly: 1) both reads must have survived adapter trimming; 2) for read pairs in which external adapters were found, the junction adapter must be found in both reads; 3) for read pairs in which external adapters were not found, junction adapter must be found in at least one read. After data processing, the sequence reads were assembled using AllPaths-LG release 52,155 with default settings. [105]. Assembled scaffolds were subjected to a BLASTn search of the GenBank database release 258 [106]. Any scaffolds where the top hit was not fungal were removed as contamination.

Flow cytometry

One hundred micrograms of freshly harvested 8-week old mycelium were cut into fine pieces with a sterile razor blade in 500 μl LB01 buffer on ice to release the nuclei [107]. The mixture was passed through a 40 μm filter and washed with 200 μl LB01 buffer. Nuclei from 50 mg young radish leaf, which has a 2C genome size of 1.1 Gb, were released the same way and used as control. Nuclei solutions were treated with RNase A and stained with propidium Iodide at room temperature for 20 min in darkness and run through a Beckman Cytoflex flow cytometer. The experiment was repeated three times.

Repeat identification and masking

The assembled genome was soft-masked prior to gene prediction [108]. A comprehensive, non-redundant repeat library was created by integrating output from RepeatModeler [109, 110], TransposonPSI [111], and LTRharvest [112]. RepeatModeler v1.0.11 and TransposonPSI were run using default parameters to generate the first two repeat libraries. The third repeat library was built using LTRharvest. False positives were removed from the LTRharvest library by running LTRdigest with protein HMMs from Pfam [113] and GyDB [114] databases. LTR retrotransposons without domain hits were removed from the LTRharvest repeat library.

Each of the three repeat libraries was classified using RepeatClassifier, part of the RepeatModeler program suite, with Repbase version 23.08 [115], for consistency in identification and naming of repeat elements. The three repeat libraries were then merged and clustered with CD-HIT [116] at ≥ 80% identity to create a non-redundant library [117]. This custom library was used to soft-mask the A. anomala genome using RepeatMasker with the “xsmall” argument and default parameters [118].

Transcriptome sequencing, gene prediction and annotation

Ascospores, 8-week old mycelium and 12-week old senescent mycelium were used for RNA extraction using the Plant RNeasy kit (Qiagen) following manufacturer’s instructions. Three mRNA libraries, one for each sample, were prepared using the TruSeq RNA Preparation kit (Illumina) following manufacturer’s instructions. The libraries were sequenced on the Illumina MiSeq platform.

Gene models were predicted using the BRAKER2 annotation pipeline [119], incorporating GeneMark-ET [120, 121] and Augustus [122] for ab initio and evidence-based gene prediction. RNA-Seq reads were mapped to the genome assembly using STAR [123]. The RNA-Seq mapping results were used as evidence for gene prediction in the BRAKER2 pipeline, using the “fungus” argument for fungal gene prediction. Genome completeness was assessed through a BUSCO analysis of benchmarking eukaryotic and fungal single-copy orthologs [124].

Blast2GO v5.2.5 [125] was used to perform a BLASTp search of the NCBI nr database with E-value cutoff of 1e-3. Interproscan v5.53 [126] results were imported in to Blast2GO and merged with GO annotations. KEGG annotation terms [79, 127] were assigned using a combination of BlastKOALA v2.2 [128] and the KEGG Automatic Annotation Server (KAAS) [129] searched against eukaryote and prokaryote KEGG GENES databases (release v89.1), with the single-directional best hit method.

Secreted proteins were predicted using SignalP 5.0 [130] to identify signal peptides sequences. Predicted secreted proteins were then analyzed with EffectorP 2.0 [131, 132] to predict genes encoding for potential effectors. Evidence including protein size and cysteine content was used for effector prediction. Potential function of effectors was evaluated by a BLASTp search [133] of the GenBank nr database (release 239) [106] with an e-value cutoff of 0.001. Functional domains were assigned using CD-Search webserver with default settings against the Conserved Domain Database v3.20 [134, 135]. Carbohydrate active enzymes were predicted using the dbCAN3 meta server [136,137,138] which integrates HMMER [139], DIAMOND [140], and Hotpep [141] searches of the CAZy database [142]. Biosynthetic gene clusters were predicted and identified using antiSMASH v5.1.2 [143].

GC-content distribution

Analysis of GC-content was performed by segmenting genomic sequences into regions of differing GC-content using the Jensen-Shannon divergence at each sequence position calculated using OcculterCut v1.1 [144]. Gene models associated with AT-rich genomic regions were used as a test set in a GO term enrichment analysis test using a two-tailed Fisher’s Exact Test with a filter value of 0.05 with BLAST2GO v5.2.5 [125].

Fungal super-gene phylogeny

We collected proteomes from 24 related ascomycete species to identify orthologous gene families. OrthoFinder v2.2.6 [145] was used under default settings to build orthogroups. Thirty-four single-copy orthologous gene families and their corresponding protein sequences were retrieved and aligned with MUSCLE v3.8.31 [146] and alignments were trimmed with TrimAl v1.4 [147] using the automated feature to select the best method. The trimmed alignments were automatically concatenated and partitioned using IQ-TREE v1.7-beta17 [148, 149]. The maximum likelihood tree was reconstructed with IQ-TREE under the LG + I + G model as selected using ModelFinder [150].

Gene family counts from the Orthofinder analysis were used to reconstruct ancestral gene family content and gain/loss of homologous gene families. These traits were reconstructed using Wagner parsimony in the Count software package [151] as well as stochastic mapping with GLOOME [152].

RIP analysis

RIP indices of individual repeat copies were calculated in RStudio v1.1.414 [153] using a custom R (v4.1.2) script (file S1) and the Biostrings package v2.62.0 [154]. RIPCAL v2 [155] was used for alignment based analysis of repeat families and calculations of mutation frequencies. Large RIP affected regions (LRARs) were identified by a minimum of seven consecutive sliding windows (window size = 1000 bp, slide size = 500 bp) with a minimum RIP product value of 1.1, maximum RIP substrate value of 0.75 and minimum composite (product – substrate) value of 0.01. RIP product, substrate, and composite values and LRAR analysis was performed using The RIPper [156].

Availability of data and materials

The Anisogramma anomala OR1 genome sequence and assembly have been submitted to NCBI SRA and have been given the BioProject reference number PRJNA966177. Protein and coding sequence fasta files can be found on FigShare under identifiers 24,905,166.v1 and 24,898,656.v1 respectively.

Abbreviations

BUSCO:

Benchmarking single-copy orthologs

CAZymes:

Carbohydrate-active enzymes

EFB:

Eastern Filbert Blight

GO:

Gene ontology

KEGG:

Kyoto encyclopedia of genes and genomes

LRAR:

Large RIP-affected region

RIP:

Repeat-induced point mutation

TEs:

Transposable elements

References

  1. Giovannetti M, Avio L, Sbrana C. Fungal spore germination and pre-symbiotic mycelial growth–physiological and genetic aspects. In: Koltai H, Kapulnik Y, editors. Arbuscular mycorrhizas: physiology and function. Dordrecht: Springer; 2010. p. 3–32.

  2. Chanclud E, Morel JB. Plant hormones: a fungal point of view. Mol Plant Pathol. 2016;17(8):1289–97.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Lorrain C, Goncalves Dos Santos KC, Germain H, Hecker A, Duplessis S. Advances in understanding obligate biotrophy in rust fungi. New Phytol. 2019;222(3):1190–206.

    Article  PubMed  Google Scholar 

  4. Glawe DA. The powdery mildews: a review of the world’s most familiar (yet poorly known) plant pathogens. Annu Rev Phytopathol. 2008;46:27–51.

    Article  CAS  PubMed  Google Scholar 

  5. Bélanger RR, Bushnell WR, Dik AJ, Carver TL. The powdery mildews: a comprehensive treatise. St. Paul: American Phytopathological Society (APS Press); 2002.

  6. Spanu P, Kamper J. Genomics of biotrophy in fungi and oomycetes–emerging patterns. Curr Opin Plant Biol. 2010;13(4):409–14.

    Article  CAS  PubMed  Google Scholar 

  7. Spanu PD. The genomics of obligate (and nonobligate) biotrophs. Annu Rev Phytopathol. 2012;50:91–109.

    Article  CAS  PubMed  Google Scholar 

  8. Kemen E, Jones JD. Obligate biotroph parasitism: can we link genomes to lifestyles? Trends Plant Sci. 2012;17(8):448–57.

    Article  CAS  PubMed  Google Scholar 

  9. Tang C, Xu Q, Zhao M, Wang X, Kang Z. Understanding the lifestyles and pathogenicity mechanisms of obligate biotrophic fungi in wheat: The emerging genomics era. The Crop Journal. 2018;6(1):60–7.

    Article  Google Scholar 

  10. Liang P, Liu S, Xu F, Jiang S, Yan J, He Q, et al. Powdery mildews are characterized by contracted carbohydrate metabolism and diverse effectors to adapt to obligate biotrophic lifestyle. Front Microbiol. 2018;9:3160.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Spanu PD, Abbott JC, Amselem J, Burgis TA, Soanes DM, Stüber K, et al. Genome expansion and gene loss in powdery mildew fungi reveal tradeoffs in extreme parasitism. Science. 2010;330(6010):1543–6.

    Article  CAS  PubMed  Google Scholar 

  12. Grandaubert J, Lowe RG, Soyer JL, Schoch CL, Van de Wouw AP, Fudal I, et al. Transposable element-assisted evolution and adaptation to host plant within the Leptosphaeria maculans-Leptosphaeria biglobosa species complex of fungal pathogens. BMC Genomics. 2014;15(1):1–27.

    Article  Google Scholar 

  13. Oliver KR, Greene WK. Transposable elements: powerful facilitators of evolution. BioEssays. 2009;31(7):703–14.

    Article  CAS  PubMed  Google Scholar 

  14. Duplessis S, Bakkeren G, Hamelin R. Advancing knowledge on biology of rust fungi through genomics. Adv Bot Res. 2014;70:173–209.

    Article  Google Scholar 

  15. Bindschedler LV, Panstruga R, Spanu PD. Mildew-omics: how global analyses aid the understanding of life and evolution of powdery mildews. Front Plant Sci. 2016;7:123.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Fuller A. The filbert or hazelnut. The Nut Culturist, Orange Judd Company, NY. 1908:118–46.

  17. Weschcke C. Hazels and filberts. Growing nuts in the north Webb, St Paul, MN. 1954:24–38.

  18. Farr DF, Bills GF, Chamuris GP, Rossman AY. Fungi on plants and plant products in the United States. St. Paul: APS Press; 1989.

  19. Pinkerton J, Johnson K, Mehlenbacher S, Pscheidt J. Susceptibility of European hazelnut clones to eastern filbert blight. Plant Dis. 1993;77(3):261–6.

    Article  Google Scholar 

  20. Gottwald T, Cameron H. Studies in the morphology and life history of Anisogramma anomala. Mycologia. 1979;71(6):1107–26.

    Article  Google Scholar 

  21. Pinkerton J, Stone J, Nelson S, Johnson K. Infection of European hazelnut by Anisogramma anomala: Ascospore adhesion, mode of penetration of immature shoots, and host response. Phytopathology. 1995;85(10):1260–8.

    Article  Google Scholar 

  22. Thompson M, HB L, SA M. Hazelnuts. Fruits Breeding (Edited by Jules Janick and James N. Moore). Volume III Chapter 3. 1996;184:125.

  23. Pinkerton J, Johnson K, Theiling K, Griesbach J. Distribution and characteristics of the eastern filbert blight epidemic in western Oregon. Plant Dis. 1992;76(11):1179–82.

    Article  Google Scholar 

  24. Davison A, Davidson R. Apioporthe and Monochaetia cankers reported in western Washington. Plant Disease Reporter. 1973.

  25. Julian J, Seavert C, Olsen J, editors. An economic evaluation of the impact of Eastern Filbert Blight resistant hazelnut cultivars in Oregon, Usa. VII International Congress on Hazelnut. 2008;845.

  26. Snelling J, Mehlenbacher S, Heilsnis B, Mooneyham R, editors. Breeding hazelnuts resistant to eastern filbert blight. XXXI International Horticultural Congress (IHC2022): International Symposium on Breeding and Effective Use of Biotechnology and 1362;2022.

  27. Stone JK, Pinkerton J, Johnson K. Axenic culture of Anisogramma anomala: Evidence for self-inhibition of ascospore germination and colony growth. Mycologia. 1994;86(5):674–83.

    Article  Google Scholar 

  28. Cai G, Leadbetter CW, Muehlbauer MF, Molnar TJ, Hillman BI. Genome-wide microsatellite identification in the fungus Anisogramma anomala using Illumina sequencing and genome assembly. PLoS ONE. 2013;8(11): e82408.

    Article  PubMed  PubMed Central  Google Scholar 

  29. Gottwald TR, Cameron HR. Infection site, infection period, and latent period of canker caused byAnisogramma anomalain European Filbert. Phytopathology. 1980;70(11):1083–7.

    Article  Google Scholar 

  30. Pinkerton J, Johnson K, Stone J, Ivors K. Maturation and seasonal discharge pattern of ascospores of Anisogramma anomala. Phytopathology. 1998;88(11):1165–73.

    Article  CAS  PubMed  Google Scholar 

  31. Gottwald TR. Infection Site, Infection Period, and Latent Period of Canker Caused byAnisogramma anomalain European Filbert. Phytopathology. 1980;70(11):1083–7.

  32. Wicker T, Oberhaensli S, Parlange F, Buchmann JP, Shatalina M, Roffler S, et al. The wheat powdery mildew genome shows the unique evolution of an obligate biotroph. Nat Genet. 2013;45(9):1092–6.

    Article  CAS  PubMed  Google Scholar 

  33. Wu Y, Ma X, Pan Z, Kale SD, Song Y, King H, et al. Comparative genome analyses reveal sequence features reflecting distinct modes of host-adaptation between dicot and monocot powdery mildew. BMC Genomics. 2018;19:1–20.

    Article  Google Scholar 

  34. Wadl PA, Mack BM, Beltz SB, Moore GG, Baird RE, Rinehart TA, et al. Development of genomic resources for the powdery mildew. Erysiphe pulchra Plant Dis. 2019;103(5):804–7.

    Article  PubMed  Google Scholar 

  35. Micali C, Göllner K, Humphry M, Consonni C, Panstruga R. The powdery mildew disease of Arabidopsis: a paradigm for the interaction between plants and biotrophic fungi. Arabidopsis Book. 2008;6:e0115.

  36. Duplessis S, Cuomo CA, Lin YC, Aerts A, Tisserant E, Veneault-Fourrey C, et al. Obligate biotrophy features unraveled by the genomic analysis of rust fungi. Proc Natl Acad Sci U S A. 2011;108(22):9166–71.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Frantzeskakis L, Németh MZ, Barsoum M, Kusch S, Kiss L, Takamatsu S, et al. The Parauncinula polyspora draft genome provides insights into patterns of gene erosion and genome expansion in powdery mildew fungi. MBio. 2019;10(5): 01692–19. https://doi.org/10.1128/mbio.

  38. Kamper J, Kahmann R, Bolker M, Ma LJ, Brefort T, Saville BJ, et al. Insights from the genome of the biotrophic fungal plant pathogen Ustilago maydis. Nature. 2006;444(7115):97–101.

    Article  PubMed  Google Scholar 

  39. Cissé OH, Almeida JM, Fonseca Á, Kumar AA, Salojärvi J, Overmyer K, et al. Genome sequencing of the plant pathogen Taphrina deformans, the causal agent of peach leaf curl. MBio. 2013;4(3):00055–13. https://doi.org/10.1128/mbio.

  40. Cuomo CA, Gueldener U, Xu JR, Trail F, Turgeon BG, Di Pietro A, et al. The Fusarium graminearum genome reveals a link between localized polymorphism and pathogen specialization. Science. 2007;317(5843):1400–2.

    Article  CAS  PubMed  Google Scholar 

  41. King R, Urban M, Hammond-Kosack MC, Hassani-Pak K, Hammond-Kosack KE. The completed genome sequence of the pathogenic ascomycete fungus Fusarium graminearum. BMC Genomics. 2015;16(1):544.

    Article  PubMed  PubMed Central  Google Scholar 

  42. O’Connell RJ, Thon MR, Hacquard S, Amyotte SG, Kleemann J, Torres MF, et al. Lifestyle transitions in plant pathogenic Colletotrichum fungi deciphered by genome and transcriptome analyses. Nat Genet. 2012;44(9):1060–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Gómez Luciano LB, Tsai IJ, Chuma I, Tosa Y, Chen Y-H, Li J-Y, et al. Blast fungal genomes show frequent chromosomal changes, gene gains and losses, and effector gene turnover. Mol Biol Evol. 2019;36(6):1148–61.

    Article  PubMed  Google Scholar 

  44. Dean RA, Talbot NJ, Ebbole DJ, Farman ML, Mitchell TK, Orbach MJ, Thon M, Kulkarni R, Xu JR, Pan H, Read ND. The genome sequence of the rice blast fungus Magnaporthegrisea. Nature. 2005;434(7036):980–6.

    Article  CAS  PubMed  Google Scholar 

  45. Klosterman SJ, Subbarao KV, Kang S, Veronese P, Gold SE, Thomma BP, et al. Comparative genomics yields insights into niche adaptation of plant vascular wilt pathogens. PLoS Pathog. 2011;7(7): e1002137.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Van Kan JA, Stassen JH, Mosbach A, Van Der Lee TA, Faino L, Farmer AD, et al. A gapless genome sequence of the fungus Botrytis cinerea. Mol Plant Pathol. 2017;18(1):75–89.

    Article  PubMed  Google Scholar 

  47. Amselem J, Cuomo CA, van Kan JA, Viaud M, Benito EP, Couloux A, et al. Genomic analysis of the necrotrophic fungal pathogens Sclerotinia sclerotiorum and Botrytis cinerea. PLoS Genet. 2011;7(8): e1002230.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Crouch JA, Dawe A, Aerts A, Barry K, Churchill AC, Grimwood J, et al. Genome sequence of the chestnut blight fungus Cryphonectria parasitica EP155: a fundamental resource for an archetypical invasive plant pathogen. Phytopathology. 2020;110(6):1180–8.

    Article  CAS  PubMed  Google Scholar 

  49. Baroncelli R, Scala F, Vergara M, Thon MR, Ruocco M. Draft whole-genome sequence of the Diaporthe helianthi 7/96 strain, causal agent of sunflower stem canker. Genomics Data. 2016;10:151–2.

    Article  PubMed  PubMed Central  Google Scholar 

  50. Derbyshire M, Denton-Giles M, Hegedus D, Seifbarghy S, Rollins J, van Kan J, et al. The complete genome sequence of the phytopathogenic fungus Sclerotinia sclerotiorum reveals insights into the genome architecture of broad host range pathogens. Genome Biol Evol. 2017;9(3):593–618.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Yin Z, Liu H, Li Z, Ke X, Dou D, Gao X, et al. Genome sequence of Valsa canker pathogens uncovers a potential adaptation of colonization of woody bark. New Phytol. 2015;208(4):1202–16.

    Article  CAS  PubMed  Google Scholar 

  52. Coleman JJ, Rounsley SD, Rodriguez-Carres M, Kuo A, Wasmann CC, Grimwood J, et al. The genome of Nectria haematococca: contribution of supernumerary chromosomes to gene expansion. PLoS Genet. 2009;5(8): e1000618.

    Article  PubMed  PubMed Central  Google Scholar 

  53. Semeiks J, Borek D, Otwinowski Z, Grishin NV. Comparative genome sequencing reveals chemotype-specific gene clusters in the toxigenic black mold Stachybotrys. BMC Genomics. 2014;15(1):1–16.

    Article  Google Scholar 

  54. Galagan JE, Calvo SE, Borkovich KA, Selker EU, Read ND, Jaffe D, et al. The genome sequence of the filamentous fungus Neurospora crassa. Nature. 2003;422(6934):859–68.

    Article  CAS  PubMed  Google Scholar 

  55. Cuomo CA, Untereiner WA, Ma L-J, Grabherr M, Birren BW. Draft genome sequence of the cellulolytic fungus Chaetomium globosum. Genome Announc. 2015;3(1):e00021-e115.

    Article  PubMed  PubMed Central  Google Scholar 

  56. McGuire IC, Marra RE, Turgeon BG, Milgroom MG. Analysis of mating-type genes in the chestnut blight fungus. Cryphonectria parasitica Fungal Genet Biol. 2001;34(2):131–44.

    Article  CAS  PubMed  Google Scholar 

  57. Mohanta TK, Bae H. The diversity of fungal genome. Biol Proced Online. 2015;17:8.

    Article  PubMed  PubMed Central  Google Scholar 

  58. Espagne E, Lespinet O, Malagnac F, Da Silva C, Jaillon O, Porcel BM, et al. The genome sequence of the model ascomycete fungus Podospora anserina. Genome Biol. 2008;9(5):R77.

    Article  PubMed  PubMed Central  Google Scholar 

  59. Tavares S, Ramos AP, Pires AS, Azinheira HG, Caldeirinha P, Link T, et al. Genome size analyses of Pucciniales reveal the largest fungal genomes. Front Plant Sci. 2014;5:422.

    Article  PubMed  PubMed Central  Google Scholar 

  60. Kemen AC, Agler MT, Kemen E. Host–microbe and microbe–microbe interactions in the evolution of obligate plant parasitism. New Phytol. 2015;206(4):1207–28.

    Article  CAS  PubMed  Google Scholar 

  61. Gómez-Pérez D, Kemen E. Predicting lifestyle from positive selection data and genome properties in oomycetes. Pathogens. 2021;10(7):807.

    Article  PubMed  PubMed Central  Google Scholar 

  62. Muszewska A, Hoffman-Sommer M, Grynberg M. LTR retrotransposons in fungi. PLoS ONE. 2011;6(12): e29425.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Wu L, Chen H, Curtis C, Fu ZQ. Go in for the kill: How plants deploy effector-triggered immunity to combat pathogens. Virulence. 2014;5(7):710–21.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Jaswal R, Kiran K, Rajarammohan S, Dubey H, Singh PK, Sharma Y, et al. Effector biology of biotrophic plant fungal pathogens: Current advances and future prospects. Microbiol Res. 2020;241: 126567.

    Article  CAS  PubMed  Google Scholar 

  65. De Wit PJ. Pathogen avirulence and plant resistance: a key role for recognition. Trends Plant Sci. 1997;2(12):452–8.

    Article  Google Scholar 

  66. Lo Presti L, Lanver D, Schweizer G, Tanaka S, Liang L, Tollot M, et al. Fungal effectors and plant susceptibility. Annu Rev Plant Biol. 2015;66:513–45.

    Article  CAS  PubMed  Google Scholar 

  67. Mehlenbacher SA, Thompson MM, Cameron HR. Occurrence and Inheritance of Resistance to Eastern Filbert Blight in Gasaway Hazelnut. HortScience. 1991;26(4):410–1.

    Article  Google Scholar 

  68. Molnar TJ, Goffreda JC, Funk CR. Survey of Corylus Resistance to Anisogramma anomala from Different Geographic Locations. HortScience. 2010;45(5):832–6.

    Article  Google Scholar 

  69. Molnar TJ, Capik J, Zhao S, Zhang N. First Report of Eastern Filbert Blight on Corylus avellana “Gasaway” and “VR20-11” Caused by Anisogramma anomala in New Jersey. Plant Dis. 2010;94(10):1265.

    Article  CAS  PubMed  Google Scholar 

  70. Sathuvalli VR, Mehlenbacher SA, Smith DC. Response of Hazelnut Accessions to Greenhouse Inoculation with Anisogramma anomala. HortScience. 2010;45(7):1116–9.

    Article  Google Scholar 

  71. Toruño TY, Stergiopoulos I, Coaker G. Plant-pathogen effectors: cellular probes interfering with plant defenses in spatial and temporal manners. Annu Rev Phytopathol. 2016;54:419–41.

    Article  PubMed  PubMed Central  Google Scholar 

  72. Sharpee WC, Dean RA. Form and function of fungal and oomycete effectors. Fungal Biol Rev. 2016;30(2):62–73.

    Article  Google Scholar 

  73. Plissonneau C, Benevenuto J, Mohd-Assaad N, Fouché S, Hartmann FE, Croll D. Using population and comparative genomics to understand the genetic basis of effector-driven fungal pathogen evolution. Front Plant Sci. 2017;8:119.

    Article  PubMed  PubMed Central  Google Scholar 

  74. Raffaele S, Kamoun S. Genome evolution in filamentous plant pathogens: why bigger can be better. Nat Rev Microbiol. 2012;10(6):417–30.

    Article  CAS  PubMed  Google Scholar 

  75. Zhao Z, Liu H, Wang C, Xu J-R. Erratum to: comparative analysis of fungal genomes reveals different plant cell wall degrading capacity in fungi. BMC Genomics. 2014;15(1):1–15.

    Article  Google Scholar 

  76. Gruber S, Seidl-Seiboth V. Self versus non-self: fungal cell wall degradation in Trichoderma. Microbiology. 2012;158(1):26–34.

    Article  CAS  PubMed  Google Scholar 

  77. Lyu X, Shen C, Fu Y, Xie J, Jiang D, Li G, et al. Comparative genomic and transcriptional analyses of the carbohydrate-active enzymes and secretomes of phytopathogenic fungi reveal their significant roles during infection and development. Sci Rep. 2015;5(1):15565.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Scott K. Obligate parasitism by phytopathogenic fungi. Biol Rev. 1972;47(4):537–72.

    Article  Google Scholar 

  79. Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 1999;27(1):29–34.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Kemen E, Gardiner A, Schultz-Larsen T, Kemen AC, Balmuth AL, Robert-Seilaniantz A, et al. Gene gain and loss during evolution of obligate parasitism in the white rust pathogen of Arabidopsis thaliana. PLoS Biol. 2011;9(7): e1001094.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. McDowell JM. Genomes of obligate plant pathogens reveal adaptations for obligate parasitism. Proc Natl Acad Sci. 2011;108(22):8921–2.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Pendleton AL, Smith KE, Feau N, Martin FM, Grigoriev IV, Hamelin R, et al. Duplications and losses in gene families of rust pathogens highlight putative effectors. Front Plant Sci. 2014;5:299.

    Article  PubMed  PubMed Central  Google Scholar 

  83. Bourras S, Praz CR, Spanu PD, Keller B. Cereal powdery mildew effectors: a complex toolbox for an obligate pathogen. Curr Opin Microbiol. 2018;46:26–33.

    Article  PubMed  Google Scholar 

  84. Kamoun S. Groovy times: filamentous pathogen effectors revealed. Curr Opin Plant Biol. 2007;10(4):358–65.

    Article  CAS  PubMed  Google Scholar 

  85. De Jonge R, Bolton MD, Thomma BP. How filamentous pathogens co-opt plants: the ins and outs of fungal effectors. Curr Opin Plant Biol. 2011;14(4):400–6.

    Article  PubMed  Google Scholar 

  86. Gladyshev E. Repeat-Induced Point Mutation and Other Genome Defense Mechanisms in Fungi. Microbiol Spectr. 2017;5(4):687–99.

  87. Galagan JE, Selker EU. RIP: the evolutionary cost of genome defense. Trends Genet. 2004;20(9):417–23.

    Article  CAS  PubMed  Google Scholar 

  88. Selker EU, Garrett PW. DNA sequence duplications trigger gene inactivation in Neurospora crassa. Proc Natl Acad Sci. 1988;85(18):6870–4.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  89. Cambareri EB, Jensen BC, Schabtach E, Selker EU. Repeat-induced G-C to A-T mutations in Neurospora. Science. 1989;244(4912):1571–5.

    Article  CAS  PubMed  Google Scholar 

  90. Singer MJ, Marcotte BA, Selker EU. DNA methylation associated with repeat-induced point mutation in Neurospora crassa. Mol Cell Biol. 1995;15(10):5586–97.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Wang L, Sun Y, Sun X, Yu L, Xue L, He Z, et al. Repeat-induced point mutation in Neurospora crassa causes the highest known mutation rate and mutational burden of any cellular life. Genome Biol. 2020;21:1–23.

    Article  Google Scholar 

  92. Clutterbuck AJ. Genomic evidence of repeat-induced point mutation (RIP) in filamentous ascomycetes. Fungal Genet Biol. 2011;48(3):306–26.

    Article  PubMed  Google Scholar 

  93. Selker EU, Tountas NA, Cross SH, Margolin BS, Murphy JG, Bird AP, et al. The methylated component of the Neurospora crassa genome. Nature. 2003;422(6934):893–7.

    Article  CAS  PubMed  Google Scholar 

  94. Margolin BS, Garrett-Engele PW, Stevens JN, Fritz DY, Garrett-Engele C, Metzenberg RL, et al. A methylated Neurospora 5S rRNA pseudogene contains a transposable element inactivated by repeat-induced point mutation. Genetics. 1998;149(4):1787–97.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  95. Selker EU, Stevens JN. DNA methylation at asymmetric sites is associated with numerous transition mutations. Proc Natl Acad Sci. 1985;82(23):8114–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  96. Frantzeskakis L, Kusch S, Panstruga R. The need for speed: compartmentalized genome evolution in filamentous phytopathogens. Mol Plant Pathol. 2019;20(1):3–7.

    Article  PubMed  Google Scholar 

  97. Faino L, Seidl MF, Shi-Kunne X, Pauper M, van den Berg GC, Wittenberg AH, et al. Transposons passively and actively contribute to evolution of the two-speed genome of a fungal pathogen. Genome Res. 2016;26(8):1091–100.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. Dong S, Raffaele S, Kamoun S. The two-speed genomes of filamentous pathogens: waltz with plants. Curr Opin Genet Dev. 2015;35:57–65.

    Article  CAS  PubMed  Google Scholar 

  99. Lee SC, Corradi N, Doan S, Dietrich FS, Keeling PJ, Heitman J. Evolution of the sex-related locus and genomic features shared in microsporidia and fungi. PLoS ONE. 2010;5(5): e10539.

    Article  PubMed  PubMed Central  Google Scholar 

  100. Gioti A, Mushegian AA, Strandberg R, Stajich JE, Johannesson H. Unidirectional evolutionary transitions in fungal mating systems and the role of transposable elements. Mol Biol Evol. 2012;29(10):3215–26.

    Article  CAS  PubMed  Google Scholar 

  101. Kanzi AM, Steenkamp ET, Van der Merwe NA, Wingfield BD. The mating system of the Eucalyptus canker pathogen Chrysoporthe austroafricana and closely related species. Fungal Genet Biol. 2019;123:41–52.

    Article  CAS  PubMed  Google Scholar 

  102. Muehlbauer MF, Tobia J, Honig JA, Zhang N, Hillman BI, Gold KM, et al. Population differentiation within Anisogramma anomala in North America. Phytopathology. 2019;109(6):1074–82.

    Article  CAS  PubMed  Google Scholar 

  103. Tobia J, Muehlbauer MF, Honig JA, Pscheidt JW, Capik JM, Molnar TJ. Genetic Diversity Analysis of Anisogramma anomala in the Pacific Northwest and New Jersey. Manuscript submitted for publication. 2022.

  104. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–20.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  105. Gnerre S, Maccallum I, Przybylski D, Ribeiro FJ, Burton JN, Walker BJ, et al. High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci U S A. 2011;108(4):1513–8.

    Article  CAS  PubMed  Google Scholar 

  106. Sayers EW, Bolton EE, Brister JR, Canese K, Chan J, Comeau DC, et al. Database resources of the National Center for Biotechnology Information in 2023. Nucleic Acids Res. 2023;51(D1):D29–38.

    Article  CAS  PubMed  Google Scholar 

  107. Loureiro J, Rodriguez E, DOLEŽEL J, Santos C. Comparison of four nuclear isolation buffers for plant DNA flow cytometry. Annals of Botany. 2006;98(3):679–89.

  108. Haridas S, Salamov A, Grigoriev IV. Fungal genome annotation. Fungal Genomics: Springer; 2018. p. 171–84.

    Book  Google Scholar 

  109. Smit A, Hubley R. RepeatModeler Open-10.2008–2015: http://www.repeatmasker.org.

  110. Flynn JM, Hubley R, Goubert C, Rosen J, Clark AG, Feschotte C, et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc Natl Acad Sci. 2020;117(17):9451–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  111. Haas B. TransposonPSI: an application of PSI-Blast to mine (retro-) transposon ORF homologies. Broad Institute, Cambridge, MA, USA. 2007.

  112. Ellinghaus D, Kurtz S, Willhoeft U. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics. 2008;9(1):18.

    Article  PubMed  PubMed Central  Google Scholar 

  113. El-Gebali S, Mistry J, Bateman A, Eddy SR, Luciani A, Potter SC, et al. The Pfam protein families database in 2019. Nucleic Acids Res. 2019;47(D1):D427–32.

    Article  CAS  PubMed  Google Scholar 

  114. Llorens C, Futami R, Covelli L, Domínguez-Escribá L, Viu JM, Tamarit D, et al. The Gypsy Database (GyDB) of mobile genetic elements: release 2.0. Nucl Acids Res. 2010;39(suppl_1):D70-D4.

  115. Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J. Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res. 2005;110(1–4):462–7.

    Article  CAS  PubMed  Google Scholar 

  116. Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22(13):1658–9.

    Article  CAS  PubMed  Google Scholar 

  117. Coghlan A, Tsai IJ, Berriman M. Creation of a comprehensive repeat library for a newly sequenced parasitic worm genome. Protoc Exch. 2018. https://doi.org/10.1038/protex.2018.054.

  118. Smit A, Hubley R, Green P. RepeatMasker Open-40.2013–2015:<http://www.repeatmasker.org>.

  119. Hoff KJ, Lomsadze A, Stanke M, Borodovsky M. BRAKER2: incorporating protein homology information into gene prediction with GeneMark-EP and AUGUSTUS. Plant and Animal Genomes XXVI. 2018.

  120. Lomsadze A, Ter-Hovhannisyan V, Chernoff YO, Borodovsky M. Gene identification in novel eukaryotic genomes by self-training algorithm. Nucleic Acids Res. 2005;33(20):6494–506.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  121. Ter-Hovhannisyan V, Lomsadze A, Chernoff YO, Borodovsky M. Gene prediction in novel fungal genomes using an ab initio algorithm with unsupervised training. Genome Res. 2008;18(12):1979–90.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  122. Stanke M, Keller O, Gunduz I, Hayes A, Waack S, Morgenstern B. AUGUSTUS: ab initio prediction of alternative transcripts. Nucl Acids Res. 2006;34(suppl_2):W435-W9.

  123. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29(1):15–21.

    Article  CAS  PubMed  Google Scholar 

  124. Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015;31(19):3210–2.

    Article  PubMed  Google Scholar 

  125. Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005;21(18):3674–6.

    Article  CAS  PubMed  Google Scholar 

  126. Quevillon E, Silventoinen V, Pillai S, Harte N, Mulder N, Apweiler R, et al. InterProScan: protein domains identifier. Nucl Acids Res. 2005;33(suppl_2):W116-W20.

  127. Mao X, Cai T, Olyarchuk JG, Wei L. Automated genome annotation and pathway identification using the KEGG Orthology (KO) as a controlled vocabulary. Bioinformatics. 2005;21(19):3787–93.

    Article  CAS  PubMed  Google Scholar 

  128. Kanehisa M, Sato Y, Morishima K. BlastKOALA and GhostKOALA: KEGG tools for functional characterization of genome and metagenome sequences. J Mol Biol. 2016;428(4):726–31.

    Article  CAS  PubMed  Google Scholar 

  129. Moriya Y, Itoh M, Okuda S, Yoshizawa AC, Kanehisa M. KAAS: an automatic genome annotation and pathway reconstruction server. Nucl Acids Res. 2007;35(suppl_2):W182-W5.

  130. Petersen TN, Brunak S, Von Heijne G, Nielsen H. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods. 2011;8(10):785–6.

  131. Sperschneider J, Dodds PN, Gardiner DM, Singh KB, Taylor JM. Improved prediction of fungal effector proteins from secretomes with EffectorP 2.0. Mole Plant Pathol. 2018;19(9):2094–110.

  132. Sperschneider J, Gardiner DM, Dodds PN, Tini F, Covarelli L, Singh KB, et al. EffectorP: predicting fungal effector proteins from secretomes using machine learning. New Phytol. 2016;210(2):743–61.

    Article  CAS  PubMed  Google Scholar 

  133. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10:1–9.

    Article  Google Scholar 

  134. Wang J, Chitsaz F, Derbyshire MK, Gonzales NR, Gwadz M, Lu S, et al. The conserved domain database in 2023. Nucleic Acids Res. 2023;51(D1):D384–8.

    Article  CAS  PubMed  Google Scholar 

  135. Marchler-Bauer A, Bryant SH. CD-Search: protein domain annotations on the fly. Nucl Acids Res. 2004;32(suppl_2):W327-W31.

  136. Zhang H, Yohe T, Huang L, Entwistle S, Wu P, Yang Z, et al. dbCAN2: a meta server for automated carbohydrate-active enzyme annotation. Nucleic Acids Res. 2018;46(W1):W95–101.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  137. Yin Y, Mao X, Yang J, Chen X, Mao F, Xu Y. dbCAN: a web resource for automated carbohydrate-active enzyme annotation. Nucleic Acids Res. 2012;40(W1):W445–51.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  138. Zheng J, Ge Q, Yan Y, Zhang X, Huang L, Yin Y. dbCAN3: automated carbohydrate-active enzyme and substrate annotation. Nucl Acids Res. 2023;51(W1):W115–21.

  139. Finn RD, Clements J, Eddy SR. HMMER web server: interactive sequence similarity searching. Nucl Acids Res. 2011;39(suppl_2):W29–37.

  140. Buchfink B, Xie C, Huson DH. Fast and sensitive protein alignment using DIAMOND. Nat Methods. 2015;12(1):59–60.

    Article  CAS  PubMed  Google Scholar 

  141. Busk PK, Pilgaard B, Lezyk MJ, Meyer AS, Lange L. Homology to peptide pattern for annotation of carbohydrate-active enzymes and prediction of function. BMC Bioinformatics. 2017;18(1):214.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  142. Lombard V, Golaconda Ramulu H, Drula E, Coutinho PM, Henrissat B. The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res. 2014;42(D1):D490–5.

    Article  CAS  PubMed  Google Scholar 

  143. Blin K, Shaw S, Steinke K, Villebro R, Ziemert N, Lee SY, et al. antiSMASH 5.0: updates to the secondary metabolite genome mining pipeline. Nucl Acids Res. 2019;47(W1):W81-W7.

  144. Testa AC, Oliver RP, Hane JK. OcculterCut: A Comprehensive Survey of AT-Rich Regions in Fungal Genomes. Genome Biol Evol. 2016;8(6):2044–64.

    Article  PubMed  PubMed Central  Google Scholar 

  145. Emms DM, Kelly S. OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 2015;16(1):157.

    Article  PubMed  PubMed Central  Google Scholar 

  146. Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32(5):1792–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  147. Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics. 2009;25(15):1972–3.

    Article  PubMed  PubMed Central  Google Scholar 

  148. Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, Von Haeseler A, et al. IQ-TREE 2: New models and efficient methods for phylogenetic inference in the genomic era. Mol Biol Evol. 2020;37(5):1530–4.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  149. Nguyen L-T, Schmidt HA, Von Haeseler A, Minh BQ. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol. 2015;32(1):268–74.

    Article  CAS  PubMed  Google Scholar 

  150. Kalyaanamoorthy S, Minh BQ, Wong TK, von Haeseler A, Jermiin LS. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods. 2017;14(6):587–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  151. Csűös M. Count: evolutionary analysis of phylogenetic profiles with parsimony and likelihood. Bioinformatics. 2010;26(15):1910–2.

    Article  Google Scholar 

  152. Cohen O, Ashkenazy H, Belinky F, Huchon D, Pupko T. GLOOME: gain loss mapping engine. Bioinformatics. 2010;26(22):2914–5.

    Article  CAS  PubMed  Google Scholar 

  153. Team R. RStudio: integrated development for R. RStudio, Inc, Boston, MA URL http://www.rstudio.com. 2015;42:14.

  154. Pagès H, Aboyoun P, Gentleman R, DebRoy S. Biostrings: Efficient manipulation of biological strings. R package version. 2017;2(0).

  155. Hane JK, Oliver RP. RIPCAL: a tool for alignment-based analysis of repeat-induced point mutations in fungal genomic sequences. BMC Bioinformatics. 2008;9(1):478.

    Article  PubMed  PubMed Central  Google Scholar 

  156. Van Wyk S, Harrison CH, Wingfield BD, De Vos L, van Der Merwe NA, Steenkamp ET. The RIPper, a web-based tool for genome-wide quantification of Repeat-Induced Point (RIP) mutations. PeerJ. 2019;7: e7447.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank Dr. Shawn Mehlenbacher, Oregon State University, for his generous gift of the infected hazelnut stems from which the fungal spores and DNA were isolated.

Funding

Funding from the following is very gratefully acknowledged: the USDA-NIFA through the Specialty Crop Research Initiative Competitive Grants Program to TJM and BIH (Grant #2016–51181-25412); the Rutgers Microbial Biology Graduate Program for an initial PhD Research Fellowship to ABC; The New Jersey Agricultural Experiment Station for partial salary support to GC, DCP, TJM, NZ and BIH.

Author information

Authors and Affiliations

Authors

Contributions

B.I.H., T.J.M., and G.C. conceived of the project and designed research; A.B.C and G.C. performed the wet lab experiments; A.B.C., D.C.P and G.C. assembled and annotated the genome; A.B.C. and N.Z. performed the evolutionary analyses; A.B.C, G.C., and B.I.H. wrote the paper with contribution from all authors. All author(s) read and approved the final manuscript.

Corresponding authors

Correspondence to Guohong Cai or Bradley I. Hillman.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Infected hazelnut stems for laboratory use were imported under USDA Permit # P526P-07–03455 to BIH. The authors comply with relevant institutional guidelines for plant studies.

This paper does not involve human or animal experiments. This paper does not include any personal information. The dataset of this study is being submitted to GenBank and should be available in time for the review process.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cohen, A.B., Cai, G., Price, D.C. et al. The massive 340 megabase genome of Anisogramma anomala, a biotrophic ascomycete that causes eastern filbert blight of hazelnut. BMC Genomics 25, 347 (2024). https://doi.org/10.1186/s12864-024-10198-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12864-024-10198-1

Keywords