Comparative genome analyses of four rice-infecting Rhizoctonia solani isolates reveal extensive enrichment of homogalacturonan modification genes

Background Plant pathogenic isolates of Rhizoctonia solani anastomosis group 1-intraspecific group IA (AG1-IA) infect a wide range of crops causing diseases such as rice sheath blight (ShB). ShB has become a serious disease in rice production worldwide. Additional genome sequences of the rice-infecting R. solani isolates from different geographical regions will facilitate the identification of important pathogenicity-related genes in the fungus. Results Rice-infecting R. solani isolates B2 (USA), ADB (India), WGL (India), and YN-7 (China) were selected for whole-genome sequencing. Single-Molecule Real-Time (SMRT) and Illumina sequencing were used for de novo sequencing of the B2 genome. The genomes of the other three isolates were then sequenced with Illumina technology and assembled using the B2 genome as a reference. The four genomes ranged from 38.9 to 45.0 Mbp in size, contained 9715 to 11,505 protein-coding genes, and shared 5812 conserved orthogroups. The proportion of transposable elements (TEs) and average length of TE sequences in the B2 genome was nearly 3 times and 2 times greater, respectively, than those of ADB, WGL and YN-7. Although 818 to 888 putative secreted proteins were identified in the four isolates, only 30% of them were predicted to be small secreted proteins, which is a smaller proportion than what is usually found in the genomes of cereal necrotrophic fungi. Despite a lack of putative secondary metabolite biosynthesis gene clusters, the rice-infecting R. solani genomes were predicted to contain the most carbohydrate-active enzyme (CAZyme) genes among all 27 fungal genomes used in the comparative analysis. Specifically, extensive enrichment of pectin/homogalacturonan modification genes were found in all four rice-infecting R. solani genomes. Conclusion Four R. solani genomes were sequenced, annotated, and compared to other fungal genomes to identify distinctive genomic features that may contribute to the pathogenicity of rice-infecting R. solani. Our analyses provided evidence that genomic conservation of R. solani genomes among neighboring AGs was more diversified than among AG1-IA isolates and the presence of numerous predicted pectin modification genes in the rice-infecting R. solani genomes that may contribute to the wide host range and virulence of this necrotrophic fungal pathogen. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07549-7.


(Continued from previous page)
Conclusion: Four R. solani genomes were sequenced, annotated, and compared to other fungal genomes to identify distinctive genomic features that may contribute to the pathogenicity of rice-infecting R. solani. Our analyses provided evidence that genomic conservation of R. solani genomes among neighboring AGs was more diversified than among AG1-IA isolates and the presence of numerous predicted pectin modification genes in the rice-infecting R. solani genomes that may contribute to the wide host range and virulence of this necrotrophic fungal pathogen.
This study aimed to sequence four genomes of R. solani isolated from ShB-infected rice and conducted comparative genome analyses amongst them (R. solani AG1-IA) as well as to 4 R. solani genomes belonging to AGs aside from AG1-IA (AG1-IB, AG2, AG3 and AG8). We hypothesized that rice-infecting R. solani would possess a large arsenal of cell wall-degrading genes to support its necrotrophic lifestyle and host range. To test this hypothesis, we assembled high-quality genome sequences for four rice-infecting R. solani strains that were isolated from rice grown in diverse geographic regions of the world (USA, China, and India) and compared them to publicly available genomes belonging to R. solani AG1-IA, AG2-IB, AG2-2IIIB, AG3, and AG8 [13,16,17,19,20]. We also selected representative genomes encompassing different nutritional lifestyles and hosts from Basidiomycota (9 genomes) and Ascomycota (9 genomes) into our comparative analyses. In this study, pairwise whole-genome alignments suggest that macrosynteny exists among the rice-infecting R. solani genomes. This phylogenetic proximity is supported by a phylogenetic tree constructed using the maximum likelihood method as well as the existence of a larger set of core-orthogroups among rice-infecting AG1-IA genomes (5812 orthogroups) compared to core orthogroups of R. solani genomes from diverse R. solani AGs (3635 orthogroups). Comparative genome analyses also revealed that rice-infecting R. solani have a smaller set of SSPs compared to biotrophs and other necrotrophs (cereal). Conversely, rice-infecting R. solani genomes code for the highest number of CAZymes, which are predicted to be involved in plant cell wall modification and degradation. Specifically, all R. solani genomes used in this study, regardless of AG, were highly enriched in pectin-degrading genes, containing even more than the well-known pectin-degrading, necrotrophic fungus Verticillium dahliae. The high-quality genome sequencing data and comparative genomic results from this study are useful resources for functional analysis of pathogenicity genes in this important fungal pathogen of rice.

Results
High-quality genome sequences for rice-infecting R. solani AG1-IA isolates Four rice-infecting R. solani isolates were collected from rice grown in USA (B2), India (ADB, WGL), and China (YN-7) ( Table 1). De novo sequencing of the B2 genome was achieved using a Single-Molecule Real-Time (SMRT; Pacific Biosciences); the WGL, ADB, and YN-7 genomes were sequenced using Illumina technology and subsequently assembled using the B2 genome as reference (Additional file 1: Fig. S1). B2 had the largest genome of the four isolates (45. [16]. Thus, all comparative genomic analyses hereafter used the B2 genome as the representative for rice-infecting R. solani AG1-IA isolates.

Phylogenetic proximity of R. solani isolates
To evaluate the phylogenetic relationships of all 27 fungal genomes used in this study (Additional file 2: Table S1), we constructed a maximum likelihood-based phylogenetic tree using single-copy orthogroups (Fig. 1a). We confirmed that the four rice-infecting R. solani isolates were most phylogenetically related to each other and to the previously sequenced Chinese R. solani isolate, followed by AG1-IB, and finally the remaining R. solani AG isolates used in this study. In addition, comparison of the R. solani genomes indicated that the rice-infecting R. solani genomes share 5812 orthogroups of protein-coding genes while the genomes of different AG-ISG groups (AG1-IA B2, AG1-IB, AG2-2IIIB, AG3, and AG8) share only 3635 orthogroups (Fig. 1b). Of these orthogroups, 25 to 164 were specific to the genomes of rice-infecting R. solani AG1-IA while 318 to 3329 are AG-ISG specific.

Genome synteny between R. solani isolates
To determine the synteny between R. solani genomes, we performed pairwise whole-genome alignments using PROmer [38], a script-based pipeline to align multiple, divergent sequences and identify similar genomic regions based on the translation of all six reading frames. Large syntenic size (Fig. 2a) and diagonal dot plots (Additional file 3: Fig. S2) suggested that the five riceinfecting R. solani genomes (the four from this study and the previously sequenced Chinese isolate) share a high degree of genome conservation, ranging from 66 to 70.9% (Additional file 4: Table S2) when the B2 genome was used as reference for comparison. However, the degree of genome conservation between the B2 genome and those of different R. solani AGs (AG1-IB, AG2- 2IIIB, AG3, and AG8) dramatically decreases to 3.3 to 9.6%, suggesting that the AG1-IA genome is quite divergent from other AGs.

Protein-coding gene conservation among R. solani isolates
To determine the degree of proteome similarity in AGs, we performed pairwise ortholog clustering to compare each protein sequence of protein-coding genes. The number of shared orthologs ranged from 8798 to 9723 in the protein-coding genes of R. solani AG1-1A isolates (Additional file 5: Table S3). The average proteome similarity of intra-AGs of R. solani was 89.69%. In contrast, the protein-coding gene similarity of inter-AGs of R. solani was more diversified, wherein the percentage of shared predicted proteomes in inter-AGs was averagely 68.36%. Moreover, in order to determine protein-coding gene similarity of genomes belonging to Basidiomycota, we added four Ustilago and four Trametes genomes along with five Agaricomycetes genomes used for phylogenetic analyses (Additional file 6: Table S4). R. solani genomes of inter-AGs (compared with B2 strain) were larger than that of different Basidiomycota genus-group (54.53%). On the contrary, intra-genus group under Basidiomycota showed high protein-coding gene similarity compared to inter-AG (Ustilago intra-genus; 88.08%, Trametes intra-genus; 83.41%) (Fig. 2b). In comparison of single copy orthologs, discrepancy of inter AGs were increased (intra-AG1 IA; 63.89%, inter-AGs; 33.27%).
Transposable element profiles of the rice-infecting R. solani isolates Transposable elements (TEs), such as class I retrotransposons and class II DNA transposons, can create temporary or permanent genomic rearrangements and modifications [39], and the abundance and frequency of these genetic elements can significantly influence the size of eukaryotic genomes [40]. To define the repetitive element profiles for the rice-infecting R. solani genomes, we analyzed the type and proportion of repetitive elements in each newly sequenced genome. The B2 genome contains the largest proportion of TEs (26.74%) compared to three other R. solani AG1-IA genomes; ADB (8.89%), WGL (9.16%), and YN-7 (6.18%), respectively (Additional file 7: Table S5). The number of total TEs for the ADB and WGL genomes are comparable (ADB: 10,421, WGL: 10,248), yet less than that of the B2 genome (17,123) and more than that of YN-7 (8030). Specifically, the B2 genome contains the highest proportion of DNA transposons, Long Terminal Repeats (LTRs) and Long Interspersed Nuclear Elements (LINEs) among the R. solani AG1-IA genomes, wherein the proportion of LTRs in B2 (20.14%) was identified to be more than 3 times of that of ADB (5.47%), WGL (5.65%) and YN-7 (3.84%). However, all four newly sequenced AG1-IA genomes as well as the previously sequenced AG1-IA genome possess lower numbers of TEs compared to AG1-IB, AG2-2IIIB, AG3 and AG8. In terms of average length of repetitive sequences, the PacBiosequenced B2 genome contains approximately twice as long (797.4 bp) compared to the Illumina sequenced ADB (370 bp), WGL (385.6 bp), YN-7 (345.2 bp) in this study and other previously sequenced genomes of AG1-IA, AG1-IB, AG2-2IIIB, AG3 and AG8.
Predicted secretome of rice-infecting R. solani isolates The putative secretomes of the rice-infecting R. solani isolates were analyzed, and 818 to 888 predicted secreted protein genes were identified (Fig. 3a, Additional file 8: Table S6). This suggests that rice-infecting R. solani isolates have a secretome that is intermediate in size smaller than those of necrotrophic ascomycetes (cereal) but larger than those of brown-rot fungi Postia placenta and Dacryopinax sp. as well as biotrophs Ustilago maydis and Blumeria graminis. The number of small secreted proteins (SSPs; putative effectors) in the four genomes ranges from 263 to 279, accounting for 30-33% of each isolate's individual predicted secretome. We identified 367 R. solani specific orthogroups of SSPs. Among these specific SSPs, 12 (AG1-IB) to 105 (AG8) AG-specific SSPs were identified through ortholog comparison analysis of putative SSP gene sets (Additional File 9: Table S7), and the greatest number of specific SSPs were identified in AG8. We also observed that Rhizoctonia AG1-IA genomes have relatively small predicted proteincoding genes and SSPs are shown to be necrotrophic fungal groups (Fig. 3b). Furthermore, using the 272 SSPs identified in B2, we performed alignment of their protein sequence to the genomes of R. solani AG1-IA, AG1-IB, AG2-2IIIB, AG3, and AG8. The B2 SSPs possess a high degree of homology amongst rice-infecting AG1-IA genomes, whereas they showed decreased homology to genomes of other R. solani AGs (Fig. 3c, Additional file 10: Table S8).

Predicted CAZyme genes of the rice infecting R. solani isolates
We identified cell wall degrading enzymes by searching for each of the different CAZyme gene families across all 27 fungal genomes (Fig. 4). These fungal genomes were then categorized into 11 groups considering their nutritional lifestyle and type of host. A chi-square test of proportions was then used to determine whether gene frequency variations between genomes of each grouping were significant (Additional file 11: Table S9 and Additional file 12: Table S10). Our analyses indicated that there was significant variation across all 11 groups for all CAZyme gene families except GTs. Rice-infecting R. solani genomes had the highest enrichment of CAZyme genes (725 genes) while the genomes of other R. solani AGs, necrotrophs (cereal and dicot) showed only moderate enrichment for these genes. In contrast, biotrophs and brown rot genomes contain a relatively low number of CAZymes compared to rice-infecting R. solani genomes.

Lignocellulose-degrading genes in rice-infecting R. solani isolates
To ascertain whether rice-infecting R. solani isolates can degrade lignocellulose in a similar fashion to fungi in the subdivision Agaricomycotina, we specifically searched for AA family-encoding genes (Additional file 13: Table S11). Genomes of whiterot fungi and other R. solani AGs had the highest total number of CAZyme genes (131 genes), followed by necrotrophs (cereal) (128 genes) and rice-infecting R. solani (121 genes). In contrast, brownrot and biotroph genomes completely lacked these lignin depolymerization genes. We also examined the genomes to identify genes belonging to individual AA subfamilies. Genes belonging to the AA1 subfamily (EC 1.10.3.2) were most abundant in whiterot fungi but could also be found in rice-infecting R. solani isolates, hemibiotrophs, and necrotrophs (cereal); however, the presence of these AA1 genes was significantly lower in brown-rot fungi and biotrophs. In addition, while brown-rot and riceinfecting R. solani genomes did not contain any representatives of the AA2 subfamily, these genes were present in the genomes of necrotrophs (cereal) and white-rot fungi; the latter were highly enriched in manganese peroxidase (MnPs; 1.11.1.13) and versatile peroxidase (VPs; 1.11.1.16) genes. We also observed enrichment of AA8 subfamily genes in the genomes of rice-infecting R. solani isolates, while white-rot genomes had only 1 or 2 AA8 genes and cereal necrotroph genomes had none (except for Septoria nodorum, which had 1). Finally, the AA5 subfamily, which was absent in brown-rot fungi, was significantly enriched in other R. solani AGs and to a lesser extent in rice-infecting R. solani and whiterot fungi.

Pectin-degrading and modifying genes in rice-infecting R. solani
Extensive enrichment of genes belonging to the PL family was observed in both rice-infecting (82 genes) and other R. solani AGs (76 genes) (Additional file 13:

Genes for monocot-specific cell wall degrading enzymes
To determine whether rice-infecting R. solani isolates have any CAZyme genes that allow them to infect their monocot host, we searched for CAZyme genes that degrade arabinoxylans, ferulic acids, and mixed linked glucans (MLGs), such as α-L-arabinofuranosidases (  Table S12). α-Larabinofuranosidase genes were enriched in necrotrophs, including R. solani AG1-IA and other R. solani AGs, but not in white-and brown-rot fungi. However, while feruloyl esterase genes were enriched in necrotrophs (cereal), symbionts, and hemibiotrophs, none were identified in rice-infecting R. solani. While the raw data suggested that there was an enrichment in genes encoding (1,3;1, 4)-β-D-glucan endohydrolases/licheninases in riceinfecting R. solani genomes, our chi-square test failed to reject the null hypothesis, so the proportion of (1,3;1,4)β-D-glucan endohydrolases/licheninases genes across the different fungal genomes is likely similar.

Prediction of secondary metabolite biosynthesis gene clusters
antiSMASH [41] was used to identify putative secondary metabolite biosynthesis gene clusters for polyketide synthase (PKS), terpene synthase (TS), nonribosomal peptide synthetase (NRPS), and other accessory enzymes in rice-infecting R. solani isolates (Additional file 15: Table S12). However, none of the secondary metabolite biosynthesis gene clusters predicted for rice-infecting R. solani isolates or members of related AGs contained PKS genes. In contrast, there was an abundance of secondary metaboliteproducing enzymes predicted for necrotrophic ascomycetes (cereal), including type 1 PKS, type 2 PKS, NRPS, and TS. Despite the low abundance and diversity of putative secondary metabolite biosynthesis gene clusters in R. solani genomes, R. solani AGs and necrotrophic ascomycetes (cereal) have similar levels of terpenes, which suggests that secondary metabolites may not be important for R. solani virulence.

Discussion
De novo and reference-based genome assemblies of riceinfecting R. solani isolates Both SMRT and Illumina sequencing technologies were used in the de novo sequencing of the B2 genome, which allowed us to utilize (1) the ability of SMRT sequencing to generate long read lengths (5 to 20 kb) and precisely capture genomic regions containing repetitive elements and novel gene isoforms [42]. Utilizing these sequencing approaches facilitated assembly of the 45-Mbp B2 genome, which is much larger than a previously published 36.94-Mbp R. solani AG1-IA genome and has a relatively high proportion of repetitive sequences. In addition, it also provided an opportunity to accurately annotate the TE content of the B2 genome. Upon comparing to both newly and previously sequenced R. solani AG1-IA genomes as well as to genomes of other AGs, the B2 genome was found to possess the highest proportion of TEs and the longest average length of TEs. Thus, the higher quality B2 genome generated from the two sequencing methods allowed us to more accurately annotate the R. solani AG1-IA genomes for detailed comparative genome analyses of this important fungal pathogen.

Genomic differences among R. solani isolates
In previous studies, R. solani isolates have been classified based on their ability to hyphal fuse or through sequence analysis of phylogenetic markers [9,43,44] but wholegenome comparisons were not available. Here, we compared five R. solani AG1-IA genomes and four neighboring AG representative genomes and found out that genomic drastically decreased as the comparisons were made from among AG1-IA genomes to among different AG group genomes. Moreover, the similarity of inter-AGs was lower than other Basidiomycota genus-groups. This phylogenomic result shows R. solani species complex reflects multi-species feature in genome contents.

Small set of predicted putative effectors in rice-infecting R. solani isolates
It has been shown that necrotrophic fungi have fewer effectors than biotrophs [45]. Along this line, we found that the newly sequenced rice-infecting R. solani AG1-IA genomes had relatively small set of effectors among fungal genomes analyzed in this study. Previous reports suggest that the broad host range of R. solani is not dependent on the size of its secretome [46] but rather on the secretion of specific effectors than can infect a variety of different hosts, as is the case with necrotroph Sclerotinia sclerotiorum [47]. Furthermore, it has been reported that Colletotrichum pathogens from different clades have tailored suites of CAZymes that are specific to their individual host range and infection lifestyle [48]. We speculate that the relatively low number of SSPs (putative effectors) may be compensated by the diverse and large arsenal of CAZymes of rice-infecting R. solani, allowing them to be a competitive pathogen of broad host range. Additional bioinformatic and functional genomics analyses must be conducted in order to dissect the role of SSPs in rice-infecting R. solani genomes. However, we have provided evidence that the SSPs among the rice-infecting R. solani genomes share high homology and decrease homology of these SSPs among genomes of different R. solani AGs suggest that the SSPs of from R. solani species complex as a whole may be diverse that previously expected.
Lignocellulose-degrading CAZyme genes in rice-infecting R. solani isolates Essential for cell growth and differentiation, cell walls are primarily comprised of cellulose, hemicellulose, pectin, and lignin [49][50][51][52][53], which also provide plants with resistance to and protection from biotic and abiotic stresses [54][55][56]. Plant biomass-degrading fungi that inhabit diverse ecological niches such as forest litters, trees, crops, and grasses of the subdivision Agaricomycotina [34] are classified as either white-or brown-rot [57] based on their ability to degrade lignin. White-rot fungi use oxidative enzymes [58,59] like glyoxal oxidases to efficiently depolymerize lignin and class II heme peroxidases of the AA2 subfamily, such as MnPs and VPs, to degrade the lignin matrix and expose the embedded cellulose [60,61]. While we observed enrichment of glyoxal oxidase-encoding genes in rice-infecting R. solani genomes, we could not identify any peroxidase genes in the rice-infecting R. solani or brown-rot genomes. This apparent lack of peroxidases may explain why brown-rot fungi rapidly degrade cellulose but leave behind a chemically modified lignin matrix for long-term degradation by other microbes [62][63][64].
While a relationship between modes of wood decomposition and CAZyme families exists, some wooddecaying fungi cannot be strictly categorized as whiteor brown-rot, suggesting that a continuum may exist between these two types of wood-decay fungi [57]. Rice-infecting R. solani isolates were enriched in lignocellulose-degrading CAZyme genes as well as strong cellulose-degrading enzymes, which are hallmarks of white-and brown-rot fungi, respectively. Hence, we hypothesize that R. solani may fall along the continuum between these two types of wooddecay fungi. Furthermore, the abundance of oxidoreductase and iron reductase genes in the R. solani genomes may indicate that these enzymes are involved in the production of hydroxyl radicals to drive lignocellulose attack, thereby working with crystalline cellulose-degrading LPMOs to expose the complex lignocellulose structure for further cell wall degradation by other CAZymes.

Enrichment of pectin-degrading enzymes in rice-infecting R. solani isolates
Pectin is a structural heteropolysaccharide with a 1, 4-α-D-galacturonic acid (GalA) backbone that contributes to the mechanical strength of plants [65][66][67] by forming a gel-like matrix that interacts with cellulose and hemicellulose in the primary plant cell wall [51,68]. It is present in higher proportions in dicot (type I) cell walls than in monocot (type II) cell walls [50], and some reports suggest that dicotspecific fungal pathogens have higher amounts of pectin-degrading enzymes than monocot-specific pathogens [69,70]. Pectin is classified based on the degree of methoxylation, methylesterification, and/or acetylation of its backbone [71,72]. Homogalacturonan (HG), a polymer of 1,4-linked α-D-galactopyranosyluronic acid, exists in methylesterified or acetylated form in the primary cell walls of plants [65], and pectin methylesterases (PMEs; EC 3.1.1.11) and pectin acetylesterases (PAE; EC 3.1.1.6) catalyze the demethylesterification and deacetylation of HG, respectively [73][74][75]. This process yields substrates for PGs [76], pectin lyases (PNLs; EC 4.2.2.10), and pectate lyases (PLs; EC 4.2.2.2), which loosen the cell wall [77]. Demethylesterification of HGs can also lead to 'egg-box formation' and cell wall stiffening caused by the interaction of negatively-charged demethylesterified HG and divalent cations such as calcium ions [77], which may explain the stiff, hollow-textured stem phenotypes observed in ShB-susceptible rice cultivars but not in moderately ShB-resistant rice cultivars (Lee et al., unpublished data). The significant enrichment of PMEs, PAEs, PGs, PNLs, and PLs in R. solani genomes suggests that these pathogens may have evolved a diverse suite of pectin depolymerization enzymes that allow them to efficiently breach host cell walls. These expanded homogalacturonan modification genes are known to have the enzymatic activity of cell wall loosening roles in the infection process. Similarly, mutations in Arabidopsis thaliana pectin methylesterase 35 (PME35) lead to the suppression of HG demethylesterification and a concomitant increased stem deformation rate, supporting the essential role of pectin in maintaining the integrity of the plant cell wall and supporting the plant's mechanical properties [65].
Gene family expansions and contractions are the signatures of an organism's adaptation to new ecological niches [78], as exemplified by the presence of numerous pectin-degrading genes in rice-infecting R. solani and neighboring R. solani AGs. However, dicot-specific fungal pathogens may not necessarily have more specialized pectin-depolymerizing enzyme suites than monocot-specific pathogens, as genes encoding HG-modifying enzymes are more highly enriched in rice-infecting R. solani isolates than in the dicot-specific pathogen V. dahliae. This enrichment in pectin degrading genes indicates that R. solani can degrade a wide range of pectic substrates and polysaccharide linkages, allowing it to use multiple virulence mechanisms to invade a variety of hosts. Moreover, the large and diverse suite of pectindegrading enzymes in rice-infecting R. solani isolates may not have evolved in response to the amount of pectin in host plant cell walls but rather as an efficient mechanism for loosening plant cell walls, breaking crosslinks with other cell wall components, and dissolving plant tissue.

Secondary metabolite biosynthesis clusters in R. solani isolates
Previous studies suggest the association of loss of secondary metabolite genes with biotrophy [87,88]. However, most of the pathways for secondary metabolite synthesis in the biotrophic fungus Cladosporium fulvum were revealed to be cryptic [89]. Despite our results suggest that R. solani possess limited number of secondary metabolite biosynthesis clusters, further research on expression analyses of the putative secondary metabolite genes and those identified along with metabolite extraction and chromatography will be needed. These analyses will provide conclusive evidence about the extent of involvement of secondary metabolite genes in the lifestyle and pathogenicity of R. solani.

Conclusion
In this study, we analyzed the cell wall degrading enzyme profiles of four newly sequenced rice-infecting R. solani genomes. Comparative analyses of these riceinfecting R. solani genomes can help identify cell wall degrading mechanisms, such as homogalacturonan modification, that are utilized by this necrotrophic, riceinfecting ShB pathogen. With more and more R. solani genomes are sequenced in the future, reclassification of this fungal pathogen should be discussed and implemented. Moreover, our findings, along with the highquality genome sequences of rice-infecting isolates of R. solani AG1 IA, provide additional genomic resources that can be used to further our understanding of the pathobiology of this necrotrophic fungal pathogen. Fungal DNA extraction of R. solani isolates Hyphal tip isolation and culture maintenance of R. solani isolates were conducted using Potato Dextrose Agar (PDA). R. solani hyphae-containing agar blocks were isolated from the actively growing mycelial portion of the fungus and cultured in the dark in liquid Potato Dextrose Broth (PDB) at 25°C on an orbital shaker (150 rpm) for 4-5 days. Mycelia were filtered using sterile Miracloth (Millipore, Sigma, Burlington, MA, USA), rinsed with sterile distilled water, and frozen in liquid nitrogen. Genomic DNA (gDNA) was extracted using a DNeasy Plant Mini Kit (Qiagen, Valencia, CA, USA), and the resulting DNA pellet was resuspended in 10 mM Tris-HCl (pH 8.0) buffer. The quality of the isolated gDNA was assessed using agarose gel-based electrophoresis while the total DNA concentration was calculated based on UV-Vis measurements on a Nanodrop spectrophotometer. Isolated gDNAs were sent to the National Instrumentation Center for Environmental Management (NICEM), Seoul National University, Korea, for SMRT and Illumina-based sequencing.

Genome sequencing, assembly, and annotation
A PacBio sequencing assemblage strategy was used to assemble the B2 genome. Raw PacBio RSII sequence reads were assembled and corrected using Canu v2.0 [90] and trimmed using Circlator v0.14.0 [91]. All resulting contigs were then joined before the Redundans assembly pipeline [92] and Pilon v1.22 [93] was used to improve the final draft genome.

Comparative analyses and ortholog clustering
For pairwise genomic comparisons, MUMmer v3.23 [38] was used to align and compare the whole genome sequences of R. solani isolates. PROmer, a built-in MUMmer package that generates and aligns translations of all six reading frames for genome sequences of interest, was used to determine the extent of synteny between the genomes used in this study. OrthoFinder v 2.2.7 [102] was used for ortholog clustering to sort out single-copy gene families that would be the most phylogenetically informative. Single-copy ortholog genes in all fungal species were then aligned using ClustalW v2.1 [103], and poorly aligned regions were removed using trimAl v1.2 with the strict method [104]. RAxML v8.2.8 [105] and a bootstrap value of 1000 was used to construct a maximum likelihood-based phylogenetic tree. Ortholog genes were annotated with Gene Ontology (GO) annotation using Interproscan v5.20 [106].

Gene family analyses
Genes encoding plant cell wall degrading enzymes were predicted and categorized using dbCAN HMMER v6 [107]. Each EC gene was collected from classification of CAZyDB-ec-info.txt.07-20-2017. Each classified group from dbCAN was subdivided using EC classification using BLAST 2.2.26. Aligning EC classified protein sequences using ClustalW 2.1 and removal of poorly aligned regions by trimAl v1.2 were preceded before phylogenetic analysis. Phylogeny trees were constructed using RAxML version 8.2.9 with a bootstrap value of 1000. We reconciled the gene tree resulting from this analysis with the species tree using NOTUNG 2.6 [108]. The secretome data of selected species were obtained from the Fungal Secretome Database (FSD) [109]. The database detects all possible secreted proteins by eliminating proteins with transmembrane or endoplasmic reticulum domains and using SignalP 3.0 [110]. The SSPs were then selected from each fungal secretome, considering proteins with a length shorter than 300 amino acids, as previously described [45]. Exonerate 2.4.0 was utilized to perform protein to genome sequence alignments of the effectors among R. solani genomes [111]. Genes encoding laccases and peroxidases were predicted using fPoxDB [112], while putative secondary metabolite biosynthesis gene clusters were identified using antiSMASH v3.0 [41], and the P450 database [113] was searched to predict cytochrome P450 genes in each genome. Transcription factors were identified using the Fungal Transcription Factor Database (FTFD) pipeline [114], which utilizes data from Interpro v12 [115].

Statistical analysis Chi-square tests of proportions
Chi-square tests of proportions for comparative analyses of CAZyme secondary metabolite biosynthesis clusters were performed using R [116].