New genome assemblies reveal patterns of domestication and adaptation across Brettanomyces (Dekkera) species

Background Yeasts of the genus Brettanomyces are of significant interest, both for their capacity to spoil, as well as their potential to positively contribute to different industrial fermentations. However, considerable variance exists in the depth of research and knowledgebase of the five currently known species of Brettanomyces. For instance, Brettanomyces bruxellensis has been heavily studied and many resources are available for this species, whereas Brettanomyces nanus is rarely studied and lacks a publicly available genome assembly altogether. The purpose of this study is to fill this knowledge gap and explore the genomic adaptations that have shaped the evolution of this genus. Results Strains for each of the five widely accepted species of Brettanomyces (Brettanomyces anomalus, B. bruxellensis, Brettanomyces custersianus, Brettanomyces naardenensis, and B. nanus) were sequenced using a combination of long- and short-read sequencing technologies. Highly contiguous assemblies were produced for each species. Structural differences between the species’ genomes were observed with gene expansions in fermentation-relevant genes (particularly in B. bruxellensis and B. nanus) identified. Numerous horizontal gene transfer (HGT) events in all Brettanomyces species’, including an HGT event that is probably responsible for allowing B. bruxellensis and B. anomalus to utilize sucrose were also observed. Conclusions Genomic adaptations and some evidence of domestication that have taken place in Brettanomyces are outlined. These new genome assemblies form a valuable resource for future research in Brettanomyces.


Background
Most commercial alcoholic fermentations are currently performed by yeast from the genus Saccharomyces with the most common species being Saccharomyces cerevisiae. The domestication of S. cerevisiae is thought to have begun as early as prehistoric times [1]. To date, many commercially available strains have been selected for fermentation in harsh conditions, such as those encountered during wine, beer, and industrial bioethanol fermentations [2][3][4]. In parallel with Saccharomyces, a distantly related genus of budding yeasts, Brettanomyces (teleomorph Dekkera), has also convergently evolved to occupy this same fermentative niche [5].
There are currently five accepted species of Brettanomyces: B. anomalus, B. bruxellensis, B. custersianus, B. naardenensis, and B. nanus [6]. A sixth species, Brettanomyces acidodurans, was recently described, although this species has only been tentatively assigned to this genus, due to a high genetic divergence relative to five species and has not been included in this study [7]. Brettanomyces species were originally characterized with a combination of morphological, physiological, and chemotaxonomical traits [8], although the phylogeny has since been defined and updated using several methodologies, often with conflicting results [8][9][10]. Three different phylogenies were originally presented based on analyses of the 18S or 26S ribosomal RNA sequences, which showed conflicting placement of B. custersianus and B. naardenensis [8]. Four additional phylogenies, based on either 18S or 26S RNA, or on the concatenated sequences for SSU, LSU, and elongation factor 1α sequences have also been published [9,10]. These show a consistent placement for B. custersianus but somewhat inconsistent branching and poor branch support for B. naardenensis and B. nanus.
Brettanomyces spp. are most commonly associated with spoilage in beer, wine, and soft drink due to the production of many off-flavour metabolites including acetic acid, and vinyl-and ethyl-phenols [5,11,12]. However, Brettanomyces can also represent an important and favorable component of traditional Belgian Lambic beers [13,14], and their use has increased in recent years in the craft brewing industry [15]. Furthermore, B. bruxellensis has shown potential in bioethanol production by outcompeting S. cerevisiae and for its ability to utilize novel substrates [16,17].
B. bruxellensis and to a lesser extent B. anomalus, are the main species encountered during wine and beer fermentation and has led to the majority of Brettanomyces research focusing only on these two species. The initial assembly of the triploid B. bruxellensis strain AWRI1499 [18] has enabled genomics to facilitate research on this organism [19][20][21][22][23]. Subsequent efforts have seen the B. bruxellensis genome resolved to chromosome-level scaffolds [24]. In contrast, the assemblies that are available for B. anomalus [25], B. custersianus, and B. naardenensis, are less contiguous, and are mostly un-annotated, while no genome assembly is currently available for B. nanus.
Brettanomyces genomes have been shown to vary considerable in terms of ploidy and karyotype with haploid, diploid, and triploid strains of B. bruxellensis being observed [22,26]. In addition to ploidy variation, karyotypes can also vary widely, with chromosomal numbers in B. bruxellensis being estimated to range between 4 and 9 depending on the strain [27]. Currently available assemblies for Brettanomyces vary from 10.2 Mb for B. custersianus, and between 11.8 Mb and 15.4 Mb for B. bruxellensis (based on haploid genome size).
Recent advancements in third-generation long-read sequencing have enabled the rapid production of highly accurate and contiguous genome assemblies, particularly for microorganisms (reviewed in [28]). This study sought to fill knowledge gaps for various Brettanomyces species by sequencing and assembling genomes using currentgeneration long-read sequencing technologies [29], and then to use these new assemblies to explore the genomic adaptations that have taken place across the Brettanomyces genus.

Results
New genome assemblies for the Brettanomyces genus Information regarding the species and strains used in this study is listed in Table 1. In the interest of obtaining high-quality and contiguous assemblies, haploid or homozygous strains were favored (the B. anomalus strain was the exception), with strains that featured in past studies prioritized. All strains had been isolated from commercial beverage products, with three from commercial fermentations.
Haploid assemblies were produced for all the Brettanomyces species (genome assembly summary statistics are shown in Table 2 and MinION sequencing statistics are available in Table S1). Genome sizes for B. bruxellensis and B. anomalus of 13.2 and 13.7 Mb, respectively were well within the range of other publicly-available Brettanomyces assemblies, which range from 11.8 Mb to 15.4 Mb [18,24,25,34,35]. The B. custersianus assembly size was 10.7 Mb, similar to assemblies of other B. custersianus strains (10.2 Mb to 10.4 Mb) [36]. The B. naardenensis assembly was 11.16 Mb, highly similar to the only other published assembly [37]. The B. nanus assembly was the smallest at only 10.2 Mb and represents in the first whole-genome sequence for this species.
The overall contiguity of the assemblies varies due to differences in heterozygosity and sequencing read lengths. The B. anomalus strain is a heterozygous diploid organism and while read coverage was high, the median read length was relatively low at 4.7 kb. This resulted in the lowest contiguity in the study consisting of 48 contigs for the haploid assembly with an N 50 of 640 kb. The B. nanus strain is a haploid organism and had a much higher median read length of 14.9 kb. As such, this assembly had the best contiguity consisting of only 5 contigs with an N 50 of 3.3 Mb.
In order to assess the completeness of each assembly, BUSCO statistics were compiled for each genome (Table 2). Predicted genome completeness was high for the haploid assemblies, with between 3.8% (B. naardenensis) and 7.2% (B. anomalus) missing BUSCO genes (BGs). The assemblies were then processed with Purge Haplotigs [38] to remove duplicated and artifactual contigs. Duplication was low for not only the homozygous strains but also for the heterozygous B. anomalus assembly with between 0.5% (B. nanus) and 1.2% (B. anomalus) duplicate BGs.
Given the significant differences in the genome sizes within the Brettanomyces genus, it was of interest to determine if this size range was due to differences in overall gene number, gene compactness or both. The total number of predicted genes, gene densities (the percent of genome that is genic) and the number of orthogroups with multiple entries were calculated for each Brettanomyces genome, in addition to S. cerevisiae as a point of comparison (Table S2). B. nanus (smallest genome) had the fewest genes (5083), the highest gene density (78.1% genic) and the lowest number of expanded orthogroups (5.2%). Conversely, B. anomalus (largest genome) exhibited the highest number of genes (5735), the most ortholog duplicates (10.4%) and the largest proportion of intergenic sequences (62.2% genic).
Given the heterozygous nature of the B. anomalus genome, a diploid assembly was also generated for the strain AWRI953. The resultant diploid assembly was approximately twice the size of the haploid assembly and had a slightly improved N 50 of 730 kb. While the genome size doubled, duplicated BGs only increased from 1.2% for the haploid assembly to 35.9% for the diploid assembly. In an ideal scenario, in which both alleles are faithfully separated, duplicated BGs would be closer to 100%. The low number of duplicated BGs was found to mainly be the result of a number of fragmented gene models being present in one of the two haplomes. It should be noted that while the diploid B. anomalus assembly is split into Haplome 1 (H1) and Haplome 2 (H2), these haplomes consist of mosaics of both parental haplotypes. This is an unavoidable artefact of assembly where haplotype switching can randomly occur due to breaks in heterozygosity, and between chromosomes.

Taxonomy of Brettanomyces
This collection of high quality Brettanomyces genomes allowed for a comprehensive phylogeny to be generated, which utilized the entire genome, as opposed to extrapolating relationship based upon ribosomal sequences. Codon-based alignments were produced for 3482 singlecopy orthologues (SCOs) that were common across the five Brettanomyces species, in addition to using Ogataea polymorpha (closest available non-Brettanomyces genome) as an outgroup. These concatenated alignments were used to calculate a maximum-likelihood tree ( Fig. 1a) and to estimate average nucleotide identity (ANI) between pairs of genomes (Table 3). Individual gene trees were also generated for all SCO groups. These individual gene trees were then used to generate a coalescence-based phylogeny ( Figure S1a) to check for consistency with, and to generate branch support values for, the concatenation-based phylogeny. As a point of comparison, this phylogenetic methodology was also performed on the members of the Saccharomyces genus (Fig. 1b, Figure S1b, and Table 4).
When compared to the distances between the members of the genus Saccharomyces, there is a much larger genetic distance separating the various Brettanomyces species. Indeed, there is a greater genetic distance between most of the Brettanomyces species than there is between any of the individual Saccharomyces species and the outgroup used for that phylogeny (Naumovozyma castellii). The largest separation was observed between

Extensive rearrangements are present throughout Brettanomyces genomes
In order to ascertain if larger-scale differences accompanied the extensive nucleotide diversity that was observed between the Brettanomyces species, whole-genome alignments were used to detect structural rearrangements between the genomes (Fig. 2). There were numerous small and several large translocations present between the B. bruxellensis and the B. anomalus assemblies (Fig. 2a) with a total of 71 syntenic blocks identified. The B. bruxellensis and B. custersianus assemblies showed less overall synteny, with the alignment broken into 93 syntenic blocks (although individual translocation units appear to be smaller; Figure S2). Comparing B. bruxellensis to the more distantly related species B. naardenensis (Fig. 2b) and B. nanus (Fig. 2c), these breaks in synteny are also common, with 91 and 117 syntenic blocks observed, respectively. The chromosomal rearrangements were also not limited to a single species or clade; when comparing B. nanus to B. naardenensis (Fig. 2d) there were 73 syntenic blocks identified, very similar to that occurring between B. bruxellensis and B. anomalus.
Given the heterozygous nature of the B. anomalus genome analyzed in this study, the genome was examined for the presence of large LOH tracts. Three large contigs, comprising 2.14 Mb (15%) of the B. anomalus genome, were predicted to be homozygous (0.0353 SNPs/kb) while the rest of the genome is heterozygous (3.21 SNPs/kb) ( Figure S3). The strains used in this study as reference for B. bruxellensis, B. custersianus, B. naardenensis, and B. nanus appeared homozygous as expected, with heterozygous SNP densities ranging from 0.01 SNPs/kb (B. naardenensis) to 0.05 SNPs/kb (B. bruxellensis).

Enrichment of fermentation-relevant genes
Given the apparent adaptation of Brettanomyces to the fermentative environment, each Brettanomyces genome was investigated for the presence of specific gene family expansions (Table 5). Both B. bruxellensis and B. nanus were predicted to have undergone copy number expansion of ORFs predicted to encode oligo-1,6-glucosidase enzymes (EC 3.2.1.10), which are commonly associated with starch and galactose metabolism (Fig. 3a). B. nanus was also predicted to possess an expanded set of genes encoding β-glucosidase (EC 3.2.1.21; Fig. 3b) and βgalactosidase (EC 3.2.1.23; Fig. 3c) activities, which are involved in the utilization of sugars from complex polysaccharides.
In addition to PIPOX, B. bruxellensis and B. anomalus share an expansion of S-formylglutathione hydrolase (EC 3.1.2.12), and B. anomalus contains an expansion of formate dehydrogenase (EC 1.17.1.9). These genes are part of methanol metabolism in other species (a capability   Potential HGT events that may have contributed to the evolution of Brettanomyces were investigated. Twelve Brettanomyces orthogroups were predicted to be the result of HGT from bacteria (Table 6). Of these bacterially derived gene families, a Glycoside Hydrolase family 32 gene (GH32), which was predicted to have βfructofuranosidase activity (EC 3.2.1.26), is likely to have had a key phenotypic impact during the evolution of this genus. GH32 enzymes hydrolyse glycosidic bonds and βfructofuranosidase (Invertase) is specifically responsible for the breakdown of sucrose into fructose and glucose monomers and is required for the utilization of sucrose as a carbon source.
To further confirm the bacterial origins of the Brettanomyces invertases, a protein-based phylogeny was created from the highest scoring eukaryote and prokaryote blast hits from the RefSeq non-redundant database, as well as from these three Brettanomyces invertases (Fig. 4a). The prokaryote and eukaryote invertases each form two distinct clades. Consistent with a bacterialderived HGT event, the Brettanomyces invertase proteins reside within one of the two prokaryote clades and are evolutionarily distinct from the eukaryote groups. There are also three other eukaryote invertases that reside within a prokaryote clade, and two prokaryote invertases that reside within a eukaryote clade, which suggests that HGT of this important enzyme activity is not unique to Brettanomyces. To confirm the placement of the Brettanomyces invertases in the prokaryotic clade, three alternate topologies (within either of the eukaryote clades, as well as within the second prokaryote clade) were tested ( Figure S4, Table S3). These constrained topologies were all significantly less likely compared to the unconstrained tree ( Figure S4, Table S3).
The genomic context of the invertases present in B. bruxellensis and B. anomalus was also examined. These genes are predicted to reside within sub-telomeric regions (Fig. 4b). In Brettanomyces, there is significant structural variation and a general loss of synteny, which is typical of sub-telomeric regions in other species (Fig. 4b). For example, in B. nanus the NAG gene cluster resides within a different sub-telomere relative to B. bruxellensis and B. anomalus. The NAG genes are also present in B. naardenensis, but are not colocated and appear to be missing entirely in B. custersianus. Likewise, homologues of the MPH3 and TIP1 genes that are present across all the Brettanomyces species, are only found in this specific sub-telomeric region in B. bruxellensis and B. anomalus.

Discussion
New genome assemblies for the five Brettanomyces species are described, which generally exhibit significant improvements over previous assemblies produced for this genus. The most contiguous genome assembly described was that of B. nanus, which comprised only 5 contigs and which had an N 50 of 3.3 Mb. To the best of our knowledge, this makes the B. nanus assembly the most contiguous Brettanomyces assembly to date. When comparing the assemblies of the other species to the next most contiguous assembly available from other Brettanomyces sequencing studies, the B. anomalus assembly represents a 4.7-fold improvement over GCA_001754015.1 (261 contigs), B. custersianus a 9.4-fold improvement over GCA_ 001746385.1 (226 contigs) and B. naardenensis and 6.5-fold improvement over GCA_900660285.1 (104 contigs). While the predicted completeness for these new assemblies were all generally high, there was also considerable differences in gene density and content. This was most prominent between B. nanus and B. anomalus, with the B. nanus genome containing fewer total genes, less intergenic sequence and lower duplication of specific orthogroups.
The high-quality genome sequences allowed for the calculation of a Brettanomyces whole-genome phylogeny. The topology of the whole-genome phylogeny generally agreed with those derived from rRNA sequences in the placement of B. bruxellensis, B. anomalus and B. custersianus [8][9][10]. However, these earlier studies were not able to consistently resolve the placement of B. nanus and B. naardenensis, with conflicting results between phylogenies based on 18S and 26S ribosomal RNA sequences. The whole-genome phylogeny proposes the Brettanomyces genus to be comprised of two clades, with B. nanus and B. naardenensis forming a clade separate from the other species. This whole-genome topology is  consistent with previous 18S phylogenies. Comparison of ANI values identified that there is a larger genetic distance separating some Brettanomyces species than there was separating the Saccharomyces and Naumovozyma genera. While ANI values alone are generally insufficient for determining genus boundaries (at least in prokaryotes) [43], the extremely low ANIs that have been observed across the Brettanomyces genus merits further consideration into the taxonomy of this group and whether it may be appropriate for the Brettanomyces genus to be refined. B. nanus, and to a lesser extent B. bruxellensis exhibited expansions of families of glucosidases and galactosidases that are responsible for the utilization of sugars from complex polysaccharides. These types of expansions are a hallmark of the domestication of beer and wine strains of S. cerevisiae and suggests that a similar process may be occurring in B. nanus [44][45][46]. The three known B. nanus strains were all isolated from beer samples obtained from Swedish breweries in 1952. The B. nanus strain AWRI2847 (CBS 1945) was found to have far less spoilage potential in beer than either B. bruxellensis or B. anomalus [47]. At the time this strain was isolated, microbial spoilage of beer was determined sensorially and sharing yeast samples between both individual fermentations (re-pitching) and breweries was common practice [48]. Taken together, these practices may have allowed B. nanus to remain a long-term undetected contaminant, surviving successive serial repitchings and spreading to multiple breweries.
The ability of Brettanomyces to grow in nutrientdepleted conditions has largely been attributed to the utilization of alternative nitrogen sources such as free nitrates and amino acids [49][50][51]. The expansion of PIPOX in Brettanomyces may be partly responsible for this important survival trait. Proline, a substrate of PIPOX, is one of the more common amino acids in fermented wine and beer. Despite this abundance, proline is poorly utilized by S. cerevisiae, however it is readily metabolized by B. bruxellensis [52][53][54][55]. PIPOX converts proline to 1-pyrroline-2-carboxylate, which can be further converted to D-Ornithine by a general aminotransferase. Unlike proline oxidase (EC 1.5.1.2) and proline dehydrogenase (EC 1.5.5.2) which convert proline to 1pyrroline-5-carboxylase, PIPOX represents an alternative avenue for proline utilization as a nitrogen source that is less impactful to redox homeostasis, which may allow its utilization during fermentation.
Horizontal Gene Transfer (HGT) has been reported as a mechanism of adaptative evolution in fungal species and to have contributed to the domestication of S. cerevisiae [56][57][58]. Similarly, an HGT event is predicted to have conferred the ability to utilize sucrose as a carbon source to B. bruxellensis and B. anomalus via the incorporation of a bacterially-derived invertase. Previous phenotypic testing has shown B. bruxellensis and B. anomalus to be the only Brettanomyces species capable of utilizing sucrose [6] and this phenotype correlates with the presence of this HGT-derived invertase, which is only observed in the B. bruxellensis and B. anomalus genomes (there are no other invertase encoding ORFs predicted in Brettanomyces). The genomic context illustrates further parallels to evolution in Saccharomyces. The invertases are shown to reside within sub-telomeres, which are genomic regions that have been shown to be hotspots for structural rearrangements and HGT events in Saccharomyces [59][60][61][62][63]. Sucrose utilization likely conferred a significant advantage in fruit fermentations, helping to shape the evolution of the common ancestor of B. bruxellensis and B. anomalus towards this fermentation specialization.

Conclusions
High quality genome assemblies for all five currently accepted Brettanomyces species are described, including the first assembly for B. nanus and the most contiguous assemblies available to date for B. anomalus, B. custersianus, and B. naardenensis. Comparative genome analysis established that the species are genetically distant and polyphyletic. Numerous indicators of domestication and adaptation in Brettanomyces were identified with some notable parallels to the evolution of Saccharomyces. Structural differences between the genomes of the Brettanomyces species and apparent loss of heterozygosity in B. anomalus were observed. Enrichments of fermentation-relevant genes were identified in B. anomalus, B. bruxellensis and B. nanus, as well as multiple horizontal gene transfer events in all Brettanomyces genomes, including a gene in the B. anomalus and B. bruxellensis genomes that is probably responsible for these species' ability to utilize sucrose.

Methods
Detailed workflows, custom scripts for computational analyses and genome annotations are available at https:// github.com/mroach-awri/BrettanomycesGenComp (DOI: https://doi.org/10.5281/zenodo.3632185). All sequencing reads and genome assemblies have been deposited at the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) under the BioProject: PRJNA554210. Raw FAST5-format files for all Oxford Nanopore sequencing are available from the European Bioinformatics Institute (EMBL-EBI) European Nucleotide Archive (ENA) under the study: ERP116386.

Library preparation and sequencing
Genomic DNA was extracted from liquid cultures using a QIAGEN Gentra Puregene Yeast/Bact Kit. B. bruxellensis was sequenced using PacBio RS-II SMRT sequencing. The sequencing library for B. nanus was multiplexed with other samples (not reported here) using the SQK-LSK109 and EXP-NBD103 kits following the Oxford Nanopore protocol NBE_9065_v109_revA 23MAY2018. For the remaining species, libraries were prepared using the SQK-LSK108 kit following the protocol GDE_9002_v108_revT_ 18OCT2016. Sequencing was performed on a MinION using FLO-MIN106 flow-cells. Demultiplexing and basecalling were performed using Albacore v2.3.1.
Illumina sequencing was performed on each strain using a combination of short-insert (TruSeq PCR-free) and mate-pair (2-5 kb insert and 6-10 kb insert) libraries. All libraries were barcoded and pooled in a single Miseq sequencing run using 2x300bp chemistry.
A diploid assembly for AWRI953 (B. anomalus) was also generated. Paired-end reads were mapped to the haploid assembly with BWA-MEM, and high-confidence SNPs were called using VarScan v2.3.9 [72]. Nanopore reads were mapped to the assembly using BWA-MEM.
Heterozygous SNPs were phased using the mapped Nanopore reads with HapCut2 commit: c2e6608 [73] and converted to VCF format with WhatsHap v0.16 [74]. New consensus sequences were called for each haplotype from the phased SNPs and the nanopore reads were binned according to which haplotype they mapped best. The two B. anomalus haplotypes were then independently reassembled from the haplotype-binned nanopore reads using the method described for the other species.
All other Brettanomyces assemblies were aligned to the B. bruxellensis assembly using NUCmer (MUMmer) v4.0.0beta2 [75]. Dotplots were visualized and contigs with split alignments were manually inspected for indications of mis-assemblies using mapped alignments of Nanopore reads and Illumina mate-pair reads. Genome metrics were calculated with Quast [76] and completeness, duplication, and fragmentation were estimated using BUSCO v3.0.2 [77] with the odb9 Saccharomyceta dataset.

Phylogeny
Orthofinder (Brettanomyces + O. polymorpha) was used to find SCOs over these genomes. Protein sequences were aligned with Muscle v3.8.31 [82] and then converted to codon-spaced alignments using PAL2NAL [83]. Average nucleotide identities were estimated using panito commit: f65ba29 (github.com/sanger-pathogens/ panito). A rooted maximum likelihood phylogeny was generated with IQ-TREE [84] on the concatenated codon alignments. IQ-TREE was also used to generate gene trees for all SCOs, and then to generate a coalescence-based phylogeny from the SCO individual gene trees. Phylogenies were created using the same method for the Saccharomyces species + N. castellii (outgroup) to serve as a comparison.

Whole genome synteny visualization
Pairwise synteny blocks were generated between the reference B. bruxellensis assembly and the other haploid assemblies, as well as between the B. naardenensis and B. nanus assemblies. Contigs were placed in chromosome order using Purge Haplotigs [38] to generate placement files that were then used to rearrange contigs. Alignments between the assemblies were calculated using NUCmer with sensitive parameters (−b 500 -c 40 -d 0.5 -g 200 -l 12). Genome windows (20 kb windows, 10 kb steps) were generated for the assemblies and a custom script was used to pair syntenic genome windows based on the NUCmer alignments. Concordant overlapping and adjacent windows were merged, and overlapping discordant windows were trimmed. The synteny blocks were then visualized using Circos v0.69.6 [85].

Gene enrichment
OrthoFinder (Saccharomycetaceae) annotations were used to identify gene-count differences between the Brettanomyces species. The ratio of the gene-count to the average gene-count was calculated for the Brettanomyces species over all OrthoFinder orthogroups. All orthogroups with a ratio ≥ 2 for any Brettanomyces species were subject to GO-enrichment analysis using BiNGO v3.0.3 [86] using the hypergeometric test with Bonferroni Family-Wise Error Rate (FWER) correction. Genes for overrepresented categories (p-value ≤0.05) were returned. Multiple sequence alignments were generated for GO-enriched orthogroups using Muscle and maximum likelihood phylogeny trees generated using PhyML within SeaView v4.7 [87] using default parameters (LG model, BioNJ starting tree, tree searching using NNI substitutions).

Horizontal gene transfer
HGT events were predicted for the Brettanomyces species. Protein sequences for the assemblies were used in BLAST-P searches against the RefSeqKB non-redundant Fungi and Bacteria datasets [88], the Alien Index was calculated as described in [89]. All Brettanomyces proteins with an AI score greater than 20 were investigated further. The multiple sequence alignments and trees were retrieved for the HGT candidates' orthogroups and several candidates were removed following manual inspection. A phylogeny was generated for one HGT prediction of interest. The Brettanomyces genes, and the top blast hits from the ResSeq non-redundant database eukaryote and prokaryote datasets were aligned with Muscle, and the phylogeny was generated with IQ-TREE. Constrained trees were generated to test the Brettanomyces genes within alternate clades and these were assessed using IQ-TREE's tree topology tests.