Whole genome sequencing and comparison of this large collection of isolates has shown that all mycolactone-producing mycobacteria (collectively referred to in this study as M. ulcerans) evolved from a common M. marinum progenitor by a combination of horizontal gene transfer (pMUM and phage), IS-mediated deletion and point mutation (Figure 4). These evolutionary processes resulted in extensive gene loss (185 CDS), changes in gene expression (e.g. hsp18 and possibly sigM) and likely changes in gene function through positive selection, establishing M. ulcerans as a highly specialized, niche-adapted mycobacterium. Our data indicate these changes were followed by global dispersal of M. ulcerans and further diversification and adaptive evolution.
Scrutiny of the types of genes gained, lost or modified in conjunction with other experimental evidence provides some clues regarding the nature of the niche environment in which the M. ulcerans MRCA was able to flourish and in which today’s isolates still survive. Foremost among the DNA gained was in the pMUM plasmid. The acquisition of this plasmid occurred early in the evolution of M. ulcerans as demonstrated by the congruent tree topologies inferred from chromosome and plasmid sequence alignments (Figure 4). The synthesis of mycolactone, a potent immunosuppressive small molecule encoded in the pMUM plasmid, presumably gave a population of M. marinum cells the ability to persist in a place their generalist relatives could not. In previous research we have described how recombination and gene conversion has shaped the unusually repetitive gene structure of the mycolactone PKS, such that the 110 kb, three-gene PKS locus on pMUM comprises only 10 kb of unique sequence. This unusual gene structure and its resultant instability are strongly suggestive of intense selection acting on M. ulcerans populations to maintain mycolactone production [5, 56, 57].
The loss of the mevalonate pathway for synthesis of isoprenoid lipids was originally observed in the genome of M. ulcerans Agy99 . In the current study we show that this trait was established in the M. ulcerans MRCA and was probably a key adaptive response of the bacterium following the acquisition of pMUM (Table 2). It is possible that loss of this metabolic capacity freed essential resources for critical mycolactone synthesis or alternatively that these pathways and metabolites were redundant in the niche environment occupied by M. ulcerans. Other potentially significant gene losses in M. ulcerans include selenocysteine synthase (selA), required for the synthesis of proteins containing selenocysteine, and the linked genes encoding the alpha and beta subunits of a putative selenocysteine-containing formate dehydrogenase, with a possible role in anaerobic growth; mutations that suggest the M. ulcerans MRCA lost the ability to grow anaerobically.
It is also striking that genes associated with the intracellular lifestyle of mycobacterial pathogens such as M. marinum and M. tuberculosis have been lost in M. ulcerans. Genes predicted to be inactive include four phospholipase enzymes (PlcB_2,3,5&6) and cueO (Table 2). As well as playing roles in intracellular replication in other bacteria, these CDS are also components of the bacterial cell wall. There appears to have been significant selective pressure on M. ulcerans to reduce or change its cell wall and cell surface antigenic profile. In this respect another noteworthy pseudogene conserved in M. ulcerans is lgt (MUL_1594). Lgt is a prolipoprotein diacylglyceryl transferase that acylates prolipoproteins at a conserved N-terminal cysteine . An lgt mutation in Staphylococcus aureus causes growth rate attenuation, an accumulation of prolipoproteins in the culture supernatant, and reduced activation of innate immune responses [46, 58]. The loss of lgt in M. ulcerans might therefore lead to aberrantly or non-acylated lipoproteins with reduced immunogenicity, like the S. aureus mutant.
The modification of the cell wall appears to have continued in the BU-associated M. ulcerans lineage 3 genomes, which contain an additional 589 pseudogenes or deleted regions of which 30% are predicted to have encoded antigens or cell wall associated proteins, including EsxA_2, EsxA_3, and Hspx_1. The deep branching lineage and clonal nature of the African and Australian lineage 3 isolates, which are most commonly involved in human infections, have the signature of passing through a second evolutionary bottleneck: gene deletions, further loss of gene function, chromosomal rearrangements and the expansion of another IS (IS2606 from the pMUM plasmid). Each of the M. ulcerans lineages probably represents different ecotypes, reflecting adaptation to related but distinct niche environments. It may be that each lineage is best described as an M. ulcerans ecovar.
We tried to identify a temporal signal in our sequence data to estimate divergence dates for particular lineages of the MuMC. However, while our phylogenetic inferences were highly robust (100% bootstrap values for major branches of our tree, Figure 4) no linear correlation between branch length and year of isolation was observed. We suspect there is variation in the effective number of generations per year across the complex, perhaps related to different niches, reservoirs or modes of living. Such variation has recently been observed in M. tuberculosis. The lack of a temporal signal in these data raises doubts around previous estimates of divergence time that have assumed a constant molecular clock rate among the lineages of the complex [26, 34].
However, sequence comparisons did reveal a compelling correlation between genes undergoing positive selection - as revealed by dN/dS analysis - and those CDS inactivated or deleted (Additional file 6: Table S5, Additional file 7: Table S6), with mutations in all these groups skewed towards CDS involved in cell wall and lipid biosynthesis. These patterns point to significant selective pressures acting on M. ulcerans populations to devote resources (substrate and energy) towards the synthesis of mycolactones and modification of cell wall structures. Intriguingly, many of the cell wall metabolites lost via mutation are known to be highly antigenic in other mycobacteria. One interpretation of these observations is that the bacteria are responding to pressures from a host immune system, a point argued in a previous study . Furthermore, when one considers that mycolactone is a potent immune suppressor with reported specificity for a mammalian microRNA that controls T-cell chemotaxis , this in turn leads to the idea that the niche occupied by M. ulcerans is a higher organism with a complex immune system. The discovery that Australian possums inhabiting BU endemic areas are susceptible to BU disease and harbour large number of M. ulcerans in their gastrointestinal tracts is consistent with this idea . Nevertheless, these arguments are not consistent with the significant lack of variation seen among M. ulcerans proteins with putative T-cell epitopes (Additional file 8: Table S7), where an immune escape hypothesis would predict hypervariability not hyperconservation in these regions.
This study has also reinforced the close relationship between isolate origin and genotype for M. ulcerans strains that cause Buruli ulcer, notably those from Africa and Australia, where multiple isolates from one region were sequenced. The complete resolution of strain differences afforded by whole genome sequencing has shown how the genotype of M. ulcerans strains from two African countries correlate with place of origin. Isolates from the east of Benin are distinct from isolates in the West of the country or from a different country (Ghana, Figure 5). These findings suggest - as also demonstrated in a previous study in Ghana - that M. ulcerans transmission and microevolution generally occurs at a local level and therefore the source of the bacterium is somewhat fixed within a local region . This observation should guide our thinking regarding the source of the bacteria in BU endemic areas, indicating that animal reservoirs of M. ulcerans are unlikely to be highly mobile. Yet, one should also consider that the relative paucity of genomic differences between isolates from Ghana and Benin also reflects the relatively recent spread of the bacteria across this entire region. Efforts to establish the rate of mutation of these isolates or genome analysis of a more temporally and spatially diverse collection of isolates from this region might help estimate the amount of time M. ulcerans has been extant in West Africa.