Our goal was to compare the genomes of two very closely related Mycoplasma subspecies and accurately determine their degree of relatedness. This can be done at various levels by comparing the genome organization, the gene repertoire and the polymorphisms within the genes.
Genome plasticity in mycoplasmas of the mycoides cluster is greatly influenced by the "mobilome" and more specifically by ICE and IS. Such elements were evidenced in the 95010 genome and have been shown to drive overall genome plasticity.
Integrative conjugative elements have a modular structure and contain blocks of genes dedicated to integration into and excision from the chromosome, as well as conjugal transfer . Until recently, ICEs had been evidenced in a limited range of hosts belonging to the four major divisions of bacteria. However, whole genome sequencing projects suggest that ICEs are widespread in bacteria and could be one of the main types of shuttle for horizontal gene transfer . ICEs have now been identified in various Mycoplasma species, including M. fermentans, M. agalactiae and Mcc [16, 26, 27]. The ICE copy number in mycoplasmas seems to be small, for example only four copies of two ICE types in M. fermentans. ICE copy numbers are much higher in other bacterial species: in Orienta tsutsugamushi duplicated elements, including ICEs, account for more than 37% of the genome [28, 29]. Our analysis of the Mmc genome shows that these elements have a direct impact on genome rearrangements, although the exact mechanisms leading to excision, integration and/or conjugation to another cell remain to be elucidated. In the genus Bacillus, transfer of ICE copies seems to be favoured by high densities of cells not carrying these elements and integration into a cell apparently leads to blocking the entry of additional copies [30, 31]. The possibility that these elements are involved in transferring virulence factors to and between mycoplasmas needs to be investigated.
Comparison of plasmids from the mycoides cluster suggested that various recombination events may have occurred during the spread of these plasmids among strains. More surprisingly, alignments of plasmid and ICE sequences in Mmc 95010 indicated that these mobile elements may have exchanged sequences. This new finding suggests that these two types of mobile elements could interact within mycoplasma cells and maybe even cooperate in transmission from cell to cell.
Insertion sequences are another driving force for genome plasticity in the mycoplasma mycoides cluster. Comparisons of genomes revealed substantial diversity of IS type and copy number even between closely related strains (95010 and GM12) and that duplication of IS copies may lead to large DNA fragment inversions. This contrasts with the findings for two M. agalactiae genomes in which the presence of 15 IS copies and three ICE copies was not associated with any large-scale genetic rearrangement . In the case of the mycoides cluster, the major contribution of IS to genome plasticity is well illustrated by the comparison between Mmc and MmmSC genomes. The MmmSC PG1 genome has large numbers of IS copies, IS1634 having the highest copy number (N = 60). These IS-elements have not only led to large DNA fragment inversions but also large DNA fragment duplications and deletions. This is not unprecedented in the bacterial world and IS expansions may result from an evolutionary bottleneck due to bacterial population isolation . In the case of MmmSC PG1, this bottleneck may have been associated with the strict adaptation of this subspecies to the bovine lung. Indeed, IS1634 shares 97% identity with ISMbov3 which is found in M. bovis, a very common pathogen isolated from cow lungs . This close relatedness certainly indicates recent HGT and the absence of IS1634 from Mmc suggests that this IS was acquired by MmmSC from M. bovis. Such exchanges of IS in between the mycoides cluster and the M. bovis/M. agalactiae cluster have already been proposed .
A striking feature of the MmmSC PG1 genome, as compared to those of Mmc 95010 and GM12, is the large number of pseudogenes in the vicinity of IS elements. Altogether, more than 98 MmmSC of the originally described putative genes are certainly pseudogenes as a result of frameshift mutations or inserted insertion sequences. This represents more than 9% of the total number of MmmSC genes that were annotated in 2004. High percentages of pseudogenes are often associated with a recent adaptation to a host and to virulence, as suggested for Yersinia pestis . Adaptation to a new host allows a massive clonal population growth in which all mutations affecting genes that are not essential for bacterial survival in the new environment are maintained. Such clonal expansion also explains the limited polymorphism of the housekeeping genes. Reductive evolution of this type has been described for various pathogens in addition to Yersinia pestis, including Orienta tsutsugamushi, the agent of scrub typhus, Ricketsia prowazeckii, the agent of epidemic typhus and Aliivibrio salmonicida, the agent of cold-water vibriosis [29, 35, 36]. In the case of MmmSC, adaptation to a new host may also have favoured the acquisition and multiplication of new IS types, such as IS1634. Similarly, pseudogenisation was also observed in M. bovis, where an adhesin of M. agalactiae was inactivated upon infection of a different host . Longer evolution times would possibly allow a streamlining of the genome with a reduction of the number of pseudogenes by a deletion process. Mmc is an ubiquitous pathogen that is present in numerous species (sheep, goats) all around the world; it is an opportunist pathogen that can infect diverse organs and can even be found in the ear canal of healthy goats (or in parasites found in the ear canal). By contrast, MmmSC is strictly pathogenic and limited to a single host, cows, and to a single organ, the lung. This is consistent with an Mmc ancestor, adapted to various ecological niches in small ruminants, adapting to a bovine host where it colonizes only the lungs, and evolving into what is now known as MmmSC. Genomic studies, and particularly the observation that intraspecies polymorphism in housekeeping genes is much more limited in MmmSC than in Mmc, support the hypothesis that MmmSC emerged only recently .
The availability of whole genome sequences may help unravel the genetic events underlying phenotypic diversity among closely related strains. As an example, the utilization of maltose in the M. mycoides cluster species has been studied by AbuGroun : no maltose utilization was found in MmmSC. Maltose is utilized by Mcc and more rapidly by Mmc. However, some Mmc strains failed to metabolize maltose at all. The presence of an alpha glucosidase was also detected by a rapid colorimetric test using pNPG flooded on mycoplasma colonies . None of the MmmSC strains tested possessed any glucosidase activity although most Mcc and Mmc strains did. However MmmSC strains express beta-glucosidases with variations which may be related to cytotoxicity . Our findings are in accordance with these observations. What remains to be verified is the integrity of the maltose gene cluster in the Mmc strains that fail to utilize this substrate, and the ability of Mmc strains to utilize starch. Mcc California kid is expected to be unable to use starch, although other Mcc strains should be able metabolize this carbon source.
Surface proteins and more specifically lipoproteins that play key roles in interactions with the environment are determinant for the lifestyle of mycoplasmas. They contribute to the uptake of nutrients and can mediate essential functions during the infection cycle. Some play a role in cytadhesion, and other bind IgAs to allow the cells to escape cellular recognition . Surface proteins can also display mechanisms of phase variation as a means to escape the host immune responses . At the same time they are excellent immunogens, their lipid moiety acting with adjuvant-like proinflammatory activity and their protein part evoking an immune response . However, the type of immune response they trigger may vary according to the Lpp involved. In the case of MmmSC PG1, LppA seems to trigger a cellular response, involving CD4 cells producing interferon gamma, whereas LppQ, LppB and LppC do not . The presence of 86 genes coding for Lpp in the Mmc 95010 genome, as compared to 56 in MmmSC PG1, is in agreement with findings in other mycoplasma species. The number of Lpp genes in two strains of M. agalactiae is 100 and 67, the latter number being that of reference strain PG2 . In M. agalactiae, poly-G tracts are suspected to be involved in genomic rearrangements and possibly in the control of expression of genes in the region encoding the so-called spma lipoprotein family [15, 16]. A locus encoding homologous predicted lipoproteins was found in Mmc 95010 with intergenic regions containing GC(T)7-20 motifs. This suggests that there may have been exchange of genes belonging to this family between these ruminant pathogens followed by divergent evolution of intergenic motifs and subsequent expansion of the gene families. In accordance with this hypothesis, only one single member of this family was found in Mcc genome whereas variable expansions were observed in the two strains of M. agalactiae. In M. mycoides, intergenic nucleotide tracts are found at other loci. Poly-TA tracts, with more than 10 repeats, were found at six locations both in the 95010 and the GM12 genomes. The size of these tracts differed between strains. However, these size variations should be interpreted with great care as most sequencing projects use cloned bacterial stocks: such variants may differ from the main population. In 95010, the percentage of Lpp that were detected by the proteomic analysis was 30%. This is slightly lower than reported by other studies in which amphiphilic proteins were first concentrated by Triton X-114 extraction . As a consequence, the differences may simply be due to differences in the sensitivity detection of the techniques used. In addition, Lpp expression may be driven by environmental conditions and our results apply only to mycoplasmas grown in rich media, in vitro. Co-incubation and adhesion to cells may well trigger the expression of a different set of Lpp as has been demonstrated for M. pneumoniae in contact with lung epithelial cells . This type of modulation of expression may play an important role in virulence.
In fact, the evolution of MmmSC genome may be shaped by unconstrained population growth in infected animals, followed by extreme transmission bottlenecks from host to host. Furthermore, current MmmSC strain populations may also be shaped by CBPP control strategies based on slaughter and vaccination. Existing MmmSC strains may well have adapted to this artificial selection that has been implemented for more than 100 years. The MmmSC genome is larger than that of Mmc, mostly due to gene duplications, and the insertion of multiple copies of Insertion Sequences. IS elements seem to play a prominent role in this gene rearrangement process, as demonstrated during growth in vitro under conditions of stress induced by high temperature (41.5°C) . Fever in CBPP-infected animals may induce a similar stress and favour gene rearrangements in MmmSC. However, the MmmSC genome is also characterized by a high degree of gene decay with more than 9% of the originally described genes likely to be pseudogenes. Many of these genes are not associated with any known function ("ORFans"), consistent with the notion that the number of genes in whole genomes is often overestimated . This fits also well with a non-adaptative genomic complexity theory allowing duplications or pseudogenes to be maintained in the absence of an adaptive selection that would lead to purifying selection and genome streamlining [49, 50].
Genome structure in both of the subspecies seems to have been affected by mobile genetic elements despite these elements differing in kind and in numbers. Integrative conjugative elements have been identified in Mmc where they were shown to induce chromosomal rearrangements, but not in MmmSC. They may also have played a role in gene acquisition although this has not yet been demonstrated. Insertion sequences were identified in both subspecies but, here again, there are differences: Mmc and MmmSC have only two IS types in common and MmmSC possess only three IS types present in large copy numbers (95 copies) whereas Mmc possess five IS types but only in lower copy numbers (24 copies). Again the larger copy number in MmmSC may be associated with an evolutionary bottleneck such that they provide transitory selective advantages to their host such as HGT and genomic rearrangements .
Homologous recombination has been demonstrated experimentally in Mcc and Mmc strains . This does not seem to be the case in MmmSC where multiple attempts to obtain homologous recombinations in vitro have failed . These failures could be linked to the functional absence of two genes, recG and recR, which are disrupted by frameshift mutations.