C. pecorum is recognized as one of the most widely distributed chlamydial species, with a wide host range that includes livestock (sheep, goats, cattle and swine) and wildlife [15]. Despite this, very few genome sequences of C. pecorum have been deposited and thus made available, which is a limiting factor for investigating genetic diversity in this chlamydial species.
In this study, we sequenced and examined the genomes of two C. pecorum isolates, PV7855 from an Alpine chamois, which is the only C. pecorum isolate fully sequenced from a wild ruminant to date and PV6959 from a water buffalo [11, 12]. The genomes of our isolates were compared to those of E58, PV3056/3, P787 and W73, all recovered from farmed ruminants [13, 16], which have been selected for analysis because neither C. pecorum sequences from wild ruminants nor from water buffaloes are available in sequence repositories. Furthermore, PV6959 was isolated from a water buffalo reared in a farm where the observed clinical signs, the seropositivity to Chlamydia, the histological findings, the detection of C. pecorum by immunofluorescence assay in the brain tissue and eventually the isolation of the organism in cell culture, confirmed the diagnosis of chlamydial meningoencephalomyelitis [12]. The two isolates were compared also versus the two genomes obtained from Australian koala (DBDeUG and MC/MarsBar) with regard to the two main variable regions, PZ and pmps.
The genome length and GC content of PV7855 and PV6959 isolates are similar to those of the other C. pecorum isolates deposited in the genome repositories. The C. pecorum genome, like other chlamydial species, has a conserved order and gene content [13] and an explanation for this could be that Chlamydia tried to reduce the exposure to frequent lateral gene transfer events, which can cause phenotypic modifications [17].
The phylogenetic analysis based on the orthologous genes shows a tight cluster of the C. pecorum strains, but interestingly, PV6959 appears to be well separate from the other C. pecorum strains considered in this analysis, while PV7855 is very close to E58. This confirm the data from sequence analysis and the number of SNPs, which demonstrate that PV7855 and E58 are very similar in genetic structure. PV7855 and PV6959 are both isolates from ruminants and, as already demonstrated [4], are well separate from C. pecorum isolates from Australian koala, maybe due to their different geographic origin and host preference. NeighborNet analysis gave further confirmation that PV7855 and E58 are evolutionarily close, in fact, the genome analysis revealed a high similarity (range from 97 to 100%) in the amino acid sequences of pmps, the PZ, the trp system and bioBFDA system, furthermore, they present a very low number of SNPs. To better clarify the phylogenetic distance between isolates from koala and wild ruminants more NGS analysis on different isolates are needed.
Firstly, we focused our analysis on the pmp loci, which are known to be one of the most polymorphic loci in the entire chlamydial genome. The members of the pmps, which are part of Type V Secretion System, are unique to chlamydial species [18] and possess a conserved domain structure that includes the C-terminal autotransporter beta-barrel domain, a central m-domain unique to this family of proteins and an N-terminal passenger domain that is involved in adhesion [19].
Pmps are involved in the adhesion to the host’s cell as autotransporter surface-exposed proteins, but are also involved in the immunopathogenesis of Chlamydia infections as potent antigenic proteins [20]. The number of pmps varies according to the chlamydial species, for example C. pecorum contains 15 pmps, which show similarity in domain structure and are divided in 6 subtypes labeled as: A, B, D, E, H and G, which has the highest number of pmps (nine) and the most variable genes among all the subtypes of pmps.
The number of pmps identified in PV7855 was 15, identical to the number identified in the isolates of C. pecorum deposited in the public databases. In contrast 16 pmps were identified in PV6959, one of which, Cpecorum_PV6959_00982 corresponding to pmpG5, is not present in any other whole genomes currently available. Even though PV6959 and E58 are characterized by the same disease pathotype and are evolutionarily very close (see NeighborNet analysis Fig. 3B), this Pmp is not present in E58. We can hypothesize that this protein is not involved in this specific pathogenesis, but more analyses from ruminants affected with the same disease are needed to better understand a possible correlation. The pmpG5 detected in PV6959 showed a low degree (76%) of nucleotide identity with P787, E58, W73, PV3056/3 and DBDeUG when it is run on BLAST and the Sanger-sequence of the region demonstrated that the gene is present and it is not an assembly or sequencing error.
PmpG6 is the most polymorphic Pmp in PV7855 and PV6959 with respect to the Pmps of the other C. pecorum isolates. This data supports other published studies showing that the Pmps belonging to the G family are those most subject to diversity in the sequence [8] and that the pmps are the most rapidly evolving genes in C. pecorum [4]. To date few studies have been carried out to investigate the function and the expression of pmps genes in C. pecorum [21].
The second region analyzed in our study is the PZ, where most of the studies focus on to evaluate the presence and/or absence of a range of established chlamydial virulence factors [22]. In the PZ of the C. pecorum, the number of pld genes can vary from 4 to 5, which is a matter not unique to C. pecorum [13], indeed also C. trachomatis and C. muridarum have a variable number of pld genes [23]. The PV7855 and PV6959 isolates each possess five copies of the phospholipase genes, of which one and two, respectively, are noted as cardiolipin. The gene annotated as cardiolipin synthase in PV7855 corresponds to a pld gene in the other strains. Cardiolipin is present in three copies in Escherichia coli [24] and has been identified in C. trachomatis, where it appears to be expressed at 16 h after infection [25]. The role of mac and pld genes is still unclear but their respective functions appear to be linked [26] and they could be involved in the evasion of the host immune system and in facilitating entry and exit from the host cell. The mac genes, which encode membrane attack complex/perforin proteins, are not present in all Chlamydia species. Indeed, mac genes are present in C. pecorum, C. felis, C. ibidis, C. muridarum, C. pneumoniae, C. psittaci, C. suis and C. trachomatis [27]. In the two C. pecorum isolated in this study, the differences detected in Pld amino acid sequence compared with strains already present in repertoires, may suggest a role of these proteins in the difference of pathogenicity and virulence, but experimental studies are necessary to evaluate this hypothesis. Interestingly, PV6959 is the only strain, among those considered in this study, to have a hypothetical protein between the two tox genes instead of the two copy of pld genes. In contrast, among the mac/perforin domain, of the strains analyzed, there is a very high degree of amino acid identity, comprised between 95 and 99%.
The toxB cytotoxicity gene is present in a different number of copies depending on the chlamydial species, for example, there is only one copy in C. psittaci, C. felis and C. caviae and three copies in C. muridarum. In C. pecorum, toxB cytotoxicity gene is present in two copies, but its function is not yet fully known, it may contribute to the ability of the organism to switch from persistent infection to acute disease [13] and it could have different biological functions or a link to host specificity. Another peculiarity of PV6959 genome is the absence of the two pld genes between the two toxB genes. It represents the only case, between the genomes considered in this study, so the sequencing of more C. pecorum could be relevant also to understand the meaning of this aspect.
Genetic diversity also occurs in biotin biosynthesis, which leads the biotin production from pimeloyl-CoA, which shows significant variability among the different chlamydial species and is absent in some of them (C. caviae, C. trachomatis and C. muridarum) [22]. Biotin is involved in many central cell metabolism pathways and this could indicate its role in the host specificity [22]. In PV7855 and PV6959, it is present and highly homologous to the other systems belonging to the C. pecorum isolates.
Chlamydia species can be characterized for their ability to synthetize tryptophan and this ability depends on trp system, which is not complete in all chlamydial species. The trp system consists of prs, kynU, trpD, trpF, trpC, trpA, trpB and trpR and its aim, theoretically, is allowing the production of tryptophan from an anthranilate substrate [24]. In C. pecorum PV7855 sequenced in this study, trp system is almost intact, consisting of trpD, trpF, trpC, trpA, trpB and trpR, prs and kynU. This complement of genes would permit the production of tryptophan from the substrate anthranilate. However, this system does not allow the first step in the production of tryptophan, the conversion from chorismate to anthranilate, which would be catalyzed by trpE/G. It is hypothesized that the acquisition of anthranilate could be achieved through the capture of kynurenine from the host cell with a transport amino acid, tyrP [13]. In the isolate PV7855, trp system is complete and highly homologous to the one of the genomes considered in the comparison. In PV6959, however, the system is incomplete and in particular lacks the genes prs, kynU, trpF, trpC and trpD. The absence of these genes in a C. pecorum genome is detected in the present study for the first time, since C. felis and C. caviae represent two of the species, along with C. pecorum, where the trp system is complete. The system is absent in the PZ of C. muridarum and C. psittaci suggesting that the ability to synthesize trp de novo is not mandatory for the transmission or survival of these species [22]. PV6959 is likely unable to metabolize and produce tryptophan by itself, unless it manages to obtain indole in some other way.
Finally, an increasing number of recent studies have linked the chlamydial plasmids to pathogenesis [8,9,10]. Plasmids are also recognized as carriers of virulence factors and are almost ubiquitous in chlamydial species [9], thus, we also included plasmids in our analysis. Plasmids in Chlamydia are small, usually about 7.5 kbp in length, highly conserved, non-integrative and non-conjugative and they have been observed in many chlamydial species, including C. pecorum [8, 10]. They consist of non-coding RNA and eight open reading frames (ORF 1–8) [28] and their main role, in C. trachomatis, is the contribution in glycogen accumulation [29].
In this study, the plasmids found confirmed the presence of eight CDSs and a GC content in line as the other C. pecorum plasmids.
A comparison among the plasmids of PV6959, W73 and PV7855 shows slight differences. The W73 plasmid consists of eight genes, some noted as virulence factors that correspond to those noted as hypothetical proteins in plasmids of the isolates studied. In Chlamydia plasmid, CDSs 4 (the most polymorphic locus), 5 and 6 (the most conserved loci) are associated with virulence [30] and these data are confirmed in pCpecPV7855, which is composed of eight genes of which three are hypothetical proteins.
One limitation of our study was that, unfortunately, among the genomes selected for the chromosomal phylogeny only 3 strains (W73, DBDeUG and MC/MarsBar) have both a complete chromosome and plasmid. This meant our plasmid phylogeny is not directly comparable with the chromosomal phylogeny and highlights the lack of complete genomes for C. pecorum in public databases.