- Research article
- Open Access
Phylogenetic analysis of human Chlamydia pneumoniae strains reveals a distinct Australian indigenous clade that predates European exploration of the continent
BMC Genomics volume 16, Article number: 1094 (2015)
The obligate intracellular bacterium Chlamydia pneumoniae is a common respiratory pathogen, which has been found in a range of hosts including humans, marsupials and amphibians. Whole genome comparisons of human C. pneumoniae have previously highlighted a highly conserved nucleotide sequence, with minor but key polymorphisms and additional coding capacity when human and animal strains are compared.
In this study, we sequenced three Australian human C. pneumoniae strains, two of which were isolated from patients in remote indigenous communities, and compared them to all available C. pneumoniae genomes. Our study demonstrated a phylogenetically distinct human C. pneumoniae clade containing the two indigenous Australian strains, with estimates that the most recent common ancestor of these strains predates the arrival of European settlers to Australia. We describe several polymorphisms characteristic to these strains, some of which are similar in sequence to animal C. pneumoniae strains, as well as evidence to suggest that several recombination events have shaped these distinct strains.
Our study reveals a greater sequence diversity amongst both human and animal C. pneumoniae strains, and suggests that a wider range of strains may be circulating in the human population than current sampling indicates.
Chlamydia pneumoniae is an obligate intracellular bacterium and member of the Chlamydiaceae, a family of pathogens of higher eukaryotes with a distinct biphasic development cycle . Whilst C. pneumoniae is primarily recognised as an aetiological agent of community acquired pneumonia and other respiratory diseases in humans , it has a broad host range encompassing both warm [3–5] and cold blooded animals [6, 7]. Members of the Chlamydiaceae are characterised by their compact genomes and highly conserved gene content . C. pneumoniae has the greatest coding capacity of the Chlamydiaceae, with animal strains of C. pneumoniae having between 20Kbp (animal versus human C. pneumoniae) to almost 200Kbp (animal C. pneumoniae versus C. trachomatis serovar D) [9, 10] of extra nucleotide sequence. The additional coding capacity of C. pneumoniae is predominantly accounted for by the expansion of the polymorphic membrane protein (pmp) and inclusion membrane protein (inc) gene families [10–12], both of which are involved in the formation and maintenance of the chlamydial inclusion body, modulation of the host cell response [12, 13], as well as a large number of species-specific metabolic and hypothetical protein genes [9, 10, 14].
In addition to its description as a cause of human respiratory disease, C. pneumoniae has been implicated in a variety of human pathologies, including cardiovascular disease, Alzheimer’s disease, ischaemic stroke, asthma and lung cancer [15–18]. Until recently, the majority of fully sequenced C. pneumoniae whole genomes were from strains that were isolated from respiratory pathologies [10, 19, 20], and demonstrated highly conserved nucleotide sequence content and gene order. Recently, several genomes from respiratory and cardiovascular strains were reported, as were whole genome sequences from atherosclerotic and Alzheimer’s C. pneumoniae strains, which allowed for comparison of strains isolated from different diseases, and demonstrated that only minor genetic differences were found between these strains [9, 21, 22].
A previous study examining the genetic diversity between human and animal C. pneumoniae suggested that a genetically distinct strain of human C. pneumoniae was present and circulating within Australian indigenous communities . PCR analysis of a small number of selected target genes was performed on two respiratory strains isolated from Indigenous Australian patients in geographically separate regions [24, 25] and these were shown to have nucleotide sequence, that in some instances, placed these strains phylogenetically closer to animal strains of C. pneumoniae than those circulating in human populations in Australia and worldwide .
To further explore the genetic diversity of Australian human C. pneumoniae strains, we genome sequenced and performed comparative genomic and phylogenetic analyses of two human Australian indigenous C. pneumoniae strains and a third strain from an Australian Caucasian patient. In doing so, we (i) demonstrate that the indigenous Australian human strains form a separate clade branching earlier than other human C. pneumoniae strains; (ii) identify genetic markers unique to Australian indigenous and non-indigenous strains; and (iii) reveal evidence of limited recombination within C. pneumoniae strains from the greater human C. pneumoniae clade.
Phylogenetic relationships in human C. pneumoniae reveal a distinct Australian indigenous clade predating European exploration of the continent
C. pneumoniae strains SH511, 1979 [24, 25] and WA97001  were sequenced following capture of C. pneumoniae DNA using a set of species-specific SureSelectXT RNA probes [27–29]. Sequencing reads of C. pneumoniae WA97001, SH511 and 1797 were mapped to the reference genome, C. pneumoniae AR39, to check the efficacy of the SureSelectXT DNA captures. The genome of SH511 had the highest mean read depth of 1944×, followed by 1979, which had an average read depth of 1887×. The SH511 and 1979 assembled into 10 contigs and 31 contigs, respectively. In contrast, C. pneumoniae WA97001 genome had a significantly lower read depth of 15× and assembled into 104 contigs.
In order to determine the evolutionary and phylogenetic relationships between the Australian C. pneumoniae strains and those previously published, Bayesian and coalescent estimation methods were used to construct phylogenetic trees based on whole genome alignments of all human C. pneumoniae strains and the three published animal C. pneumoniae strains.
Percentage pairwise identities between indigenous and non-indigenous strains ranged from 98.4 to 98.8 %, whilst non-indigenous strains were 99.0 % or greater. Percentage identities of all strains used in the MrBayes analysis are outlined in Table 1. The resulting phylogenetic tree as represented in Fig. 1, demonstrates a clear demarcation of animal and human clades. The majority of non-indigenous human strains cluster into two clades: a large single clade that contains the AR39 and CWL029 subclades, and the smaller TW183 clade . Interestingly, the two Australian indigenous C. pneumoniae strains, SH511 and 1979, formed their own clade which branched deepest from the main human C. pneumoniae grouping, but was also considerably distant to the animal C. pneumoniae clade. The Australian caucasian strain WA97001 and IOL207 (a strain isolated from a case of acute conjunctivitis)  formed their own separate branches in the main human C. pneumoniae clade.
To investigate the evolutionary relationships of these deep-branching Australian indigenous human strains further, we determined the date of the most recent common ancestor (MRCA) of the indigenous Australian C. pneumoniae strains by using BEAST  and ClonalFrame  coalescent estimation methods. BEAST analysis of indigenous and non-indigenous C. pneumoniae strains reveals an MRCA for indigenous strains at 1028, with a 95 % credibility interval between 996 and 1062 years. The mean substitution rate was determined to be 4.64 × 10−4 substitutions per site, per year. ClonalFrame analysis of indigenous and non-indigenous C. pneumoniae strains reveals a MRCA of 1425 for the indigenous strains, with a mean substitution rate of 2.36 × 10−5 per site per year. Though there are minor differences in the predicted MRCA and substitution rates between the two programs, which can be accounted for by the difference in their calculation methods , their estimates support similar evolutionary timelines and dates.
Identification of genetic markers that distinguish Australian indigenous strains from non-indigenous and animal C. pneumoniae strains
Using a PCR-based sequencing approach, we previously identified a series of potential genetic markers that could be used to distinguish Caucasian C. pneumoniae strains of different origins . In the current study, fine-detailed genomic comparisons identified a series of novel genetic markers unique to the Australian indigenous strains, as well as unexpected sequence diversity in the DC9, WA97001 and IOL207 strains, which support their distinct phylogenetic positions in the C. pneumoniae tree.
One of the most significant regions of genetic variation identified is located around four full-length IncA genes annotated in koala strain LPCoLN (CPK_ORF00546 to CPKORF00549 ); the differences of which support our phylogenetic results. The most notable finding in this region for the three Australian strains was the observation that the Australian indigenous strains contain a full-length homolog of CPK_ORF00549 sharing 99.4 % nucleotide pairwise identity to the koala homolog (Fig. 2). The presence of this gene in strains SH511 and 1979, and its significant sequence identity to the koala/bandicoot homolog supports the branching of the Australian indigenous clade earliest in the greater human C. pneumoniae phylogeny. Conversely, the Australian indigenous strains do not have a copy of CPK_ORF00547. This locus is also absent in the frog (DC9) strain and all strains within the TW183 clade, but is found in fragmented forms in all other human strains. Gene copy numbers and fragmentation with respect to the koala LPCoLN strain is represented in Fig. 2.
Another genetic marker unique to the Australian indigenous C. pneumoniae strains SH511 and 1979, was the presence of a 159 bp insertion in the gene homologous to koala CPK_ORF0341 (585 bp insertion compared to the AR39 homolog). Translation of the open reading frame suggests that this is a putative IncA gene which is full length in the koala strain. However, this gene is slightly truncated by 84 amino acids in indigenous strains (354 amino acids in length) and is only 154 amino acids in length in all other strains, including frog DC9 - due to a single nucleotide insertion 3’ which results in a frame shift (Fig. 3). Again, the large, strain-specific insertion and its sequence similarity to the koala homolog, supports the earliest branching of the Australian indigenous strains in the major human C. pneumoniae clade.
Sequence polymorphism has been described in the guaB/A-add operon in human and animal C. pneumoniae strains, with previous studies detailing that human C. pneumoniae strains encode fragmented inosine-5-monophosphate dehydrogenase (guaB) genes . In this study, we found that like the DC9 frog strain, the Australian indigenous strains and strain IOL207 encode for a full length, intact guaB gene. By comparison, all other human strains have a T/C transition at nucleotide position 262, which results in a stop codon (Fig. 4). Varied levels of sequence decay are evident in the Australian strains for GMP synthase (guaA) and adenosine deaminase (add). Deletions in both the guaA and add homologs of WA97001 result in truncations of these genes with loss of functional domains, whilst the Australian indigenous strains exhibit extensive sequence decay at this locus, resulting in the absence of guaA-add and the downstream hypothetical protein. Interestingly, whilst the entire guaA/B-add operon is absent in both koala and bandicoot strains, these genes are present in the frog strain DC9.
Various sequence polymorphisms are evident in the Australian C. pneumoniae strains for the pmpE/F4 gene. Both indigenous Australian strains 1979 and SH511 are truncated as the result of several deletions, whilst a single nucleotide insertion in WA97001 results in a frameshift causing truncation of this gene. This results in the loss of the C-terminal autotransporter domain for all three strains - however the mid-gene region encoding for nine FXXN and eight GGA(I,L,V) amino acid motifs are highly conserved across all the human C. pneumoniae strains (Fig. 5). Additionally, whilst both the koala and bandicoot homologs of this gene display extensive sequence polymorphism, the DC9 frog homolog is highly similar in sequence to the non-indigenous human pmpE/F4 and encodes for the full-length protein.
Australian indigenous strains demonstrate characteristic recombination profiles with only a few instances shared with non-indigenous strains
In addition to estimation of the MRCA and mean substitution rate, ClonalFrame was used to determine the recombination profiles and any shared recombination loci in C. pneumoniae. Our study found that the Australian indigenous strains SH511 and 1979 had a distinct and almost identical recombination and nucleotide substitution profile, with only a single difference in recombination locus between the two: SH511 between 296,000 and 298,000 bp and 1979 between 310,000 and 316,000 bp. Additionally, SH511 and 1979 share a strongly supported recombination event with the atherosclerosis strain A03 and to a lesser extent with Australian non-indigenous strain WA97001 between 778,000 and 784,000, which encompasses hypothetical protein and putative IncA genes. In comparing recombination profiles across the non-indigenous C. pneumoniae strains, the Australian WA97001 strain shares a single strong recombination event with A03 and TW183 between 823,600 and 827,100 bp, which encompasses putative IncA genes. Several nucleotide substitution events are shared amongst the various C. pneumoniae strains, though the highest number of nucleotide substitutions occur in strains J138, IOL207 and DC9 (Fig. 6). A Phi test for recombination was performed on the C. pneumoniae whole genome alignment using SplitsTree4 , which found a total of 16,329 informative sites and statistically significant evidence of recombination (p = 5.538 × 10−4).
C. pneumoniae has been described as an ancient pathogen, with the broadest host range of any member of the Chlamydiaceae . Comparative whole genome studies examining the differences between human respiratory , non-respiratory [9, 21, 22, 36] and animal C. pneumoniae strains  all demonstrate a highly conserved core genome with subtle strain-specific differences. We previously characterized some of these subtle differences using a PCR/sequencing approach and revealed that the two human Australian indigenous human strains sequenced in this study shared genetic markers with the koala LPCoLN strain  for some genes and away from other human non-Australian indigenous strains . To further explore the relationship of Australian indigenous and non-indigenous human strains, in the current study, we obtained whole genome sequences for three Australian respiratory strains (SH511, 1979 and WA97001) and performed comparative analyses to further understand their relationship to other previously characterized human and animal C. pneumoniae strains.
Using a variety of phylogenomic tools, our analysis suggests that the Australian C. pneumoniae indigenous strains form a phylogenetically distinct clade away from all other human C. pneumoniae strains sequenced to date. This is substantiated by unique sequence polymorphisms and recombination profiles associated with the Australian indigenous strains. In contrast to previous phylogenies constructed using sequenced PCR fragments, which alternately placed the Australian indigenous strains within either the human or animal branches of the tree , the use of whole genome sequences gives a more accurate description of the position of these strains within the greater C. pneumoniae evolutionary tree. Fine-detailed genomic comparisons also revealed several novel genetic markers in Australian indigenous human C. pneumoniae strains, beyond those previously identified in previous PCR-based studies .
The Australian indigenous strains demonstrate a copy number incongruity within the CPK_ORF00546 to CPK_ORF00549 IncA gene family. This gene family expansion was first described in the koala LPCoLN strain  with human C. pneumoniae strains exhibiting variable levels of gene fragmentation and gene loss at this locus. The Australian indigenous strains are unique in that they specifically encode a homolog to CPK_ORF00549: to date, SH511 and 1979 are the only human C. pnuemoniae strains that encode for this homolog. Previous studies have shown that C. pneumoniae encodes a far larger number of IncA and putative IncA proteins compared to other Chlamydiae [11, 12], many of which are species-specific. Strong recombination signals were also detected within several human C. pneumoniae strains at loci encoding IncA proteins, which suggests that recombination may account for the expanded number of IncA proteins in C. pneumoniae.
One of the more subtle genetic differences observed between the strains analysed was the maintenance of a partial purine biosynthesis pathway encoded by guaA/B-add . Previous studies demonstrated that the guaB gene is fragmented in human C. pneumoniae strains , however in this study we demonstrate that strains DC9, SH511 and 1979 encode for an intact guaB gene. Given that the Australian indigenous strains do not encode guaA-add, it is likely that the sequence for guaB was a recent acquisition from a strain most similar to DC9. Interestingly, in contrast to the koala and bandicoot strains where the entire guaA/B-add operon is absent [9, 37], the frog DC9 strain encodes guaA/B-add genes, with >99.5 % nucleotide pairwise identity to all human C. pneumoniae strains, with the exception of the three Australian strains. Studies in both C. psittaci and Chlamydia caviae have found evidence for horizontal gene transfer of the guaA/B-add operon between different chlamydial strains and species [33, 38], lending further support for the recent acquisition of guaB by the Australian indigenous strains. Whilst it is unclear what effect the presence or absence of guaA/B-add has on the growth and virulence of human and animal C. pneumoniae strains, a previous study examining the effect of mutations in the Chlamydia muridarum plasticity zone suggest that 5’ point mutations of guaB and add result in attenuated virulence in vivo, whilst guaA/B-add mutations do not affect the growth characteristics of these strains in vitro . These observations are similar to those reported for the growth and virulence of Borrelia burgdorferi and Francisella tularensis guaA/B +/− strains in vitro and in vivo [40, 41].
In order to further explore the evolutionary relationships of the Australian indigenous C. pneumoniae strains, BEAST and ClonalFrame analyses predicted that these strains had an MRCA of 1028 and 1425, respectively. Both of these estimations pre-date the known colonization of the Australian continent by Europeans by several hundred years, but are virtually identical to the previously estimated MRCA for strains within the non-indigenous clade at 1151 +/− 20 years .
Given this new evidence and our previous data suggesting that C. pneumoniae strains in humans likely originated from a zoonotic event(s) [9, 23], it is interesting to speculate on the origin of these indigenous human C. pneumoniae strains. Two possible evolutionary hypotheses to explain the deep-branching of these strains are proposed: (A), the Australian indigenous strains have evolved from a separate zoonotic transmission event, or alternate intermediate strain, to that of the other human C. pneumoniae strains. These ancestral strains were subsequently endemic on the Australian continent and continued to evolve in isolation to the non-indigenous C. pneumoniae strains. Alternatively (B), all human C. pneumoniae strains disseminated from a common intermediate strain, resultant from a single zoonotic event several thousand years ago, and evolved separately in response to their different ecological niches (Fig. 7). Our findings provide support for both hypotheses.
With respect to hypothesis (A), estimations from both BEAST and ClonalFrame analyses indicate an MRCA for the indigenous strains several hundred years prior to the first reported visitation of the Australian continent by Dutch or British explorers [42, 43]. This suggests the possibility that an endemic strain similar to our strains may have been circulating within the indigenous population prior to the arrival of European colonisation. Given the sequence similarity of the indigenous strains to the koala and bandicoot C. pneumoniae strains at several key loci (the absence of guaA-add, polymorphisms in pmpE/F4 and the IncA gene expansion), as well as those previously described , it is possible that a strain similar to these animal strains was zoonotically transmitted to humans on the Australian continent. Hunter-gatherer communities lived in close proximity and interacted with wild animals throughout human history, which would facilitate the transmission of a pathogen to humans. Serological studies examining the prevalence of chlamydial infection in remote indigenous communities have reported levels of almost 60 % adult female seroprevalence to C. pneumoniae . Several species of native Australian marsupials [9, 23, 37, 44] as well as an amphibian [7, 23] have been demonstrated to have genetic sequence similar to that of the koala LPCoLN strain. Studies have shown that koala and bandicoot C. pneumoniae strains readily infect various human-derived cell lines [3, 45, 46], and evidence for human carotid artery and PBMC strains which are genotypically similar to the koala strain at the ompA and yge-urk intergenic spacer loci have been reported . If the distinct phylogenetic clustering of SH511 and 1979 is a result of a separate zoonotic event to that of the main human C. pneumoniae lineage, then it is likely that the animal strain that they have evolved from is still unknown, and probably more similar to the frog DC9 strain in sequence and nucleotide content.
The alternate hypothesis (B), is that all human C. pneumoniae strains disseminated from a single zoonotic event (presumably in Americas or Europe) and then differentiated along separate evolutionary paths, dependent on their geographical and disease niche. The estimated MRCA for indigenous and non-indigenous human strains differs by less than 200 years, whilst their phylogenetic distance is significantly closer, compared to the animal strains. The overall nucleotide pairwise identity of the Australian indigenous strains is more similar to other human strains of C. pneumoniae, even when significant similarities to animal strains at discrete loci are included. There are two possible mechanisms to explain the dissemination of these particular strains: Firstly - various strains of C. pneumoniae were circulating in the worldwide human population approximately 40 thousand years ago, which is well prior to the colonisation of the Australian continent , and that one or some of these strains came to the continent with the arrival of the indigenous peoples. This would account for the characteristic sequence polymorphisms present in the SH511 and 1979 but not in other human C. pneumoniae strains. Alternately - the worldwide variation in human C. pneumoniae is far greater than has yet been determined, and several strain types were introduced to the Australian continent with European colonisation. This in turn accounts for the overall sequence similarity of the SH511 and 1979 strains to non-indigenous human C. pneumoniae strains, in particular WA97001, with which it shares a considerable number of SNPs, as opposed to the Australian marsupial strains, LPCoLN and B21. In both cases, genetically distinct subpopulations of C. pneumoniae could have spread throughout, and evolved in isolation within the indigenous Australian population. Genotypic variation amongst concurrent populations of monomorphic bacteria resulting from selective sweeps is well documented in both Chlamydia [49, 50] and other bacterial species . The differentiation of the main human C. pneumoniae lineage from both the indigenous and animal lineages could be explained by adaptation of these strains to selective and antigenic pressure as a result of extensive antibiotic treatment regimes .
Whilst our study provides evidence for a phylogenetically and genetically distinct branch of human C. pneumoniae, these inferences are made on a relatively small sample size, taken from two individuals from remote communities in the same state, over two decades ago. It is highly unlikely that sampling from the same remote communities and wider ranging communities will uncover the same strains as documented in this study; given the increased interaction between members of remote indigenous communities and neighbouring townships, as well as expanded antibiotic treatment regimes for a range of bacterial infections, including Chlamydia, within these communities. It is also possible that greater sampling for C. pneumoniae in countries outside Australia would uncover a wider range of strains, some of which may be similar to those described in this study.
In summary, we used a combination of comparative genomic and phylogenetic methods to determine the evolutionary position of three Australian human C. pneumoniae strains within the greater C. pneumoniae tree. Our study demonstrated a phylogenetically distinct human C. pneumoniae clade consisting of two Australian indigenous strains, that branched earlier in the human C. pneumoniae evolutionary tree with an estimated MRCA predating the exploration and colonisation of the continent by European settlers by several hundred years. Our findings indicate that a unique strain of C. pneumoniae evolved in isolation within the Australian indigenous population, as evidenced by the unique recombination profiles and distinct sequence polymorphisms in these strains. This suggests that a far greater level of sequence diversity is present amongst human and animal C. pneumoniae strains than previously surmised, and that further sampling of C. pneumoniae isolates from wider geographical regions may uncover strains which have evolved similarly to this unique C. pneumoniae clade.
Description of Chlamydia pneumoniae strains, cell culturing and DNA purification
Three Australian C. pneumoniae cultured isolates (WA97001, SH511 and 1979) were used for comparative analyses in this study. The non-indigenous isolate WA97001 is a clinical nasopharyngeal isolate from Western Australia  whilst isolates SH511 and 1979 are indigenous Australian isolates from two separate patients in remote Northern Territory communities [24, 25].
Isolate WA97001 was propagated on McCoy cells in T75 flasks for five passages, based on a previously described method . Infected cells were pooled and semi-purified using a sonication and centrifugation method prior to passage. The final semi-purified product was stored in an equal volume of SPG media . 500 μl of this semi-purified material was used for DNA extraction. Isolates SH511 and 1979 were extracted from non-viable archival culture material ; 500 μl of each isolate was used for DNA extraction.
DNA extraction was performed using phenol:chloroform:IAA, based on a well described method , with the addition of 2 μl of glycogen prior to ethanol precipitation at −20 °C overnight. Precipitated DNA was dissolved in 50 μl of TE buffer. 500 ng of extracted DNA was used to perform pan-Chlamydiales 16S rRNA  and C. pneumoniae specific RpoB  PCR to confirm the presence of C. pneumoniae DNA, and 500 ng of stock DNA was electrophoresed on a 0.8 % TBE agarose gel to confirm high molecular weight DNA. Each DNA extraction yielded greater than 2 μg of high molecular weight genomic DNA, which was used for sequence capture and Illumina HiSeq 2500 whole genome sequencing at the Institute for Genome Sciences, Baltimore, Maryland.
Sequence capture, whole genome sequencing and assembly
Sequence capture was performed on total DNA extracted from WA97001, SH511 and 1979 with Agilent SureSelectXT DNA capture probes designed to C. pneumoniae reference strain AR39, using a hybridisation capture and amplification process [27–29]. Captured and amplified products were sequenced using the Illumina HiSeq 2500 platform, resulting in paired-end 100 base pair reads. Read quality was checked with FastQC (http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/) and genomes were assembled de novo using SPAdes 3.0.0 with SPAdes 3.0.0 with k-mer values set to of 15, 21, 33, 51 and 71 . All assembled contigs were aligned to the reference C. pneumoniae AR39 genome using BLASTn to remove non-chlamydial contigs. Concatenated genome contigs were annotated using the RAST pipeline  and manually curated using ARTEMIS . Total read depth of WA97001, SH511 and 1979 was calculated by mapping the raw reads to complete genome of C. pneumoniae AR39 using the BWA-backtrack algorithm with BWA aligner . Raw reads were also mapped to the complete genome of C. pneumoniae LPCoLN for comparison. The BWA parameters used include the number of differences allowed between the reference and query set at 0.04 and the number of differences allowed in the seed was 2. The maximum number of gaps allowed in the alignment was 1 and the gap penalty was set at 11.
Phylogenetic and recombination analyses
De novo assemblies and readmapped assembled consensus sequences for WA97001, SH511 and 1979 were aligned to the existing human C. pneumoniae whole genome sequences [10, 19–22, 61] and animal C. pneumoniae strains LPCoLN, B21 and DC9 [9, 22, 37] in Geneious 6.1.8  using the MAFFT plugin implementation . Coverage analyses for readmapped assemblies and manual curation of annotated genomes was performed using ARTEMIS .
Phylogenetic analyses were performed on whole genome alignments, with the LPCoLN koala  C. pneumoniae strain indicated as an outlier. Whole genome alignments were also filtered for poorly aligned and gap regions using Gblocks 0.91b . Mid-point rooted trees were constructed with the MrBayes plugin  in Geneious, utilising a Jukes-Cantor substitution model with with four Markov Chain Monte Carlo (MCMC) chains and 1.1 million cycles, sampled every 1000 generations and the first 10,000 trees discarded as burn-in. Estimates of strain evolution over time were performed on whole genome alignments using the BEAST package . Indigenous, non-indigenous and animal isolates were defined in separate taxon sets and a GTR nucleotide substitution model was employed. MRCA priors were set at a normal distribution with a mean of 95.2 +/− 7.4 . MCMC chain length was set to 5 × 107 to ensure effective sample sizes were sufficient for strong posterior distribution statistics. ClonalFrame  was used to determine homologous recombination within C. pneumoniae genomes, and progressive MAUVE  was used to generate the input alignments. Three successive runs of ClonalFrame were performed on the whole genome alignment, each with 20,000 iterations and 10,000 of these discarded as burn-in. The three runs were checked for convergence and their trees combined for analysis. An additional Phi test for recombination was performed in SplitsTree4  using the whole genome alignment generated by MAFFT in Geneious.
The accession numbers for the C. pneumoniae whole genome sequences used in the comparative analyses and phylogenies are outlined in Table 2.
Description of polymorphic hotspots in C. pneumoniae whole genome alignments
De novo and readmapping assemblies were used to construct whole genome alignments with previously described human and animal C. pneumoniae whole genomes in MAFFT  in Geneious . Single nucleotide polymorphisms (SNPs) and insertions/deletions were detected using the Variations/SNPs tool in Geneious, and larger scale differences were detected via manual scanning of the genome alignment. Sequence for genes which appeared to have significant deletions or insertions were manually extracted and sequence run against the BLAST  database to determine closest homologs. Sequences were translated and searched against the SMART database  to predict any changes in functional domains or protein motifs.
Availabilty of supporting data
The WA97001, SH511 and 1979 whole genome sequencing projects can be found on National Center for Biotechnology Information (NCBI) BioProject under accession numbers [Bioproject:PRJNA291806, Bioproject:PRJNA291802 and Bioproject:PRJNA291805] with reads deposited in the Short Reads Archive under accession numbers [SRA:SRR2144962, SRA:SRR2144961 and SRA:SRR2144960] respectively.
This study was approved by the ethics committee of the Queensland University of Technology and Menzies School of Health Research, Human Research Ethics Committee. Ethics approval for the collection and analysis of strains SH511, 1979 and WA97001 were obtained from Queensland University of Technology, Menzies School of Health Research and the Princess Margaret Hospital for Children.
- pmp :
polymorphic membrane protein
- inc :
polymerase chain reaction
most recent common ancestor
single nucleotide polymorphism
guanine monophosphate synthetase
- add :
Horn M, Collingro A, Schmitz-Esser S, Beier CL, Purkhold U, Fartmann B, et al. Illuminating the evolutionary history of Chlamydiae. Science. 2004;304(5671):728–30. doi:10.1126/science.1096330.
Hahn DL, Azenabor AA, Beatty WL, Byrne GI. Chlamydia pneumoniae as a respiratory pathogen. Front Biosci. 2002;7:E66–76. doi:10.2741/hahn.
Kutlin A, Roblin PM, Kumar S, Kohlhoff S, Bodetti T, Timms P, et al. Molecular characterization of chlamydophila pneumoniae isolates from western barred bandicoots. J Med Microbiol. 2007;56(3):407–17. doi:10.1099/jmm.0.46850-0.
Storey C, Lusher M, Yates P, Richmond S. Evidence for Chlamydia pneumoniae of nonhuman origin. J Gen Microbiol. 1993;139:2621–6. doi:10.1099/00221287-139-11-2621.
Girjes AA, Carrick FN, Lavin MF. Remarkable sequence relatedness in the DNA encoding the major outer membrane protein of chlamydia psittaci (koala type I) and chlamydia pneumoniae. Gene. 1994;138(1):139–42. doi:10.1016/0378-1119(94)90796-X.
Bodetti TJ, Jacobson E, Wan C, Hafner L, Pospischil A, Rose K, et al. Molecular evidence to support the expansion of the host range of chlamydophila pneumoniae to include reptiles as well as humans, horses, koalas and amphibians. Syst Appl Microbiol. 2002;25(1):146–52. doi:10.1078/0723-2020-00086.
Berger L, Volp K, Mathews S, Speare R, Timms P. Chlamydia pneumoniae in a free-ranging giant barred frog (Mixophyes iteratus) from Australia. J Clin Microbiol. 1999;37(7):2378–80.
Collingro A, Tischler P, Weinmaier T, Penz T, Heinz E, Brunham RC, et al. Unity in variety-the pan-genome of the chlamydiae. Mol Biol Evol. 2011;28(12):3253–70. doi:10.1093/molbev/msr161.
Myers GSA, Mathews SA, Eppinger M, Mitchell C, O’Brien KK, White OR, et al. Evidence that human chlamydia pneumoniae was zoonotically acquired. J Bacteriol. 2009;191(23):7225–33. doi:10.1128/jb.00746-09.
Kalman S, Mitchell W, Marathe R, Lammel C, Fan L, Hyman RW, et al. Comparative genomes of Chlamydia pneumoniae and C. trachomatis. Nat Genet. 1999;21(4):385–9. doi:10.1038/7716.
Lutter EI, Martens C, Hackstadt T. Evolution and conservation of predicted inclusion membrane proteins in chlamydiae. Comp Funct Genomics. 2012. doi:10.1155/2012/362104.
Dehoux P, Flores R, Dauga C, Zhong GM, Subtil A. Multi-genome identification and characterization of Chlamydiae-specific type III secretion substrates: the Inc proteins. BMC Genet. 2011. doi:10.1186/1471-2164-12-109.
Grimwood J, Olinger L, Stephens RS. Expression of Chlamydia pneumoniae polymorphic membrane protein family genes. Infect Immun. 2001;69(4):2383–9. doi:10.1128/iai.69.4.2383-2389.2001.
Mitchell CM, Hovis KM, Bavoil PM, Myers GSA, Carrasco JA, Timms P. Comparison of koala LPCoLN and human strains of Chlamydia pneumoniae highlights extended genetic diversity in the species. BMC Genet. 2010. doi:10.1186/1471-2164-11-442.
Ramirez JA, Ahkee S, Summersgill JT, Ganzel BL, Ogden LL, Quinn TC, et al. Isolation of Chlamydia pneumoniae from the coronary artery of a patient with coronary atherosclerosis. Ann Intern Med. 1996;125(12):979–82. doi:10.7326/0003-4819-125-12-199612150-00008.
Dreses-Werringloer U, Bhuiyan M, Zhao YH, Gerard HC, Whittum-Hudson JA, Hudson AP. Initial characterization of Chlamydophila (Chlamydia) pneumoniae cultured from the late-onset Alzheimer brain. Int J Med Microbiol. 2009;299(3):187–201. doi:10.1016/j.ijmm.2008.07.002.
Hasan ZN. Association of chlamydia pneumoniae serology and ischemic stroke. South Med J. 2011;104(5):319–21. doi:10.1097/SMJ.0b013e3182114954.
Zhan P, Suo LJ, Qian QA, Shen XK, Qiu LX, Yu LK, et al. Chlamydia pneumoniae infection and lung cancer risk: a meta-analysis. Eur J Cancer. 2011;47(5):742–7. doi:10.1016/j.ejca.2010.11.003.
Read TD, Brunham RC, Shen C, Gill SR, Heidelberg JF, White O, et al. Genome sequences of Chlamydia trachomatis MoPn and Chlamydia pneumoniae AR39. Nucleic Acids Res. 2000;28(6):1397–406. doi:10.1093/nar/28.6.1397.
Shirai M, Hirakawa H, Kimoto M, Tabuchi M, Kishi F, Ouchi K, et al. Comparison of whole genome sequences of Chlamydia pneumoniae J138 from Japan and CWL029 from USA. Nucleic Acids Res. 2000;28(12):2311–4. doi:10.1093/nar/28.12.2311.
Roulis E, Bachmann NL, Myers GS, Huston W, Summersgill J, Hudson A, et al. Comparative genomic analysis of human Chlamydia pneumoniae isolates from respiratory, brain and cardiac tissues. Genomics. 2015. doi:10.1016/j.ygeno.2015.09.008.
Weinmaier T, Hoser J, Eck S, Kaufhold I, Shima K, Strom TM, et al. Genomic factors related to tissue tropism in Chlamydia pneumoniae infection. BMC Genomics. 2015;16(1):268. doi:10.1186/s12864-015-1377-8.
Mitchell CM, Hutton S, Myers GSA, Brunham R, Timms P. Chlamydia pneumoniae is genetically diverse in animals and appears to have crossed the host barrier to humans on (at least) Two occasions. PLoS Pathog. 2010. doi:10.1371/journal.ppat.1000903.
Asche LV, Hutton SI, Douglas FP. Serological evidence of the three chlamydial species in an aboriginal community in the Northern Territory. Med J Aust. 1993;158(9):603–4.
Hutton S, Dodd H, Asche V. C. pneumoniae successfully isolated. Today’s Life Science. 1993;5:2.
Coles KA, Timms P, Smith DW. Koala biovar of Chlamydia pneumoniae infects human and koala monocytes and induces increased uptake of lipids in vitro. Infect Immun. 2001;69(12):7894–7. doi:10.1128/iai.69.12.7894-7897.2001.
Gnirke A, Melnikov A, Maguire J, Rogov P, LeProust EM, Brockman W, et al. Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat Biotech. 2009;27(2):182–9. doi:10.1038/nbt.1523.
Christiansen MT, Brown AC, Kundu S, Tutill HJ, Williams R, Brown JR, et al. Whole-genome enrichment and sequencing of Chlamydia trachomatis directly from clinical samples. BMC Infect Dis. 2014. doi:10.1186/s12879-014-0591-3.
Bachmann NL, Sullivan MJ, Jelocnik M, Myers GSA, Timms P, Polkinghorne A. Culture-independent genome sequencing of clinical samples reveals an unexpected heterogeneity of infections by Chlamydia pecorum. J Clin Microbiol. 2015. doi:10.1128/jcm.03534-14.
Forsey T, Darougar S. Acute conjunctivitis caused by an atypical chlamydial strain: Chlamydia IOL 207. Br J Ophthalmol. 1984;68(6):409–11.
Drummond AJ, Suchard MA, Xie D, Rambaut A. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol Biol Evol. 2012. doi:10.1093/molbev/mss075.
Didelot X, Falush D. Inference of bacterial microevolution using multilocus sequence data. Genetics. 2007;175(3):1251–66. doi:10.1534/genetics.106.063305.
Read TD, Joseph SJ, Didelot X, Liang B, Patel L, Dean D. Comparative analysis of Chlamydia psittaci genomes reveals the recent emergence of a pathogenic lineage with a broad host range. MBio. 2013. doi:10.1128/mBio.00604-12.
Huson DH, Bryant D. Application of phylogenetic networks in evolutionary studies. Mol Biol Evol. 2006;23(2):254–67. doi:10.1093/molbev/msj030.
Roulis E, Polkinghorne A, Timms P. Chlamydia pneumoniae: modern insights into an ancient pathogen. Trends Microbiol. 2012. doi:10.1016/j.tim.2012.10.009.
Bodetti TJ, Timms P. Detection of Chlamydia pneumoniae DNA and antigen in the circulating mononuclear cell fractions of humans and koalas. Infect Immun. 2000;68(5):2744–7. doi:10.1128/IAI.68.5.2744-2747.2000.
Roulis E, Bachmann N, Polkinghorne A, Hammerschlag M, Kohlhoff S, Timms P. Draft genome and plasmid sequences of chlamydia pneumoniae strain B21 from an Australian endangered marsupial, the western barred bandicoot. Genome Announcements. 2014. doi:10.1128/genomeA.01223-13.
Read TD, Myers GSA, Brunham RC, Nelson WC, Paulsen IT, Heidelberg J, et al. Genome sequence of Chlamydophila caviae (Chlamydia psittaci GPIC): examining the role of niche specific genes in the evolution of the Chlamydiaceae. Nucleic Acids Res. 2003;31(8):2134–47. doi:10.1093/nar/gkg321.
Rajaram K, Giebel AM, Toh E, Hu S, Newman JH, Morrison SG, et al. Mutational analysis of the Chlamydia muridarum plasticity zone. Infect Immun. 2015. doi:10.1128/iai.00106-15.
Jewett MW, Lawrence KA, Bestor A, Byram R, Gherardini F, Rosa PA. GuaA and GuaB Are Essential for Borrelia burgdorferi Survival in the Tick-Mouse Infection Cycle. J Bacteriol. 2009;191(20):6231–41. doi:10.1128/jb.00450-09.
Santiago AE, Mann BJ, Qin A, Cunningham AL, Cole LE, Grassel C, et al. Characterization of Francisella tularensis Schu S4 defined mutants as live-attenuated vaccine candidates. Pathog Dis. 2015. doi:10.1093/femspd/ftv036.
Heeres JEJE, Genootschap KNA. The part borne by the Dutch in the discovery of Australia 1606–1765. Luzac: Leiden : E.J. Brill; 1899.
Bennett S. Australian discovery and colonisation. Part II. Sydney: Hanson & Bennett; 1865.
Bodetti TJ, Viggers K, Warren K, Swan R, Conaghty S, Sims C, et al. Wide range of Chlamydiales types detected in native Australian mammals. Vet Microbiol. 2003;96(2):177–87. doi:10.1016/s0378-1135(03)00211-6.
Chacko A, Barker CJ, Beagley KW, Hodson MP, Plan MR, Timms P, et al. Increased sensitivity to tryptophan bioavailability is a positive adaptation by the human strains of Chlamydia pneumoniae. Mol Microbiol. 2014. doi:10.1111/mmi.12701.
Mitchell CM, Mathews SA, Theodoropoulos C, Timms P. In vitro characterisation of koala chlamydia pneumoniae: morphology, inclusion development and doubling time. Vet Microbiol. 2009;136(1–2):91–9. doi:10.1016/j.vetmic.2008.10.008.
Cochrane M, Walker P, Gibbs H, Timms P. Multiple genotypes of Chlamydia pneumoniae identified in human carotid plaque. Microbiology. 2005;151:2285–90. doi:10.1099/mic.0.27781-0.
Rasmussen M, Guo X, Wang Y, Lohmueller KE, Rasmussen S, Albrechtsen A, et al. An aboriginal Australian genome reveals separate human dispersals into Asia. Science. 2011;334(6052):94–8. doi:10.1126/science.1211177.
Millman K, Black CM, Johnson RE, Stamm WE, Jones RB, Hook EW, et al. Population-based genetic and evolutionary analysis of chlamydia trachomatis urogenital strain variation in the United States. J Bacteriol. 2004;186(8):2457–65. doi:10.1128/jb.186.8.2457-2465.2004.
Nunes A, Nogueira PJ, Borrego MJ, Gomes JP. Adaptive evolution of the chlamydia trachomatis dominant antigen reveals distinct evolutionary scenarios for B- and T-cell epitopes: worldwide survey. PLoS One. 2010;5(10), e13171. doi:10.1371/journal.pone.0013171.
Shapiro BJ, Friedman J, Cordero OX, Preheim SP, Timberlake SC, Szabo G, et al. Population genomics of early events in the ecological differentiation of bacteria. Science. 2012;336(6077):48–51. doi:10.1126/science.1218198.
Sandoz KM, Rockey DD. Antibiotic resistance in Chlamydiae. Future Microbiol. 2010;5(9):1427–42. doi:10.2217/fmb.10.96.
Warford AL, Rekrut KA, Levy RA, Drill AE. Sucrose phosphate glutamate for combined transport of chlamydial and viral specimens. Am J Clin Pathol. 1984;81(6):762–4.
Sambrook J, Russell DW, Irwin N. Molecular cloning : a laboratory manual. Cold Spring Harbor: Cold Spring Harbor Laboratory Press; 2001.
Everett KDE, Bush RM, Andersen AA. Emended description of the order Chlamydiales, proposal of Parachlamydiaceae fam. nov. and Simkaniaceae fam. nov., each containing one monotypic genus, revised taxonomy of the family Chlamydiaceae, including a new genus and five new species, and standards for the identification of organisms. Int J Syst Bacteriol. 1999;49:415–40. doi:10.1099/00207713-49-2-415.
Campbell LA, Perez Melgosa M, Hamilton DJ, Kuo CC, Grayston JT. Detection of Chlamydia pneumoniae by polymerase chain reaction. Eur J Clin Microbiol. 1992;30(2):434–9.
Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012;19(5):455–77. doi:10.1089/cmb.2012.0021.
Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: rapid annotations using subsystems technology. BMC Genomics. 2008;9:75. doi:10.1186/1471-2164-9-75.
Rutherford K, Parkhill J, Crook J, Horsnell T, Rice P, Rajandream MA, et al. Artemis: sequence visualization and annotation. Bioinformatics. 2000;16(10):944–5. doi:10.1093/bioinformatics/16.10.944.
Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754–60. doi:10.1093/bioinformatics/btp324.
Grayston JT, Kuo CC, Campbell LA, Wang SP. Chlamydia pneumoniae sp. nov. for Chlamydia sp. Strain TWAR. Int J Syst Evol Microbiol. 1989;39(1):88–90. doi:10.1099/00207713-39-1-88.
Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, et al. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. 2012;28(12):1647–49.
Katoh K, Misawa K, Kuma KÄ, Miyata T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002;30(14):3059–66. doi:10.1093/nar/gkf436.
Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol. 2000;17(4):540–52.
Nelson DE, Taylor LD, Shannon JG, Whitmire WM, Crane DD, McClarty G, et al. Phenotypic rescue of Chlamydia trachomatis growth in IFN-gamma treated mouse cells by irradiated Chlamydia muridarum. Cell Microbiol. 2007;9(9):2289–98. doi:10.1111/j.1462-5822.2007.00959.x.
Rattei T, Ott S, Gutacker M, Rupp J, Maass M, Schreiber S, et al. Genetic diversity of the obligate intracellular bacterium Chlamydophila pneumoniae by genome-wide analysis of single nucleotide polymorphisms: evidence for highly clonal population structure. BMC Genomics. 2007. doi:10.1186/1471-2164-8-355.
Talavera G, Castresana J. Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst Biol. 2007;56(4):564–77. doi:10.1080/10635150701472164.
Darling AE, Mau B, Perna NT. ProgressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One. 2010;5(6):10.1371/journal.pone.0011147.
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25(17):3389–402. doi:10.1093/nar/25.17.3389.
Schultz J, Milpetz F, Bork P, Ponting CP. SMART, a simple modular architecture research tool: identification of signaling domains. Proc Natl Acad Sci U S A. 1998;95(11):5857–64.
We would like to thank Val Asche, Susan Hutton and the Menzies School of Health in the Northern Territory for their initial work in isolating the Australian indigenous strains. We would also like to thank Avinash Kollipara for his assistance with cell culture propagation of the WA97001 strain. This work was funded by an Australian Government NHMRC grant (#APP1004666). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
The authors declare that they have no competing interests.
ER performed the cell culture, DNA purification, sequence alignments, comparative genomics and phylogenetic analyses, data interpretation and drafted the manuscript. NB performed the assembly and read mapping of the genomes and assisted with drafting the manuscript. MH and GM performed the whole genome sequencing and assisted with data analysis. WH and AP aided in data interpretation and assisted with drafting the manuscript. PT conceived of the study, participated in the design and drafting of the manuscript, aided in data interpretation and critically revised the manuscript. All authors read and approved the final manuscript.
About this article
Cite this article
Roulis, E., Bachmann, N., Humphrys, M. et al. Phylogenetic analysis of human Chlamydia pneumoniae strains reveals a distinct Australian indigenous clade that predates European exploration of the continent. BMC Genomics 16, 1094 (2015). https://doi.org/10.1186/s12864-015-2281-y
- Chlamydia pneumoniae
- Whole genome sequencing
- Inclusion proteins
- Polymorphic membrane protein
- Inosine-monophosphate dehydrogenase