Defining natural species of bacteria: clear-cut genomic boundaries revealed by a turning point in nucleotide sequence divergence
© Tang et al.; licensee BioMed Central Ltd. 2013
Received: 24 January 2013
Accepted: 15 July 2013
Published: 18 July 2013
Bacteria are currently classified into arbitrary species, but whether they actually exist as discrete natural species was unclear. To reveal genomic features that may unambiguously group bacteria into discrete genetic clusters, we carried out systematic genomic comparisons among representative bacteria.
We found that bacteria of Salmonella formed tight phylogenetic clusters separated by various genetic distances: whereas over 90% of the approximately four thousand shared genes had completely identical sequences among strains of the same lineage, the percentages dropped sharply to below 50% across the lineages, demonstrating the existence of clear-cut genetic boundaries by a steep turning point in nucleotide sequence divergence. Recombination assays supported the genetic boundary hypothesis, suggesting that genetic barriers had been formed between bacteria of even very closely related lineages. We found similar situations in bacteria of Yersinia and Staphylococcus.
Bacteria are genetically isolated into discrete clusters equivalent to natural species.
Bacteria are classified into species, which are organized into higher taxonomic ranks such as genera, families, orders, etc., based on levels of similarity among them. However, the definition of the fundamental taxonomic unit, the species, is still an unsolved issue. Over the past three centuries since their discovery, bacteria have been classified in numerous ways based on morphological, serological, biochemical or genetic properties, with the species being defined differently according to the method used for the classification. As a result, a bacterial pathogen may at one time be defined as an independent species or at another time as a variant of a species along with many other bacteria that share phenotypic or genetic similarities. For example, the human typhoid agent was originally treated as a species with a Latinized scientific name Salmonella typhi but later was re-classified as merely a serovar of another species, Salmonella enterica, together with over 2000 other “serovars” [1–3]; these 2000 plus serovars are mostly mild or non-pathogenic to humans and were, like S. typhi, initially also classified as separate species. Inclusion of the deadly human pathogen S. typhi in a species together with thousands of pathogenically very different bacteria has in fact caused enormous confusions in the clinical as well as basic research settings. In addition to medicine, the recognition of natural bacterial species is also important for research and applications in industrial and agricultural areas. Essentially, all such confusions have resulted from the lack of theory-based species concept and of objective criteria-supported species definition.
Currently an expedient way is to categorize bacteria into taxonomic species by arbitrary cut-off values at 70% DNA-DNA association and 97% 16S rRNA sequence identity [4, 5]. However, since both kinds of data are continuous, the 70% and 97% criteria can hardly assign bacteria into discrete genetic groupings. More seriously, the wide ranges of genomic variation set by the 70% and 97% criteria would unavoidably classify a great diversity of phylogenetically different bacteria into the same species. Therefore, for a stable classification system that truly reflects the evolutionary relationships of bacteria, the basic taxonomic unit, i.e., species, needs to be defined on the basis of objective criteria that can assign bacteria into discrete genetic as well as biological clusters with clear-cut boundaries.
Previous work already suggests that bacteria exist in discrete clusters as demonstrated by their distinct genome structures [6–8] and significantly reduced recombination efficiency among even very closely related bacteria , although it has been unclear whether the genetic isolation among the bacteria is “clear-cut”. Based on our earlier findings with Salmonella[10–12], we hypothesize that genetic boundaries may exist to isolate bacteria into phylogenetically discrete clusters equivalent to natural species . In this study, we use Salmonella as the primary models to explore the hypothesized genetic boundaries. We found sharp genetic distinctness among bacteria of closely related lineages and demonstrated the existence of an abrupt turning point in sequence divergence between any pair of Salmonella lineages compared. When we extended the work to other bacteria, including Yersinia and Staphylococcus, we found similar genetic boundaries. We propose that bacteria circumscribed by the genetic boundary be considered members of a natural species, and bacteria of a natural species should have cohesive genetic and biological attributes.
Genomic sequence comparison: high homogeneity and abrupt divergence within and across Salmonellalineages as molecular evidence of genetic boundaries
We used Salmonella as the primary model in this study mainly for the close genetic relatedness [14, 15] and distinct biological properties [16–18] of these bacteria in addition to the extraordinarily large number of lineages available for comparative studies. The serologically defined Salmonella types, called serotypes or serovars, may be monophyletic or polyphyletic. Examples of monophyletic Salmonella serotypes are those with antigenic formula 9,12:d:- for S. typhi and 1,2,12:a:[1, 5] for S. paratyphi A. On the other hand, many Salmonella serotypes are polyphyletic, such as those with antigenic formula 6,7:c:1,5 that actually includes diverse pathogens S. paratyphi C, S. choleraesuis and S. typhisuis or 1,9,12:a:1,5 that can be differentiated into S. miami and S. sendai by biochemical assays. Even serotypes with the same name may contain multiple “lineages”, such as S. paratyphi B (1,4,,12:b:1,2), which can be divided into d-tartrate positive and negative lineages, with the former infecting a broad range of hosts and causing gastroenteritis and the latter infecting only humans and causing paratyphoid. The Salmonella strains compared in this study are either of monophyletic serotypes or representatives of individual lineages of polyphyletic serotypes according to our previous phylogenetic studies of these bacteria [7, 8, 19–22]. We compared the genomes of twenty six strains from thirteen Salmonella lineages (Additional file 1: Table S1), to reveal potentially important genomic differences that may clearly distinguish the lineages on a phylogenetic basis. For this, we first identified genes common to these genomes. We found that all compared Salmonella genomes are indeed highly similar: the strains of different lineages share most of their genes, from 79% as between S. typhi and S. pullorum (3693 of the 4682 genes of S. pullorum RKS5078 are in common with genes of S. typhi Ty2) to 93% as between S. gallinarum and S. pullorum (4034 of the 4347 genes of S. gallinarum 287/91 are in common with genes of S. pullorum RKS5078; Additional file 2: Table S2). Within a lineage, this percentage may be lower or higher than 90% (Additional file 2: Table S2). As the percentage ranges of shared genes inside and across the Salmonella lineages are continuous or even overlapping, the hypothesized genetic boundaries among different Salmonella lineages were not supported in this regard.
However, when we compared the levels of sequence identity between homologous genes, a drastic distinction stood up conspicuously, forming an acute turning point in sequence divergence between strains of a pair of lineages compared. Whereas within a particular lineage, most of the genes had 100% sequence identity among independent strains, across different lineages the percentages of genes with 100% sequence identity dropped abruptly (Additional file 3: Table S3). With rare exceptions, the percentages of genes with 100% sequence identity were 85% or higher among strains of the same lineage and 12% or lower across the lineages (Additional file 4: Table S4). The exceptions were seen in the comparison of three lineages, including S. enteritidis, S. gallinarum and S. pullorum, among which about 40% of their homologous genes have 100% sequence identity (Additional file 4: Table S4). Our explanation is that these three pathogens have diverged not long enough to independently accumulate as many mutations. Nevertheless, clear-cut genetic boundaries have already been formed among them, delineating these three close relatives into distinct lineages.
Genetic barriers assessed by DNA recombination assays between S. gallinarum and S. pullorum
The fowl pathogens S. gallinarum and S. pullorum have a common antigenic formula, 1,9,12:-:-, the former causing typhoid and the latter causing pullorum disease (dysentery). They are so closely related that, being originally treated as separate species , they have since the mid 1980s been classified into the same serovar of the same species and even the same subspecies (i.e., S. enterica subspecies enterica Serovar Gallinarum as separate biovars Gallinarum and Pullorum, respectively ). However, their biological distinction (causing entirely different diseases) unambiguously tells that they are different organisms (i.e., each being a natural species on its own right). Our recent work also reveals that the two pathogens have accumulated distinct sets of mutations including different pseudogenes [23, 24], further demonstrating genetic divergence of the two Salmonella lineages. Therefore, the existence of genetic barriers, if experimentally validated, would further support the genetic boundary hypothesis and facilitate the establishment of objective criteria for defining natural species of bacteria. Otherwise, the genetic boundary concept would need reconsideration.
We used the bacteriophage P22 to move DNA between S. pullorum and S. gallinarum by generalized transduction as previously described . We first moved the Tn10-inserted ompD159 gene from S. typhimurium LT2 [16, 26] to four S. pullorum strains RKS5078 [23, 27], CDC1983-67, SARB51 and 04–6767, and four S. gallinarum strains 287/91 , RKS5021, SGSC2293 and 91–29327 (see strain information at http://www.ucalgary.ca/~kesander). Then we moved the ompD159 gene from one of the eight strains to the other seven strains and repeated this process for all of the eight strains. When we inspected transductants on LB plates containing tetracycline and compared their numbers among the bacterial strains used as the recipients of the DNA carried by the P22 phage, we saw a general tendency in differential efficiency to incorporate the same donor DNA between S. pullorum and S. gallinarum: transduction of S. pullorum recipients with DNA from S. pullorum resulted in larger numbers of transductants than with DNA from S. gallinarum and, similarly, transduction of S. gallinarum recipients with DNA from S. gallinarum resulted in larger numbers of transductants than with DNA from S. pullorum (Additional file 5: Table S5). To validate this observation and rule out the possibility that a particular genomic DNA segment or a particular bacterial strain might have given non-representative results, we used additional DNA segments (Tn10-inserted leu-1151, bio-102, oxrA2 and cysA1367, in addition to ompD159, which was also included in the second set of transduction experiments for comparisons) and additional S. pullorum and S. gallinarum strains (Additional file 6: Table S6). Again, the transduction efficiency was lower in across-lineage combinations (i.e., S. pullorum or S. gallinarum as recipient to receive S. gallinarum or S. pullorum DNA) than in recipient-donor combinations of the same lineage (Additional file 6: Table S6).
Salmonellalineages as discrete clusters of bacteria: phylogenetic distinction
Genome structure comparison of the Salmonellalineages: abrupt dissimilarity
Genetic boundaries in Yersinia and Staphylococcus
This study aims at one key question: do bacteria exist as discrete clusters or do they spread all over continuously to span the whole phylogenetic spectrum or, asked in another way, do bacteria exist as natural species that are isolated by genetic boundaries into discrete phylogenetic clusters? This question is central to bacterial systematics or, in a sense, to biology, but so far there was no evidence-based answer or experimentally testable hypothesis. As a result, bacteria have been classified into species by largely arbitrary cut-offs at 70% DNA-DNA association and 97% 16S rRNA sequence identity. Therefore, genera, families and higher taxonomic ranks based on the arbitrary species can only be arbitrary, reflecting not necessarily accurate natural relationships among the bacteria. Through this study, we show that genetic boundaries circumscribing bacteria are objective and can be described digitally. Specifically, Salmonella lineages, such as S. typhi, S. typhimurium, S. gallinarum and S. pullorum analyzed in this study, may be defined as species, since clear-cut genetic boundaries have been unambiguously demonstrated among them. We also demonstrated the existence of similar clear-cut genetic boundaries in other bacteria exemplified by Yersinia and Staphylococcus.
The first line of evidence indicating the existence of genetic boundaries isolating bacteria into discrete phylogenetic clusters was in fact provided by physical analyses of bacterial genomes with the PFGE techniques. For example, the cleavage sites of certain endonucleases such as XbaI and SpeI are highly conservative within a Salmonella lineage ; of great significance, the conservation of cleavage sites disappear abruptly across the lineages, even between those as closely related as S. pullorum and S. gallinarum (see Figure 3). A plausible explanation for the genomic conservation within a bacterial lineage (to be defined as natural species) is that bacteria of a species occupy a niche not congruent with those of bacteria in other species; a subpopulation of this species may become dominant in the niche and purge other subpopulations of the same species, retaining a genome structure representative of the extant species. Phylogenetic analysis shows that strains of the same Salmonella lineage cluster very tightly together and different Salmonella lineages are clearly isolated with certain evolutionary distances on the genealogical tree as a result of independent accumulation of nucleotide variations over long evolutionary times.
To look into the molecular basis of the genetic boundaries isolating bacterial into natural clusters (i.e., species), we carried out systematic comparisons of the sequenced Salmonella genomes and found a sharp drop in the number of genes that have 100% sequence identity across the Salmonella lineages, a finding that further demonstrates the rather rigid genetic isolation among the Salmonella lineages. Genetic isolation between Salmonella lineages were also demonstrated by reduced recombination efficiency between S. gallinarum and S. pullorum, which are the most closely related Salmonella lineages so far known. We postulated the existence of genetic barriers among different Salmonella lineages, but the molecular basis is still largely unknown. Although the mismatch repair system may account for at least part of the genetic boundaries as a kind of barriers against gene flow between bacteria of different species [9, 30–32], obviously the genetic boundaries are formed by multiple factors, many of which are yet to be identified.
We conclude that bacteria exist as discrete biological clusters to be called natural (rather than arbitrary) species; the genetic boundaries separating the bacterial species can be described digitally and may be used for the establishment of objective and universal criteria to define bacterial species. Salmonella lineages are separated by clear-cut genetic boundaries and therefore each should be re-classified as species. Our method requires whole genome sequences from only representative bacterial strains and most others can be compared by PFGE analyses.
Bacterial strains and phages
Information on the Salmonella strains used in this study can be found at the Salmonella Genetic Stock Center (http://www.ucalgary.ca/~kesander/). Bacteriophage P22 (HT105/1 int-201) was routinely grown on S. typhimurium LT2 and was used in the transduction experiments.
Enzymes and chemicals
I-CeuI, XbaI and SpeI were purchased from New England Biolabs, and proteinase K was from Roche. The other reagents were mainly from Sigma.
PFGE analyses of genomic DNA
Bacteriophage-mediated transduction experiments
Genomic and statistics analysis tools
Phylogenetic tree construction was done with MEGA4.0.2 and CLUSTALW. The statistical analyses of transduction data were performed by using software SPSS v20.
This work was supported by a Heilongjiang Innovation Endowment Award for graduate studies (YJSCX2012-197HLJ) to LT; a National Natural Science Foundation of China (NSFC30970078) and a grant of Natural Science Foundation of Heilongjiang Province of China to GRL; and grants of the National Natural Science Foundation of China (NSFC30970119, 81030029, 81271786, NSFC-NIH 81161120416), and a Specialized Research Fund for the Doctoral Program of Higher Education (SRFDP, 20092307110001) to SLL.
- Kauffmann F, Edwards PR: A revised, simplified Kauffmann-White schema. Acta Pathol Microbiol Scand. 1957, 41 (3): 242-246.View ArticlePubMedGoogle Scholar
- Le Minor L, Popoff MY: Designation of Salmonella enterica sp. nov., nom. rev., as the type and only species of the genus Salmonella. Int J Syst Bacteriol. 1987, 37: 465-468. 10.1099/00207713-37-4-465.View ArticleGoogle Scholar
- Popoff MY, Le Minor LE: Genus XXXIII. Salmonella. Bergey’s Mannual of Systematic Bacteriology. Edited by: Brenner DJ, Krieg NR, Stanley JT. 2005, Springer, 764-799. 2Google Scholar
- Stackebrandt E, Frederiksen W, Garrity GM, Grimont PA, Kampfer P, Maiden MC, Nesme X, Rossello-Mora R, Swings J, Truper HG, et al: Report of the ad hoc committee for the re-evaluation of the species definition in bacteriology. Int J Syst Evol Microbiol. 2002, 52 (Pt 3): 1043-1047.PubMedGoogle Scholar
- Stackebrandt E, Goebel BM: Taxonomic note: a place for DNA-DNA reassociation and 16S rDNA sequence analysis in the present species definition in bacteriology. Int J Syst Bacteriol. 1994, 44: 846-849. 10.1099/00207713-44-4-846.View ArticleGoogle Scholar
- Liu SL, Hessel A, Sanderson KE: The XbaI-BlnI-CeuI genomic cleavage map of Salmonella typhimurium LT2 determined by double digestion, end labelling, and pulsed-field gel electrophoresis. J Bacteriol. 1993, 175 (13): 4104-4120.PubMed CentralPubMedGoogle Scholar
- Liu SL, Sanderson KE: Genomic cleavage map of Salmonella typhi Ty2. J Bacteriol. 1995, 177 (17): 5099-5107.PubMed CentralPubMedGoogle Scholar
- Liu SL, Schryvers AB, Sanderson KE, Johnston RN: Bacterial phylogenetic clusters revealed by genome structure. J Bacteriol. 1999, 181 (21): 6747-6755.PubMed CentralPubMedGoogle Scholar
- Zahrt TC, Mora GC, Maloy S: Inactivation of mismatch repair overcomes the barrier to transduction between Salmonella typhimurium and Salmonella typhi. J Bacteriol. 1994, 176 (5): 1527-1529.PubMed CentralPubMedGoogle Scholar
- Liu WQ, Feng Y, Wang Y, Zou QH, Chen F, Guo JT, Peng YH, Jin Y, Li YG, Hu SN, et al: Salmonella paratyphi C: genetic divergence from Salmonella choleraesuis and pathogenic convergence with Salmonella typhi. PLoS One. 2009, 4 (2): e4510-10.1371/journal.pone.0004510.PubMed CentralView ArticlePubMedGoogle Scholar
- Feng Y, Liu S-L: Pathogenic Salmonella. Omics, Microbial Modeling, and Technologies for Food-borne Pathogens. Edited by: Yan X, Juneja V, Fratamico PM, Smith J. 2011, Lancaster, Pennsylvania, USA: DEStech Publications, Inc, 43-68.Google Scholar
- Feng Y, Liu W-Q, Sanderson KE, Liu S-L: Comparison of Salmonella genomes. Salmonella from genome to function. Edited by: Porwollik S. 2011, Norfolk: Caister Academic Press, 49-67.Google Scholar
- Tang L, Liu SL: The 3Cs provide a novel concept of bacterial species: messages from the genome as illustrated by Salmonella. Antonie Van Leeuwenhoek. 2012, 101 (1): 67-72. 10.1007/s10482-011-9680-0.View ArticlePubMedGoogle Scholar
- Crosa JH, Brenner DJ, Ewing WH, Falkow S: Molecular relationships among the Salmonelleae. J Bacteriol. 1973, 115 (1): 307-315.PubMed CentralPubMedGoogle Scholar
- Le Minor L, et al: Genus III. Salmonella. Bergey’s Manual of Systematic Bacteriology. 1984, Baltimore: Williams & Wilkins, 427-458.Google Scholar
- McClelland M, Sanderson KE, Spieth J, Clifton SW, Latreille P, Courtney L, Porwollik S, Ali J, Dante M, Du F, et al: Complete genome sequence of Salmonella enterica serovar Typhimurium LT2. Nature. 2001, 413 (6858): 852-856. 10.1038/35101614.View ArticlePubMedGoogle Scholar
- Parkhill J, Dougan G, James KD, Thomson NR, Pickard D, Wain J, Churcher C, Mungall KL, Bentley SD, Holden MT, et al: Complete genome sequence of a multiple drug resistant Salmonella enterica serovar Typhi CT18. Nature. 2001, 413 (6858): 848-852. 10.1038/35101607.View ArticlePubMedGoogle Scholar
- Parry CM, Hien TT, Dougan G, White NJ, Farrar JJ: Typhoid fever. N Engl J Med. 2002, 347 (22): 1770-1782. 10.1056/NEJMra020201.View ArticlePubMedGoogle Scholar
- Liu SL, Hessel A, Sanderson KE: Genomic mapping with I-Ceu I, an intron-encoded endonuclease specific for genes for ribosomal RNA, in Salmonella spp., Escherichia coli, and other bacteria. Proc Natl Acad Sci U S A. 1993, 90 (14): 6874-6878. 10.1073/pnas.90.14.6874.PubMed CentralView ArticlePubMedGoogle Scholar
- Liu SL, Hessel A, Cheng HY, Sanderson KE: The XbaI-BlnI-CeuI genomic cleavage map of Salmonella paratyphi B. J Bacteriol. 1994, 176 (4): 1014-1024.PubMed CentralPubMedGoogle Scholar
- Liu SL, Sanderson KE: I-CeuI reveals conservation of the genome of independent strains of Salmonella typhimurium. J Bacteriol. 1995, 177 (11): 3355-3357.PubMed CentralPubMedGoogle Scholar
- Liu SL, Sanderson KE: The chromosome of Salmonella paratyphi A is inverted by recombination between rrnH and rrnG. J Bacteriol. 1995, 177 (22): 6585-6592.PubMed CentralPubMedGoogle Scholar
- Feng Y, Xu HF, Li QH, Zhang SY, Wang CX, Zhu DL, Cao FL, Li YG, Johnston RN, Zhou J, et al: Complete genome sequence of Salmonella enterica serovar pullorum RKS5078. J Bacteriol. 2012, 194 (3): 744-10.1128/JB.06507-11.PubMed CentralView ArticlePubMedGoogle Scholar
- Feng Y, Johnston RN, Liu GR, Liu SL: Genomic comparison between Salmonella Gallinarum and Pullorum: differential pseudogene formation under common host restriction. PLoS One. 2013, 8 (3): e59427-10.1371/journal.pone.0059427.PubMed CentralView ArticlePubMedGoogle Scholar
- Liu SL: Physical mapping of Salmonella genomes. Methods Mol Biol. 2007, 394: 39-58. 10.1007/978-1-59745-512-1_3.View ArticlePubMedGoogle Scholar
- Liu SL, Sanderson KE: A physical map of the Salmonella typhimurium LT2 genome made by using XbaI analysis. J Bacteriol. 1992, 174 (5): 1662-1672.PubMed CentralPubMedGoogle Scholar
- Liu GR, Rahn A, Liu WQ, Sanderson KE, Johnston RN, Liu SL: The evolving genome of Salmonella enterica serovar Pullorum. J Bacteriol. 2002, 184 (10): 2626-2633. 10.1128/JB.184.10.2626-2633.2002.PubMed CentralView ArticlePubMedGoogle Scholar
- Thomson NR, Clayton DJ, Windhorst D, Vernikos G, Davidson S, Churcher C, Quail MA, Stevens M, Jones MA, Watson M, et al: Comparative genome analysis of Salmonella Enteritidis PT4 and Salmonella Gallinarum 287/91 provides insights into evolutionary and host adaptation pathways. Genome Res. 2008, 18 (10): 1624-1637. 10.1101/gr.077404.108.PubMed CentralView ArticlePubMedGoogle Scholar
- McClelland M, Jones R, Patel Y, Nelson M: Restriction endonucleases for pulsed field mapping of bacterial genomes. Nucleic Acids Res. 1987, 15 (15): 5985-6005. 10.1093/nar/15.15.5985.PubMed CentralView ArticlePubMedGoogle Scholar
- Chen F, Liu WQ, Eisenstark A, Johnston RN, Liu GR, Liu SL: Multiple genetic switches spontaneously modulating bacterial mutability. BMC Evol Biol. 2010, 10: 277-10.1186/1471-2148-10-277.PubMed CentralView ArticlePubMedGoogle Scholar
- Chen F, Liu WQ, Liu ZH, Zou QH, Wang Y, Li YG, Zhou J, Eisenstark A, Johnston RN, Liu GR, et al: mutL as a genetic switch of bacterial mutability: turned on or off through repeat copy number changes. FEMS Microbiol Lett. 2010, 312 (2): 126-132. 10.1111/j.1574-6968.2010.02107.x.View ArticlePubMedGoogle Scholar
- Gong J, Liu WQ, Liu GR, Chen F, Li JQ, Xu GM, Wang L, Johnston RN, Eisenstark A, Liu SL: Spontaneous conversion between mutL and 6 bpDeltamutL in Salmonella typhimurium LT7: association with genome diversification and possible roles in bacterial adaptation. Genomics. 2007, 90 (4): 542-549. 10.1016/j.ygeno.2007.06.009.View ArticlePubMedGoogle Scholar
- Liu SL, Hessel A, Sanderson KE: The XbaI-BlnI-CeuI genomic cleavage map of Salmonella enteritidis shows an inversion relative to Salmonella typhimurium LT2. Mol Microbiol. 1993, 10 (3): 655-664. 10.1111/j.1365-2958.1993.tb00937.x.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.