Phylogenetic relationship and virulence inference of Streptococcus Anginosus Group: curated annotation and whole-genome comparative analysis support distinct species designation
© Olson et al.; licensee BioMed Central Ltd. 2013
Received: 11 June 2013
Accepted: 9 December 2013
Published: 17 December 2013
The Streptococcus Anginosus Group (SAG) represents three closely related species of the viridans group streptococci recognized as commensal bacteria of the oral, gastrointestinal and urogenital tracts. The SAG also cause severe invasive infections, and are pathogens during cystic fibrosis (CF) pulmonary exacerbation. Little genomic information or description of virulence mechanisms is currently available for SAG. We conducted intra and inter species whole-genome comparative analyses with 59 publically available Streptococcus genomes and seven in-house closed high quality finished SAG genomes; S. constellatus (3), S. intermedius (2), and S. anginosus (2). For each SAG species, we sequenced at least one numerically dominant strain from CF airways recovered during acute exacerbation and an invasive, non-lung isolate. We also evaluated microevolution that occurred within two isolates that were cultured from one individual one year apart.
The SAG genomes were most closely related to S. gordonii and S. sanguinis, based on shared orthologs and harbor a similar number of proteins within each COG category as other Streptococcus species. Numerous characterized streptococcus virulence factor homologs were identified within the SAG genomes including; adherence, invasion, spreading factors, LPxTG cell wall proteins, and two component histidine kinases known to be involved in virulence gene regulation. Mobile elements, primarily integrative conjugative elements and bacteriophage, account for greater than 10% of the SAG genomes. S. anginosus was the most variable species sequenced in this study, yielding both the smallest and the largest SAG genomes containing multiple genomic rearrangements, insertions and deletions. In contrast, within the S. constellatus and S. intermedius species, there was extensive continuous synteny, with only slight differences in genome size between strains. Within S. constellatus we were able to determine important SNPs and changes in VNTR numbers that occurred over the course of one year.
The comparative genomic analysis of the SAG clarifies the phylogenetics of these bacteria and supports the distinct species classification. Numerous potential virulence determinants were identified and provide a foundation for further studies into SAG pathogenesis. Furthermore, the data may be used to enable the development of rapid diagnostic assays and therapeutics for these pathogens.
KeywordsStreptococcus Milleri group Streptococcus Anginosus group Streptococcus anginosus Streptococcus constellatus Streptococcus intermedius Phylogenetics Virulence Comparative genomics Whole-genome sequencing
The genus Streptococcus consists of Gram-positive cocci that are divided into sub-groups via numerous biochemical and molecular methods. The majority of Streptococcus species can be divided into either β-hemolytic causing complete zones of lysis on blood agar plates or α-hemolytic, formation of green zones due to oxidation of hemoglobin by hydrogen peroxide to form a green methemoglobin. Lancefield typing (based on specific carbohydrates within the bacterial cell wall) provides groupings that do not necessarily follow recognized species. The most clinically important are S. pyogenes known as Lancefield Group A Streptococcus (GAS), S. agalactiae Lancefield Group B Streptococcus (GBS), Lancefield group O (S. pneumoniae, S. mitis and S. pseudopneumoniae) and the variable Lancefield group species belonging to the Streptococcus Anginosus group (Which has also been referred to as the Streptococcus Milleri Group primarily by clinicians; SAG) that include non-typeable (using Lancefield typing) as well as strains that are Lancefield group C, F and G. The majority of α-hemolytic streptococci are non-pyogenic including the viridans group Streptococcus (VGS = Anginosus, Mitis and Salivarius groups), Mutans, and S. suis, which is a species that has not been assigned to a group . VGS are considered to be part of the normal microbiota in the human oropharyngeal, urogenital and gastrointestinal tracts . Many of the VGS are classified as α-hemolytic based on their activity on standard sheep’s blood agar. However, some strains can be β-hemolytic, including S. anginosus and S. constellatus that show beta-hemolytic activity and have been shown to produce a streptolysin S-like protein . S. intermedius produces beta-hemolysis on human blood due to a human specific hemolysin called intermedilysin  and may also be beta-hemolytic on sheep blood agar. Many other VGS behave similarly and also show β-hemolytic activity under anaerobic conditions, but not aerobically under which these assays are usually conducted (Surette and Teal, unpublished data).
The taxonomic grouping of the SAG has historically been debated and the definitions have ranged from that of one to three species (with or without subspecies) . The validity of S. anginosus (SA), S. intermedius (SI) and S. constellatus (SC) as individual species has been addressed through phenotypic analysis, DNA: DNA hybridization studies and genetic characterization, and currently there is little debate that there are at least three distinct species with additional subspecies [6, 7]. A recent study has elucidated that SAG consist of 3 species with S. constellatus divided into 3 subspecies (subsp constellatus (SCC), subsp pharyngis (SCP) and subsp viborgensis) and S. anginosus divided into 2 subspecies [subsp anginosus (SAA) and subsp whileyi (SAW)], based on the use of seven core housekeeping genes . The SAG are phenotypically diverse but most strains share common characteristics such as slow growth rate, distinctive ‘caramel smell’, ability to hydrolyze arginine, acetoin production from glucose, and an inability to ferment sorbitol . Lancefield sero-grouping is variable with SA Lancefield types of A, C, F or G, while SC is typically Lancefield C, F or no antigen, and SI is generally not typeable using the Lancefield method. Almost half of all human SAG clinical isolates are Lancefield F type . Due to this phenotypic variability, molecular methods must be used for proper classification of SAG.
The SAG are part of the microbiota of the respiratory, gastrointestinal, and genitourinary tract with variable carriage levels . The SAG are also medically important for their ability to cause suppurative infections and have been isolated from numerous body sites [10, 11]. Of particular interest, SAG has been identified as the most common organism isolated from brain abscess [12, 13], liver abscess  and empyema [13, 15]. Their capacity to elicit pulmonary exacerbation and contribute to disease pathology in CF has also been demonstrated [14, 16, 17]. However, the exact mechanism for virulence within SAG has yet to be determined.
Although many Streptococcus species have the ability to cause disease, virulence studies within the Streptococcus genus often focus on GAS, GBS and S. pneumoniae[18–20]. SAG virulence and pathogenesis mechanisms have not yet been well studied; however, virulence mechanisms have been identified within SAG that allow for the invasion of host cells, evasion of host immune activity, spreading, and allow for the colonization of host tissues to occur. Intermedilysin is a cholesterol-dependent cytolysin produced by all SI strains that demonstrates specificity for human erythrocytes , and is essential for invasion of human cells by SI . S. constellatus and SA also exhibit the β-hemolytic phenotype on sheep’s blood agar , this hemolytic activity has been attributed to the cytolytic factor Streptolysin S-like peptides encoded by the sagA gene within a sag operon . Mutation to the luxS gene has also been shown to decrease hemolytic activity in SI . Capsules allow evasion of the host immune system and encapsulated SAG strains have been isolated having a greater virulence potential than non-encapsulated strains . They are more likely to cause larger abscesses, earlier spontaneous drainage and death in mice compared to non-encapsulated strains . Hyaluronan (HA) is a major component of the extra cellular matrix of human connective tissue and is expressed by many cell types . HA up regulates the hyl gene increasing SAG spreading and colonization within the host . Most SAG isolates have both hyaluronidase and chondroitin sulfatase activity . A detailed analysis of SAG virulence targets are required to achieve a firm understanding for the overall virulence potential within SAG.
The number of sequenced bacterial genomes has exploded with the advent of new sequencing technologies, which has allowed for comparative genomic analysis. Using Roche GS20 and Illumina technologies, we have sequenced to closure and, sequence polished and fully annotated seven SAG genomes including representatives of each species. With the abundance of sequenced strains within the Streptococcus genus, we conducted a detailed comparative analysis utilizing 66 streptococcal genomes including representatives from SAG, Mitis, Pyogenic, Salivarius, Bovis, Mutans groups and S. suis strains. Such genomic comparisons allowed for detailed characterization with insights gained into SAG phylogenomics, core genome, virulence potential, horizontally transferred genetic material, and microevolution within the host.
Results and discussion
An introduction to the SAG genome structure
Background information for SAG strains used in this study
Extent of SAG disease
Broncho-pulmonary, septic arthritis, Osteomyelitis, pyomyositis
Total hip arthroplasy
Summary of genome characteristics for sequenced SAG
G + C%
Avg length CDS (nt)
The G + C content for SAG strains ranged from 37.56 to 38.97% (avg 38.14, n = 18), with the overall G + C content increasing from SI, SC to SA (Table 2). Overall the average of 38.14% is similar to the 38.57% average G + C content for all Streptococcus strains analyzed (Additional file 3). None of the SAG strains contained plasmid DNA similar to other Streptococcus[28–30]. All seven in-house sequenced SAG contained four rRNA operons and 58, 59 and 60 tRNA genes in SA, SC and SI respectively, with the majority of tRNA genes situated around rRNA operons, as seen with other sequenced Streptococcus strains . With regard to the number of tRNA and rRNA found within the finished genomes, SAG were most similar to S. mitis and S. suis, having fewer numbers of both these RNA genes than streptococci in the pyogenic, mutans, bovis and salivarius groups [28, 29, 32–35].
The SAG strains had on average 1800 CDSs (coding DNA sequence: does not include pseudogenes) with an average length of 934 bp, while the streptococcal average was 1943 CDSs with an average size of 893 bp. This difference in number of CDSs may be due in part to the differing size of streptococci genomes, but it may also be partially due to the extra care taken to manually annotate the in-house SAG genomes resulting in a higher number of pseudogenes. This has been demonstrated using S. pyogenes MGAS315, MGAS8232, SF370 and SSI-1, which in the original annotation had zero pseudogenes [29, 32–34], but when retrospectively examined for the presence of pseudogenes they were shown to have 42, 50, 60 and 51 respectively . The seven in-house sequenced SAG genomes harbor 16 to 80 pseudogenes depending on the strain, with the most found in SAW C238 and the least in SI B196 (Table 2). When pseudogenes are included in the total CDSs, the average number of CDSs increased to 1855. An example of how sequence errors or pseudogenes can affect the average CDS size is seen in the whole-genome shotgun sequence for SCP SK1060. This genome had an average CDS size of 754 bp (Table 2), however, a closer look at this genome revealed a large number of essential genes that were truncated including, parC, gyrA, rpoA, uvrA, dnaG, dnaE, and ftsA. The lack of extra genetic material as shown in Figure 1A for SCP SK1060 compared to other SC strains also shows that the extra CDSs are created owing to sequence errors and not novel genetic materials. In this study, even after manual sequence verification, generating high quality finished genomes, all seven in-house sequenced SAG strains still contained predicted pseudogenes, defined as a full-length CDS present within another SAG strain from this study or GenBank or another genome in GenBank. This demonstrates the value of having single contig, high quality reference genomes for SAG to aid in the analysis of comparative genomic studies.
Accurate Phylogeny of SAG requires multiple genetic loci
With the increased genomic sequences available, the capacity for applying phylogenetic analysis methods has increased; however, there is urgent need to evaluate their equivalency. Phylogenetic analysis using the single locus 16S rRNA has historically been the primary molecular method determining species within the SAG and, indeed this is how the seven in-house sequenced SAG strains were originally speciated. To assess the discriminatory power of 16S rRNA sequencing we compared the results to those acquired from an in-house core-SNP pipeline, a multi-locus automated pipeline for phylogenomic analysis called AMPHORA , and two alternate single loci targets cpn60 and rpoB. For all analysis strategies, 66-sequenced streptococcal genomes were included in the SAG clustering analysis. In all iterations SI, SC, and SA strains clustered together by species with one exception-SA F0211 clustered with the SC strains when using cpn60 as a reference.
Global ortholog analysis for SAG within the genus Streptococcus
A total genetic content comparison was conducted to determine the Streptococcus pan- and core genomes via ortholog comparison between 66 Streptococcus reference strains using OrthoMCL . A total of 11587 orthologous groups were identified within the 66 Streptococcus strains, with 7669 ortholog groups absent within SAG (Additional file 4). In a similar study using 11 species and 45 strains of Streptococcus, 9053 orthologous groups were identified with 7442 absent within the S. dysgalactiae group . The differences in numbers are due to an increased number of species (16) and strains (66) used in the present study, which is the largest study of this type done for Streptococcus. The Streptococcus pan-genome increases by an average of 45 genes for each of the 21 additional strains used . Most genes are from the five new Streptococcus species generated in this study. This shows that the Streptococcus pan-genome should still be considered ‘open’ with new genes added with additional genomes analyzed.
Comparison of individual species within SAG revealed that SC had the most core genes (1617) followed by SA (1323) and SI (1316), this is consistent with other species of streptococcus; S. thermophilus (1271) ; S. pneumoniae (1619) ; and S. agalactiae (1806) . Within SAG the average percentage of core genes is 75.23% for SC, 70.7% for SA and 72.9% for SI. The values for SAG are similar to those found for other Streptococcus species including S. pyogenes (80.3%), S. agalactiae (79.1%) and S. thermophilus (81.4%) .
Comparison of gene content of SAG to clinically important Streptococcus species
Average Core genes
# of SAG core genes not found
Total SAG genes analyzed
% of SAG core genes found
S. gordonii a
S. sanguinis b
S. mitis c
S. pneumoniae d
S. suis e
S. agalactiae f
S. pyogenes g
S. mutans h
Differences in protein functional classifications within SAG
Comparing SAG to S. gordonii and S. sanguinis there are only a few noticeable differences in the number of proteins within different COG categories (Figure 4B). Within SA there is an increase of replication, recombination, and repair proteins for SAW C238 (172) and SAW CCUG39159 (159), compared to an average of 117 for the other strains in Figure 4B. This difference is likely due to the increase in phage-related proteins within a recently identified SAW subspecies, which may have increased propensity for the uptake of foreign DNA (largest SAG genomes in this study) as compared to SAA (Figure 1A). There is a decrease in amino acid transport and metabolism proteins (Group E), and poorly characterized protein (Groups R and S) in SAG (112 and 472) compared to S. gordonii (173 and 570) and S. sanguinis (202 and 620). Some of these differences are due to the size of the genomes, as the genomes for both S. gordonii and S. sanguinis are larger than the SAG genomes, accounting for some of the increase in proteins found. In S. mutans, COG groups L, E, R and S are also known to be variable within the genus, with many proteins from these groups present within the accessory genome for S. mutans. The analysis of these differences will provide insight into the species-specific characteristics for SAG.
Orthologs identified as SAG-unique signatures
Unique genes found in SAG strains, as determined by OrthoMCL analysis
% G + C
Conserved hypothetical protein
Putative phosphoglycerate mutase
Conserved hypothetical protein
Conserved hypothetical protein
Tagatose 1,6-diphosphate aldolase
Conserved hypothetical protein
Conserved hypothetical protein
SAG virulence factor repertoire identified for future pathogenomics investigations
Five of the 55 virulence proteins newly identified within SAG are inferred to be adhesion proteins including: a fibronectin binding protein (Fbp54), important in adhesion for S. gordonii, a S. pneumoniae and S. mitis surface adhesion protein (PsaA) , a laminin-binding protein important for adherence in GBS , pullulanase protein important in Streptococcus adhesion  and Streptococcus enolase a strong plasminogen-binding protein .
Twelve loci involved in invasion or evasion from host proteins were also identified in SAG. All five proteins encoded for by the Streptococcus invasion locus silA to silE , four capsule proteins (Cps19FL to O) from S. pneumoniae, hemolysin proteins from the S. agalactiae hemolysin loci cylZ and cylG , and UDP-glucose pyrophosphorylase protein (HasC) , were present in the conserved SAG virulence proteins. Finally, three regulator function proteins were present, including a two component response regulator CsrR (CovR), known to regulate expression of extracellular carbohydrates in Streptococcus; a protein with homology to salivaricin-A (SalX), a bacteriocin known to inhibit growth of streptococci ; and GAPDH an immunomodulatory protein important in streptococcal colonization .
Some virulence proteins were present in more than one species of SAG revealing that the SAG virulence repertoire may vary with strain. Homologs for S. pneumoniae polysaccharide capsule operon (Cps4) ; four proteins from the hemolysin complex from S. agalactiae (Cyl)  and a S. pneumoniae surface-expressed adhesion protein (PavB)  were found in SI, SCC and some SA. Interestingly only some SA (C238, CCUG39159 and SK52T) and SCP contained homologs to sagA through sagI that form the streptolysin S cytolytic toxin complex of GAS. Streptolysin S is a strong cytolytic toxin that provides the ability for transepithelial migration  and sagA homologs have recently been shown to confer β-hemolytic activity in SA . The cytolytic function can be detected via β-hemolysis on sheep’s blood agar plates. Strains predicted to be β-hemolytic based on genomic analysis (SAW C238 and SCP C818, C232 and C1050) were congruent with β-hemolytic phenotype observed on sheep blood (results not shown).
There was also a homolog present in SAG to a hyaluronidase precursor protein (HylA) that provides the ability to survive on host hyaluronic acid as a sole carbon source, as well as aiding in bacterial spreading by allowing for detachment from biofilms . This gene was found in all SC and SI strains and SAW strains, and the presence of the HylA protein was confirmed through traditional phenotypic growth testing for in house sequenced SAG strains .
Internalin A (InlA) is a major invasion protein from Listeria monocytogenes that mediates the attachment and invasion of hepatocytes by L. monocytogenes and is encoded by the inlA gene . A homolog to this gene has been found in Streptococcus spp. and was termed the streptococcal leucine-rich (Slr) protein . A homolog to this protein was identified in all SCP, SI and some SA (C238, 62CV and CCUG39159), and is highly conserved within SAG having 97.4% nucleotide identity and 97.2 PID. Internalin A orthologs have also been identified in many other sequenced streptococcal species including S. sanguinis VMC66 [GenBank: EFX94353.1], S. pyogenes MGAS8232 [GenBank; AAL97968.1] and S. agalactiae 2603 V/R [GenBank; NC_004116]. Within sequenced S. sanguinis, inlA is found inserted between pyrR and a hypothetical protein, while in S. pyogenes this gene is located between metK and birA and in S. agalactiae it is located between lepA and a histidine diad domain protein encoding gene. The region around the inlA gene is conserved in all SAG, with inlA inserted between pyrR and a conserved hypothetical protein as previously seen in S. sanguinis. In SAG without inlA, the pyrR and the gene encoding the conserved hypothetical protein are present, thus it appears that the inlA was lost or never integrated into some SA and SCC. In S. gordonii [GenBank: NC_009009], SA 62CV [GenBank: EFW07950.1] and SC SK1060 [GenBank; EGV09572.1], there are remnants of a leucine-rich protein that has been truncated located next to pyrR, which shows that inlA has been gained and lost in some Streptococcus strains. It has been shown in L. monocytogenes that loss or truncation of the inlA gene causes decreased invasive ability . Similar results were shown for slr, where an isogenic GAS strain lacking slr was significantly less virulent in a mouse model and more susceptible to phagocytosis by human polymorphonuclear leukocytes .
Two virulence genes were identified in SI that were not found in any of the other SAG. Both genes have been previously identified including; nanA, a sialidase A precursor that may influence host bacterial interactions , a pneumolysin-like protein (Ply), identified as intermedilysin, which is a human erythrocyte specific cytotoxin [4, 21]. There were no virulence traits found to be specific to either SC or SA. Indeed, the use of VirDB has identified numerous virulence genes within SAG that may allow SAG to adhere, invade and spread within the host.
Evaluation of colonization potential: a genetic look at SAG LPXTG proteins
Bacterial attachment to host cells is essential in host colonization. Colonization occurs through interactions between bacterial surface-exposed proteins and host cell receptors . One of the most common motifs found in bacterial cell surface-exposed proteins is the LPxTG motif. The motif anchors the C-terminus of these externally facing surface fibrillar proteins . These LPxTG motif proteins are covalently attached to the cell wall by sortases, with sortase A (SrtA) being the most common within streptococci . Many LPxTG motif proteins have been associated with virulence in Gram-positive microorganisms, however many also have no known associated function [20, 68]. Each SAG strain had at least one srtA sortase ortholog, with all SI having two srtC orthologs. SA F0211 is the only SA strain to have multiple sortases, with a SrtA and two SrtC orthologs. Increased sortase function may increase virulence potential through assembly of surface structures .
A total of 58 LPxTG motif proteins were identified within SAG, however, only 50 of these proteins had a signal peptidase in addition to the LPxTG motif to allow expression on the cell surface. The number of LPxTG motif proteins in SAG strains was in the mid range for Streptococcus spp. ranging from 14 to 22 [28, 29, 32, 70]. SA had the most LPxTG motif proteins with 21 and 22 for SAW C238 and C1051 respectively, while SC and SI ranged from 14 to 17 (Additional file 12). One LPxTG motif protein, hyaluronate lyase precursor protein (HylA) was found in all SI, SC and SAW strains. This protein is known to be an important virulence factor as discussed above. Twelve of the LPxTG motif proteins had collagen-binding domains, the same number found in S. equi subsp. zooepidemicus, and thus could be involved in SAG virulence. Also identified were eight proteins important as either pili or in fibrinogen-binding. The presence of numerous potential adherence proteins is common for oral microbiota, and thus, high numbers of these proteins within SAG is not unexpected.
Genetic analysis of two component (histidine kinases/response regulator) systems (TCS)
Comparative summary of two component system (TCS) histidine kinases identified in SAG with other Streptococcus species
Equivalent TCS HK in S. pneumoniae2
Linked to virulence
SAG strain found in
Best non-SAG BlastP results
All SAG except SAb SK1138, SA F0211 and SA SK52T
SIc and SA
SI and SA C1051
SI and SCCd
SA C1051, SA 62CV and SA SK52T
Thirteen of the 14 TCS found in SAG were found to have orthologs in other species of Streptococcus including nine orthologs to TCS that have been linked to virulence in S. pneumoniae. Of these nine, four are well characterized, including VicRK, CiaRH, ComDE, and BlpR (Table 5). These TCS are important to S. pneumoniae virulence, quorum sensing, competence, bacteriocin production, and stress response [71, 77]. Four additional SAG TCS (designated herein as SAGTCS1, SAGTCS6, SAGTCS9, and SAGTCS11) show homology to less characterized TCS that are also important for virulence in S. pneumoniae [78, 79]. Another TCS sequence (SAGTCS13) is poorly characterized showing the most similarity to a characterized TCS from S. mutans. SAGTCS13 has been disrupted in SCP and SA except SK52T, CCUG39159 and C1051, to leave only a truncated portion of the HK with an absent cognate RR. This truncation appears to be due to the insertion of the SAG operon within these strains of SAG, as a complete HK/RR system is observed in all strains that lack the SAG operon; the importance of this TCS has not been well characterized . Finally SAGTCS14 was only found in SA C1051, 62CV and SK52T, with homology to uncharacterized TCS from S. oralis, S. mutans, S. macacae and Lactobacillus salivarius. The importance of TCS in many aspects of virulence is well known and shows the virulence potential for SAG, however, more work is required to fully characterize TCS within SAG.
Potential natural immunity conveyed by clusters of regularly interspaced short palindromic repeats (CRISPRs) within SAG
Clusters of regularly interspaced short palindromic repeats (CRISPRs) contribute to bacterial immunity to invasion from foreign DNA such as bacteriophage and plasmid DNA . CRISPR analysis in SAG strains showed that seven of the 18 strains analyzed contained CRISPRs (Additional file 13). Three of the seven strains had more than one CRISPR for a total of 12 CRISPRs identified within SAG (Additional file 13). Based on the Cas1 protein all CRISPRs except for one that had a Cas1 protein were of the CRISPR subtype II-A also known as the NMENI subtype [80, 81].
The CRISPR SconSK53-2 was the only non Type II-A CRISPR (identified as Type 1-C), which is located next to an integrase and is composed of cas3, cas5, cas8, cas7, cas4, cas1 and cas2 genes from 5′ to 3′ and followed by a CRISPR region with a DR of 32 bp and 19 spacers (Figure 6B). This type of CRISPR has been previously identified within Streptococcus, and showed the most similarity to CRISPR-associated loci from S. sanguinis, S. mutans, S. parasanguinis, and S. pyogenes. The localization next to an integrase gene and high similarity (85 to 93% AA identity) to a similar region within S. mutans LJ23 (GenBank NC_017768), while the surrounding area shows less similarity (<78% AA identity) suggests that this region was horizontally transferred into SCC SK53T.
S. anginosus C1051 had a total of three CRISPR regions, with Sang1051-1 and Sang1051-3 both appearing to be degenerate. Sang1051-1 has cas1 and cas2 genes and the CRISPR region with 21 DR of 37 bases in length (Figure 6B), similar to Sthe2c from S. thermophilus LMD-9 , however, compared to other Cas1 proteins there is less than 65% AA identity as compared to all other Cas1 proteins that have matches greater than 90% based on AA identity. The third CRISPR region, Sang1051-3, had three spacers of 28 bases (Additional file 13). Although neither CRISPR-1 nor CRISPR-3 had a full complement of CRISPR-associated proteins, the fact the CRISPR-1 had 21 spacers suggests that this CRISPR may still be functional . S. anginosus C1051 and SI B196 did not have any detectable prophage integrated into their genomes. This may be due to the presence of multiple CRISPR elements within these genomes.
For the 12 CRISPRs described above, there were a total of 201 spacers (Additional file 14). Of these 201 a total of 50 had greater than 80% identity over 50% of their sequence to a sequence in GenBank. However, none of the spacers from SAG showed 100% DNA identity to anything in the NCBI nucleotide database, which is significant as it has been shown that 100% match is required for immunity to foreign DNA . A total of 11 spacers showed >90% nucleotide similarity to previously identified Streptococcal-specific phages including five phages from S. pneumoniae; 11865, 8140, V22 , Cp-1  and EJ-1 ; two phages from S. pyogenes, phi370.2 , and phiNIH1 ; streptococcus phage C1 isolated from Group C Streptococcus; S. oralis phage PH10 ; S. gordonii phage PH15 ; and a phage from S. gallolyticus subsp. gallolyticus UCN34 ; (Additional file 14). All spacers were unique except for 16 spacers from SA F0211, this one spacer was located within each of the three CRISPRs identified for SA F0211, these spacers were similar to an integrative conjugative element (ICE) from Vibrio fluvialis Ind1 (ICEVflInd1) and showed similarity to genes encoding for pili and conjugative machinery. It appears that F0211 had multiple encounters with one or multiple conjugative elements similar to ICEVflInd1 and has acquired a means to prevent integration of this type of ICE. Perhaps CRISPRs have played a role in the evolution of SAG and may be responsible for the lack of foreign genetic material seen in some in-house sequenced SAG strains. We are currently investigating whether SAG CRISPRs might be variable enough to be useful in a subtyping scheme.
Extensive horizontal gene transfer and accessory gene content in SAG
SAG accessory regions
Genome size (nt)
Combined size ARs (nt)
AR% of Genome
Combined Size HGT ARs (nt)
% of Genome
Investigation of SAG natural competence
Uptake of naked DNA from the environment, known as natural competence, allows bacteria to survive and thrive in variable environmental conditions. Natural competence has been studied in-depth for the streptococci with S. pneumoniae serving as the model organism . The complexity of natural competence systems and diversity of streptococcal competence genes has made in-silico prediction of natural competence difficult . Genome transcriptomics and tiling microarrays have shown that there are 22 essential genes required for natural transformation in S. pneumoniae. Orthologs to 21 of the essential competence genes were identified in all seven in-house sequenced SAG (Additional file 15). For all remaining WGS strains, there were truncated or absent genes likely owing to sequence errors; however, this would have to be experimentally confirmed. The only homolog absent in all strains was comW, which is also missing in the naturally competent S. zooepidemicus, S. sanguinis and S. mutans suggesting that it may not be absolutely required for natural competence.
All SCP examined to date (n = 34) are not naturally competent, but competence has been shown in SI (Lacroix, Grinwis and Surette, unpublished data). Sequence analysis has revealed that the lack of natural competence in SC may be due to an insertion within comEA gene. This truncation results in a gene product approximately half that of the original comEA and when this gene is inactivated in S. pneumoniae, competence is abrogated . The detected insertion has been experimentally validated using Sanger sequencing. All SCP strains had a nine nt in-frame deletion in the comD gene. This has been found to be a feature of all SCP (n = 34), but does produce a functional ComD (Lacroix, Grinwis and Surette, unpublished data). The only other difference seen in the complement of competence genes within SAG was the number of comX orthologs. For most Streptococcus strains there are two comX orthologs (comX1 and comX2), for the SAG there were three comX loci in most strains. However, one copy of comX was truncated in SCP C1050, and in some of the publicaly available WGS strains included in our analysis (Additional file 16). Further functional studies will have to be completed in SAG to identify and characterize which competence proteins are essential.
Microevolution within SAG
Two strains (SCP C232 and C818) that were sequenced within this study were cultured from the same individual almost one year apart. These isolates were both the numerically dominant strain present during a pulmonary exacerbation. Genomic comparisons revealed that these strains were almost identical, differing by only 18 SNPs (Additional file 17). The SNPs were found within intergenic regions (6), and causing both synonymous (3) and non-synonymous mutations (9). Two of the non-synonymous mutations introduced stop codons, resulting in two truncated genes within SCP C818; one is a putative ABC transporter (ATPase portion) while the other is a 3-dehydroquinate dehydratase (aroD). Mutations in aroD have been extensively characterized in Salmonella typhi and have been shown to render bacteria auxotrophic for aromatic amino acids p-aminobenzoate (pABA) and 2, 3-dihydroxybenzoate. This results in the inability to produce ubiqinone and menaquinone causing cellular respiration defects , as well as defects in the cell envelope . Indeed, it has been shown that aromatic amino acids are abundant within the sputum produced within the lungs of CF lung patients and thus may provide an adequate source of these essential nutrients . Further studies will be required to determine the importance of this aroD mutation within SCP C818. Two SNPs were located within a region of the rpoB gene known to cause rifampicin resistance . Finally, a second type of divergence between SCP C232 and C818 were tandem repeat and microsatellite regions, which are potential targets for multilocus variable number tandem repeat analysis (MLVA). There were a total of five regions of increased/decreased copies of tandem repeats ranging in size from four to 41 nt (Additional file 18). Of these five regions, three are located in truncated genes, whereas the other two were within non-coding regions. Further analysis into potential MLVA targets within SAG is warranted. Although both these strains are highly similar, appreciating the evolution that occurred during a year in a CF lung is critical to understanding the fitness advantage required for chronic bacterial pathogens.
This study presented the analysis and comparison of the whole-genome sequences for the three species within SAG, important pathogens with the capacity to cause serious infections throughout the body. Sequencing strains from both respiratory and invasive infections, we identified no clear differences in gene content between these types of isolates, suggesting that it may be host factors that promote certain types of infection as opposed to bacteria-specific virulence factors. There were only eight genes detected that were uniquely common to all seven in-house sequenced SAG; these were also found in most SAG draft genomes that are currently publically available. The comparison of SC, SI and SA strains revealed significant differences with respect to virulence factors, surface proteins and carriage of horizontal genetic elements, with SA showing the most intra-species variability. Horizontal gene transfer between SAG and other pathogens within their environment has clearly played a significant role in the evolution of species within SAG, which will need to be studied in far greater detail. The detailed comparison of microevolution within SAG has identified potential targets for molecular typing methods as well as potential research questions regarding survival of a bacterial pathogen within the lung of a CF patient. The generation of our high quality finished reference genomes for the seven in-house sequenced SAG strains will provide a valuable resource for the analysis of future SAG draft genomes. This comparative genomic analysis provides a key genetic framework for assessing and understanding the molecular events contributing to SAG pathogenesis.
Four respiratory SAG isolates were cultivated on McKay agar from adult patients at the time of a pulmonary exacerbation. The remaining three isolates were from infections at other body sites to provide a comparison between respiratory and non-respiratory (invasive) SAG strains. A total of seven SAG strains were sequenced with at least one respiratory and invasive strain from each of the three species within the SAG. Three SCP strains were sequenced: C232 and C818, both respiratory strains isolated from one individual almost a year apart, and C1050, an invasive isolate from a non-related individual. Two SI strains were sequenced: C270, a respiratory isolate, and B196, an invasive isolate obtained from a CF patient coinciding with a pulmonary exacerbation. Finally, two SA strains were sequenced: SAW C238, a respiratory isolate, and SAA C1051, an invasive isolate (Table 1). All isolates were obtained in accordance with the University of Calgary ethics approval and written approval was obtained for participation in the study from all human subjects providing bacterial isolates.
Chromosomal DNA isolation
Strains were cultured on Columbia blood agar plates (CBAB) or Brain heart infusion (BHI) agar and incubated for 24 hr at 37°C plus 5% CO2. A single colony was inoculated into 20 mL of BHI broth and incubated as above. These cultures were then centrifuged at 5000 × g for 30 min and washed three times with sterile PBS. Cells were re-suspended in 1 mL of sterile PBS and lysed by physical disruption using the MiniBeadbeater-8™ (BioSpec Products, Inc). DNA was purified from lysates using standard phenol: chloroform and ethanol precipitation. All genomic preps were run on an agarose gel to ensure chromosomal integrity. Finally, the DNA was quantified using the Qubit® (Invitrogen, Burlington, ON).
Genome sequencing, assembly, and Gap closure
Genomic DNA was sequenced using the Roche GS20 standard platform as per manufacturer’s protocols (454 Life-sciences, Brandford, CT). For gap closure, fosmid libraries were created as per manufacturer’s protocols using the CopyControl™ Fosmid Library production kit (Epicentre Technologies, Madison, WI) and Sanger sequencing was done on selected clones. Traditional PCR was done using proof-reading (HiFi platinum Taq; Invitrogen) Taq polymerase as per manufacturer’s protocols. PCR products were purified using the QIAquick PCR purification kit (Qiagen, Mississauga, ON) and sequenced with an ABI3730XL capillary electrophoresis instrument (Applied Biosystems, Foster City, CA). After sequencing on the Roche GS20 genome sequencer the raw reads were assembled using the Newbler assembler software package v22.214.171.124. After closure of the genome to a single contigous sequence, the ori was located by performing a simple Blastp using dnaA as a reference.
Sequence errors and SNP confirmation
Potential sequence errors attributed to homonucleotide runs were manually tested in SCP C232 using primers designed to flank the region of interest. PCR amplification was conducted using Invitrogen HiFi Platinum proof-reading Taq polymerase (Invitrogen), following manufacturer’s instructions, with 1 μM of each oligonucleotide and the following thermocycling conditions: 94°C for 5 min, 40 cycles of 94°C for 30 s, 55°C for 30 s and extension at 68°C for at least 30 s (time varied for larger size fragments) followed by 68°C for 5 min. All SNPs found between SCP C232 and C818 were confirmed by Sanger sequencing as described above. For all other SAG strains (except SCP C232), a combination of standard sequencing and Illumina sequence technology were applied to correct base calling errors caused by homonucleotide runs. Illumina sequencing runs were completed at the Iowa State University DNA facility using an Illumina GAII, (Illumina Inc, San Diego, CA). For Illumina sequencing, 5 μg of genomic DNA was submitted. The DNA facility created libraries and bar-coded five strains (SCP C1050 and C818, SAA C1051 and SI C270 and B196) which were then pooled in a single lane using 36 bp reads, while SAW C238 was run in a single lane using 75 bp reads. To ensure that Illumina data was reliable, 100 regions showing variation between the GS20 data and Illumina data were analyzed by Sanger sequenceing. In all cases, the Illumina data was observed to be correct. Illumina data and GS20 data were aligned and co-analyzed using CLC Genomics Workbench version 4.0 (CLC genomics, Cambridge, MA).
Draft and finished genomes were automatically annotated using an in-house version of the GenDB 2.2 genome annotation system. In the annotation pipeline, genes are predicted using a combination of CRITICA  and Glimmer 3.0 de novo gene predictors. Locations of Ribosomal Binding Sites (RBS) were also predicted via CRITICA. RNAmmer  and tRNAScan-SE  were used to predict all rRNA (5S, 16S, 23S) subunits and tRNA, respectively. Functional observations were collected from BLASTN  alignments to the NCBI nucleotide (nt) database , and from BLASTP  alignments to the Kyoto Encyclopedia of Genes and Genomes (KEGG)  and the NCBI non-redundant protein (nr) database . Observations related to protein families were collected using PSI-BLAST  alignments to SWISS-PROT (sp)  and Clusters of Orthologous Groups (COG) databases . Conserved protein domain observations were collected as RPS-Blast  alignments to the Conserved Domain Database (CDD) . Additional observations collected for each CDS included Hidden Markov Model protein sequence classification searches against the TIGRFAM  and PFAM  databases, membrane spanning helices searches via EMBOSS helix-turn-helix  TMHMM  and presence of signal peptide sequences . All functional observations were analyzed within the annotation system using a set of pre-defined heuristics to automatically assign a gene name and biological role for each CDS, when possible. Bi-directional BLASTP  for all predicted CDS was run between each pair of genomes and within each genome to identify potential orthologs and paralogs, respectively.
All intergenic regions (with a 25 base pair (bp) elongation on either end) were analyzed through a separate customized pipeline for identification of potential reading frame shifts, short genes overlooked by the automatic pipeline, or genes in regions of localized atypical nucleotide composition. Regions with BLASTX  alignments and EMBOSS getORF  open reading frames (ORF) were identified as potential CDS regions, run through the function prediction pipeline and automatically marked for manual curation.
Manual curation of SAG genomes
All predicted CDSs were manually inspected with their GenDB observations to validate inference of the auto annotation. Predicted genes were inspected for potential frameshift errors and alternative start sites. All potential frameshift errors were experimentally validated with Sanger sequencing (described above). Orthologs between genomes were multi-annotated manually with inference from a chosen reference genome using the GenDB ortholog finding tool based on ClustalW multiple alignments of sequences flagged from the Bidirectional Blastp observations. To facilitate annotating the genomes in a timely manner, sequencing corrections and annotation were finished concurrently. The final version of each corrected sequence was re-mapped to the annotations and observations using custom perl scripts built to work within the GenDB system.
The manually curated, high quality finished genomes sequenced generated within this study have been deposited at GenBank with the accessions, SAA C1051 [GenBank: CP003860], SAW C238 [GenBank: CP003861], SCP C232 [GenBank: CP003800], SCP C818 [GenBank: CP003840], SCP C1050 [GenBank: CP003859], SI B196 [GenBank: CP003857], SI C270 [GenBank: CP003858].
Genome visualization and analysis
The pan-genome analysis was done using Gview version 1.6  with SAW C238 as the seed genome and adding all SAG genomes as listed in Figure 2 to create a pangenome reference, this pangenome was then used to compare all individual SAG genomes to the pangenome using a BLAST atlas with the default settings on Gview version 1.6 . Blast atlas for each of the species within SAG were constructed by using the basic setting on Gview version 1.6 and creating a circular comparison with SAW C238, SCP C232 and SIB196 as the reference for SA, SC, and SI respectively. MAUVE version 3.2.1  was used with default settings to construct linear views of the SAG chromosomes.
Orthologs were identified by OrthoMCL v2.0.2 ; for the in-house core-SNP pipeline those orthologs present as single copies and common to all data sets were included for analysis. Each orthologous group was aligned using ClustalW (v1.8.2) and manually edited to correct for incorrectly predicted start sites. A subset of SNP loci present among all data sets (core) were identified and used to generate a meta-alignment using an in-house Perl script for downstream analysis. Alternatively a set of core gene alignments was generated using AMPHORA . Of the 31 gene sets generated, 3 core genes (rplB, rplD, rplL) were discarded from further analysis owing to the fact they were not all present in all WGS genomes undergoing analysis. Phylogenetic trees were generated for comparison of each analysis method; in-house core-SNP pipeline, AMPHORA, 16S rRNA, groEL, and rpoB using Phylogenetic estimation using Maximum Likelihood (PhyML 3.0)  with nucleotide substitution models LG. To assess the stability of the tree branching patterns in rpoB, cpn60 and 16S rRNA gene trees bootstrap analysis with 100 pseudoreplicates was performed using evolutionary models and tree building as described above. The in-house core-SNP tree was analyzed using the approximate likelihood ratio test, using a selection threshold of 0.8, which is comparable to bootstrap supports of 75% . The ratio of mean non-synonymous (dN) to synonymous substitutions (dS) per site (dN : dS ratio) within the two selected genes (rpoB,cpn60) was calculated using MEGA software using the Nei-Gojobori Method with Jukes Cantor correction [122, 123].
OrthoMCL  was used to identify orthologous gene groups from the proteomes of 66 sequenced Streptococcus strains (Additional file 2). Sequences and annotations for the WGS projects were obtained from the Broad Institute download site (http://broadinstitute.org/annotation/genome/Streptococcus_group/GenomesIndex.html) and all other fully annotated Streptococcus spp. were obtained from NCBI , excluding the seven in-house SAG. A BLASTP  e-value of 1e 10-5 and percent match length of 50% were used as cut-offs in orthoMCL  analysis. Signature genes for each strain were appended to the orthoMCL output, since it does not report signature genes, defined as genes that are only present in one strain. VENN diagrams for ortholog groups were generated using an in-house Perl script. VENN diagrams do not represent all genes from all genomes due to the fact that accessory genes are not included.
Clustered COG analysis
A comparative COG analysis was performed using a custom set of Perl scripts. One representative from each accessory gene group from the orthoMCL  output, as well as any non-orthologous predicted genes from each of the 66 Streptococcus genomes above, were blasted against the non-supervised Orthologous Groups (eggNOG) database . Significant BlastP  matches were filtered using the following criteria: 80% PID over a minimum of 80% of the protein length or proteins over 100 amino acids in length, with a minimum of 75% length and 30% PID. Matches were scored on a binary scale, wherein a significant match score was assigned a value of one; non-significant matches were assigned a zero. Accessory gene matches to each COG group were hierarchically column-clustered with distance correlation. The resulting dendrogram was represented as a heatmap generated in R (http://www.r-project.org) using heatmap.2 from gplots package (http://cran.r-project.org/web/packages/gplots/index.html) with each tile shade representing a functional COG category associated with the gene group. Gene shading of black indicates exclusion from the gene cluster. The row dendrogram of the heatmap was not hierarchically clustered and simply represents the order of the phylogenetic tree resulting from the in-house core-SNP analysis (described above). Gene groups with no significant hits or hits to proteins with no assigned functional category from the eggNOG database  were excluded from the heatmap representation depicted.
A list of the number of protein matches to the eggNOG database  matching a COG functional category for each of the 66 Streptococcus reference strains was generated. Significant BlastP hits were filtered using the following criteria: 80% PID over a minimum of 80% of the protein length, or proteins over 100 amino acids in length with a minimum of 50% length and 30% PID.
Genomes were scanned for presence of CRISPRs using the online program CRISPRFinder and CRISPRcompar [125, 126]. CRISPRs were classified based on their composition of CRISPR associated proteins . Spacers were identified within the CRISPR regions and BlastN was used to determine if there were matches to known mobile elements within the GenBank database.
Virulence analysis (Heatmaps)
A virulence database (VirDB) was constructed in-house using a combination of a literature search from public NCBI protein database  and Streptococcus specific virulence genes from the Virulence Factors of the Pathogenic Bacteria Database (VFDB), http://www.mgc.ac.cn/VFs/main.htm. The VirDB was made non-redundant using CD-HIT  with default values, and then curated manually to ensure that genes from the same operon were not collapsed into a single entry. The initial virulence database was comprised of 234 genes, but later reduced to 189 virulence-associated genes. A protein BLAST  was performed using cut-off scores of 35% percent identity (PID) and highest scoring pair (hsp) length of 50% for the longest genes. A similar database was constructed to specifically identify TCS and LPxTG proteins within SAG. Heatmaps were generated in R (http://www.r-project.org) using a modified version of heatmap.2 as described above. Genome order for presentation of data was predefined by the in-house core-SNP pipeline (described above). Gene order was organized alphabetically, although genes within an operon were clustered together. Virulence genes that did not have hits to any genomes were eliminated from heatmaps in Figure 5 showing SAG, S. sanguinis and S. gordonii.
This work was supported by a Genomics Research and Development Initiative project awarded to CRC. Operating grants from the Canadian Institute of Health Research (CIHR) and Cystic Fibrosis Canada awarded to MGS. MGS is supported as a Canada Research Chair in Interdisciplinary Microbiome Research. CDS was supported by an Alberta foundation for medical research studentship and a Canada Graduate Scholarship from CIHR. The authors would like to thank all members of PHAC Genomics Core facility for generation of sequence data and primer synthesis. The authors would also like to thank all members of PHAC Bioinformatics Core for input on bioinformatics questions. The views and opinions expressed herein are those of the authors only, and do not necessarily represent the views and opinions of the PHAC or the Government of Canada.
- Facklam R: What happened to the streptococci: overview of taxonomic and nomenclature changes. Clin Microbiol Rev. 2002, 15 (4): 613-630. 10.1128/CMR.15.4.613-630.2002.PubMed CentralPubMedGoogle Scholar
- Mitchell J: Streptococcus mitis: walking the line between commensalism and pathogenesis. Mol Oral Microbiol. 2011, 26 (2): 89-98. 10.1111/j.2041-1014.2010.00601.x.PubMedGoogle Scholar
- Tabata A, Nakano K, Ohkura K, Tomoyasu T, Kikuchi K, Whiley RA, Nagamune H: Novel twin streptolysin S-like peptides encoded in the sag operon homologue of beta-hemolytic Streptococcus anginosus. J Bacteriol. 2013, 195 (5): 1090-1099. 10.1128/JB.01344-12.PubMed CentralPubMedGoogle Scholar
- Sukeno A, Nagamune H, Whiley RA, Jafar SI, Aduse-Opoku J, Ohkura K, Maeda T, Hirota K, Miyake Y, Kourai H: Intermedilysin is essential for the invasion of hepatoma HepG2 cells by Streptococcus intermedius. Microbiol Immunol. 2005, 49 (7): 681-694. 10.1111/j.1348-0421.2005.tb03647.x.PubMedGoogle Scholar
- Coykendall AL, Wesbecher PM, Gustafson KB: “Streptococcus milleri,” Streptococcus constellatus, and Streptococcus intermedius are later synonyms of Streptococcus anginosus. Int J Syst Bacteriol. 1987, 37 (3): 222-228. 10.1099/00207713-37-3-222.Google Scholar
- Whiley RA, Hall LM, Hardie JM, Beighton D: A study of small-colony, beta-haemolytic, Lancefield group C streptococci within the anginosus group: description of Streptococcus constellatus subsp. pharyngis subsp. nov., associated with the human throat and pharyngitis. Int J Syst Bacteriol. 1999, 49 (4): 1443-1449. 10.1099/00207713-49-4-1443.PubMedGoogle Scholar
- Jensen A, Hoshino T, Kilian M: Taxonomy of the Anginosus group of the genus Streptococcus and description of Streptococcus anginosus subsp. whileyi subsp. nov. and Streptococcus constellatus subsp. viborgensis subsp. nov. Int J Syst Evol Microbiol. 2013, 63 (Pt 7): 2506-2519. 10.1099/ijs.0.043232-0.PubMedGoogle Scholar
- Gossling J: Occurrence and pathogenicity of the Streptococcus milleri group. Rev Infect Dis. 1988, 10 (2): 257-285. 10.1093/clinids/10.2.257.PubMedGoogle Scholar
- Grinwis ME, Sibley CD, Parkins MD, Eshaghurshan CS, Rabin HR, Surette MG: Characterization of Streptococcus milleri group isolates from expectorated sputum of adult patients with cystic fibrosis. J Clin Microbiol. 2010, 48 (2): 395-401. 10.1128/JCM.01807-09.PubMed CentralPubMedGoogle Scholar
- Whiley RA, Beighton D, Winstanley TG, Fraser HY, Hardie JM: Streptococcus intermedius, Streptococcus constellatus, and Streptococcus anginosus (the Streptococcus milleri group): association with different body sites and clinical infections. J Clin Microbiol. 1992, 30 (1): 243-244.PubMed CentralPubMedGoogle Scholar
- Yassin M, Yadavalli GK, Alvarado N, Bonomo RA: Streptococcus anginosus (Streptococcus milleri group) pyomyositis in a 50-year-Old Man with acquired immunodeficiency syndrome: case report and review of literature. Infection. 2010, 38 (1): 65-68. 10.1007/s15010-009-6002-9.PubMedGoogle Scholar
- Prasad KN, Mishra AM, Gupta D, Husain N, Husain M, Gupta RK: Analysis of microbial etiology and mortality in patients with brain abscess. J Infect. 2006, 53 (4): 221-227. 10.1016/j.jinf.2005.12.002.PubMedGoogle Scholar
- Sibley CD, Church DL, Surette MG, Dowd SE, Parkins MD: Pyrosequencing reveals the complex polymicrobial nature of invasive pyogenic infections: microbial constituents of empyema, liver abscess, and intracerebral abscess. Eur J Clin Microbiol Infect Dis. 2012, 31 (10): 2679-2691. 10.1007/s10096-012-1614-x.PubMedGoogle Scholar
- Sibley CD, Grinwis ME, Field TR, Parkins MD, Norgaard JC, Gregson DB, Rabin HR, Surette MG: McKay agar enables routine quantification of the ’Streptococcus milleri’ group in cystic fibrosis patients. J Med Microbiol. 2010, 59 (Pt 5): 534-540.PubMedGoogle Scholar
- Ahmed RA, Marrie TJ, Huang JQ: Thoracic empyema in patients with community-acquired pneumonia. Am J Med. 2006, 119 (10): 877-883. 10.1016/j.amjmed.2006.03.042.PubMedGoogle Scholar
- Sibley CD, Parkins MD, Rabin HR, Duan K, Norgaard JC, Surette MG: A polymicrobial perspective of pulmonary infections exposes an enigmatic pathogen in cystic fibrosis patients. Proc Natl Acad Sci USA. 2008, 105 (39): 15070-15075. 10.1073/pnas.0804326105.PubMed CentralPubMedGoogle Scholar
- Parkins MD, Sibley CD, Surette MG, Rabin HR: The Streptococcus milleri group–an unrecognized cause of disease in cystic fibrosis: a case series and literature review. Pediatr Pulmonol. 2008, 43 (5): 490-497. 10.1002/ppul.20809.PubMedGoogle Scholar
- Mitchell AM, Mitchell TJ: Streptococcus pneumoniae: virulence factors and variation. Clin Microbiol Infect. 2010, 16 (5): 411-418. 10.1111/j.1469-0691.2010.03183.x.PubMedGoogle Scholar
- Maisey HC, Doran KS, Nizet V: Recent advances in understanding the molecular basis of group B Streptococcus virulence. Expert Rev Mol Med. 2008, 10: e27-PubMed CentralPubMedGoogle Scholar
- Lindahl G, Stalhammar-Carlemalm M, Areschoug T: Surface proteins of Streptococcus agalactiae and related proteins in other bacterial pathogens. Clin Microbiol Rev. 2005, 18 (1): 102-127. 10.1128/CMR.18.1.102-127.2005.PubMed CentralPubMedGoogle Scholar
- Nagamune H, Ohnishi C, Katsuura A, Taoka Y, Fushitani K, Whiley RA, Yamashita K, Tsuji A, Matsuda Y, Maeda T, Korai H, Kitamura S: Intermedilysin. A cytolytic toxin specific for human cells of a Streptococcus intermedius isolated from human liver abscess. Adv Exp Med Biol. 1997, 418: 773-775. 10.1007/978-1-4899-1825-3_182.PubMedGoogle Scholar
- Jacobs JA, Schot CS, Schouls LM: Haemolytic activity of the Streptococcus milleri group’ and relationship between haemolysis restricted to human red blood cells and pathogenicity in S. intermedius. J Med Microbiol. 2000, 49 (1): 55-62.PubMedGoogle Scholar
- Pecharki D, Petersen FC, Scheie AA: LuxS and expression of virulence factors in Streptococcus intermedius. Oral Microbiol Immunol. 2008, 23 (1): 79-83.PubMedGoogle Scholar
- Toyoda K, Kusano N, Saito A: Pathogenicity of the Streptococcus milleri group in pulmonary infections–effect on phagocytic killing by human polymorphonuclear neutrophils. Kansenshogaku Zasshi. 1995, 69 (3): 308-315.PubMedGoogle Scholar
- Kanamori S, Kusano N, Shinzato T, Saito A: The role of the capsule of the Streptococcus milleri group in its pathogenicity. J Infect Chemother. 2004, 10 (2): 105-109. 10.1007/s10156-004-0305-7.PubMedGoogle Scholar
- Agren UM, Tammi M, Ryynanen M, Tammi R: Developmentally programmed expression of hyaluronan in human skin and its appendages. J Invest Dermatol. 1997, 109 (2): 219-224. 10.1111/1523-1747.ep12319412.PubMedGoogle Scholar
- Pecharki D, Petersen FC, Scheie AA: Role of hyaluronidase in Streptococcus intermedius biofilm. Microbiology. 2008, 154 (Pt 3): 932-938.PubMedGoogle Scholar
- Ajdic D, McShan WM, McLaughlin RE, Savic G, Chang J, Carson MB, Primeaux C, Tian R, Kenton S, Jia H, Lin S, Qian Y, Li S, Zhu H, Najar F, Lai H, White J, Roe BA, Ferretti JJ: Genome sequence of Streptococcus mutans UA159, a cariogenic dental pathogen. Proc Natl Acad Sci USA. 2002, 99 (22): 14434-14439. 10.1073/pnas.172501299.PubMed CentralPubMedGoogle Scholar
- Ferretti JJ, McShan WM, Ajdic D, Savic DJ, Savic G, Lyon K, Primeaux C, Sezate S, Suvorov AN, Kenton S, Lai HS, Lin SP, Qian Y, Jia HG, Najar FZ, Ren Q, Zhu H, Song L, White J, Yuan X, Clifton SW, Roe BA, McLaughlin R: Complete genome sequence of an M1 strain of Streptococcus pyogenes. Proc Natl Acad Sci USA. 2001, 98 (8): 4658-4663. 10.1073/pnas.071559398.PubMed CentralPubMedGoogle Scholar
- Holden MT, Hauser H, Sanders M, Ngo TH, Cherevach I, Cronin A, Goodhead I, Mungall K, Quail MA, Price C, Rabbinowitsch E, Sharp S, Croucher NJ, Chieu TB, Mai NT, Diep TS, Chinh NT, Kehoe M, Leigh JA, Ward PN, Dowson CG, Whatmore AM, Chanter N, Iversen P, Gottschalk M, Slater JD, Smith HE, Spratt BG, Xu J, Ye C, Bentley S, Barrell BG, Schultsz C, Maskell DJ, Parkhill J: Rapid evolution of virulence and drug resistance in the emerging zoonotic pathogen Streptococcus suis. PLoS One. 2009, 4 (7): e6072-10.1371/journal.pone.0006072.PubMed CentralPubMedGoogle Scholar
- Xu P, Alves JM, Kitten T, Brown A, Chen Z, Ozaki LS, Manque P, Ge X, Serrano MG, Puiu D, Hendricks S, Wang Y, Chaplin MD, Akan D, Paik S, Peterson DL, Macrina FL, Buck GA: Genome of the opportunistic pathogen Streptococcus sanguinis. J Bacteriol. 2007, 189 (8): 3166-3175. 10.1128/JB.01808-06.PubMed CentralPubMedGoogle Scholar
- Beres SB, Sylva GL, Barbian KD, Lei B, Hoff JS, Mammarella ND, Liu MY, Smoot JC, Porcella SF, Parkins LD, Campbell DS, Smith TM, McCormick JK, Leung DY, Schlievert PM, Musser JM: Genome sequence of a serotype M3 strain of group A Streptococcus: phage-encoded toxins, the high-virulence phenotype, and clone emergence. Proc Natl Acad Sci USA. 2002, 99 (15): 10078-10083. 10.1073/pnas.152298499.PubMed CentralPubMedGoogle Scholar
- Smoot JC, Barbian KD, Van Gompel JJ, Smoot LM, Chaussee MS, Sylva GL, Sturdevant DE, Ricklefs SM, Porcella SF, Parkins LD, Beres SB, Campbell DS, Smith TM, Zhang Q, Kapur V, Daly JA, Veasy LG, Musser JM: Genome sequence and comparative microarray analysis of serotype M18 group A Streptococcus strains associated with acute rheumatic fever outbreaks. Proc Natl Acad Sci USA. 2002, 99 (7): 4668-4673. 10.1073/pnas.062526099.PubMed CentralPubMedGoogle Scholar
- Nakagawa I, Kurokawa K, Yamashita A, Nakata M, Tomiyasu Y, Okahashi N, Kawabata S, Yamazaki K, Shiba T, Yasunaga T, Hayashi H, Hattori M, Hamada S: Genome sequence of an M3 strain of Streptococcus pyogenes reveals a large-scale genomic rearrangement in invasive strains and new insights into phage evolution. Genome Res. 2003, 13 (6A): 1042-1055.PubMed CentralPubMedGoogle Scholar
- Guedon E, Delorme C, Pons N, Cruaud C, Loux V, Couloux A, Gautier C, Sanchez N, Layec S, Galleron N, Almeida M, van de Guchte M, Kennedy SP, Ehrlich SD, Gibrat JF, Wincker P, Renault P: Complete genome sequence of the commensal Streptococcus salivarius strain JIM8777. J Bacteriol. 2011, 193 (18): 5024-5025. 10.1128/JB.05390-11.PubMed CentralPubMedGoogle Scholar
- Lerat E, Ochman H: Recognizing the pseudogenes in bacterial genomes. Nucleic Acids Res. 2005, 33 (10): 3125-3132. 10.1093/nar/gki631.PubMed CentralPubMedGoogle Scholar
- Wu M, Eisen JA: A simple, fast, and accurate method of phylogenomic inference. Genome Biol. 2008, 9 (10): R151-10.1186/gb-2008-9-10-r151.PubMed CentralPubMedGoogle Scholar
- Glazunova OO, Raoult D, Roux V: Streptococcus massiliensis sp. nov., isolated from a patient blood culture. Int J Syst Evol Microbiol. 2006, 56 (Pt 5): 1127-1131.PubMedGoogle Scholar
- Schouls LM, Schot CS, Jacobs JA: Horizontal transfer of segments of the 16S rRNA genes between species of the Streptococcus anginosus group. J Bacteriol. 2003, 185 (24): 7241-7246. 10.1128/JB.185.24.7241-7246.2003.PubMed CentralPubMedGoogle Scholar
- Chen F, Mackey AJ, Stoeckert CJ, Roos DS: OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups. Nucleic Acids Res. 2006, 34 (Database issue): D363-D368.PubMed CentralPubMedGoogle Scholar
- Suzuki H, Lefebure T, Hubisz MJ, Pavinski Bitar P, Lang P, Siepel A, Stanhope MJ: Comparative genomic analysis of the Streptococcus dysgalactiae species group: gene content, molecular adaptation, and promoter evolution. Genome Biol Evol. 2011, 3: 168-185. 10.1093/gbe/evr006.PubMed CentralPubMedGoogle Scholar
- Lefebure T, Stanhope MJ: Evolution of the core and pan-genome of Streptococcus: positive selection, recombination, and genome composition. Genome Biol. 2007, 8 (5): R71-10.1186/gb-2007-8-5-r71.PubMed CentralPubMedGoogle Scholar
- Rasmussen TB, Danielsen M, Valina O, Garrigues C, Johansen E, Pedersen MB: Streptococcus thermophilus core genome: comparative genome hybridization study of 47 strains. Appl Environ Microbiol. 2008, 74 (15): 4703-4710. 10.1128/AEM.00132-08.PubMed CentralPubMedGoogle Scholar
- Ding F, Tang P, Hsu MH, Cui P, Hu S, Yu J, Chiu CH: Genome evolution driven by host adaptations results in a more virulent and antimicrobial-resistant Streptococcus pneumoniae serotype 14. BMC Genomics. 2009, 10: 158-10.1186/1471-2164-10-158.PubMed CentralPubMedGoogle Scholar
- Tettelin H, Masignani V, Cieslewicz MJ, Donati C, Medini D, Ward NL, Angiuoli SV, Crabtree J, Jones AL, Durkin AS, Deboy RT, Davidsen TM, Mora M, Scarselli M, Margarit y Ros I, Peterson JD, Hauser CR, Sundaram JP, Nelson WC, Madupu R, Brinkac LM, Dodson RJ, Rosovitz MJ, Sullivan SA, Daugherty SC, Haft DH, Selengut J, Gwinn ML, Zhou L, Zafar N, et al: Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial “pan-genome”. Proc Natl Acad Sci USA. 2005, 102 (39): 13950-13955. 10.1073/pnas.0506758102.PubMed CentralPubMedGoogle Scholar
- Zhang L, Foxman B, Drake DR, Srinivasan U, Henderson J, Olson B, Marrs CF, Warren JJ, Marazita ML: Comparative whole-genome analysis of Streptococcus mutans isolates within and among individuals of different caries status. Oral Microbiol Immunol. 2009, 24 (3): 197-203. 10.1111/j.1399-302X.2008.00495.x.PubMed CentralPubMedGoogle Scholar
- Takao A, Nagamune H, Maeda N: Sialidase of Streptococcus intermedius: a putative virulence factor modifying sugar chains. Microbiol Immunol. 2010, 54 (10): 584-595.PubMedGoogle Scholar
- Christie J, McNab R, Jenkinson HF: Expression of fibronectin-binding protein FbpA modulates adhesion in Streptococcus gordonii. Microbiology. 2002, 148 (Pt 6): 1615-1625.PubMedGoogle Scholar
- Zhang Q, Ma Q, Su D, Li Q, Yao W, Wang C: Identification of horizontal gene transfer and recombination of PsaA gene in streptococcus mitis group. Microbiol Immunol. 2010, 54 (6): 313-319. 10.1111/j.1348-0421.2010.00216.x.PubMedGoogle Scholar
- Spellerberg B, Rozdzinski E, Martin S, Weber-Heynemann J, Schnitzler N, Lutticken R, Podbielski A: Lmb, a protein with similarities to the LraI adhesin family, mediates attachment of Streptococcus agalactiae to human laminin. Infect Immun. 1999, 67 (2): 871-878.PubMed CentralPubMedGoogle Scholar
- Hytonen J, Haataja S, Finne J: Streptococcus pyogenes glycoprotein-binding strepadhesin activity is mediated by a surface-associated carbohydrate-degrading enzyme, pullulanase. Infect Immun. 2003, 71 (2): 784-793. 10.1128/IAI.71.2.784-793.2003.PubMed CentralPubMedGoogle Scholar
- Pancholi V, Fischetti VA: Alpha-Enolase, a Novel Strong Plasmin(ogen) Binding Protein on the Surface of Pathogenic Streptococci. J Biol Chem. 1998, 273 (23): 14503-14515. 10.1074/jbc.273.23.14503.PubMedGoogle Scholar
- Hidalgo-Grass C, Ravins M, Dan-Goor M, Jaffe J, Moses AE, Hanski E: A locus of group A Streptococcus involved in invasive disease and DNA transfer. Mol Microbiol. 2002, 46 (1): 87-99. 10.1046/j.1365-2958.2002.03127.x.PubMedGoogle Scholar
- Morona JK, Morona R, Paton JC: Comparative genetics of capsular polysaccharide biosynthesis in Streptococcus pneumoniae types belonging to serogroup 19. J Bacteriol. 1999, 181 (17): 5355-5364.PubMed CentralPubMedGoogle Scholar
- Spellerberg B, Pohl B, Haase G, Martin S, Weber-Heynemann J, Lutticken R: Identification of genetic determinants for the hemolytic activity of Streptococcus agalactiae by ISS1 transposition. J Bacteriol. 1999, 181 (10): 3212-3219.PubMed CentralPubMedGoogle Scholar
- Crater DL, Dougherty BA, van de Rijn I: Molecular characterization of hasC from an operon required for hyaluronic acid synthesis in group A streptococci. Demonstration of UDP-glucose pyrophosphorylase activity. J Biol Chem. 1995, 270 (48): 28676-28680. 10.1074/jbc.270.48.28676.PubMedGoogle Scholar
- Jiang SM, Ishmael N, Dunning Hotopp J, Puliti M, Tissi L, Kumar N, Cieslewicz MJ, Tettelin H, Wessels MR: Variation in the group B Streptococcus CsrRS regulon and effects on pathogenicity. J Bacteriol. 2008, 190 (6): 1956-1965. 10.1128/JB.01677-07.PubMed CentralPubMedGoogle Scholar
- Upton M, Tagg JR, Wescombe P, Jenkinson HF: Intra- and interspecies signaling between Streptococcus salivarius and Streptococcus pyogenes mediated by SalA and SalA1 lantibiotic peptides. J Bacteriol. 2001, 183 (13): 3931-3938. 10.1128/JB.183.13.3931-3938.2001.PubMed CentralPubMedGoogle Scholar
- Madureira P, Baptista M, Vieira M, Magalhaes V, Camelo A, Oliveira L, Ribeiro A, Tavares D, Trieu-Cuot P, Vilanova M, Ferreira P: Streptococcus agalactiae GAPDH is a virulence-associated immunomodulatory protein. J Immunol. 2007, 178 (3): 1379-1387.PubMedGoogle Scholar
- Smith HE, de Vries R, van’t Slot R, Smits MA: The cps locus of Streptococcus suis serotype 2: genetic determinant for the synthesis of sialic acid. Microb Pathog. 2000, 29 (2): 127-134. 10.1006/mpat.2000.0372.PubMedGoogle Scholar
- Jensch I, Gamez G, Rothe M, Ebert S, Fulde M, Somplatzki D, Bergmann S, Petruschka L, Rohde M, Nau R, Hammerschmidt S: PavB is a surface-exposed adhesin of Streptococcus pneumoniae contributing to nasopharyngeal colonization and airways infections. Mol Microbiol. 2010, 77 (1): 22-43. 10.1111/j.1365-2958.2010.07189.x.PubMedGoogle Scholar
- Sumitomo T, Nakata M, Higashino M, Jin Y, Terao Y, Fujinaga Y, Kawabata S: Streptolysin S contributes to group A streptococcal translocation across an epithelial barrier. J Biol Chem. 2011, 286 (4): 2750-2761. 10.1074/jbc.M110.171504.PubMed CentralPubMedGoogle Scholar
- Sawyer RT, Drevets DA, Campbell PA, Potter TA: Internalin A can mediate phagocytosis of Listeria monocytogenes by mouse macrophage cell lines. J Leukoc Biol. 1996, 60 (5): 603-610.PubMedGoogle Scholar
- Reid SD, Montgomery AG, Voyich JM, DeLeo FR, Lei B, Ireland RM, Green NM, Liu M, Lukomski S, Musser JM: Characterization of an extracellular virulence factor made by group A Streptococcus with homology to the Listeria monocytogenes internalin family of proteins. Infect Immun. 2003, 71 (12): 7043-7052. 10.1128/IAI.71.12.7043-7052.2003.PubMed CentralPubMedGoogle Scholar
- Nightingale KK, Windham K, Martin KE, Yeung M, Wiedmann M: Select Listeria monocytogenes subtypes commonly found in foods carry distinct nonsense mutations in inlA, leading to expression of truncated and secreted internalin A, and are associated with a reduced invasion phenotype for human intestinal epithelial cells. Appl Environ Microbiol. 2005, 71 (12): 8764-8772. 10.1128/AEM.71.12.8764-8772.2005.PubMed CentralPubMedGoogle Scholar
- Croucher NJ, Harris SR, Fraser C, Quail MA, Burton J, van der Linden M, McGee L, von Gottberg A, Song JH, Ko KS, Pichon B, Baker S, Parry CM, Lambertsen LM, Shahinas D, Pillai DR, Mitchell TJ, Dougan G, Tomasz A, Klugman KP, Parkhill J, Hanage WP, Bentley SD: Rapid pneumococcal evolution in response to clinical interventions. Science. 2011, 331 (6016): 430-434. 10.1126/science.1198545.PubMed CentralPubMedGoogle Scholar
- Navarre WW, Schneewind O: Surface proteins of gram-positive bacteria and mechanisms of their targeting to the cell wall envelope. Microbiol Mol Biol Rev. 1999, 63 (1): 174-229.PubMed CentralPubMedGoogle Scholar
- Lalioui L, Pellegrini E, Dramsi S, Baptista M, Bourgeois N, Doucet-Populaire F, Rusniok C, Zouine M, Glaser P, Kunst F, Poyart C, Trieu-Cuot P: The SrtA Sortase of Streptococcus agalactiae is required for cell wall anchoring of proteins containing the LPXTG motif, for adhesion to epithelial cells, and for colonization of the mouse intestine. Infect Immun. 2005, 73 (6): 3342-3350. 10.1128/IAI.73.6.3342-3350.2005.PubMed CentralPubMedGoogle Scholar
- Cozzi R, Prigozhin D, Rosini R, Abate F, Bottomley MJ, Grandi G, Telford JL, Rinaudo CD, Maione D, Alber T: Structural basis for group B streptococcus pilus 1 sortases C regulation and specificity. PLoS One. 2012, 7 (11): e49048-10.1371/journal.pone.0049048.PubMed CentralPubMedGoogle Scholar
- Tettelin H, Masignani V, Cieslewicz MJ, Eisen JA, Peterson S, Wessels MR, Paulsen IT, Nelson KE, Margarit I, Read TD, Madoff LC, Wolf AM, Beanan MJ, Brinkac LM, Daugherty SC, DeBoy RT, Durkin AS, Kolonay JF, Madupu R, Lewis MR, Radune D, Fedorova NB, Scanlan D, Khouri H, Mulligan S, Carty HA, Cline RT, Van Aken SE, Gill J, Scarselli M, Mora M, Iacobini ET, Brettoni C, Galli G, Mariani M, Vegni F, Maione D, Rinaudo D, Rappuoli R, Telford JL, Kasper DL, Grandi G, Fraser CM: Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae. Proc Natl Acad Sci USA. 2002, 99 (19): 12391-12396. 10.1073/pnas.182380799.PubMed CentralPubMedGoogle Scholar
- Halfmann A, Schnorpfeil A, Muller M, Marx P, Gunzler U, Hakenbeck R, Bruckner R: Activity of the two-component regulatory system CiaRH in Streptococcus pneumoniae R6. J Mol Microbiol Biotechnol. 2011, 20 (2): 96-104. 10.1159/000324893.PubMedGoogle Scholar
- Hoch JA: Two-component and phosphorelay signal transduction. Curr Opin Microbiol. 2000, 3 (2): 165-170. 10.1016/S1369-5274(00)00070-9.PubMedGoogle Scholar
- Lange R, Wagner C, de Saizieu A, Flint N, Molnos J, Stieger M, Caspers P, Kamber M, Keck W, Amrein KE: Domain organization and molecular characterization of 13 two-component systems identified by genome sequencing of Streptococcus pneumoniae. Gene. 1999, 237 (1): 223-234. 10.1016/S0378-1119(99)00266-8.PubMedGoogle Scholar
- Paterson GK, Blue CE, Mitchell TJ: Role of two-component systems in the virulence of Streptococcus pneumoniae. J Med Microbiol. 2006, 55 (Pt 4): 355-363.PubMedGoogle Scholar
- Tettelin H, Nelson KE, Paulsen IT, Eisen JA, Read TD, Peterson S, Heidelberg J, DeBoy RT, Haft DH, Dodson RJ, Durkin AS, Gwinn M, Kolonay JF, Nelson WC, Peterson JD, Umayam LA, White O, Salzberg SL, Lewis MR, Radune D, Holtzapple E, Khouri H, Wolf AM, Utterback TR, Hansen CL, McDonald LA, Feldblyum TV, Angiuoli S, Dickinson T, Hickey EK, Holt IE, Loftus BJ, Yang F, Smith HO, Venter JC, Dougherty BA, Morrison DA, Hollingshead SK, Fraser CM: Complete genome sequence of a virulent isolate of Streptococcus pneumoniae. Science. 2001, 293 (5529): 498-506. 10.1126/science.1061217.PubMedGoogle Scholar
- Biswas I, Drake L, Erkina D, Biswas S: Involvement of sensor kinases in the stress tolerance response of Streptococcus mutans. J Bacteriol. 2008, 190 (1): 68-77. 10.1128/JB.00990-07.PubMed CentralPubMedGoogle Scholar
- Tremblay YD, Lo H, Li YH, Halperin SA, Lee SF: Expression of the Streptococcus mutans essential two-component regulatory system VicRK is pH and growth-phase dependent and controlled by the LiaFSR three-component regulatory system. Microbiology. 2009, 155 (Pt 9): 2856-2865.PubMedGoogle Scholar
- Throup JP, Koretke KK, Bryant AP, Ingraham KA, Chalker AF, Ge Y, Marra A, Wallis NG, Brown JR, Holmes DJ, Rosenberg M, Burnham MK: A genomic analysis of two-component signal transduction in Streptococcus pneumoniae. Mol Microbiol. 2000, 35 (3): 566-576.PubMedGoogle Scholar
- McKessar SJ, Hakenbeck R: The two-component regulatory system TCS08 is involved in cellobiose metabolism of Streptococcus pneumoniae R6. J Bacteriol. 2007, 189 (4): 1342-1350. 10.1128/JB.01170-06.PubMed CentralPubMedGoogle Scholar
- Haft DH, Selengut J, Mongodin EF, Nelson KE: A guild of 45 CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes. PLoS Comput Biol. 2005, 1 (6): e60-10.1371/journal.pcbi.0010060.PubMed CentralPubMedGoogle Scholar
- Makarova KS, Haft DH, Barrangou R, Brouns SJ, Charpentier E, Horvath P, Moineau S, Mojica FJ, Wolf YI, Yakunin AF, van der Oost J, Koonin EV: Evolution and classification of the CRISPR-Cas systems. Nat Rev Microbiol. 2011, 9 (6): 467-477. 10.1038/nrmicro2577.PubMedGoogle Scholar
- Horvath P, Coute-Monvoisin AC, Romero DA, Boyaval P, Fremaux C, Barrangou R: Comparative analysis of CRISPR loci in lactic acid bacteria genomes. Int J Food Microbiol. 2009, 131 (1): 62-70. 10.1016/j.ijfoodmicro.2008.05.030.PubMedGoogle Scholar
- Lopez-Sanchez MJ, Sauvage E, Da Cunha V, Clermont D, Ratsima Hariniaina E, Gonzalez-Zorn B, Poyart C, Rosinski-Chupin I, Glaser P: The highly dynamic CRISPR1 system of Streptococcus agalactiae controls the diversity of its mobilome. Mol Microbiol. 2012, 85 (6): 1057-1071. 10.1111/j.1365-2958.2012.08172.x.PubMedGoogle Scholar
- van der Ploeg JR: Analysis of CRISPR in Streptococcus mutans suggests frequent occurrence of acquired immunity against infection by M102-like bacteriophages. Microbiology. 2009, 155 (Pt 6): 1966-1976.PubMedGoogle Scholar
- Deveau H, Barrangou R, Garneau JE, Labonte J, Fremaux C, Boyaval P, Romero DA, Horvath P, Moineau S: Phage response to CRISPR-encoded resistance in Streptococcus thermophilus. J Bacteriol. 2008, 190 (4): 1390-1400. 10.1128/JB.01412-07.PubMed CentralPubMedGoogle Scholar
- Croucher NJ, Vernikos GS, Parkhill J, Bentley SD: Identification, variation and transcription of pneumococcal repeat sequences. BMC Genomics. 2011, 12: 120-10.1186/1471-2164-12-120.PubMed CentralPubMedGoogle Scholar
- Hauser R, Sabri M, Moineau S, Uetz P: The proteome and interactome of Streptococcus pneumoniae phage Cp-1. J Bacteriol. 2011, 193 (12): 3135-3138. 10.1128/JB.01481-10.PubMed CentralPubMedGoogle Scholar
- Diaz E, Lopez R, Garcia JL: EJ-1, a temperate bacteriophage of Streptococcus pneumoniae with a Myoviridae morphotype. J Bacteriol. 1992, 174 (17): 5516-5525.PubMed CentralPubMedGoogle Scholar
- Ikebe T, Wada A, Inagaki Y, Sugama K, Suzuki R, Tanaka D, Tamaru A, Fujinaga Y, Abe Y, Shimizu Y, Watanabe H, Working Group for Group A Streptococcus in Japan: Dissemination of the phage-associated novel superantigen gene speL in recent invasive and noninvasive Streptococcus pyogenes M3/T3 isolates in Japan. Infect Immun. 2002, 70 (6): 3227-3233. 10.1128/IAI.70.6.3227-3233.2002.PubMed CentralPubMedGoogle Scholar
- Nelson D, Schuch R, Zhu S, Tscherne DM, Fischetti VA: Genomic sequence of C1, the first streptococcal phage. J Bacteriol. 2003, 185 (11): 3325-3332. 10.1128/JB.185.11.3325-3332.2003.PubMed CentralPubMedGoogle Scholar
- van der Ploeg JR: Genome sequence of the temperate bacteriophage PH10 from Streptococcus oralis. Virus Genes. 2010, 41 (3): 450-458. 10.1007/s11262-010-0525-0.PubMedGoogle Scholar
- van der Ploeg JR: Characterization of Streptococcus gordonii prophage PH15: complete genome sequence and functional analysis of phage-encoded integrase and endolysin. Microbiology. 2008, 154 (Pt 10): 2970-2978.PubMedGoogle Scholar
- Rusniok C, Couve E, Da Cunha V, El Gana R, Zidane N, Bouchier C, Poyart C, Leclercq R, Trieu-Cuot P, Glaser P: Genome sequence of Streptococcus gallolyticus: insights into its adaptation to the bovine rumen and its ability to cause endocarditis. J Bacteriol. 2010, 192 (8): 2266-2276. 10.1128/JB.01659-09.PubMed CentralPubMedGoogle Scholar
- Johnsborg O, Havarstein LS: Pneumococcal LytR, a protein from the LytR-CpsA-Psr family, is essential for normal septum formation in Streptococcus pneumoniae. J Bacteriol. 2009, 191 (18): 5859-5864. 10.1128/JB.00724-09.PubMed CentralPubMedGoogle Scholar
- Claverys JP, Martin B: Bacterial “competence” genes: signatures of active transformation, or only remnants?. Trends Microbiol. 2003, 11 (4): 161-165. 10.1016/S0966-842X(03)00064-7.PubMedGoogle Scholar
- Peterson SN, Sung CK, Cline R, Desai BV, Snesrud EC, Luo P, Walling J, Li H, Mintz M, Tsegaye G, Burr PC, Do Y, Ahn S, Gilbert J, Fleischmann RD, Morrison DA: Identification of competence pheromone responsive genes in Streptococcus pneumoniae by use of DNA microarrays. Mol Microbiol. 2004, 51 (4): 1051-1070. 10.1046/j.1365-2958.2003.03907.x.PubMedGoogle Scholar
- Beres SB, Sesso R, Pinto SW, Hoe NP, Porcella SF, Deleo FR, Musser JM: Genome sequence of a Lancefield group C Streptococcus zooepidemicus strain causing epidemic nephritis: new information about an old disease. PLoS One. 2008, 3 (8): e3026-10.1371/journal.pone.0003026.PubMed CentralPubMedGoogle Scholar
- Klein MI, Bang S, Florio FM, Hofling JF, Goncalves RB, Smith DJ, Mattos-Graner RO: Genetic diversity of competence gene loci in clinical genotypes of Streptococcus mutans. J Clin Microbiol. 2006, 44 (8): 3015-3020. 10.1128/JCM.02024-05.PubMed CentralPubMedGoogle Scholar
- Canals R, Xia XQ, Fronick C, Clifton SW, Ahmer BM, Andrews-Polymenis HL, Porwollik S, McClelland M: High-throughput comparison of gene fitness among related bacteria. BMC Genomics. 2012, 13: 212-2164. 10.1186/1471-2164-13-212.PubMed CentralPubMedGoogle Scholar
- Hone DM, Harris AM, Chatfield S, Dougan G, Levine MM: Construction of genetically defined double aro mutants of Salmonella typhi. Vaccine. 1991, 9 (11): 810-816. 10.1016/0264-410X(91)90218-U.PubMedGoogle Scholar
- Sebkova A, Karasova D, Crhanova M, Budinska E, Rychlik I: aro mutations in Salmonella enterica cause defects in cell wall and outer membrane integrity. J Bacteriol. 2008, 190 (9): 3155-3160. 10.1128/JB.00053-08.PubMed CentralPubMedGoogle Scholar
- Palmer GC, Palmer KL, Jorth PA, Whiteley M: Characterization of the Pseudomonas aeruginosa transcriptional response to phenylalanine and tyrosine. J Bacteriol. 2010, 192 (11): 2722-2728. 10.1128/JB.00112-10.PubMed CentralPubMedGoogle Scholar
- Chen JY, Fung CP, Chang FY, Huang LY, Chang JC, Siu LK: Mutations of the rpoB gene in rifampicin-resistant Streptococcus pneumoniae in Taiwan. J Antimicrob Chemother. 2004, 53 (2): 375-378. 10.1093/jac/dkh073.PubMedGoogle Scholar
- Badger JH, Olsen GJ: CRITICA: coding region identification tool invoking comparative analysis. Mol Biol Evol. 1999, 16 (4): 512-524. 10.1093/oxfordjournals.molbev.a026133.PubMedGoogle Scholar
- Delcher AL, Bratke KA, Powers EC, Salzberg SL: Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics. 2007, 23 (6): 673-679. 10.1093/bioinformatics/btm009.PubMed CentralPubMedGoogle Scholar
- Lagesen K, Hallin P, Rodland EA, Staerfeldt HH, Rognes T, Ussery DW: RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res. 2007, 35 (9): 3100-3108. 10.1093/nar/gkm160.PubMed CentralPubMedGoogle Scholar
- Lowe TM, Eddy SR: tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997, 25 (5): 955-964.PubMed CentralPubMedGoogle Scholar
- Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL: BLAST+: architecture and applications. BMC Bioinforma. 2009, 10: 421-10.1186/1471-2105-10-421.Google Scholar
- Sayers EW, Barrett T, Benson DA, Bolton E, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Federhen S, Feolo M, Fingerman IM, Geer LY, Helmberg W, Kapustin Y, Landsman D, Lipman DJ, Lu Z, Madden TL, Madej T, Maglott DR, Marchler-Bauer A, Miller V, Mizrachi I, Ostell J, Panchenko A, Phan L, Pruitt KD, Schuler GD, et al: Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2011, 39 (Database issue): D38-D51.PubMed CentralPubMedGoogle Scholar
- Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M: KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 1999, 27 (1): 29-34. 10.1093/nar/27.1.29.PubMed CentralPubMedGoogle Scholar
- UniProt Consortium: The Universal Protein Resource (UniProt) in 2010. Nucleic Acids Res. 2010, 38 (Database issue): D142-D148.Google Scholar
- Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA: The COG database: an updated version includes eukaryotes. BMC Bioinforma. 2003, 4: 41-10.1186/1471-2105-4-41.Google Scholar
- Marchler-Bauer A, Lu S, Anderson JB, Chitsaz F, Derbyshire MK, DeWeese-Scott C, Fong JH, Geer LY, Geer RC, Gonzales NR, Gwadz M, Hurwitz DI, Jackson JD, Ke Z, Lanczycki CJ, Lu F, Marchler GH, Mullokandov M, Omelchenko MV, Robertson CL, Song JS, Thanki N, Yamashita RA, Zhang D, Zhang N, Zheng C, Bryant SH: CDD: a Conserved Domain Database for the functional annotation of proteins. Nucleic Acids Res. 2011, 39 (Database issue): D225-D229.PubMed CentralPubMedGoogle Scholar
- Selengut JD, Haft DH, Davidsen T, Ganapathy A, Gwinn-Giglio M, Nelson WC, Richter AR, White O: TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes. Nucleic Acids Res. 2007, 35 (Database issue): D260-D264.PubMed CentralPubMedGoogle Scholar
- Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL, Gunasekaran P, Ceric G, Forslund K, Holm L, Sonnhammer EL, Eddy SR, Bateman A: The Pfam protein families database. Nucleic Acids Res. 2010, 38 (Database issue): D211-D222.PubMed CentralPubMedGoogle Scholar
- Rice P, Longden I, Bleasby A: EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 2000, 16 (6): 276-277. 10.1016/S0168-9525(00)02024-2.PubMedGoogle Scholar
- Moller S, Croning MD, Apweiler R: Evaluation of methods for the prediction of membrane spanning regions. Bioinformatics. 2001, 17 (7): 646-653. 10.1093/bioinformatics/17.7.646.PubMedGoogle Scholar
- Bendtsen JD, Nielsen H, von Heijne G, Brunak S: Improved prediction of signal peptides: SignalP 3.0. J Mol Biol. 2004, 340 (4): 783-795. 10.1016/j.jmb.2004.05.028.PubMedGoogle Scholar
- Petkau A, Stuart-Edwards M, Stothard P, Van Domselaar G: Interactive microbial genome visualization with GView. Bioinformatics. 2010, 26 (24): 3125-3126. 10.1093/bioinformatics/btq588.PubMed CentralPubMedGoogle Scholar
- Darling AE, Mau B, Perna NT: progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One. 2010, 5 (6): e11147-10.1371/journal.pone.0011147.PubMed CentralPubMedGoogle Scholar
- Criscuolo A: morePhyML: Improving the phylogenetic tree space exploration with PhyML 3. Mol Phylogenet Evol. 2011, 61 (3): 944-948. 10.1016/j.ympev.2011.08.029.PubMedGoogle Scholar
- Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S: MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011, 28 (10): 2731-2739. 10.1093/molbev/msr121.PubMed CentralPubMedGoogle Scholar
- Nei M, Gojobori T: Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol. 1986, 3 (5): 418-426.PubMedGoogle Scholar
- Muller J, Szklarczyk D, Julien P, Letunic I, Roth A, Kuhn M, Powell S, von Mering C, Doerks T, Jensen LJ, Bork P: eggNOG v2.0: extending the evolutionary genealogy of genes with enhanced non-supervised orthologous groups, species and functional annotations. Nucleic Acids Res. 2010, 38 (Database issue): D190-D195.PubMed CentralPubMedGoogle Scholar
- Grissa I, Vergnaud G, Pourcel C: CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats. Nucleic Acids Res. 2007, 35 (Web Server issue): W52-W57.PubMed CentralPubMedGoogle Scholar
- Grissa I, Vergnaud G, Pourcel C: CRISPRcompar: a website to compare clustered regularly interspaced short palindromic repeats. Nucleic Acids Res. 2008, 36 (Web Server issue): W145-W148.PubMed CentralPubMedGoogle Scholar
- Yang J, Chen L, Sun L, Yu J, Jin Q: VFDB 2008 release: an enhanced web-based resource for comparative pathogenomics. Nucleic Acids Res. 2008, 36 (Database issue): D539-D542.PubMed CentralPubMedGoogle Scholar
- Li W, Godzik A: Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006, 22 (13): 1658-1659. 10.1093/bioinformatics/btl158.PubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.