Skip to main content

Pangenome analysis and virulence profiling of Streptococcus intermedius



Streptococcus intermedius, a member of the S. anginosus group, is a commensal bacterium present in the normal microbiota of human mucosal surfaces of the oral, gastrointestinal, and urogenital tracts. However, it has been associated with various infections such as liver and brain abscesses, bacteremia, osteo-articular infections, and endocarditis. Since 2005, high throughput genome sequencing methods enabled understanding the genetic landscape and diversity of bacteria as well as their pathogenic role. Here, in order to determine whether specific virulence genes could be related to specific clinical manifestations, we compared the genomes from 27 S. intermedius strains isolated from patients with various types of infections, including 13 that were sequenced in our institute and 14 available in GenBank.


We estimated the theoretical pangenome size to be of 4,020 genes, including 1,355 core genes, 1,054 strain-specific genes and 1,611 accessory genes shared by 2 or more strains. The pangenome analysis demonstrated that the genomic diversity of S. intermedius represents an “open” pangenome model. We identified a core virulome of 70 genes and 78 unique virulence markers. The phylogenetic clusters based upon core-genome sequences and SNPs were independent from disease types and sample sources. However, using Principal Component analysis based on presence/ absence of virulence genes, we identified the sda histidine kinase, adhesion protein LAP and capsular polysaccharide biosynthesis protein cps4E as being associated to brain abscess or broncho-pulmonary infection. In contrast, liver and abdominal abscess were associated to presence of the fibronectin binding protein fbp54 and capsular polysaccharide biosynthesis protein cap8D and cpsB.


Based on the virulence gene content of 27 S. intermedius strains causing various diseases, we identified putative disease-specific genetic profiles discriminating those causing brain abscess or broncho-pulmonary infection from those causing liver and abdominal abscess. These results provide an insight into S. intermedius pathogenesis and highlights putative targets in a diagnostic perspective.

Peer Review reports


Streptococcus intermedius belongs to the S. anginosus group (SAG) that also includes S. constellatus and S. anginosus [1]. It is part of the normal oral cavity and upper respiratory tract floras, as well as those of the gastrointestinal and female urogenital tracts [2,3,4,5]. This bacterium was first described by Guthof in 1956 after being isolated from dental abscesses [6]. S. intermedius may also cause human infections, usually monomicrobial, including purulent abscesses of the liver, lungs, psoas, spine and/or central nervous system, and infective endocarditis [7]. Over the years, the role of S. intermedius in human infections has increasingly been reported. Patients with invasive S. intermedius infections were described to cause significantly longer hospital stays and higher mortality than patients with other S. anginosus group infections, suggesting that identifying this species might be important for the management of patients [8].

Various putative virulence factors have been described for Streptococcus intermedius, among which the ability to form biofilms to protect itself from antibiotics and the host immune system [9], the production of hydrolytic enzymes, including both glycosaminoglycan-degrading enzymes, such as hyaluronidase and chrondroitin sulphate depolymerase, and glycosidases, such as α- N- acetylneuramidase (sialidase), β-D-galactosidase, N-acetyl-β-D-glucosaminidase and N-acetyl-β-D-galactosaminidase, which allow S. intermedius to grow on macromolecules found in host tissue [10]; a cytotoxin, intermedilysin (ILY), that can directly damage host tissues and immune defense cells and participate in bacterial pathogenicity; and the surface protein antigens I/II that are involved in adhesion to fibronectin and laminin, which is an important step in the pathogenesis of endocarditis and abscess formation [11].

The development of high throughput nucleic acid sequencing technologies has enabled observing variations of the genetic repertoire among strains of a given bacterial species. Our present study analysis aimed at describing the genetic diversity and pathogenesis substratum of S. intermedius. Twenty-seven genomic sequences from S. intermedius strains, including 13 newly sequenced from our laboratory and 14 from public databases, were used for pan-genomic analysis. Predicted genes were compared among strains to determine the size of the core and dispensable gene pools, the pangenome, the gain/loss of putative virulence determinants, and to identify genomic islands.

Accession numbers

The 13 genome sequences determined in this study were deposited in GenBank and their accession numbers are listed in Table 1.

Table 1 Genomic characteristics of the 27 studied S. intermedius strains

Materials and methods

Extraction and genome sequencing

The genomic DNA (gDNA) of each studied S. intermedius strain was extracted in two steps: a mechanical treatment was first performed using acid-washed glass beads (G4649-500 g Sigma) and a FastPrep BIO 101 instrument (Qbiogene, Strasbourg, France) at maximum speed (6.5) for 90 s. following a 2-hour lysozyme incubation at 37 °C, DNA was extracted using an EZ1 biorobot and the EZ1 DNA Tissue kit (Qiagen, Hilden, Germany). The elution volume was 50µL. Genomic DNA was quantified using the Qubit assay (Life technologies, Carlsbad, CA, USA).

The gDNAs were sequenced using a MiSeq sequencer with the Paired-End strategy and the Nextera XT library kit (Illumina, Inc, San Diego, CA, USA). The Paired-End library was prepared using input solutions of 1 ng gDNAs. The gDNAs were fragmented at the tagmentation step. Then, limited cycle PCR amplification (12 cycles) completed the tag adapters and introduced dual-index barcodes. After purification on AMPure beads (Life technologies, Carlsbad, CA, USA), the libraries were normalized according to the Nextera XT protocol (Illumina). Normalized libraries were pooled for sequencing on a MiSeq sequencer (Illumina). Automated cluster generation and paired-end sequencing with dual index reads was performed in a single 39-hour run in a 2 × 250 bp format. The numbers of paired-end reads were summarized in Table 2. The paired-end reads were trimmed and filtered according to the read qualities.

Table 2 Genome sequencing details of the 13 S. intermedius strains from our study

Genome assembly, annotation and comparison

After sequencing, the obtained reads were assembled using the A5 software [12] with default parameters and then contigs were compared to NCBI using BLASTn to remove contaminations. Then, the online tool Fasta dataset joiner ( was used to merge sequences into a single molecule. The Mauve software was used for multiple genomic sequence alignment [13]. Genes were annotated using the Prokka software with default parameters [14] in which the similarity e-value cut-off is 0.000001 and the minimum contig size is 200 bp. This pipeline also includes several other tools like Aragorn for tmRNA detection, Barnap to count rRNAs and Prodigal to identify coding sequences. To estimate the mean level of sequence similarity at the genome level among studied strains, we used the OrthoANI [15] and Genome-to-Genome Distance Calculator (GGDC) [16] softwares, with the following respective threshold values of 95–96 and 70 %.

Phylogenetic analysis

A 16 S rRNA-based phylogenetic analysis of the 27 studied S. intermedius strains was performed using the MEGA 7 software [17]. For constructing the phylogenetic tree, the following options were used: Maximum Likelihood method; Kimura 2-parameter model for substitution model; uniform rates among sites; partial deletion option for gaps/missing data; 1000 bootstrap replicates.

Using genomic sequences and the Roary program [18], a clustered heatmap of core genes was drawn on the basis of the presence/absence approach [18]. We also detected SNPs with the snp-sites program [19] from the core genome alignment and drew a phylogenetic tree with CGEwebface [20].

Virulence factor analysis

Virulence-associated genes were detected by comparing studied genomic sequences with the virulence factor database (VFDB) [21] and sequences described in recent publications [22]. The BLASTp search was performed using the threshold scores reported by Olson et al.: 35 % identity and highest scoring pair length of 50 % [22]. Additionally, we reviewed the literature to identify the proteins involved in interactions with the host [10, 23, 24]. A principal component analysis was performed using the XLSTAT program (Data Analysis and Statistical Solution for Microsoft Excel, Addinsoft, Paris, France 2017) in which the Fisher’s least significant difference (LSD, α = 0.005) and Pearson’s correlation coefficients were used, to detect any association of virulence-associated genes with specific clinical conditions.

Core and pan-genome analysis

Get_homologue [25] was used to reveal orthologous genes among S. intermedius strains, using the following parameters: minimal coverage (-C) 40 %, minimum identity (-S) 50 %, minimum e-value (-E) 1e-05. Sequence similarity searches and clustering of coding sequence (CDS) from the 27 genomes were performed using pairwise BLASTp and OrthoMCL algorithms [26]. Sequential inclusion of all possible combinations of up to 27 strains were simulated and fitted by regression analysis [27] of the amount of conserved genes and of strain-specific genes. This allowed to estimate and extrapolate the sizes of core- and pan-genomes. Roary [18] was also used, with default parameters, to confirm the reliability of the obtained pan-genome analysis results (identity percent ≥ 70 %, coverage ≥ 70 %) and to generate the core genome alignment.

Functional classification of orthologous cluster analysis

The Clusters of Orthologous Groups (COGs) database was used to identify gene functions [28] using BLASTP (E-value 1e− 03, coverage 0.7 and identity percent 30 %).

A circular comparison of genomes was obtained using the online GView Server ( with S. intermedius strain ATCC 27,335 as reference genome [29]. ResFinder and the ARG-ANNOT database were used to search antibiotic resistance-related markers [30, 31]. The presence of CRISPR repeats and prophages was predicted using the CRISPRFinder [32] and PHASTER softwares, respectively [33].

Results and discussion

Strain characterization

The 27 studied S. intermedius strains originated from China, Canada, South Korea, US, Japan and France. The patients’ data was not available for some strains. The 13 French strains (G1552-G1557 and G1562-G1568, Tables 1 and 2) were isolated in our laboratory from patients with various infections (Table 2), from August 2014 to November 2016, on 5 % sheep blood-enriched Columbia agar (BioMérieux) at 37 °C in anaerobic atmosphere. Their identification was confirmed by the high scores (> 2) obtained using MALDI-TOF MS. In addition, 14 S. intermedius genome sequences were retrieved from GenBank. The 27 strains were divided into 8 groups according to their isolation source (Table 2). The genome sizes and gene numbers among S. intermedius strains were relatively similar, consisting for each strain in a single chromosome but no plasmid was identified in any strains and ranging in size from 1.85Mbp to 2.05Mbp (Table 2).

A schematic view of all 27 studied genomes is provided in (Fig. 1), showing an overall high degree of conservation. The general features of S. intermedius genomes are summarized in Table 2 The G + C content of S. intermedius ranged from 37.3 to 37.8 % (avg 37.641 %, n = 27). All 13 in-house sequenced S. intermedius contained at least 47 tRNA genes, and the number of rRNAs for all strains ranged from 3 to 6. Streptococcus intermedius exhibited an average 1870 CDs with a mean length of 907 bp, accounting for 87.3 % of the whole genome.

Fig. 1
figure 1

Circular representation of the 27 studied S. intermedius genomes. Genomic sequences were aligned using strain ATCC 27335 as reference. The alignment gaps tend to coincide with the regions of low G + C contents. The rings, from the inside out, display the size in kbp; GC skew; G + C content; followed by genomes as listed in the left legend

Phylogenetic analysis

The 16 S rRNA-based phylogenetic analysis (Fig. 2), widely used as a gene marker to differentiate Streptococcus species [34], demonstrated that all S. intermedius strains were grouped in a single cluster that was closely-related to S. anginosus and S. constellatus within the S. anginosus group [22] (Fig. 2). In the topology S. intermedius, S. constellatus and S. anginosus strains clustered together with their sub-species. However, the heatmap obtained using Roary [18], based on the core genome, was more discriminatory within the species than the 16 S rRNA-based analysis and identified 3 clusters that were independent from the strain source (Fig. 3).

Fig. 2
figure 2

16S rRNA-based phylogenetic relationships of S. intermedius strains using the Maximum Likelihood method with Kimura 2-parameter. The scale bar indicates the evolutionary distance between the sequences determined by a 0.005 substitution per nucleotide position. Numbers at the nodes indicate bootstrap values obtained from 1,000 replicates

Fig. 3
figure 3

Clustered gene presence/ absence and accessory genome distribution calculated by pangenome analysis among the 27 studied S. intermedius strains. Left: core-genome phylogeny; the three clusters in the dendrogram are delineated by red lines; right: heatmap of core genes

The three clusters are as follows: strains G1557, G1556, LC4, G1562, SK54, AJKN01, ATCC27335 and JTH08 constituted the first group, strains 30,309, G1563, G1564, 631SC0N and G1554 clustered in the second group while the remaining strains clustered in a third group. There was neither evidence of correlation between strain clusters and their clinical forms, nor between genomic types and the geographical origin of isolates.

To measure the divergence between all studied strains at a deeper level, we also analyzed their phylogenetic relationships on the basis of core genome SNPs, which demonstrated that strains G1562, G1566 and FO413 diverged from other strains and exhibited a higher tendency of recombination. However, again no disease-specific clustering was observed (Fig. 4).

Fig. 4
figure 4

Phylogenetic tree of S. intermedius strains based upon SNPs extracted from the core genome. Sequences were aligned using ClustalW with default parameters and phylogenetic inferences obtained using the Maximum likelihood method within the MEGA, version 7, software. Nodes indicate bootstrap support from 1000 replicates.

Genomic similarity

Digital DNA-DNA hybridization (dDDH) values ranged from 80.5 to 99.3 % between all 27 strains, thus confirming their classification within a single species. This was also cross-validated by the OrthoANI program, which produced pairwise values ranging from 97.78 to 100 % which is well above the consensus 95–96 % threshold for prokaryotic species demarcation [35]. This corresponded to 100 % 16 S rDNA sequence identity across all studied isolates. The above data correlate with a strong degree of genome conservation and synteny.

Functional classification of orthologous cluster

The overall distribution of S. intermedius proteins in COG categories was quite similar in all 27 studied strains (Fig. 5). Previous studies of other Streptococcus species also suggested that, within a given species, the majority of strains had a similar COG profile [36,37,38]. Approximately 79.72 % of all proteins predicted in all strains were identified in COG superfamilies. The proportion of each category fluctuated within a very small range, showing almost similar percentages of distribution in all strains. The most abundant sub-categories were related to carbohydrate transport and metabolism (G) and translation, ribosomal structure and biogenesis (J) like their distribution in core genes.

Fig. 5
figure 5

Differential distribution of COG functional categories in S. intermedius: a proportion of six classes of functional categories in strain-specific and core genes; b functional categories in strain-specific and core genes; c functional categories in the 27 S. intermedius strains. Category abbreviations are as follows: C, energy production and conversion; E, amino acid transport and metabolism; F, nucleotide transport and metabolism; G, carbohydrate transport and metabolism; H, coenzyme transport and metabolism; I, lipid transport and metabolism; P, inorganic ion transport and metabolism; Q, secondary metabolites biosynthesis, transport and catabolism; X, mobilome: prophages, transposons; A, RNA processing and modification; B, chromatin structure and dynamics; J, translation, ribosomal structure and biogenesis; K, transcription; L, replication, recombination and repair; D, cell cycle control, cell division, chromosomal partitioning; M, cell wall/membrane/envelope biogenesis; N, cell motility; O, posttranslational modification, protein turnover, chaperones; T, signal transduction mechanisms; U, intracellular trafficking, secretion, and vesicular transport; V, defense mechanisms; W, extracellular structures; Z, cytoskeleton; R, general function predicted only; S, function unknown

Less than half of strain-specific genes, but more than 90 % of core genes, had a match in the COGs database. The most abundant functions in core genes were associated with metabolism (Fig. 5a). The overall proportion of metabolic functions in core genes was 32.47 %, whereas that in strain-specific genes was 9.58 %. More specifically, energy production and conversion (C), amino acid transport and metabolism (E), nucleotide transport and metabolism (F), carbohydrate transport and metabolism (G) and coenzyme transport and metabolism (H) were noticeably more abundant in core genes (p-value < 0.01) (Fig. 5b). No mobilome-related functions were detected in S. intermedius. The functional category of information storage and processing showed highly different proportions in sub-categories (Fig. 5b). The functions of translation, ribosomal structure and biogenesis (J) were significantly enhanced (p-value < 0.0001) in core genes, whereas the functions of replication, recombination and repair (L) were significantly enhanced (p-value < 0.01) in strain-specific genes. This trend was also observed in other bacteria [35]. In the cellular processing and signaling category, the function of defense mechanisms (V) was more abundant in strain-specific (p-value < 0.05) than in core genes (Fig. 5c).

Pan- and core-genome analyses

The average number of new genes added by a novel genome was 40 when the 27th genome was added (Fig. 6). The exponential decay model shown in Fig. 7a suggests that the number of conserved core genes approached an asymptote with the comparison of 27 genomes. A total of 1,355 core genes were identified in S. intermedius. The average proportion and sequence identity of core genes per strain were 72 and 97.79 %, respectively, indicating that core genes in S. intermedius are highly conserved and reflecting a low degree of intraspecies genomic variability too. Examination of the functional annotation of these core genes suggests, as expected, that they encode mostly core metabolic processes.

Fig. 6
figure 6

Plot representing the numbers of new and unique genes found as each isolate of S. intermedius is added

Fig. 7
figure 7

Fitted curves indicating the characteristics of core- and pan-genomes from 27 studied S. intermedius strains. a curve showing the relationship between the core genome and the number of genomes, b curve showing the relationship between the pan-genome and the number of genomes. As the number of genomes sequenced increased, the pan-genome size increased, whereas the core-genome size decreased, thus indicating an open pan-genome model. The gradual extension of the pangenome with addition of new genomes describes an open pan-genome model of S. intermedius. The number of genes that each strain contains is documented from comprehensive statistical analysis given earlier in Table 2

A total of 1,054 strain-specific genes were identified in S. intermedius and the average number of strain-specific genes was 39 (Fig. 7b). Among strain-specific genes, 148 genes were found in strain G1562, 107 in strain TYG1620, 105 in strain BA1, 96 in strain 32,811, 82 in strain G1557, 73 in strain C270, 69 in strain 631_SCON, 61 in strain G1555, 41 in strain G1554, 38 each in strains G1565 and F0413, 33 each in strains G1564 and LC4, 27 each in strains G1556 and ATCC27335, 20 in strain 30,309, 16 in strain G1553, 13 in strain G1552, 11 in strain B196, 6 in strain KCOM1545, 5 in strain FDAARGOS_233 and 1 in strains G1563, G1566, G1567, JTH08, SK54AJKN01, respectively. The size of the pangenome increased steadily without reaching any plateau. The pangenome trend depicted in (Fig. 7b) shows a gradual expansion by addition of new genomes and thus the pangenome of S. intermedius may be considered as open and indicates a homogenous pattern of genome evolution with similar rates of gene gain/ loss process across the whole population. In addition, a total of 1,611 accessory genes that were shared by two or more strains were identified. Overall, we identified a S. intermedius pangenome of 4,020 genes including 1,355 core genes, 1,054 strain-specific genes and 1,611 accessory genes.

Virulence factors and principal component analysis

In the S. intermedius pangenome, 252 virulence factors were identified in total. Of these, 70 core virulence factors were shared by all strains and 78 unique virulence factors were present in one strain each (Table 3). Virulence-associated genes present in all studied genomes included homologous virulence genes that contribute to bacterial avoidance of the immune system, such as ily which encodes an intermedilysin, the lmb, pspA, pavB/pfbB, fss3 genes coding surface proteins, the genes coding the polysaccharide capsule (cps4A, cps4B, cps4C, cps4D, cps8D), the auto-inducer LuxS (luxS), the binding proteins (pavA, hitC, fbpC, psaA, mntA, clpC, fss3), neuraminidase (nanA), hyaluronidase (hysA), and heat shock protein B (htpB), genes from the sil locus known to play a role in quorum-sensing and virulence in S. pyogenes (silA, silD, silE, salX), genes associated with secretion systems (lem11, lem15, sdeC, ceg32, esxA, essC, lpg2372, lirB), and genes associated with Mg2+ transport proteins (mgtB, mgtC); the response regulator CsrR beta-hemolysin gene (cylG), lamanin-binding surface protein like Pac and invasion protein inlA were also present in all strains.

Table 3 Unique Virulence-associated genes detected in the 27 studied S. intermedius genomes

Among these core virulence genes, the surface protein antigen I/II that was demonstrated to play a potential role in S. intermedius pathogenesis [39], and human fibronectin and laminin that are supposed to bind to this antigenic protein induce IL-18 release from monocytes [39]; genes from the streptococcal invasion locus (sil) are related to enhanced virulence in the SAG group and may contribute to the invasive behavior of S. intermedius strains; the internalin (inlA), likely acquired from Listeria monocytogenes, increases the virulence of S. intermedius by playing a key role in attachment to host cells [40]; the hyaluronidase (hysA) acts in the liquification of tissues and is also involved in biofilm formation, which protects bacteria from host defenses and antibiotics, and plays a role in infection [9]; the ily-coded intermedilysin can directly damage host tissues and immune defense cells, causing human cell death by membrane bleb formation [23]. It has been also reported that intermedilysin helps in invasion and adhesion of bacteria to human liver cells, and in cytotoxicity [41]; the galE gene codes galactose which plays a role in biofilm formation and its key residues are essential for epimerase activity [42]; the laminin-binding surface protein, homologous to that in Streptococcus agalactiae is coded by the Pac gene and is essential in binding and invasion of different host surfaces, and is present in almost all group B Streptococcus strains causing pneumonia, septicemia and meningitis [43, 44]; psaA codes a surface lipoprotein that plays a role in Streptococcus pneumoniae systemic infections by interacting with monocytes [45]; we also identified the heat shock protein-coding gene htbB that is known in Legionella pneumophila to act in adhesion to host fibronectin [46]; the clpC gene codes a heat shock protein involved in the invasion of hepatocytes in Listeria monocytogenes and has an ATPase activity [47]; ATPase proteins were shown to play a role in the survival and virulence in Salmonella typhimurium and S. aureus [48] ; clpP codes an ATP-dependent caseinolytic protease that was proven in Streptococcus suis to play a role in colonization and bacterial adaptation to various environmental stresses [49], pavB codes a fibronectin-binding protein that mediates bacterial attachment to human epithelial and endothelial cells and also promotes transfer of bacteria to the bloodstream [50, 51]; and nanA codes a highly conserved neuraminidase that also possesses a sialidase activity to catalyze the cleavage of terminal sialic acid residues from glycoconjugates. In S. pneumoniae, it promotes biofilm formation and contributes significantly to broncho-pulmonary colonization [52].

Although most of the strains exhibited one to eight unique virulence genes, strains G1562 and BA1 possessed 14 and 10 specific virulence genes, respectively. Eight strains (G1563, G1566, G1567, G1568, BA1, KCOM1545, JTH08, SK54AJKN01) had no strain-specific virulence factor (Table 3).

Among unique virulence genes, sdcA, ybtE,lpbA, SalR,salK, VopT are secretory system-associated genes that are involved in iron-mediated transport across cellular membranes. Some of these genes are linked with bacterial growth and act as important anti-inflammatory effectors [42, 53,54,55,56,57,58]. Among other unique genes, the pilC gene is suspected to be essential for secretion and assembly of transcription factor P, important in pilus formation [59] while pilT helps in polymerization and depolymerization of pilin [60]. The brkA gene inhibits bactericidal activity and protects the bacterium from complement activation products [61]. Other unique genes are linked with bacterial adherence and colonizationm such as hopH, toxB, mpn 372 and stcE which contribute significantly to actin organization and bacterial attachment to human surfactant proteins [62,63,64,65]. The iraAB gene utilizes iron-loaded peptides that promote iron assimilation [66] while lepA plays a role in bacterial growth and induces inflammatory response. This gene also plays a key role in pathogenicity in Psudomonas aeruginosa [67]. The fcrA gene codes a protein containing receptor domains for immunoglobulins similar to those M-related proteins [68]. Another immunoglobulin-related gene, aga, plays a barrier function for mucosal antibodies by cleaving IgA1 [69]. IpsA controls transcriptional biogenesis of the cell wall in inositol-derived lipid formation in Corynebacterium and Mycobacterium species [70]. The vasL gene is considered to be component of vas genes, associated with the membrane type VI secretion system [71], and ravL is presumably activated at low oxygen level and regulates virulence gene expression via clp gene [72]. The lpg0365 codes a lypophosphoglycan that together with other membrane polypeptides, is necessary for Leishmania pathogenesis [73]. The pvdJ gene is involved in the production of cyclodipeptides that may regulate the production of biofilm [74]. In addition, pvdL is associated to biosynthesis or uptake of the siderphores pyoverdine and pyochelin that act in the transport of heme and ferrous ions [75], while pvdD is involved in the biosynthesis of pyoverdine in Pseudomonas aeruginosa [76]. IpaJ codes a plasmid antigen involved in demyristoylation of proteins by inducing golgi fragmentation and inhibiting hormone trafficking [77]. AliA is associated with nasopharyngeal colonization in Streptococcus pneumoniae [78]. The espN gene is reported in Mycobacterium tuberculosis to play a role in adding an acetyl group to the N-terminus of the esaT-6 virulence factor [79]. Flagella-related unique genes found in different studied strains include flgG, flgI, flgJ and flgk which play a major role in virulence, adhesion and motility. They are mostly involved in flagellum formation and also act as interface with other flagellar proteins [80,81,82,83]. The lnlK gene was reported in Listeria monocytogenes to help avoid autophagy while virB8 localizes to the inner membrane and is related to the export of alkaline phosphatase to the periplasm [84]. Finally, sigA codes a sigma factor linked with galactosidase activity [85].

Using principal component analysis of differentially distributed virulence genes, three distinct clusters were visualized (Fig. 8). A clear separation of virulence genes associated with brain or broncho-pulmonary abscesses (cps4E, sda and lap) from those associated with liver or abdominal abscesses (cpsB, fbp54 and cap8D) was observed. The first component which has maximum coverage and represents the largest variation showed that brain abscess-causing strains were associated with genes coding ATP-dependent proteolytic enzymes, which indicates their potential role in abscess formation. Other virulence genes clustered independently, excluding any association with the previous two disease categories. Among virulence genes associated to brain and broncho-pulmonary infections, sda codes an histidine kinase that regulates sporulation initiation in Bacillus subtilis and mediates the expression of virulence-associated factors [86]; lap codes the Listeria adhesion protein (LAP) that is a host stress response protein responsible for adhesion and promotion of translocation across monolayers [87]; and cps4E codes the capsular polysaccharide biosynthesis protein that was demonstrated in S. pneumoniae to prevent phagocytosis by forming an inert shield essential for encapsulation [88, 89].

Fig. 8
figure 8

Principal component analysis based upon gene presence/absence showing the distribution of virulence genes which may contribute to the particular type of abscess. The green color represents the various clinical forms while virulence genes are represented in red and studied strains are in blue. BPA is Broncho Pulmonary abscess and Abd abscess denotes abdominal abscess

In S. pyogenes, fbp54 codes a fibronectin-binding protein that acts as an immunogen in humans. The amino acid sequence of fbp54 in S. intermedius is similar to that of S. pneumoniae. cap8D codes a dehydratase that is essential for the synthesis of the capsule precursor involved in adhesion. It has also been targeted as component for vaccine development [85, 90]; cpsB code capsular polysaccharide biosynthesis proteins that are essential for encapsulation in S. pneumoniae and are involved in the interaction of bacteria with their environment, notably their host organism [91];

In contrast to the above-mentioned genes, some were not found to be disease-specific. These included glf, cpsE, cpsI, cpsA, cps4C, cps8P and hasC. The glf gene is involved in the biosynthesis of unusual monosaccharide galactofuranose [92]; cpsE codes a glycosyl transferase responsible for the addition of activated sugars to the lipid carriers in the bacterial membrane and are essential for encapsulation in S. pneumoniae [93]; cpsI is essential for the production of high molecular weight capsular polysaccharides [94]; cpsA and cps8P are necessary for normal cell wall integrity and composition [95]; cps4C codes a polysaccharide tyrosine kinase adaptor protein that plays a key role in the regulation of capsule biosynthesis [96]; finally, hasC is involved in biosynthesis of hyaluronic acid capsule biosynthesis encodes glucose-1-phosphate uridylyltransferase [97].

Resistance-related genes and prophages

The tetracycline resistance gene tetM was identified in strains G1552, C270, KCOM1545, G1555, LC4, 30,309 and 32,811 whereas tet32 was identified in strain 631_SCON (Table 4). The macrolide resistance gene ermB was detected in strains G1552, C270, G1555 and 30,309. In other strains, no antibiotic resistance gene was identified.

Table 4 Antimicrobial resistance genes of studied S. intermedius strains

A set of prophage elements was identified in all 27 strains (Table 1). In addition, four prophage-like elements were detected in strain BA1, three in strain TYG1620, and two in strains G1562, G1564, G1553, G1557, F0413 and G1555. The major difference in the genome size between all 27 studied strains of S. intermedius resided in the phage numbers and this presence of phages also denotes contribution of horizontal gene transfer in the emergence of this species [98].

CRISPR identification analysis

The search for CRISPR elements showed that 14 of the 27 studied genomes contained CRISPRs. Three of these 14 strains (G1564, G1565, 631_SCON) had more than one CRISPR, for a total of 17 CRISPR modules identified in studied strains. The direct repeat (DR) length in identified CRISPRs ranged from 24 to 36 bp while there was variation in the number of spacers present within each CRISPR. CRISPRs also differed among strains but the DR regions were similar for a given CRISPR element subtype. Based on the type of cas proteins, the CRISPRs of strains G1562, G1563, G1564, G1556, G1554, 631_SCON, 30,309 were subtype I-C CRISPRs; those of strains FDAARGOS_233 and KCOM1545 were subtype II-A CRISPRs; finally, the CRISPRs of strains G1565, G1552, B196, G1555 and 32811were subtype II-C CRISPRs [93] (Table 5).

Table 5 CRISPR elements found in studied S. intermedius strains


In the present study, we reported 13 new clinical isolates of S. intermedius and, based upon a combined approach of pangenomics, core-genomics and virulence profiling of 27 strains, attempted identification of disease-specific genetic profiles. The comprehensive analysis revealed a genomic variability across strains within the species, although synteny of the core genome was preserved. Our results highlight the importance of surface proteins like pavB, pspA and cps4 (polysaccharide-coding proteins) and the binding proteins psaA, pavA, which are present in all studied strains, in pathogenesis. PCA results suggests two distinct categories of virulence genes, ATP dependent proteolytic virulence genes cps4E, sda and lap that are associated with brain and broncho pulmonary abscess while capsular polysaccharides protein coding genes cpsB and cps8D are linked with liver and abdominal abscess formation. The fibronectin binding protein coded by fbp54 is also showing its connection with liver and abdominal abscess formation. A recent study also attempted to determine the pangenome of S. intermedius.[99] The SNP-based phylogenetic tree as well as core gene-based tree showed no clustering related to any disease entity in S. intermedius strains. The whole study provides a key genetic framework for assessing and understanding the molecular events contributing to S. intermedius pathogenesis. However, due to the limited number of studied strains, validation of the role of these virulence factors will require experimental confirmations.

Availability of data and materials

All Studied sequences are available in GenBank under accession numbers as follows:

UENI00000000.1, UEND00000000.1, UENF00000000.1, UENG00000000.1, UICY00000000.1, UENA00000000.1, UENB00000000.1, AP014880.1, ANFT00000000.1, CP020433.2, UENJ00000000.1, NZ_UZBH00000000.1, UENK00000000.1, UENH00000000.1, CP003857.1, CP003858.1, UENC00000000.1, JUZI00000000.1, CP012718.1, AFXO00000000.1, UENE00000000.1, PNRP00000000.1, PNRI00000000.1, PNRH00000000.1, ATFK00000000.1, AP010969.1, AJKN00000000.1.


  1. Whiley RA, Fraser H, Hardie JM, Beighton D. Phenotypic differentiation of Streptococcus intermedius, Streptococcus constellatus, and Streptococcus anginosus strains within the “Streptococcus milleri group.” J Clin Microbiol. 1990;28:1497–501.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Rabe LK, Winterscheid KK, Hillier SL. Association of viridans group streptococci from pregnant women with bacterial vaginosis and upper genital tract infection. J Clin Microbiol. 1988;26:1156–60.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Whiley RA, Beighton D, Winstanley TG, Fraser HY, Hardie JM. Streptococcus intermedius, Streptococcus constellatus, and Streptococcus anginosus (the Streptococcus milleri group): association with different body sites and clinical infections. J Clin Microbiol. 1992;30:243–4.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Claridge JE, Attorri S, Musher DM, Hebert J, Dunbar S. Streptococcus intermedius, Streptococcus constellatus, and Streptococcus anginosus (“Streptococcus milleri group”) are of different clinical importance and are not equally associated with abscess. Clin Infect Dis. 2001;32:1511–5.

    Article  PubMed  Google Scholar 

  5. Tran MP, Caldwell-McMillan M, Khalife W, Young VB. Streptococcus intermedius causing infective endocarditis and abscesses: a report of three cases and review of the literature. BMC Infect Dis. 2008;8:154.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Guthof O. [Pathogenic strains of Streptococcus viridans; streptocci found in dental abscesses and infiltrates in the region of the oral cavity]. Zentralbl Bakteriol Orig. 1956;166:553–64.

    CAS  PubMed  Google Scholar 

  7. Jacobs JA, Pietersen HG, Stobberingh EE, Soeters PB. Streptococcus anginosus, Streptococcus constellatus and Streptococcus intermedius. Clinical relevance, hemolytic and serologic characteristics. Am J Clin Pathol. 1995;104:547–53.

    Article  CAS  PubMed  Google Scholar 

  8. Hasegawa N, Sekizuka T, Sugi Y, Kawakami N, Ogasawara Y, Kato K, et al. Characterization of the Pathogenicity of Streptococcus intermedius TYG1620 Isolated from a Human Brain Abscess Based on the Complete Genome Sequence with Transcriptome Analysis and Transposon Mutagenesis in a Murine Subcutaneous Abscess Model. Infect Immun. 2017;85:e00886-16.

  9. Pecharki D, Petersen FC, Scheie AA. Role of hyaluronidase in Streptococcus intermedius biofilm. Microbiology (Reading, Engl). 2008;154 Pt 3:932–8.

  10. Nagamune H, Whiley RA, Goto T, Inai Y, Maeda T, Hardie JM, et al. Distribution of the intermedilysin gene among the anginosus group streptococci and correlation between intermedilysin production and deep-seated infection with Streptococcus intermedius. J Clin Microbiol. 2000;38:220–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Goto T, Nagamune H, Miyazaki A, Kawamura Y, Ohnishi O, Hattori K, et al. Rapid identification of Streptococcus intermedius by PCR with the ily gene as a species marker gene. J Med Microbiol. 2002;51:178–86.

    Article  CAS  PubMed  Google Scholar 

  12. Coil D, Jospin G, Darling AE. A5-miseq: an updated pipeline to assemble microbial genomes from Illumina MiSeq data. Bioinformatics. 2015;31:587–9.

    Article  CAS  PubMed  Google Scholar 

  13. Darling AE, Mau B, Perna NT. progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS ONE. 2010;5:e11147.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  14. Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics. 2014;30:2068–9.

    Article  CAS  PubMed  Google Scholar 

  15. Lee I, Ouk Kim Y, Park S-C, Chun J. OrthoANI: An improved algorithm and software for calculating average nucleotide identity. Int J Syst Evol Microbiol. 2016;66:1100–3.

    Article  CAS  PubMed  Google Scholar 

  16. Meier-Kolthoff JP, Auch AF, Klenk H-P, Göker M. Genome sequence-based species delimitation with confidence intervals and improved distance functions. BMC Bioinformatics. 2013;14:60.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Kumar S, Stecher G, Tamura K. MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 for Bigger Datasets. Mol Biol Evol. 2016;33:1870–4.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Page AJ, Cummins CA, Hunt M, Wong VK, Reuter S, Holden MTG, et al. Roary: rapid large-scale prokaryote pan genome analysis. Bioinformatics. 2015;31:3691–3.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Page AJ, Taylor B, Delaney AJ, Soares J, Seemann T, Keane JA, et al. SNP-sites: rapid efficient extraction of SNPs from multi-FASTA alignments. Microb Genom. 2016;2:e000056.

    PubMed  PubMed Central  Google Scholar 

  20. Kaas RS, Leekitcharoenphon P, Aarestrup FM, Lund O. Solving the problem of comparing whole bacterial genomes across different sequencing platforms. PLoS ONE. 2014;9:e104984.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  21. Chen L, Yang J, Yu J, Yao Z, Sun L, Shen Y, et al. VFDB: a reference database for bacterial virulence factors. Nucleic Acids Res. 2005;33(Database issue):D325-328.

  22. Olson AB, Kent H, Sibley CD, Grinwis ME, Mabon P, Ouellette C, et al. Phylogenetic relationship and virulence inference of Streptococcus Anginosus Group: curated annotation and whole-genome comparative analysis support distinct species designation. BMC Genomics. 2013;14:895.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  23. Nagamune H, Ohnishi C, Katsuura A, Fushitani K, Whiley RA, Tsuji A, et al. Intermedilysin, a novel cytotoxin specific for human cells secreted by Streptococcus intermedius UNS46 isolated from a human liver abscess. Infect Immun. 1996;64:3093–100.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Mishra AK, Fournier P-E. The role of Streptococcus intermedius in brain abscess. Eur J Clin Microbiol Infect Dis. 2013;32:477–83.

    Article  CAS  PubMed  Google Scholar 

  25. Contreras-Moreira B, Vinuesa P. GET_HOMOLOGUES, a versatile software package for scalable and robust microbial pangenome analysis. Appl Environ Microbiol. 2013;79:7696–701.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Li L, Stoeckert CJ, Roos DS. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 2003;13:2178–89.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Tettelin H, Masignani V, Cieslewicz MJ, Donati C, Medini D, Ward NL, et al. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial “pan-genome.” Proc Natl Acad Sci USA. 2005;102:13950–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, et al. The COG database: an updated version includes eukaryotes. BMC Bioinformatics. 2003;4:41.

    Article  PubMed  PubMed Central  Google Scholar 

  29. Petkau A, Stuart-Edwards M, Stothard P, Van Domselaar G. Interactive microbial genome visualization with GView. Bioinformatics. 2010;26:3125–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Zankari E, Hasman H, Cosentino S, Vestergaard M, Rasmussen S, Lund O, et al. Identification of acquired antimicrobial resistance genes. J Antimicrob Chemother. 2012;67:2640–4.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Gupta SK, Padmanabhan BR, Diene SM, Lopez-Rojas R, Kempf M, Landraud L, et al. ARG-ANNOT, a new bioinformatic tool to discover antibiotic resistance genes in bacterial genomes. Antimicrob Agents Chemother. 2014;58:212–20.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  32. Grissa I, Vergnaud G, Pourcel C. CRISPRcompar: a website to compare clustered regularly interspaced short palindromic repeats. Nucleic Acids Res. 2008;36(Web Server issue):W145-148.

  33. Arndt D, Marcu A, Liang Y, Wishart DS. PHAST, PHASTER and PHASTEST: Tools for finding prophage in bacterial genomes. Brief Bioinformatics. 2019;20:1560–7.

    Article  CAS  PubMed  Google Scholar 

  34. Thompson CC, Emmel VE, Fonseca EL, Marin MA, Vicente ACP. Streptococcal taxonomy based on genome sequence analyses. F1000Res. 2013;2:67.

    Article  PubMed  PubMed Central  Google Scholar 

  35. Goris J, Konstantinidis KT, Klappenbach JA, Coenye T, Vandamme P, Tiedje JM. DNA-DNA hybridization values and their relationship to whole-genome sequence similarities. Int J Syst Evol Microbiol. 2007;57 Pt 1:81–91.

  36. Ferretti JJ, McShan WM, Ajdic D, Savic DJ, Savic G, Lyon K, et al. Complete genome sequence of an M1 strain of Streptococcus pyogenes. Proc Natl Acad Sci USA. 2001;98:4658–63.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Holden MTG, Hauser H, Sanders M, Ngo TH, Cherevach I, Cronin A, et al. Rapid evolution of virulence and drug resistance in the emerging zoonotic pathogen Streptococcus suis. PLoS ONE. 2009;4:e6072.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  38. Xu P, Alves JM, Kitten T, Brown A, Chen Z, Ozaki LS, et al. Genome of the opportunistic pathogen Streptococcus sanguinis. J Bacteriol. 2007;189:3166–75.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Petersen FC, Pasco S, Ogier J, Klein JP, Assev S, Scheie AA. Expression and functional properties of the Streptococcus intermedius surface protein antigen I/II. Infect Immun. 2001;69:4647–53.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Sawyer RT, Drevets DA, Campbell PA, Potter TA. Internalin A can mediate phagocytosis of Listeria monocytogenes by mouse macrophage cell lines. J Leukoc Biol. 1996;60:603–10.

    Article  CAS  PubMed  Google Scholar 

  41. Sukeno A, Nagamune H, Whiley RA, Jafar SI, Aduse-Opoku J, Ohkura K, et al. Intermedilysin is essential for the invasion of hepatoma HepG2 cells by Streptococcus intermedius. Microbiol Immunol. 2005;49:681–94.

    Article  CAS  PubMed  Google Scholar 

  42. Li C-T, Liao C-T, Du S-C, Hsiao Y-P, Lo H-H, Hsiao Y-M. Functional characterization and transcriptional analysis of galE gene encoding a UDP-galactose 4-epimerase in Xanthomonas campestris pv. campestris. Microbiol Res. 2014;169:441–52.

    Article  CAS  PubMed  Google Scholar 

  43. Farley MM. Group B streptococcal disease in nonpregnant adults. Clin Infect Dis. 2001;33:556–61.

    Article  CAS  PubMed  Google Scholar 

  44. Al Safadi R, Amor S, Hery-Arnaud G, Spellerberg B, Lanotte P, Mereghetti L, et al. Enhanced expression of lmb gene encoding laminin-binding protein in Streptococcus agalactiae strains harboring IS1548 in scpB-lmb intergenic region. PLoS ONE. 2010;5:e10794.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  45. Hu D, Wang D, Liu Y, Liu C, Yu L, Qu Y, et al. Roles of virulence genes (PsaA and CpsA) on the invasion of Streptococcus pneumoniae into blood system. Eur J Med Res. 2013;18:14.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  46. Carkaci D, Højholt K, Nielsen XC, Dargis R, Rasmussen S, Skovgaard O, et al. Genomic characterization, phylogenetic analysis, and identification of virulence factors in Aerococcus sanguinicola and Aerococcus urinae strains isolated from infection episodes. Microb Pathog. 2017;112:327–40.

    Article  CAS  PubMed  Google Scholar 

  47. Nair S, Milohanic E, Berche P. ClpC ATPase is required for cell adhesion and invasion of Listeria monocytogenes. Infect Immun. 2000;68:7061–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Hensel M, Shea JE, Gleeson C, Jones MD, Dalton E, Holden DW. Simultaneous identification of bacterial virulence genes by negative selection. Science. 1995;269:400–3.

    Article  CAS  PubMed  Google Scholar 

  49. Abeyta M, Hardy GG, Yother J. Genetic alteration of capsule type but not PspA type affects accessibility of surface-bound complement and surface antigens of Streptococcus pneumoniae. Infect Immun. 2003;71:218–25.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Pracht D, Elm C, Gerber J, Bergmann S, Rohde M, Seiler M, et al. PavA of Streptococcus pneumoniae modulates adherence, invasion, and meningeal inflammation. Infect Immun. 2005;73:2680–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Holmes AR, McNab R, Millsap KW, Rohde M, Hammerschmidt S, Mawdsley JL, et al. The pavA gene of Streptococcus pneumoniae encodes a fibronectin-binding protein that is essential for virulence. Mol Microbiol. 2001;41:1395–408.

    Article  CAS  PubMed  Google Scholar 

  52. Wren JT, Blevins LK, Pang B, Basu Roy A, Oliver MB, Reimche JL, et al. Pneumococcal Neuraminidase A (NanA) Promotes Biofilm Formation and Synergizes with Influenza A Virus in Nasal Colonization and Middle Ear Infection. Infect Immun. 2017;85:e01044-16.

  53. Papazisi L, Frasca S, Gladd M, Liao X, Yogev D, Geary SJ. GapA and CrmA coexpression is essential for Mycoplasma gallisepticum cytadherence and virulence. Infect Immun. 2002;70:6839–45.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Buchrieser C, Brosch R, Bach S, Guiyoule A, Carniel E. The high-pathogenicity island of Yersinia pseudotuberculosis can be inserted into any of the three chromosomal asn tRNA genes. Mol Microbiol. 1998;30:965–78.

    Article  CAS  PubMed  Google Scholar 

  55. Arutyunova E, Brooks CL, Beddek A, Mak MW, Schryvers AB, Lemieux MJ. Crystal structure of the N-lobe of lactoferrin binding protein B from Moraxella bovis. Biochem Cell Biol. 2012;90:351–61.

    Article  CAS  PubMed  Google Scholar 

  56. Shappo MOE, Li Q, Lin Z, Hu M, Ren J, Xu Z, et al. SspH2 as anti-inflammatory candidate effector and its contribution in Salmonella Enteritidis virulence. Microbial Pathogenesis. 2020;142:104041.

    Article  CAS  PubMed  Google Scholar 

  57. Li M, Wang C, Feng Y, Pan X, Cheng G, Wang J, et al. SalK/SalR, a Two-Component Signal Transduction System, Is Essential for Full Virulence of Highly Invasive Streptococcus suis Serotype 2. PLOS ONE. 2008;3:e2080.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  58. Kodama T, Rokuda M, Park K-S, Cantarelli VV, Matsuda S, Iida T, et al. Identification and characterization of VopT, a novel ADP-ribosyltransferase effector protein secreted via the Vibrio parahaemolyticus type III secretion system 2. Cell Microbiol. 2007;9:2598–609.

    Article  CAS  PubMed  Google Scholar 

  59. Tønjum T, Freitag NE, Namork E, Koomey M. Identification and characterization of pilG, a highly conserved pilus-assembly gene in pathogenic Neisseria. Mol Microbiol. 1995;16:451–64.

    Article  PubMed  Google Scholar 

  60. McCallum M, Benlekbir S, Nguyen S, Tammam S, Rubinstein JL, Burrows LL, et al. Multiple conformations facilitate PilT function in the type IV pilus. Nat Commun. 2019;10:5198.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  61. Barnes MG, Weiss AA. BrkA protein of Bordetella pertussis inhibits the classical pathway of complement after C1 deposition. Infect Immun. 2001;69:3067–72.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Dossumbekova A, Prinz C, Mages J, Lang R, Kusters JG, Van Vliet AHM, et al. Helicobacter pylori HopH (OipA) and bacterial pathogenicity: genetic and functional genomic analysis of hopH gene polymorphisms. J Infect Dis. 2006;194:1346–55.

    Article  CAS  PubMed  Google Scholar 

  63. Kannan TR, Provenzano D, Wright JR, Baseman JB. Identification and characterization of human surfactant protein A binding protein of Mycoplasma pneumoniae. Infect Immun. 2005;73:2828–34.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Tozzoli R, Caprioli A, Morabito S. Detection of toxB, a plasmid virulence gene of Escherichia coli O157, in enterohemorrhagic and enteropathogenic E. coli. J Clin Microbiol. 2005;43:4052–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Grys TE, Siegel MB, Lathem WW, Welch RA. The StcE protease contributes to intimate adherence of enterohemorrhagic Escherichia coli O157:H7 to host cells. Infect Immun. 2005;73:1295–303.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Viswanathan VK, Edelstein PH, Pope CD, Cianciotto NP. The Legionella pneumophila iraAB Locus Is Required for Iron Assimilation, Intracellular Infection, and Virulence. Infection and Immunity. 2000;68:1069–79.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Kida Y, Shimizu T, Kuwano K. Cooperation between LepA and PlcH Contributes to the In Vivo Virulence and Growth of Pseudomonas aeruginosa in Mice. Infect Immun. 2011;79:211–9.

    Article  CAS  PubMed  Google Scholar 

  68. Podbielski A, Kaufhold A, Lütticken R. [The vir-regulon of Streptococcus pyogenes: coordinate expression of important virulence factors]. Immun Infekt. 1992;20:161–8.

    CAS  PubMed  Google Scholar 

  69. Pohlner J, Halter R, Beyreuther K, Meyer TF. Gene structure and extracellular secretion of Neisseria gonorrhoeae IgA protease. Nature. 1987;325:458–62.

    Article  CAS  PubMed  Google Scholar 

  70. Baumgart M, Luder K, Grover S, Gätgens C, Besra GS, Frunzke J. IpsA, a novel LacI-type regulator, is required for inositol-derived lipid formation in Corynebacteria and Mycobacteria. BMC Biology. 2013;11:122.

    Article  PubMed  PubMed Central  Google Scholar 

  71. Ishikawa T, Sabharwal D, Bröms J, Milton DL, Sjöstedt A, Uhlin BE, et al. Pathoadaptive conditional regulation of the type VI secretion system in Vibrio cholerae O1 strains. Infect Immun. 2012;80:575–84.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Büttner D, Bonas U. Regulation and secretion of Xanthomonas virulence factors. FEMS Microbiol Rev. 2010;34:107–33.

    Article  PubMed  CAS  Google Scholar 

  73. Elhay M, Kelleher M, Bacic A, McConville MJ, Tolson DL, Pearson TW, et al. Lipophosphoglycan expression and virulence in ricin-resistant variants of Leishmania major. Mol Biochem Parasitol. 1990;40:255–67.

    Article  CAS  PubMed  Google Scholar 

  74. Liu Y, Dai C, Zhou Y, Qiao J, Tang B, Yu W, et al. Pyoverdines are essential for the antibacterial activity of Pseudomonas chlororaphis YL-1 under low-iron conditions. Appl Environ Microbiol. 2021;87:e02840–20.

  75. Ochsner UA, Wilderman PJ, Vasil AI, Vasil ML. GeneChip expression analysis of the iron starvation response in Pseudomonas aeruginosa: identification of novel pyoverdine biosynthesis genes. Mol Microbiol. 2002;45:1277–87.

    Article  CAS  PubMed  Google Scholar 

  76. Merriman TR, Merriman ME, Lamont IL. Nucleotide sequence of pvdD, a pyoverdine biosynthetic gene from Pseudomonas aeruginosa: PvdD has similarity to peptide synthetases. J Bacteriol. 1995;177:252–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. N B, Tg F, Da P, Jm E, Ba W, As S, et al. Proteolytic elimination of N-myristoyl modifications by the Shigella virulence factor IpaJ. Nature. 2013;496.

  78. Kerr AR, Adrian PV, Estevão S, de Groot R, Alloing G, Claverys J-P, et al. The Ami-AliA/AliB Permease of Streptococcus pneumoniae Is Involved in Nasopharyngeal Colonization but Not in Invasive Disease. Infect Immun. 2004;72:3902–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Amézquita-López BA, Quiñones B, Lee BG, Chaidez C. Virulence profiling of Shiga toxin-producing Escherichia coli recovered from domestic farm animals in Northwestern Mexico. Front Cell Infect Microbiol. 2014;4:7.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  80. Ran Kim Y, Haeng Rhee J. Flagellar basal body flg operon as a virulence determinant of Vibrio vulnificus. Biochem Biophys Res Commun. 2003;304:405–10.

    Article  PubMed  CAS  Google Scholar 

  81. Wu J-J, Sheu B-S, Huang A-H, Lin S-T, Yang H-B. Characterization of flgK gene and FlgK protein required for H. pylori colonization–from cloning to clinical relevance. World J Gastroenterol. 2006;12:3989–93.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Coloma-Rivero RF, Gómez L, Alvarez F, Saitz W, Del Canto F, Céspedes S, et al. The Role of the Flagellar Protein FlgJ in the Virulence of Brucella abortus. Front Cell Infect Microbiol. 2020;10:178.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Hizukuri Y, Kojima S, Yakushi T, Kawagishi I, Homma M. Systematic Cys mutagenesis of FlgI, the flagellar P-ring component of Escherichia coli. Microbiology (Reading). 2008;154 Pt 3:810–7.

  84. Thorstenson YR, Zambryski PC. The essential virulence protein VirB8 localizes to the inner membrane of Agrobacterium tumefaciens. J Bacteriol. 1994;176:1711–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  85. Cocchiaro JL, Gomez MI, Risley A, Solinga R, Sordelli DO, Lee JC. Molecular characterization of the capsule locus from non-typeable Staphylococcus aureus. Mol Microbiol. 2006;59:948–60.

    Article  CAS  PubMed  Google Scholar 

  86. Rowland SL, Burkholder WF, Cunningham KA, Maciejewski MW, Grossman AD, King GF. Structure and mechanism of action of Sda, an inhibitor of the histidine kinases that regulate initiation of sporulation in Bacillus subtilis. Mol Cell. 2004;13:689–701.

    Article  CAS  PubMed  Google Scholar 

  87. Burkholder KM, Bhunia AK. Listeria monocytogenes uses Listeria adhesion protein (LAP) to promote bacterial transepithelial translocation and induces expression of LAP receptor Hsp60. Infect Immun. 2010;78:5062–73.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  88. Hostetter MK. Serotypic variations among virulent pneumococci in deposition and degradation of covalently bound C3b: implications for phagocytosis and antibody production. J Infect Dis. 1986;153:682–93.

    Article  CAS  PubMed  Google Scholar 

  89. Tettelin H, Nelson KE, Paulsen IT, Eisen JA, Read TD, Peterson S, et al. Complete genome sequence of a virulent isolate of Streptococcus pneumoniae. Science. 2001;293:498–506.

    Article  CAS  PubMed  Google Scholar 

  90. Kawabata S, Kunitomo E, Terao Y, Nakagawa I, Kikuchi K, Totsuka K, et al. Systemic and Mucosal Immunizations with Fibronectin-Binding Protein FBP54 Induce Protective Immune Responses against Streptococcus pyogenes Challenge in Mice. Infect Immun. 2001;69:924–30.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Skov Sørensen UB, Yao K, Yang Y, Tettelin H, Kilian M. Capsular Polysaccharide Expression in Commensal Streptococcus Species: Genetic and Antigenic Similarities to Streptococcus pneumoniae. mBio. 2016;7:e01844–16.

  92. Kleczka B, Lamerz AC, van Zandbergen G, Wiese M. Targeted gene deletion of leishmania major UDP-galactopyranose mutase leads to attenuated virulence. Journal of Biological Chemistry. 2007;VOL. 282:10498–505.

  93. Amonov M, Simbak N, Wan Hassan WMR, Ismail S, A Rahman NI, Clarke SC, et al. Disruption of the cpsE and endA Genes Attenuates Streptococcus pneumoniae Virulence: Towards the Development of a Live Attenuated Vaccine Candidate. Vaccines (Basel). 2020;8:187.

  94. Thurlow LR, Thomas VC, Hancock LE. Capsular Polysaccharide Production in Enterococcus faecalis and Contribution of CpsF to Capsule Serospecificity. Journal of Bacteriology. 2009;191:6203–10.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  95. Nepal B, Myers R, Lohmar JM, Puel O, Thompson B, Van Cura M, et al. Characterization of the putative polysaccharide synthase CpsA and its effects on the virulence of the human pathogen Aspergillus fumigatus. PLoS One. 2019;14.

  96. Whittall JJ, Morona R, Standish AJ. Topology of Streptococcus pneumoniae CpsC, a Polysaccharide Copolymerase and Bacterial Protein Tyrosine Kinase Adaptor Protein. J Bacteriol. 2015;197:120–7.

    Article  PubMed  CAS  Google Scholar 

  97. Ward PN, Field TR, Ditcham WGF, Maguin E, Leigh JA. Identification and Disruption of Two Discrete Loci Encoding Hyaluronic Acid Capsule Biosynthesis Genes hasA, hasB, and hasC in Streptococcus uberis. Infect Immun. 2001;69:392–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. Willner D, Furlan M, Schmieder R, Grasis JA, Pride DT, Relman DA, et al. Metagenomic detection of phage-encoded platelet-binding factors in the human oral cavity. Proc Natl Acad Sci U S A. 2011;108 Suppl 1:4547–53.

    Article  CAS  PubMed  Google Scholar 

  99. Issa E, Salloum T, Panossian B, Ayoub D, Abboud E, Tokajian S. Genome Mining and Comparative Analysis of Streptococcus intermedius Causing Brain Abscess in a Child. Pathogens. 2019;8:22.

Download references


The study was supported by the Méditerranée Infection foundation, the National Research Agency under the program “Investissements d’avenir”, reference ANR-10-IAHU-03 and by Région Provence Alpes Côte d’Azur and European funding FEDER PRIMI.

Author information

Authors and Affiliations



D.S. and X.S. performed the genomic analysis while M.K performs PCA analysis and helps in preparing figures and tables, and M.D, D.Ra., P.E.F. and D.S. wrote the paper and designed the study. All authors reviewed the manuscript. The authors read and approved the final manuscript.

Corresponding author

Correspondence to Pierre-Edouard Fournier.

Ethics declarations

Ethics approval and consent to participate

The study design was validated by the ethics committee of the institut federatif de recherche 48 under reference 13–035.

Consent for publication

Not applicable.

Competing interests

The authors declare no conflict of interest in relation to this research.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sinha, D., Sun, X., Khare, M. et al. Pangenome analysis and virulence profiling of Streptococcus intermedius. BMC Genomics 22, 522 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: