High-resolution profiles of the Streptococcus mitis CSP signaling pathway reveal core and strain-specific regulated genes

Background In streptococci of the mitis group, competence for natural transformation is a transient physiological state triggered by competence stimulating peptides (CSPs). Although low transformation yields and the absence of a widespread functional competence system have been reported for Streptococcus mitis, recent studies revealed that, at least for some strains, high efficiencies can be achieved following optimization protocols. To gain a deeper insight into competence in this species, we used RNA-seq, to map the global CSP response of two transformable strains: the type strain NCTC12261T and SK321. Results All known genes induced by ComE in Streptococcus pneumoniae, including sigX, were upregulated in the two strains. Likewise, all sets of streptococcal SigX core genes involved in extracellular DNA uptake, recombination, and fratricide were upregulated. No significant differences in the set of induced genes were observed when the type strain was grown in rich or semi-defined media. Five upregulated operons unique to S. mitis with a SigX-box in the promoter region were identified, including two specific to SK321, and one specific to NCTC12261T. Two of the strain-specific operons coded for different bacteriocins. Deletion of the unique S. mitis sigX regulated genes had no effect on transformation. Conclusions Overall, comparison of the global transcriptome in response to CSP shows the conservation of the ComE and SigX-core regulons in competent S. mitis isolates, as well as species and strain-specific genes. Although some S. mitis exhibit truncations in key competence genes, this study shows that in transformable strains, competence seems to depend on the same core genes previously identified in S. pneumoniae. Electronic supplementary material The online version of this article (10.1186/s12864-018-4802-y) contains supplementary material, which is available to authorized users.


Background
In several streptococci intercellular coordination of gene expression mediated by peptide pheromones is associated with development of competence for transformation [1]. The pheromones activate a signal transduction pathway that regulates natural transformation. This physiological feature provides a selective advantage by allowing competent cells to acquire new characteristics, such as antibiotic resistance, by incorporation of DNA from other cells. In Streptococcus pneumoniae, natural competence is a tightly controlled transient state: it spontaneously arises during the early exponential growth phase at a certain cell density and reaches its peak after approximately 20 min, before it quickly shuts down [1,2]. The regulatory cascade is induced via activation of comC, encoding the competence-stimulating peptide (CSP) [1], which is further cleaved and exported by its secretion apparatus ComAB [3]. CSP binds and activates its cognate receptor ComD, which, after phosphorylation, activates the response regulator ComE [4]. Phosphorylated ComE then specifically binds to a target conserved sequence, referred to as the ComE-box, in the promoter region of sigX, a global transcriptional modulator, in addition to comCDE itself and comAB, creating a positive feedback loop that coordinates an explosive spread of competence among nearby cells. These genes form the core of the quorum sensing apparatus for the induction of competence [5,6] and are known as 'early' genes. SigX then initiates the transcription of a number of late genes involved in DNA uptake, recombination and fratricide by recognizing a SigX-box consensus sequence (TACGAATA) in their promoter regions [7,8].
Many of the SigX-regulated genes contribute to various aspects of transformation. A dozen genes in four operons are required for assembly of the DNA transport machinery, while five genes in five operons are considered essential for efficient recombination of donor DNA strands [9]. Competent pneumococcal cells also upregulate at least six genes involved in the production of killing factors and their respective immunity proteins, including the early gene comM, and late genes cibABC, cbpD and lytA [10,11]. CbpD, more specifically, is a murein hydrolase that plays a key role in the fratricide phenomenon, in which competent cells are able to kill and lyse non-competent sibling cells in the neighboring milieu, in a predatory mechanism most likely wired to acquire their DNA [11]. Interestingly, fratricide is not unique to pneumococci, as it has also been demonstrated in closely related species such as Streptococcus mitis and Streptococcus oralis [10].
S. mitis is a pioneer colonizer of the oral cavity, residing on the teeth, tongue and mucous membranes, as well as tonsils and nasopharynx, where it may exist side-by-side in biofilms with S. pneumoniae [12][13][14]. Both S. mitis and S. pneumoniae are naturally competent for natural transformation, and the exchange of genetic material by homologous recombination between them is recognized as part of their parallel evolution [15]. In addition, transfer of both antimicrobial resistance and virulence genes has been described between the species, evidenced by the presence of mosaic structures in gene sequences [16][17][18]. Interestingly, it has been suggested that the acquistion of S. mitis genes by S. pneumoniae has contributed to its evolution, but the opposite has not been proven [15]. Both species share a common ancestor, and it has been suggested that S. mitis evolved from a S. pneumoniae -like precursor by genome reduction, losing virulence genes and developing mechanisms of adaptation to the human host [15]. S. mitis strains represent a wide range of very distinct lineages under the same name, and previous reports have demonstrated that the genetic variability among S. mitis strains can be greater than between S. mitis, S. pneumoniae and Streptococcus pseudopneumoniae [19,20]. Due to its ability to induce oral mucosal antibodies, S. mitis has been investigated for its potential as a vaccine vector [21]. Among 18 S. mitis strains that have been partially or completely sequenced, the majority of competence genes are widely conserved, with the exception of sigX, which appears in truncated forms in 44% of the strains analyzed [15,22]. However, the activity, regulation, and possible role of S. mitis competence in the flow of genetic information between commensal and pathogenic streptococci remains unknown [22,23]. Although low transformation yields and the absence of a widespread functional competence system have been reported for this species, recent studies reveal that, at least for some strains, high efficiencies can be achieved following optimization steps in current protocols [22]. Here, we applied such optimal conditions to gain the first detailed insight into the regulatory pathways of competence development and investigate the strain specificity of the CSP signaling response in two S. mitis isolates.

S. mitis transformation efficiency in response to CSP
In contrast to S. pneumoniae, most S. mitis strains produce different strain-specific CSPs but seem to transform with low efficiency under laboratory conditions. Besides responding to different strain-specific stimulating peptides [22], the S. mitis type strain and strain SK321 are located phylogenetically apart (Fig. 1a) [20]. Pairwise comparison of the two strains performed by orthologous clustering using OrthoVenn revealed that the type strain has 1636 annotated proteins, whereas SK321 has 1757. One thousand, three hundred and seventy-nine orthologous clusters are shared, while 203 predicted proteins in the type strain and 352 in SK321 are strain-specific. Altogether, these features make the two S. mitis strains an interesting sample of S. mitis diversity for initial study of competence gene regulation in this species.
Competence for genetic transformation in pneumococcus occurs during a brief period of highly specialized protein synthesis (10-20 min), coordinated among many or all cells of an actively growing culture. Early genes present strongly increased expression during the period between 5 and 10 min after CSP induction, decreasing nearly to original values by 20 min after initiation of exposure to CSP. "Late" genes display a similar expression pattern, but with a delay of approximately 5 min [9]. Consistent with these timings, transformation kinetics for DNA uptake and recombination in the pneumococcus peaks between 10 to 15 min after CSP induction [24][25][26]. Evaluation of S. mitis type strain and SK321 temporal transformation patterns showed a remarkably similar pattern, with maximal transformation yields after 15 min of CSP induction in both strains (Fig. 1b). Interestingly, while transformation declined substantially after 30 min for the type strain, it declined more slowly in SK321. Based on these data, we chose 15 min after CSP induction as a suitable sampling time for evaluating competence-related gene expression.
Responses to CSP by the S. mitis type strain and strain SK321 The competence response in S. pneumoniae, which has been described in detail in the literature, comprises three phases, early, late and delayed, which vary in magnitude depending on environmental conditions such as pH, temperature, and presence of albumin. While the early and late responses depend on well-defined regulons, the delayed response is less well understood, with no specific regulatory mechanisms yet identified. Thus, to better characterize the competence response in S. mitis, we investigated the transcriptome profile of the type strain in response to CSP during growth in two contrasting media, the semi-defined medium C + Y YB, and the rich medium TSB. The medium C + Y YB is an optimal medium for competence development for S. mitis type strain, supporting higher levels of sigX expression when compared to rich medium (TSB) or semi-defined C + Y [22].
In TSB, exposure to CSP in the type strain resulted in the upregulation of 68 genes by > 2-fold, while downregulating 21 genes (Additional file 1: Table S1). When C + Y YB medium was used to examine the CSP response of S. mitis type strain, 79 genes were upregulated > 2-fold, whereas 19 were downregulated (Additional file 2: Table  S2). There was a strong similarity in the response to CSP in different media with regards to the number of transcripts upregulated, with an overlap of 53 genes. Interestingly, among the downregulated genes, only two hypothetical proteins coincided between TSB and C + Y YB media. With the exception of two gene clusters corresponding to the yellow circles in Fig. 2, the CSP response profile of S. mitis was only slightly affected by the choice of growth medium (Fig. 2). Three genes in one cluster are orthologues of delayed genes in S. pneumoniae (SP0785, SP0786 and SP0787) [9], and the other 5-gene cluster highly upregulated in C + Y YB corresponds to a region that is not involved in competence in S. pneumoniae. The ComE regulon presented a modestly higher upregulation when exposed to TSB, whereas the SigX regulon seemed to respond similarly in both media. This difference indicates that environmental factor may be important for the regulatory processes in S. mitis competence.
In SK321, a strain grouped in a different cluster than the type strain, evaluation of the CSP-induced response was performed using the same cut-off values (> 2-fold). One hundred and sixty-seven genes positively responded to CSP in the strain SK321 when cultured in C + Y YB , while 69 showed significant downregulation (Additional file 3: Table S3). Overall, this isolate presented significantly higher expression levels than the type strain. This was not an isolated observation, since a similarly exacerbated response has been reported, as determined by means of luciferase reporter assays for the sigX gene [22]. For comparison of the two S. mitis isolates' expression patterns, we will refer below only to data collected in C + Y YB for both strains.
Since samples for RNA-seq were collected at a single time point (15 min post CSP treatment), it was not possible to classify gene expression by temporal response. However, given the genetic proximity of S. mitis to S. pneumoniae and similarity in temporal patterns of transformation, we searched for orthologues of differentially expressed genes in a competent S. pneumoniae strain derivative of R6 [27] and analyzed their promoter regions for conserved regulatory sequences. By doing so, it was possible to identify orthologous genes, their functions and the expression pattern obtained in this transcriptome analysis, and compare with the DNA microarray data already known for S. pneumoniae.

Early response: Conservation of the ComE regulon
Genes upregulated by CSP in the S. mitis type strain and SK321 are listed with their orthologous pneumococcal Fig. 1 a Phylogenetic tree illustrating the separation of S. mitis NCTC12261 T and SK321 in different branches (adapted from [20]). b Kinetics of transformation in the S. mitis type strain and strain SK321. Pre-cultures at OD600 0.5 were diluted 1:100 in C + Y YB medium and grown until OD600 0.04 at 37°C in 5% CO 2 . Cultures were treated with 300 nM cognate CSP and distributed into 200 μL aliquots that were further exposed to 1 μg ml − 1 recombinant plasmid pVA838 at indicated times. 20 U ml − 1 DNase I were added after 30 min of exposure to DNA and the culture was incubated in air at 37°C for additional 30 min. Transformants were recovered in blood agar plates supplemented with Erythromycin. Each line represents results of a single experiment early genes in Table 1. Orthologues of all S. pneumoniae ComE-responsive genes acting in regulation of the competence cascade and fratricide were identified as upregulated in both strains. These included 9 transcriptionally activated regions (TARs) accounting for a total of 21 genes responsible for CSP processing and export (comAB), competence regulators (comCDE) and sigX (duplicate copy), and bacteriocin related genes. An early gene encoding an immunity protein involved in competence-regulated self-protection (comM) was also upregulated, as well as several other orthologues of S. pneumoniae early genes, such as comW, purA, and lytR [9]. The only observable difference in the early response between the two S. mitis strains was the upregulation of the orthologue of SP2156 in SK321 (SMSK321_1581), a gene encoding a membrane bound protein of unknown function.
In S. pneumoniae, early genes are preceded by a conserved regulatory sequence consisting of two 9 bp imperfect direct repeats (DRs) separated by 12 nucleotides (aCAnTTcaG-12-aCAgTTgaG), which is recognized as the phosphorylated ComE-binding site or ComE-box [8]. Alignment of the regions immediately upstream of the orthologues of early genes in the S. mitis type strain and SK321 revealed the presence of consensus motifs in all cases, except for the orthologue of SP2156 in the type strain (SM12261_0729), which was not upregulated by CSP (Table 2). During the search for DRs among the upregulated sequences, we noticed a ComE-box upstream of SMSK321_0027, which encodes an uncharacterized Gly-Gly peptide. This gene is orthologous to SP0429, which has never been associated to competence in S. pneumoniae, most likely due to a defective regulatory sequence without the first 9 bp DR (Table 2). However, in our observation of transcriptome analysis of S. pneumoniae D39 (data not shown) endogenously competent cells showed upregulation of SPD_0391, orthologue of SMSK321_0027. We also detected a DR upstream of SPD_0391; interestingly, this gene has not been related to competence in D39 in a previous report [28]. Although carrying sequences slightly divergent from the consensus for DRs (Table 2), the gene encoding comW presented upregulation in both strains independent of medium composition. This suggests that this regulatory site is still responsive despite the presence of a less conserved DR.

Late response: conservation of the SigX regulon
Late CSP-induced genes make the largest group of competence-specific products involved in binding, uptake, processing and integration of exogenous DNA, and the production of killing factors. Table 3 shows information on upregulated sequences in the two S. mitis strains and their orthologue late genes in S. pneumoniae [9]. Core genes, under regulation of SigX in the majority of competent streptococcal species [29] are in bold. Overall, upregulated genes were organized in 15 TARs (Fig. 3). For both strains, orthologues of late genes involved in DNA uptake were grouped in two significantly upregulated operons (comGA-comGG and comEA-comEC). Both operons are composed by genes required for assembly of the DNA transport and uptake machinery and are known as indispensable for transformation in S. pneumoniae. In addition, comFA, involved in DNA transport, presented a strong induction in both isolates. Four upregulated operons accounted for 5 DNA recombination genes: ssbB (NCTC12261 T , 98.6; SK321, 697.9-fold), dprA (NCTC12261 T , 63.8; SK321, 339.3-fold), coiA (NCTC12261 T , 58.6; SK321, 139.6-fold), and cinA-recA (NCTC12261 T , 21.7; SK321, 55.3-fold) and (NCTC12261 T , 5.3; SK321, 18.8-fold), respectively. These genes are essential for the efficient replacement of donor DNA strands. As mentioned, fratricide has been demonstrated not only for pneumococcus, but also for commensal streptococci [30]. In the present analysis, the gene encoding the murein hydrolase essential for the pneumococcal fratricide mechanism, cbpD, was significantly upregulated (NCTC12261 T , 116-fold; SK321, 458.5-fold). SP0031, a non-core late gene annotated as hypothetical protein in S. pneumoniae TIGR4, was also induced in both S. mitis isolates (non-annotated gene) ( Table 3) and in our observations of S. pneumoniae D39 endogenous competence response (data not shown).
SigX, together with RNAP, recognizes a "cinbox" or "SigX-box" unique DNA element (− 10 from the transcription start and featuring a T-rich region at − 25) in the promoter regions of late genes, initiating their transcription [7]. To search for conserved promoter elements in S. mitis, the regions immediately upstream of the potential start sites of orthologues of S. pneumoniae late operons were aligned (Fig. 4). Table 4 displays SigX-box sequences with no more than one mismatch detected upstream of 13 SigX core genes; combined with Gene number and product from [51] c Mean fold-change induction of CSP in S. mitis type strain and SK321 obtained by transcriptome analysis d Expression pattern during response the of S. pneumoniae to CSP pheromone [9] e Gene number according to PROKKA annotation [52] upregulation of downstream genes, these account for a total of 26 SigX controlled genes. Among the nine SigX core genes with unknown functions in competence, eight were upregulated at least 2-fold in SK321 (ccs50, cbf1, yfiA, pepB, pilC, radC, ackA, cinA), and four in the type strain (yfiA, pilC, radC, cinA). In addition, a non-annotated gene upstream of ccs4 (SM12261_0853; SMSK321_1482) coding for a 46-aa long peptide was upregulated in both S. mitis strains. Although orthologues of this gene are upregulated during competence in S. sanguinis (gene SSA_2233) [31], S. mutans UA159 (gene SMU.2076) [29] and S. pneumoniae R6 (gene SPR_0181 annotated as orf47) [28], this peptide remains uncharacterized and its role in competence is still unknown. We identified five additional S. mitis genes with candidate sites matching the SigX-box consensusbut without orthologues in S. pneumoniae - (Table 4), which we discuss further below.

Downregulated genes during competence development
Genes downregulated during the state of competence in the mitis group are fewer in number than upregulated sequences, and still not well characterized anywhere. SK321 isolate presented 70 genes with a 2-to 10-fold decrease in gene expression (Additional file 3: Table S3), which mostly included genes involved in sugar and amino acid metabolism, alcohol dehydrogenases, and hypothetical proteins, and 10 orthologues of previously reported S. pneumoniae CSP-repressed genes [9]. S. mitis type strain had fewer downregulated sequences, independently of the medium used (Additional file 1: Table S1 and Additional file 2: Table S2). Only two genes were commonly repressed in TSB and C + Y YB cultured cells, both uncharacterized hypothetical proteins, and the difference may be explained by the different growth conditions. Among the other downregulated sequences in the type-strain, none were orthologues of genes reported at least once as being repressed by CSP in any other streptococci. This is not an isolated observation, since a previous transcriptome analysis of S. pneumoniae competent cells has shown strain-specific responses also for downregulated genes [28]. Furthermore, the downregulation of these genes can also be an indirect effect of a general stress response caused by CSP.

Identification of early and late genes in S. mitis with no orthologues in S. pneumoniae
Despite the close genetic kinship between these species, some S. mitis upregulated genes lack orthologues in S. pneumoniae strains (Fig. 5a, Additional file 4: Table S4).
Four to five small adjacent ORFs located upstream the ABC transporter ComAB were found induced by CSP in both S. mitis strains. SM12261_0044-0045 and SMSK321_1305-1306 encode peptides with a GG-type leader sequence and a bacteriocin moiety featuring a GxxxG-like motif [32]. The processing of SM12261_0044-0045 at the Gly-Gly site would give mature peptides of 20 and 32 amino acids residues, while SMSK321_1305-1306 processing would result in 40and 33-amino-acids mature peptides, respectively. Interestingly, in silico analysis revealed no similarity between active peptide sequences of the two strains. However, the leader sequence is identical for the peptides encoded by SM12261_0045 and SMSK321_1305, which suggests they might be processed and exported by the same bacteriocin transporter. The third gene in each operon codes for another GG-leader peptide, and may play a role as a signaling peptide involved in bacteriocin production. Finally, based on the typical transcriptional pattern of class IIb bacteriocins, the fourth gene possibly codes for an immunity protein. In the promoter region of SM12261_0044 and SMSK321_1305, we identified an usual sequence (TTCGAATA) that matches the SigX box consensus, suggesting that these genes are regulated by SigX (Fig. 5a) [7,33]. We confirmed this prediction by RT-PCR in a S. mitis strain lacking the two copies of sigX (Fig. 5b). Thus, this operon appears to be involved in the production and export of competence-related bacteriocins and part of the late CSP response in S. mitis.
Both strains possess upregulated operon (SM12261_0749-0750; SMSK321_1598-1599) that codes for a membrane protein and a lipoprotein, respectively. We identified orthologues of SM12261_0750 and SMSK321_1599 in several S. sanguinis strains, in addition to a dozen strains of S. mitis, while orthologues of  [9] e No upregulation detected by [9] f Gene number according to PROKKA annotation [52] SM12261_0749 and SMSK321_1598 were only detectable in other S. mitis strains. In S. sanguinis SK36 (SSA_2192), this gene is upregulated during competence, and a SigX-box site marks its promoter region [31]. We found the same SigX-box sequence upstream of SM12261_0750 and SMSK321_1599 (Fig. 5a), and confirmed its activity through RT-PCR (Fig. 5b).
Another region composed by two ORFs coding for two hypothetical proteins was upregulated by CSP in both media tested, but only in the type strain. BLAST analyses revealed orthologue sequences of SM12261_0241 in other three S. mitis strains (SK1073_0114, SK569_0006 and SK616_1612), and the gene annotation corresponds to a bacteriocin class II with double glycine leader peptide in all of them. Interestingly, the active part of the peptide encoded by SM12261_0241 is identical to the one encoded by isolate SK569. In turn, gene SM12261_0240 is annotated as a conserved hypothetical protein, but its orthologues in SK616_1615, SK569_0169 and SK579_0521 code for Enterocin A immunity proteins. We were not able to identify either ComE-box or SigX-box sequences in the promoter region of SM12261_0241. However, analysis of SM12261_0240-SM12261_0241 mRNA expression in a ΔsigX1ΔsigX2 mutant showed upregulation of this transcript (Fig. 5b), confirming its independence from the alternative sigma factor regulation. Indeed, this suggests that this region might be suffering indirect ComE regulation or may simply reflect a broader specificity of ComE than previously thought.
Gene SMSK321_1184, coding for a hypothetical protein, was upregulated more than 100-fold in SK321 but has no orthologue in the type strain. Using BLAST analyses against streptococcal genomes, three orthologues of this gene with at least 70% identity were identified in S. mitis SK1073, Streptococcus salivarius 57.I and S. sanguinis SK1056. A putative SigX-box sequence with pentagons. SMI T* , S. mitis type strain upregulated sequences in TSB and C + Y YB media, gene locus tag as in GenBank (accession no. AEDX00000000); SK321 gene locus tag as in GenBank (accession no. AEDT00000000). SP 1 , S. pneumoniae Rx [9]; gene locus tag as in GenBank (S. pneumoniae TIGR4, accession no. AE005672). SP 2 , S. pneumoniae R6 [28]; gene locus tag as in GenBank (S. pneumoniae TIGR4, accession no. AE005672). SP 3 , S. pneumoniae G54 [50]; gene locus tag as in GenBank (S. pneumoniae G54, accession no. CP001015). SGO 3 , S. gordonii Challis [36]; gene locus tag as in GenBank (S. gordonii Challis, accession no. CP000725). SSA 4 , S. sanguinis SK36 [31]; gene locus tag as in GenBank (accession no. CP000387.1). a, b, c, d Upregulation was not > 2-fold. The image was modified from [29] (Table 4), as well as in the promoter regions of all three orthologues. In fact, the presence of this gene in three different species might indicate the importance of this protein in competence throughout the evolution of salivarius and mitis groups of streptococci. Figure 5c demonstrates that deletion of these unique CSP-responsive genes of S. mitis type strain and SK321 did not affect transformation yields in any of the knockout strains. Thus, although regulated by CSP, these genes seem dispensable for DNA uptake and processing in these strains.

Discussion
Commensal streptococci are pioneer colonizers of the oral cavity that attach to the tooth surfaces, and to which more pathogenic bacteria later adhere to establish a mature multispecies biofilm. This close attachment provides a genetic pool to competent oral streptococci, which become potential recipients for horizontally transferred DNA as well as latent reservoirs of important genetic elements, such as antibiotic resistance genes [15]. More importantly, this attribute may also increase the likelihood of survival for some members of the population during stress conditions. To date, the global transcriptome responses during competence have been studied in only three oral streptococci, S. mutans [29,34,35], S. sanguinis [31] and S. gordonii [36], and despite a few reports of competence in S. mitis [22,30,37], little or nothing is known about its regulation or response specificity in different strains. Recently, sequence of the genomes of a range of S. mitis revealed that truncation in genes required for competence was a common feature, present in roughly 40% of the strains [20]. However, even when apparently possessing a complete intact competence apparatus, only a few of the remaining strains displayed transformability under laboratory conditions [22]. Interestingly, the sigX is apparently the most common affected gene (absence and/or truncation), while DNA uptake and bacteriocin genes are largely conserved throughout S. mitis strains. Our transcriptome data revealed that the overall response of two competent S. mitis oral isolates to CSP resulted in significant upregulation of genes involved in the modulation of the competence cascade, DNA uptake and recombination, as well as lysis and bacteriocin production. Furthermore, we identified unique upregulated sequences, a fact that highlights the S. mitis singularity despite its close genetic resemblance to S. pneumoniae.
In the present study, the comparison between two S. mitis strains located in different evolutionary branches provided insight into the variation of the global transcriptomes of isolates of the same species under the same growth conditions. When cultured in the competence permissive medium C + Y YB , approximately 4% of the type strain genome positively responded to CSP, compared to almost 10% in SK321. All orthologues of ComE-responsive genes were upregulated at least 2-fold in SK321, while the type strain displayed a weaker response mainly for comW and comCDE (Table 1). However, this difference might be simply due to a smaller amplification of the CSP signal in this strain, since their transcriptomic maps show immediate expression downstream of their ComE binding sites. A recent study comprising transcriptomic data from at least five species from different phylogenetic groups, showed that streptococci regulate a core of 27 to 30 pan genes under the control of the alternative sigma factor SigX [29]. In the present analysis, the 12 SigX core genes required for DNA uptake in S. pneumoniae were upregulated in both strains, as well as the six genes required for DNA recombination (Table 3 and Fig. 3). Additionally, the SigX regulon comprises genes involved in lytic attack, and the key protein in fratricide, CbpD was strongly upregulated in both S. mitis isolates. Not surprisingly, among the orthologues of early genes and preceded by a ComE binding site was also comM, strongly implying that S. mitis employs a similar immunity mechanism to CbpD as does S. pneumoniae.
CSP-regulated bacteriocin production is a common feature among naturally competent streptococci. S. mutans, one of the most distinguished bacteriocin producers, coordinates the CSP-ComDE regulation with production of mutacins [38], while S. pneumoniae and S. gordonii carry a direct link between SigX regulation and bacteriocin production [39,40]. The most striking difference when comparing S. mitis orthologues of S. pneumoniae late genes was the absence of the cibABC bacteriocin locus in the two strains studied. Previous reports have shown that some S. mitis strains did not maintain this competence-induced locus throughout their evolution from S. pneumoniae, and that they probably acquired other genes associated with the production (See figure on previous page.) Fig. 4 Similarity of transcriptome profiles in the transcriptional initiation regions of SigX regulon core genes [29] in the type strain and SK321. The SigX core genes are marked in green. The arrows highlighted in yellow show the region where sequences matching the SigX-box consensus were found. Line a. corresponds to the control culture and line b. to the culture treated with CSP. Comparison between lines a. and b. shows the higher expression found in samples treated with the pheromone. Stars followed by gene annotation represent non-annotated sequences in S. mitis genomes of killing factors [41,42]. Indeed, we identified a strongly upregulated bacteriocin locus in a single transcriptional unit located upstream of the comAB operon, with a conserved link to the late competence response by carrying two (in the type strain) or even three (in SK321) copies of the SigX box in their promoter regions (Fig. 5b).
Although there are no previous reports, nor clear explanations for the role of multiple SigX box sequences, we hypothesize that recurrent recombination events in this region might have either left or removed additional promoter sequences. Puzzlingly, in S. pneumoniae TIGR4, this region is occupied by the bacteriocin encoding gene   [33]. Bases divergent from the consensus are represented by lower case letters. Only the first genes within each induced transcriptionally active region are shown blpU preceded by a Blp-box, while in S. pneumoniae D39 there is also a transposase gene transcribed in the opposite direction [41]. This suggests that a shuffling between Blp and competence-induced bacteriocins might have occurred during the parallel evolution of the two species. Besides this bacteriocin operon, we also detected other upregulated sequences without orthologues in S. pneumoniae. These accounted for hypothetical proteins, membrane proteins, lipoproteins and even other bacteriocin-like encoding operon. In fact, by comparative analyses of S. mitis type strain and S. pneumoniae TIGR4, D39 and G54, Kilian et al. [20] showed that 100 S. mitis proteins do not present any homology with any S. pneumoniae strain, and that 83 of the 100 S. mitis proteins without homologues in S. pneumoniae had homologues in S. sanguinis, S. gordonii, S. agalactiae and S. thermophilus and none lacked homologues in other bacteria. While there is no information on whether there are competence-related proteins among these, the presence of S. mitis strain specific genes suggests that they were acquired more recently in evolution, independently by individual S. mitis lineages.

Conclusions
Data gathered throughout the last two decades have provided great understanding about the streptococcal regulation of competence in various groups of this genus, reinforcing the fact that competence for genetic transformation is a conserved trait among streptococci. Overall, our results demonstrated that in two S. mitis isolates CSP induces a global change in gene expression in two S. mitis isolates that not only supports the maintenance of the competent state and the DNA uptake machinery, but also strongly induces expression of genes involved in lysis and bacteriocin production. Most of the previously described competence-induced loci in other streptococci were detected by our method together with several other novel genes, for which functions remain to be elucidated. Furthermore, promoter analysis of the genes not previously known to be induced during S. mitis competence suggests that several of them belong to either the ComE or the SigX regulons. These findings reveal conservation of the competence system in transformable S. mitis strains and highlight the characteristics of strain-specific regulated regions. Particularly, our findings are significant not only from a fundamental understanding of competence in streptococci, but also from a practical perspective, as transformation is an important tool to explore gene functions and to design S. mitis for potential applications such as vaccine development.

Bacterial strains and growth conditions
All bacterial strains and isogenic derivatives used in this study are listed in Additional file 5: Table S5. Bacterial stocks were stored at − 80°C in Todd Hewitt Broth (THB, Becton Dickinson and Company, Le Pont de Claix, France) or Tryptic Soy Broth (TSB, Soybean-Casein Digest medium, BactoTM) supplemented with 30% glycerol. Pre-cultures were prepared from fresh liquid cultures grown in TSB at 37°C 5% CO 2 until an absorbance of 0.5 at 600 nm (optical density at 600 nm [OD 600 ]; Biophotometer; Eppendorf ), supplemented with 15% glycerol and stored at − 80°C. For transformation, RT-PCR and RNA sequencing assays, C + Y YB medium [43] assembled as described previously [44] was used.

Transformation
Transformation kinetics was carried out as previously described [22]. Briefly, pre-cultures of NCTC12261 T and SK321 were diluted 100-fold in C + Y YB and grown at 37°C, 5% CO 2 until an OD 600 of 0.04 was reached. After the addition of 300 nM of CSP, 1 μg ml − 1 recombinant plasmid pVA838 was added at various time points. Cells were incubated at 37°C for 30 min before addition of 20 U ml − 1 DNaseI (Roche, DNaseI recombinant, 10 U ml − 1 ), followed by incubation at 37°C for further 40 min to remove extracellular DNA. Transformants were selected on blood agar plates supplemented with Erythromycin by 24 h of incubation at 37°C, 5% CO 2 .

Construction of mutants
For construction of the S. mitis ΔsigX1ΔsigX2 (MI014), two techniques were used. First, the standard PCR ligation mutagenesis strategy was employed [45], with minor modifications, to delete sigX1. Briefly, the sigX1 flanking regions were amplified using primer pairs FP395-FP396 and FP397-FP398. The kanamycin resistance cassette (Km R ) was amplified using the primer pairs FP001-FP068. Further, ligation and purification of the PCR products were performed using T4 DNA ligase (Fermentas) and the QIAquick PCR purification kit (Qiagen), respectively. The final product was transformed into S. mitis NCTC12261 T . The specific insertional inactivation of sigX2 was performed with a recombinant integrational vector, pSF152 [46]. An internal fragment of sigX2 was amplified by primer pair FP451-FP452 and was forced-cloned via BamHI and EcoRI 5′ tags of the respective primers, into the corresponding sites of plasmid pSF152. The final products were transformed into the parent strains as previously described [22]. All primers used for mutant constructions are listed in Additional file 5: Table S5.

Real time PCR
Pre-cultures of MI014 were diluted 100-fold in C + Y YB to a final colume of 10 ml and incubated at 37°C 5% CO 2 until an OD 600 of 0.04 was reached. The cultures were then divided in two, with one half being treated with 150 nM synthetic CSP and the other half kept untreated. Cultures were incubated for 15 min and pellets were harvested at 8000 g, 4°C for 10 min. Total RNA was extracted using the High Pure Isolation Kit (Roche, Mannhelm, Germany) and treated with Turbo DNase (AM2238, Ambion, Life Technologies, Carlsbad, California, USA) to clear any DNA contamination. Complementary DNA templates were prepared from RNA First Strand cDNA Synthesis Kit (Thermo Fisher Scientific, Fermentas) according to manufacture's protocol. Housekeeping gyrA gene was used to validate the results. The primers for the studied genes were designed by using Primer3Web Platform for uniformity in size (80-150 bp) and melting temperature. The primers sequences are provided in Additional file 4: Table S4. PCR conditions included an initial denaturation at 95°C for 10 min, followed by a 40-cycle amplification consisting of denaturation at 95°C for 30 s and annealing and extension at 55°C for 1 min. Data were collected and analyzed with the software MxPro (Stratagene).

RNA sequencing
Pre-cultures of NCTC12261 T and SK321 at OD 600 of 0.5 were centrifuged at 8000 g, 4°C, for 10 min, resuspendend 100-fold in TSB or C + Y YB and the diluted cultures were incubated at 37°C 5% CO 2 until an OD 600 of 0.04 was reached. Then, the total volume of culture was divided in two tubes and grown in the presence or absence of 150 nM of CSP. Following, cells were harvested for RNA extraction at 10000 g, 4°C, for 10 min. Procedures for RNA extraction, RNA enrichment and preparation of DNA library for sequencing using Illumina® HiSeq were carried out as described elsewhere [47]. Following sequence run, a FASTQ file was derived from each sample. For differential expression analysis of genes, raw read counts for the S. mitis type strain and SK321 transcripts were generated using a Perl script based on the mapped read profiles of the two strains, as previously described [48]. The "DESeq" Bioconductor