CcpA is a global regulator of carbon catabolism  controlling expression of genes by binding to cognate operator sequences, cre, which is characterized by a low-conserved consensus sequence [32–34]. Hence, it seems possible that CcpA binds some cre sites with higher affinity than others. So far, the global studies of CcpA-dependent carbon catabolite repression were focused on identification of the members of the CcpA regulon [40, 42, 44], while the analysis of cre boxes in respect to their sequences, position and affinities in CcpA binding have been focused only on single examples [7, 17, 33, 34, 45]. A broader comparison of 32 cre boxes sequences and function was published by Miwa Y. et al. and it was deduced that a lower mismatching of cre sequences to the query sequence in the same direction as that of transcription of the target genes and a more palindromic sequence of cre boxes are desirable for their better function . The goal of our study was to perform a genome-wide analysis of cre boxes in order to reveal cre boxes with high and low binding affinities by comparing the CcpA regulon under three distinct conditions, where different amounts of CcpA were present in the cells and to identify cre features that determine this affinity.
Using a tetracycline-dependent gene regulation system  we achieved a tightly-controlled ccpA expression, leading to a wide range of CcpA amounts in the cells. B. subtilis cultures with relative low, medium or high amounts of CcpA in the cells were subjected to transcriptome analyses. The cells were grown in the presence of glucose to ensure sufficient production of low-molecular-weight modulators of CcpA activity (NADP, glucose-6-phosphate, fructose-1,6-bisphosphate). As expected, higher levels of CcpA protein lead to more genes significantly up- or downregulated. Most of the regulated genes, however, were affected indirectly, as they were lacking a cre site. Genes regulated indirectly in a CcpA-dependent manner (no cre or unfunctional cre) were already observed before and were proposed to be grouped in class II, next to class I that includes genes regulated by CcpA directly [40, 46, 47]. In our analysis, only genes belonging to class I were taken into account as the subject of this study was the nature of discriminating cre boxes. Many repressed genes are σA-dependent and do not need another inducing protein for their expression. However, expression of some genes is regulated by more than one regulator. In those rare cases of multiple regulation, the full extent of regulation would not be observed in our transcriptome analysis, but this does not affect our analysis since we are looking at the relative strength of repression at different CcpA concentrations.
The search for putative cre boxes in the B. subtilis genome, using a cre motif generated from the cre boxes known from DBTBS , T1G2A3A4A5R6C7G8Y9T10W11W12C13A14, resulted in 418 putative cre boxes. The majority of the predicted cre boxes were within ORFs far away from promoters and, although functional cre boxes located within coding sequences are present in the B. subtilis genome, a lot of the predicted cre sites seemed to be at a too large distance from the promoter to possibly be able to play a role in regulation of gene expression. Therefore, cre boxes located within −500 and +100 nucleotides from the first nucleotide of a start codon of the first genes of an operon were extracted. Also cre boxes triggering gene regulation that are known from the literature, but not predicted by our method, were included in our analysis. The genes differentially expressed at least at a high CcpA production level and possessing cre box(es) known from literature [1, 41] and/or predicted in this study were selected. Among the selected genes, 30 were downregulated and 3 were upregulated at a low CcpA induction level, while the other 37 genes were downregulated only when CcpA was produced at higher levels (medium and high CcpA induction levels). For all these genes, expression fold changes were calculated as ratios of the amounts of transcripts downstream of cre boxes as the microarray chip probes were synthesized upstream from them. Of the regulated first genes of operons possessing known and/or predicted cre box, chip probes of only kdgR and resA were upstream from kdgR
cre and second cre of resA (located 1709 bp downstream from TSS). Therefore, these cre boxes were not included in the sequence and position analysis of cre boxes. Since regulation depends on CcpA-cre binding, cre boxes causing significant regulation of downstream operons already when a small amount of CcpA is available are supposed to have a high affinity to CcpA and titrate CcpA away from low-affinity cre sites, which are able to exert regulation of other operons only when more CcpA is present. Notably, over a dozen of known cre’s fell out of our data set, because the corresponding genes were not significantly regulated in any of the three microarray experiments. Despite of the fact that they could be considered as very low-affinity sites, they were not included in the analysis as lack of the differential expression might have been a false negative result due to, e.g., high background signal, bad spot quality on the microarray slides, mRNA degradation, growth conditions, more complex regulation or yet unidentified factors. Moreover, it should be noted that division of cre boxes to two affinity groups is a simplification necessary for this analysis. Very likely a gradient distribution of cre site affinities occurs in nature, which would be difficult to assess.
The detailed analysis of the sequences of high- and low-affinity cre boxes, led to a few interesting observations. The G2 and middle C7 and G8 residues (Figure 3), known as highly conserved residues [32–34] are conserved in both high- and low-affinity cre boxes. Interestingly, the high-affinity cre boxes have more conserved G6 and C9 surrounding the middle CpG and C13 (palindromic to the conserved G2) and A14 (palindromic to T1) and their sequences are significantly more palindromic overall. It was observed before that a more palindromic sequence of cre sites contributes to a better function . The more palindromic nature of the high-affinity cre sites (in comparison low-affinity cre sites) might create a more symmetric DNA conformation, preferred for CcpA binding. Although the bases at positions 4 and 11 are more often palindromic to each other in the weak cre boxes, this is obviously less important for the cre strength. In a previous study  it was shown that CcpA binds with similar affinities to different cre boxes, which explains well the role of CcpA as a global regulator. However, the three cre boxes tested in that work differ very little around the middle CpG and in their symmetry (palindromic sequence) and they did not differ at the residues corresponding to our C13 nor A14.
Comparison of the high- and low-affinity cre boxes location in relation to the TSS also shows some trends. While the low-affinity cre sites can be located at any position from the TSS, the high-affinity cre sites cluster around the TSS, 14 and 27 base pairs upstream from TSS and 44 base pairs downstream from TSS. Simultaneously, the strongest repression by CcpA was observed for the genes with cre sites located around the TSS (amyE, rbsR
gmuB) and at positions −27 (acoR
glpF), -14 (dctP), +230 (xynP) and +372 (treP) base pairs from the TSS, which are separated by approximately 10 - 11-nt increments (corresponding with a full helical turn). This observation is in agreement with previous findings that activation or repression by CcpA binding to cre boxes is helix-face-dependent [17, 45]. Also in Lactococcus lactis the strongest repression by CcpA was shown to occur when the center of cre box was located −39, -26, -16, +5 and +15 from the TSS .
It was shown before that genes with cre boxes located further upstream from −35 sequences of the promoter are subject to activation by the CcpA complex as in case of ackA, pta and ilvB[19, 20]. In our work however, under the tested conditions, only three genes were activated: ilvB
opuE and ycbP (the two latter genes with cre sites predicted in this study). We did not observe activation of ackA in this study. This is probably due to the very low basal expression of CcpA from the TetR repressed promoter that might be high enough for binding of CcpA to the ackA cre box and for full activation of the ackA promoter. In this case, a further increase of CcpA does not result in an additional increase of ackA expression. Surprisingly, pta was downregulated. However, in this study both test and control cultures were grown in medium supplemented with glucose. The mechanism of pta regulation in this case is thus different from low glucose-dependent CCA. Based on our criteria, the cre boxes of all three activated genes are of the high affinity type. Although the ycbP cre box appears to be downstream to the TSS (+30), both the cre box and the TSS in this case are not experimentally confirmed.
Some genes and operons possess multiple cre boxes. Since DNA microarray technology was used in this study to assess expression fold changes of genes and operons in the presence of different amounts of CcpA, we were not always able to judge whether the effect is due to one cre box (and which one) or more. In our set (Table 2) there were only two operons with two cre boxes (the first genes of these operons are: iolA and gntR). gntR was weakly regulated (low-affinity cre box), suggesting that the regulatory effects of the two cre boxes do not add up to exert strong regulation. In case of the iolA operon, each of the two cre boxes is located within another gene of the operon (cre-1 within iolA and cre-2 within the second gene of the operon, iolB). In this case, the regulatory effects of these cre boxes could be assessed independently. Based on the fold changes of iolA (cre-1) and iolB (cre-2), both cre-1 and cre-2 seem to be of high affinity. Multiple cre boxes could serve for fine tuning of CcpA-regulated genes and operons.
For the genes with cre boxes located close to the TSS and downstream, distinct repression mechanisms were proposed. Elongation blockage (roadblock) was shown for xyl
ara and gnt operons, as well as sigL and acsA[49–53]. Prevention of binding RNAP to the promoter sequence was demonstrated for the acuABC and bglPH operons possessing cre partially overlapping with the promoter region [54, 55]. Transcription inhibition by direct interaction of CcpA with the σ-subunit of RNAP already bound to the promoter was shown in case of the amyE gene and xyl operon . The presence of a high-affinity cre box in close vicinity to the TSS shown in this study, suggests that repression by inhibition of RNAP binding is one of the most effective mechanism of negative regulation by CcpA.