A generic approach to identify Transcription Factor-specific operator motifs; Inferences for LacI-family mediated regulation in Lactobacillus plantarum WCFS1
- Christof Francke†1, 2Email author,
- Robert Kerkhoven†2,
- Michiel Wels1, 2, 3 and
- Roland J Siezen1, 2, 3
© Francke et al; licensee BioMed Central Ltd. 2008
Received: 21 December 2007
Accepted: 27 March 2008
Published: 27 March 2008
A key problem in the sequence-based reconstruction of regulatory networks in bacteria is the lack of specificity in operator predictions. The problem is especially prominent in the identification of transcription factor (TF) specific binding sites. More in particular, homologous TFs are abundant and, as they are structurally very similar, it proves difficult to distinguish the related operators by automated means. This also holds for the LacI-family, a family of TFs that is well-studied and has many members that fulfill crucial roles in the control of carbohydrate catabolism in bacteria including catabolite repression. To overcome the specificity problem, a comprehensive footprinting approach was formulated to identify TF-specific operator motifs and was applied to the LacI-family of TFs in the model gram positive organism, Lactobacillus plantarum WCFS1. The main premise behind the approach is that only orthologous sequences that share orthologous genomic context will share equivalent regulatory sites.
When the approach was applied to the 12 LacI-family TFs of the model species, a specific operator motif was identified for each of them. With the TF-specific operator motifs, potential binding sites were found on the genome and putative minimal regulons could be defined. Moreover, specific inducers could in most cases be linked to the TFs through phylogeny, thereby unveiling the biological role of these regulons. The operator predictions indicated that the LacI-family TFs can be separated into two subfamilies with clearly distinct operator motifs. They also established that the operator related to the 'global' regulator CcpA is not inherently distinct from that of other LacI-family members, only more degenerate. Analysis of the chromosomal position of the identified putative binding sites confirmed that the LacI-family TFs are mostly auto-regulatory and relate mainly to carbohydrate uptake and catabolism.
Our approach to identify specific operator motifs for different TF-family members is specific and in essence generic. The data infer that, although the specific operator motifs can be used to identify minimal regulons, experimental knowledge on TF activity especially is essential to determine complete regulons as well as to estimate the overlap between TF affinities.
Numerous studies have been devoted to the identification of Transcription Factor (TF)-binding sites or other regulatory elements in bacterial genomes. So far, most large-scale approaches relied heavily on statistics and the input of known binding motifs [1–7]. Unfortunately, purely statistical approaches are seriously hampered by the trade-off that exists between a high true-positive rate and a low false-negative rate of the prediction. Nonetheless, both rates can be considerably improved by taking advantage of additional data [2, 8] like, for instance, sequence data from related species [9–11], structural information  or transcriptome data [13, 14]. Another way to enhance the accuracy is phylogenetic footprinting which takes both 'phylogeny' and 'synteny' into account[8, 14–16].
We have recently developed a large-scale automated regulatory motif prediction method for prokaryotic genomes . It was applied with success in the identification of a relatively large number of regulatory motifs in genomes of the Firmicutes, a phylum that comprises many well-studied families like the Bacillaceae, Clostridiaceae, Lactobacillaceae, Staphylococcaceae and Streptococcaceae. The identified motifs included several new motifs besides known ones. Nevertheless, in many cases the method appeared less suited to couple a specific TF or signal to the regulatory motif in a straightforward manner. For example, although the characteristic T-box motif was easily identified – the T-box is a regulatory element that responds to uncharged t-RNA  and is found in all Firmicutes – the amino acid specificity of that element was not retrieved for the individual instances automatically (Wels et al. unpublished results). Likewise, the 'CRE-like' motif that was retrieved is very similar to known operator motifs of various TFs of the LacI-family, suggesting that the recovered motif is not specific.
The LacI-family of TFs plays a crucial role in many bacterial species, and certainly in those of the phylum Firmicutes, as these TFs mediate preferences in the utilization of certain carbohydrates over others. The prioritization involves both repression (or activation) of catabolic genes (i) in the absence (or presence) of a related substrate and (ii) in the presence (or absence) of a preferred substrate [19–21]. The latter process is referred to as carbon catabolite repression (CCR) and its main mediator in Firmicutes species is CcpA [21–25]. CcpA operators were called CREs (CcpA-responsive elements ) and a CRE consensus motif was defined on basis of experiments in various Firmicutes species [21–23, 25, 27–30]. The consensus motif is very similar to, and sometimes coincides with, operators related to other TFs of the LacI-family [30–33]. Most family members, however, interact with only a few operators on the genome, like LacI of Escherichia coli, which represses specifically the lac-operon in the absence of lactose . This raises the question how these bacteria coordinate 'local' (def: control of the expression of one or a few genes/operons) and 'global' (def: control of the expression of many genes/operons) regulatory effects using homologous TFs.
Thus, the lack in specificity of the current prediction methods is a key issue in case one wants to disentangle complex regulatory relationships, like between those of the TFs of the LacI-family and the operons involved in carbohydrate catabolism. Therefore, we have formulated a comprehensive sequence-based comparative approach for the prediction of TF-specific operators in bacteria. Specificity is ensured by building upon a proper phylogenetic classification of each family of TFs (whose members can for instance be found in reference databases [35–37]) and very strict criteria to define synteny.
The value of the approach was put to the test on the well-described LacI-family of TFs, and more specifically, to uncover the regulatory connections of the 12 LacI-family TFs in L. plantarum WCFS1. This species was chosen as a representative of the phylum Firmicutes, as it is an industrially and medically relevant model organism that is encountered in very different environmental niches, i.e. in association with plants, fermenting food and feed, and in the animal and human gastrointestinal tract [38, 39]. The approach proved successful and each LacI-family TF of L. plantarum was linked to a putative operator motif and thereby to a putative regulon. In addition, several principles that should govern LacI-family TF mediated 'local' and 'global' transcription regulation could be inferred from the results. Ample experimental and structural information was used to evaluate and support the predictions and inferences.
I) A comprehensive approach to identify TF-specific operators
It has been observed consistently that orthologous protein sequences  are very likely to have molecular properties that are alike . Similarly, synteny – conserved gene order – was found to be a strong indicator of functional equivalency . Thus, genes that are orthologous and share 'gene context' can be assumed to be functionally more equivalent than orthologous genes that are not syntenous. Based on this premise we formulated a generic phylogenetic footprinting /shadowing  approach for the identification of TF-specific operator sequences in bacteria (description in Methods). High specificity in the motif prediction was achieved by properly classifying orthologous TFs into groups that share gene context to yield putative Groups of Orthologous Functional Equivalents or GOOFEs. To develop the approach, we chose the well-described LacI-family of transcriptional regulators (PFAM PF00356), limiting the analysis to Firmicutes and focusing specifically on the model organism Lactobacillus plantarum WCFS1, which has a high number of LacI-family TFs for which we have ample experimental and transcriptome data for validation.
Collect homologs: LacI-family TFs in the genomes of L. plantarum WCFS1 and other Firmicutes
Determine synteny: Identification of TF-specific binding motifs
Validation: Comparison of the predicted motifs with experimental functional data from literature
The CRE consensus. For B. subtilus and species of the phylum Firmicutes in general, a consensus has been formulated by others on basis of both (exp) experiment and (pred) predictions. For the composition of the L. plantarum CRE consensus (bold, italics) we have used the two experimentally established CREs in L. plantarum [49,110] and the initial CcpA operator motif retrieved by us (Figure 2).
TG WNAN CG NTNW CA
TG NAAR CG NWWW CA
WG WAAR CG YTWW MA
WG NAAS CG NWWN CA
WG HWAD SG YWWD CA
pred/exp:  (b)
NK NWAN SG NWWN CA
pred/exp: [49, 110] and this work
Known and predicted operators for ExuR in B. subtilis and MalI in E. coli. The operators determined by experiment are shown in normal print and the same operators as predicted using our new approach are shown in bold italics. O1 and O2 indicate the relative position of the operator sequences with respect to the translation start.
TG TTAA CG TTAA CA
TG TTAA CG TTAA CA
pred, this work
GT AAAA CG TTTT AT
GA AAAA CG TTTT AT
gT aAAA CG TTTT At
pred, this work
Operators for various LacI-family TFs present in L. plantarum. The operators that were verified by experiment in several species of the phylum Firmicutes are listed in normal print, the operators predicted by us for the orthologous TFs in L. plantarum are in bold italics. O1 and O2 indicate the relative position of the operator sequences with respect to the translation start. * Transcription from O1 was 10 times stronger than from O2.
CcpA-like LacI-family TF
CG CAAA CG TTTT CC
CG CAAA CG TTTG CG
cG CAAa CG cTTG CA
pred, this work
gT AAAA CG TTTT Ac
gT AAAA CG TTTT Ac
.T AAAA CG TTTT Aa
pred, this work
EbgR-like LacI-family TF
TTG TTT ACT AAA AAT
TTG TTT AGT AAA CGG
aaa TTT AGT AAT t..
pred, this work
..T TTT AGT AAA A..
AAA TTT AGT AAA ATT
ATT TTT ACT AAA ATT
aat TTT AGT AAA a..
pred, this work
Validation: Comparison of the predicted with 3-D structure information from literature
It also proved possible to use structural information on the binding of several LacI-family members to their respective operator [55–57] to validate predicted motifs. Differences in the conservation of certain amino acid residues in the DNA-binding domain of the TF were compared to the composition of the connected operator. Two clear correlations between protein sequence and operator sequence were found (see also the legend to Figure 2):
- Firstly, the structural data suggest that, in the case of CcpA and LacI, the conserved arginine located at position 24 is one of the few residues that hydrogen bonds directly with one of the nucleotide bases, a guanine at position -6 of the operator [56, 57]. In Lp_3661 (RbsR) and its orthologs, the arginine is replaced by a glutamine (or leucine) and correspondingly the otherwise 'conserved' guanine is replaced by a thymidine. In fact, such a replacement was observed for all other studied LacI-family TFs deviant at position 24 (see Additional file 7). These anomalous TFs include MalI from E. coli which was proven experimentally to indeed bind an operator with a thymidine at position -6  (Table 2).
- Secondly, the 'EbgR-like' TFs (i.e. Lp_3470 (LacR), Lp_3479 (GalR) and Lp_3488 (RafR)) are expected to have distinct DNA-binding features. Members of this subfamily lack the conserved leucine residue (position 60 in Figure 2) that according to the 3D-structure of operator-bound CcpA  intercalates between the central CG base pairs that are characteristic for 'CcpA-like' operators . Concordantly, the predicted 'EbgR-like' LacI-family TF operators lack the central CG nucleotide pair. Possibly, the conserved arginine at position 24 interacts with the conserved single C or G nucleotide in the operator (Table 3).
II) Identification of the biological role of a TF through comparative genomics
The biological role of a transcription factor is to activate or repress the transcription of certain genes in response to the presence of a signal (e.g. a nutrient or metabolite). In principle, once the sequence of a TF-specific operator is known, a genome-wide search for the related motif could be used to find putative TF-binding sites on the genome and to establish the regulated functionalities (regulon). The signal that triggers the transcriptional response can be obtained by linking the specific TF to an ortholog that has experimentally verified 'inducer' specificities. Finally, the transcriptional effect (i.e. activation or repression) of the binding of the TF can be deduced from the relative position of the putative binding site with respect to the promoter [25, 58].
Regulon predictions for the LacI-family TF homologs in L. plantarumWCFS1
As expected, most potential binding sites were identified upstream of operons that encoded functionalities related to the catabolism of particular carbohydrates. In L. plantarum WCFS1, 11 out of 12 LacI-family TFs were found to be associated with active carbohydrate transport systems (driven by protons: GPH family; ATP: ABC transport systems; or phosphoenolpyruvate: PhosphoTransferaseSystems). Furthermore, the size of the putative regulons varied slightly. For instance, the putative regulon of Lp_3625 encompassed only one operon, whereas that of Lp_0172 encompassed five operons (Figure 3). Although the putative regulon of CcpA was the largest, it was still limited in size, which is slightly in contrast with the global role of CcpA [21, 24, 25]. The precise composition and functionality of most of the predicted regulons is discussed in some detail in Additional file 9.
The molecular function of LacI-family TFs and the connection with the predicted biological role
The identification of various additional sites whose presence should be expected (i.e. related to autoregulation or regulation of genomically associated operons) supported the view that the approach yielded genuine TF-specific binding sites. Other support for the validity of the identified sites and regulons was provided by a comparison of the functionalities encoded by the regulons and the molecules that induced the activity (or better: the in-activity) of the related TFs. Figures 3 and 4 show that in almost all cases a straightforward metabolic link existed between the predicted regulated functionality and the assigned 'inducer' of the TF. For example in L. plantarum, Lp_0188 (SacR) is predicted to respond to sucrose or oligofructose, a prediction that was derived from experimental evidence obtained for orthologous TFs [67–69]. Concordantly, its putative operators are found upstream of two operons that harbor the genes encoding an active oligofructose/sucrose uptake system  and enzymes that catalyze the conversion of the phosphorylated oligosaccharide into phosphorylated disaccharide and the phosphorylated disaccharide into glucose-6-P and fructose.
Some of the predicted regulatory connections could be substantiated directly by published transcription data for L. plantarum or related species. On the other hand, the predictions also could often not be extrapolated in a straightforward way. For instance, similar to the prediction for L. plantarum, the expression of the ribose utilization operon (rbsUDK) in L. sakei was shown to be controlled by RbsR and induced by ribose . Unfortunately, the induction of other operons was not studied. Another example is provided by transcriptome data for L. plantarum grown on short-chain fructooligosaccharides compared to glucose. As predicted, expression of the divergon associated with lp_0188 was induced under these conditions . Nevertheless, the maltase/sucrase encoding gene lp_0174 that was predicted to be controlled by Lp_0188 (SacR) was not induced. This observation could very well relate to additional factors that are involved in the regulation of the particular gene. An example of the subtle differences between species is found for the regulation of the gal operon (galK, galT and galE) and lac operon (lacS and lacZ). In S. mutans , S. thermophilus CNRZ 302  and S. salivarius  expression of the gal operon, as well as that of galM and the lac operon in S. thermophilus and S. salivarius was shown to be controlled by GalR and induced by galactose. Our predictions suggest that in L. plantarum the gal operon is similarly controlled by the GalR ortholog Lp_3479. The lac-operon in L. plantarum however, was predicted to be controlled by a paralogous LacI-family TF, Lp_3470 (LacR), that is absent from the Streptococci, but which is present in L. acidophilus where it was shown to regulate an integrated lac-gal operon . At the same time, in some strains of another Lactobacillus species (L. delbrueckii ) the lac-operon was shown again to be controlled by an ortholog of Lp_3479 (GalR).
The mode of action: repression or activation
Regulatory overlap between LacI-TFs
Another interesting aspect of the predicted regulatory connections that became apparent from inspection of the footprints in Figure 5 was that many operons appear to be preceded by multiple LacI-specific putative operators. For example, the two neighboring operons involved in sucrose/oligofructose transport and catabolism are preceded by two putative Lp_0188 (SacR) operators. It was shown that transcription of these operons is indeed induced simultaneously . This observation fits the assumption that one of the operators controls the transcription in one direction and the other in the opposite direction. Conversely, two divergently transcribed genes can in principle also be controlled by a single operator. The latter was shown to be the case in the transcriptional control of the gene levR and the operon levABCDX in Lactobacillus casei  and in the transcriptional control of the genes pepQ and ccpA in L. delbrueckii  and L. lactis . The genes pepQ and ccpA are similarly organized in L. plantarum. Furthermore, upstream of the ccpA gene three different putative promoter sites can be distinguished and every promoter seems to be connected to its own CcpA operator. This finding is in line with the experimental evidence provided by . Nevertheless, based on the relative positions of the putative CREs, the effect of CcpA on its own expression is extremely difficult to predict of hand. CcpA seems to act as activator as well as repressor depending on the actual promoter.
The role of TF concentration
A generic method to identify TF-specific operators
Every line of evidence sustains the validity of the approach we have formulated to identify LacI-family TF specific operator motifs. In all cases where a LacI-family TF operator has been characterized experimentally, our prediction is in full agreement (see Tables 1, 2 and 3). And likewise, several correlations between the TF sequence (protein) and operator sequence (DNA) that are anticipated on basis of structural information [56, 57] were retrieved perfectly. Moreover, the fact that a specific operator motif could be identified for every LacI-family TF and that relatively few proper hits for these operator motifs were found on the complete genome, is proof by itself. More non-specific methods inevitably would have yielded more degenerate motifs and more false-positive identifications.
In lactic acid bacteria many operons involved in carbohydrate catabolism are associated on the genome by the gene encoding the respective regulator . In fact, this observation may be generalized for all TFs that are considered 'local' regulators. Our results indicate that especially for these TFs, a distribution into Groups of Orthologous Functional Equivalents will reduce the noise in the motif prediction significantly. In contrast, as current automated methods generate more degenerate motifs  these methods are better suited for the recovery of binding sites for 'global' regulators.
Characteristic motifs and the implications of degeneracy
As the interaction between TF and DNA allows for a certain structural freedom, a TF-specific operator is not necessarily a unique sequence but merely a collection of sequences which can be represented as a motif or consensus sequence. The molecular nature of the interactions dictates a distinct relationship between the affinity of the TF protein for the operator DNA and their respective sequence. As a consequence, a more degenerate operator motif relates to reduced affinity. For example, the LacI-family TF CytR in E. coli exhibits a more versatile binding of operator sequences than LacI (i.e. has a higher motif degeneracy). At the same time it was observed that its affinity for the operator is much reduced when compared to LacI [76, 77]. Likewise, the affinity of the TF for the operator was shown to be affected significantly by subtle changes in the protein sequence  as well as in the nucleotide sequence of the operator (for LacI: [79–81]; for CcpA: ). Lehming et al. therefore  assumed explicitly that the interaction between TF and operator should be concentration dependent. Ultimately, it is the relation between the concentration (or better: activity) of active TF and the rate of expression that determines key features of the dynamics of the cellular response to internal and external signals .
The predicted specific operator motifs of the LacI-family TFs in L. plantarum exhibit relatively little degeneracy (>8 nucleotides fully conserved for the 'CcpA-like' subfamily; see Figure 2) with one exception: the operator motif of CcpA itself. Considering the above, and based on the fact that in the 3D-structures of CcpA and LacI bound to their respective operators the same residues are involved in the interaction of TF with DNA [56, 57], the degeneracy of the CcpA operator motif indicates it should act at relatively higher concentrations with respect to LacI and other relatives. Concomitantly, variable regulation of ccpA expression would represent a way to control the differential binding of CcpA to CREs .
'Local' versus 'global' regulation
The identified CcpA operator motif (CRE) of L. plantarum is very similar to the consensus CRE that was initially defined for B. subtilis on basis of a site-directed mutagenesis study  and later refined on basis of the experimental identification of additional CREs [22, 30] (CRE consensus sequences are summarized in Table 1). Remarkably, the DNA-binding domain of CcpA on the protein level is considerably more conserved compared to that of the other LacI-family TFs (see Figure 2 right panel), whereas in contrast, the operator motif is the most degenerate. Both facts reflect and emphasize the 'global' role of CcpA. We observed that the CcpA regulon that was defined on basis of a genome wide search with the specific operator motif was relatively small. The same observation was made by  when the genome of B. subtilis was searched for potential CREs for the first time. The authors concluded that this related to the lack of degeneracy in the search motif and they proved experimentally that this was indeed the case.
It is generally assumed that transcription and translation are connected processes in bacteria  and as a consequence proteins should be produced in the physical vicinity of where they are encoded. A major implication of an intended local role of a TF would then be that the number of TF molecules necessary to effectively control expression can be minimized in case the affinity for the operator is relatively high (signified by a less degenerate motif). As mentioned in the previous section, all but one of the predicted operators indeed show a relatively high degree of conservation over different, sometimes even distantly related, species. A low TF concentration will keep in check non-local interactions as the TF will be virtually absent in the rest of the cell and, as a result, even operators that are very similar will not be affected. In fact, it was shown for carbohydrate utilization by Lactobacillus acidophilus that induction of catabolic operons is highly specific for distinct sugars . Vice versa, a higher TF concentration, like anticipated for CcpA , would relax the sensitivity towards the composition of the operator and thus enable binding to sites for which the TF has less affinity. However, transcript levels that are observed in L. plantarum under different growth conditions are not completely conclusive (see Figure 6). Nevertheless, based on the observed transcript levels one should expect that Lp_0172 (MalR), Lp_0188 (SacR), Lp_3221, Lp_3531, Lp_3661 (RbsR), CcpB and CcpA in principle could regulate multiple and also distant operons.
Regulon boundaries and induced response
Searching the genome of L. plantarum with the identified specific operator motifs yielded a list of potential binding-sites for every LacI-family TF. To avoid many false predictions, we have used two conservative criteria to reduce the list of putative TF-specific binding sites. They related to the position of the site with respect to the translation start, as there is experimental data showing certain boundaries for that distance [21, 23], and to a maximum number of 2 deviating nucleotides. The genes/operons preceded by the putative binding sites thus should constitute putative minimal regulons. In principle, more degenerate motifs should lead to a longer list of compliant sites, as was indeed observed. This observation, which was earlier made by others , reveals a key point in regulon predictions based on operator motifs, namely motif degeneracy complicates a straightforward decision about the authenticity of the recovered sites. Moreover, as described in the above sections, binding will by necessity be influenced by TF concentration (activity). Therefore, experimental data on gene expression and TF concentration (activity) will be essential to refine the predictions. At the same time, in most cases, a proper interpretation of experimental transcription data will require motif and regulon predictions because of the fact that the activity of many TFs is intertwined and the number of conditions tested or testable too limited to untwine these. Although the extrapolation of the predictions to experimental data is non-trivial, several of the predicted associations could be confirmed on basis of data obtained in L. plantarum and related species (see Results). Moreover, a comparison of the predicted regulons depicted in Figure 3 with the environmental signals that are expected to govern the specific LacI-family TF activities (see Figure 4) shows that the recovered connection make perfect biological sense. This finding strongly supports the assertion that the predictions provide a valid coupling between the LacI-family TFs and functionalities encoded by the putative regulons.
We have formulated a sequence-based approach that enables the identification of TF-specific binding motifs. One of the major advantages of the approach is that it is generic and thus, in principle, can be applied to any TF family without prior knowledge of the actual composition of the binding motif. In fact, we are in the process of performing similar analyses for various TF-families, including two component systems, and the preliminary results confirm the assertion. The method appears perfectly suited to identify binding sites on the genome connected to local regulators in contrast to current automated procedures that yield mostly sites connected to global regulators.
The presented data substantiate the successful identification of specific operator motifs related to the LacI-family TFs in the model organism L. plantarum. The recovered motifs differ in at least one position but at the same time their similarity is considerable. As the composition of the operator motif is tightly related to the affinity of the TF for the DNA this finding implicates that some of the LacI-family TFs could potentially bind to the operators of another. In fact, the observed competition in B. subtilis TF knock outs, between CcpA and CcpB in the repression of the gnt and xyl operon , exemplifies this phenomenon. Simultaneously, higher TF (or binding site) concentration (activity) will result in regulation at degenerate sites (i.e. lower affinity) (see ), a conclusion that correlates well with the mechanism of control of TF-activity itself as this involves a change in affinity of the TF for the operator upon induction [34, 85]. An important corollary is that regulons, and especially those related to global regulators, will vary in size depending on the environmental conditions.
Finally, potential binding sites can be identified based on the operator motif predictions and from those the functionalities that are regulated in response to a given stimulus can be reconstructed. In principle, the coupling of putative regulons with potential TF inducers thus provides insight in the prioritization of the functionalities within a certain organism. Nevertheless, our data on LacI-family TFs in L. plantarum makes perfectly clear that in order to arrive at a complete reconstruction of the encoded transcriptional response to environmental stimuli, experimental data on transcription as well as TF and inducer concentration under different environmental conditions is adamant.
Resources and tools
All genomic information was obtained from the ERGO genome analysis and discovery system  and updated until the 1st of July 2007. Nevertheless, the presented results do not depend on the use of this particular resource and the methods described in this paper can as well be applied using publicly accessible resources (like those at NCBI ). The genome sequence of L. plantarum WCFS1 and the functional annotation of its genes was taken from our in-house annotation database . Potentially homologous sequences were collected from the database using the BLAST algorithm , with a typical cut-off between 10-2 and 10-10. Multiple sequence alignments were made with MUSCLE  (default settings). Alignments were visually inspected and aberrant sequences were removed (characterized by many gaps and a distinctly different conservation pattern). BioEdit  and Jalview  were used to edit sequences, and ClustalW  was used to create (domain-) specific bootstrapped neighbor-joining trees (with 'correction for multiple substitutions' ). The resulting trees were analyzed using LOFT, a tool that automatically divides the sequences into orthologous groups based on the hierarchy of the tree and the duplication and speciation events implied by that hierarchy . Overrepresented DNA sequences in a selected set of upstream regions (300 bases) were identified automatically using MEME  and MAST  was used to detect other potential TF-binding sites on the genome (default cut-off p-value < 10-5).
Identification of TF-specific operator motifs
- Selection of a TF family, the collection of homologs and the derivation of orthology (Figure 7 1–4)
Intra-species and inter-species homologs were collected from the database using BLAST and the search was iterated until no additional sequences were found. This search was not only performed on the level of the complete sequence but also with individual functional domains. The sequences were aligned, aberrant sequences were removed, a bootstrapped NJ-tree was generated, and the hierarchy of the branching together with the bootstrap support were considered to identify orthologs. In the case of LacI-family members, the complete sequence of CcpA from L. plantarum was used as a starting sequence, as well as the N-terminal (first 90 residues; DNA binding domain) and C-terminal (other residues; inducer binding domain) sequence. To restrict the size of the final collection, only Firmicutes genomes were analyzed. The examined species included well-studied organisms such as B. subtilis, L. lactis and S. thermophilus (see Additional file 1 for a complete list of analyzed genomes). To improve the potential for functional identification the genome sequences of several E. coli strains and Salmonella species were also included. A striking feature of the NJ tree of the Firmicutes LacI-family TF homologs was that the representation of the 'early' branching events came out very unreliable, as signified by the extremely low bootstrap support (several values were as low as 1). In contrast, most branches related to supposed more recent evolutionary events had high bootstrap values in the NJ-tree and, as a result, the LacI-family TF homologs could be separated reliably into groups of orthologous sequences (see Additional files 4 and 5). The set of homologs identified by us was compared to the entries in the PFAM database .
- Definition of functional equivalents (Figure 7 4,5)
Orthologous clusters can often be further subdivided to obtain putative Groups of Orthologous Functional Equivalent s or GOOFEs. The homogeneity of the sequence alignment (as indicated by conserved stretches of residues and the absence of large gaps or inserts), a high bootstrap-value at the branching point that separates the orthologous cluster from the other sequences (Figure 7 4), and most importantly, a clear difference in conserved gene-context within the group were used to evaluate the necessity of such sub-division (Figure 7 5). In the case of many of the LacI-family TFs of L. plantarum, the subdivision into GOOFEs resulted in clearly distinct operator motifs even within an orthologous group (as illustrated in Figure 7). The protein sequences, alignments and trees can be found in Additional files 1, 2, 3, 4, 5.
- Selection of upstream regions containing putative operators (Figure 7 5)
The observation that most genes encoding TFs seem to be associated on the genome with the genes whose transcription they control may guide the selection of upstream regions. The upstream regions of the conserved operons within a GOOFE were used to search putative operator sites (selected regions (see Additional file 6)). Only, in case the TF encoding gene lay solitary on the genome the upstream regions of the TFs from one GOOFE were used, based on the notion that autoregulation is a common feature of many TFs.
- Motif definition (Figure 7 6)
Potential TF binding regions on the DNA (i.e. operators) were searched automatically in the selected set of upstream regions (300 bases) using MEME. As motif prediction tools often produce multiple motifs including many false positives, an alignment of the regions was made and the observed conservations were compared to the automatically recovered motifs to remove most false positives. The final collection of motifs was then compared within the complete TF-family and the TF-specific motifs were defined based on conserved features, like characteristic residues, spacing and motif length. The LacI-family TFs are known to form functional dimers and as a consequence the reported binding sequence motifs for these proteins are palindromes of lengths varying between 10 and 16 basepairs [55, 57, 81]. Therefore MEME was tuned to find inverted repeats (-pal option) with a maximum width of 20 bases and the detection of 4 different motifs with zero or one occurrence per sequence (-ZOOPS option). The resulting motifs were compared and for each set of upstream regions (related to a certain TF) an operator region of 16 ('CcpA-like') or 17 ('EbgR-like') bases was defined.
- Identification of putative TF binding sites
A specific operator and a position-specific scoring matrix were created for each TF by application of MEME to the defined operator regions. To avoid base preferences in the scoring, a background file in which the probability of finding an A, T, C or G at a certain position at random was set at 0.25. The final position-specific scoring matrices were used as input for an automated genome-wide motif search using MAST. Two additional criteria were used to filter out potential false positives. Firstly, the vast majority of LacI operators that have been identified to date can be found in the range of -250 to +50 nucleotides from the translation start, with no instances further upstream [21, 23]. Therefore, identified sites located more than 250 nucleotides upstream and more than 50 nucleotides downstream of the translation start site were not considered. Secondly, all sites that deviated at more than two positions in the central 14 nucleotides with respect to the operators in the vicinity of the LacI-family TFs, were not considered. The tables that resulted from the MAST search have been deposited in Additional file 8.
Prediction of the inducer of TF activity
A bootstrapped NJ-tree was generated on basis of a multiple sequence alignment of all LacI-family TF homologs of L. plantarum, together with orthologous sequences for which experimental confirmation about the nature of the inducer could be retrieved. TFs were considered equivalent in case they were clearly orthologous (strong bootstrap support), were syntenous and provided the alignment was homogeneous (i.e. the absence of gaps and several clear conservations).
Reconstruction of the mode of regulation
In principle, TFs can act both as transcriptional activator and as repressor depending on the position of the operator relative to the promoter, upstream or inside/downstream, respectively [18, 25, 58, 97]. To resolve whether the TF acts as an activator or repressor, phylogenetic footprints were made for various upstream regions containing an operator and its position relative to that of the potential promoter was determined. In case the alignment was not clear, the predicted operators were used as an anchor to realign the flanking regions for promoter detection.
Determination of relative mRNA levels for the LacI-family TFs
Absolute expression data was obtained from 35 independent micro-array experiments with custom Agilent oligo-based arrays of L. plantarum WCFS1 (this yielded 70 semi-independent datasets). The experimental conditions tested varied from stress to over-expression of certain metabolic genes to growth on different oligosaccharides (D. Molenaar, unpublished data; see also ). The raw data were adapted as follows. The absolute signals of the spots related to individual proteins were averaged and then the signals were ranked independently for the two individual channels. Per experiment and per channel, the 50 lowest signals were discarded and the signals of the 200 proteins ranked lowest in the remaining list were averaged. The average was interpreted as basal signal and subtracted from the signals related to the LacI-family TFs. Finally, the resulting signals were made relative by dividing all signals by the highest signal displayed by a LacI-family TF representative.
Carbon Catabolite Repression
Group of Orthologous Functional Equivalents
RK acknowledges the support from the Netherlands Ministry of Economic Affairs via the IOP Program, grant IGE01018, and CF the support of NBIC/the Netherlands Genomics Initiative via the Kluyver Centre for Genomics of Industrial Fermentations and the BioRange program.
- Stormo GD: DNA binding sites: representation and discovery. Bioinformatics. 2000, 16 (1): 16-23. 10.1093/bioinformatics/16.1.16.PubMedGoogle Scholar
- Bulyk ML: Computational prediction of transcription-factor binding site locations. Genome Biol. 2003, 5 (1): 201-10.1186/gb-2003-5-1-201.PubMedPubMed CentralGoogle Scholar
- Thompson W, Rouchka EC, Lawrence CE: Gibbs Recursive Sampler: finding transcription factor binding sites. Nucleic Acids Res. 2003, 31 (13): 3580-3585. 10.1093/nar/gkg608.PubMedPubMed CentralGoogle Scholar
- Kim JT, Gewehr JE, Martinetz T: Binding matrix: a novel approach for binding site recognition. J Bioinform Comput Biol. 2004, 2 (2): 289-307. 10.1142/S0219720004000569.PubMedGoogle Scholar
- Osada R, Zaslavsky E, Singh M: Comparative analysis of methods for representing and searching for transcription factor binding sites. Bioinformatics. 2004, 20 (18): 3516-3525. 10.1093/bioinformatics/bth438.PubMedGoogle Scholar
- Yellaboina S, Seshadri J, Kumar MS, Ranjan A: PredictRegulon: a web server for the prediction of the regulatory protein binding sites and operons in prokaryote genomes. Nucleic Acids Res. 2004, 32 (Web Server issue): W318-320. 10.1093/nar/gkh364.PubMedPubMed CentralGoogle Scholar
- Yan B, Lovley DR, Krushkal J: Genome-wide similarity search for transcription factors and their binding sites in a metal-reducing prokaryote Geobacter sulfurreducens. Biosystems. 2006Google Scholar
- Rodionov DA: Comparative genomic reconstruction of transcriptional regulatory networks in bacteria. Chem Rev. 2007, 107 (8): 3467-3497. 10.1021/cr068309+.PubMedPubMed CentralGoogle Scholar
- Moses AM, Chiang DY, Pollard DA, Iyer VN, Eisen MB: MONKEY: identifying conserved transcription-factor binding sites in multiple alignments using a binding site-specific evolutionary model. Genome Biol. 2004, 5 (12): R98-10.1186/gb-2004-5-12-r98.PubMedPubMed CentralGoogle Scholar
- Carmack CS, McCue LA, Newberg LA, Lawrence CE: PhyloScan: identification of transcription factor binding sites using cross-species evidence. Algorithms Mol Biol. 2007, 2: 1-10.1186/1748-7188-2-1.PubMedPubMed CentralGoogle Scholar
- Okumura T, Makiguchi H, Makita Y, Yamashita R, Nakai K: Melina II: a web tool for comparisons among several predictive algorithms to find potential motifs from promoter regions. Nucleic Acids Res. 2007, 35 (Web Server issue): W227-231. 10.1093/nar/gkm362.PubMedPubMed CentralGoogle Scholar
- Kaplan T, Friedman N, Margalit H: Ab initio prediction of transcription factor targets using structural knowledge. PLoS Comput Biol. 2005, 1 (1): e1-10.1371/journal.pcbi.0010001.PubMedPubMed CentralGoogle Scholar
- Yan B, Núñez C, Ueki T, Esteve-Núñez A, Puljic M, Adkins RM, Methé BA, Lovley DR, Krushkal J: Computational prediction of RpoS and RpoD regulatory sites in Geobacter sulfurreducens using sequence and gene expression information. Gene. 2006, 384: 73-95. 10.1016/j.gene.2006.06.025.PubMedGoogle Scholar
- Monsieurs P, Thijs G, Fadda AA, De Keersmaecker SC, Vanderleyden J, De Moor B, Marchal K: More robust detection of motifs in coexpressed genes by using phylogenetic information. BMC Bioinformatics. 2006, 7: 160-10.1186/1471-2105-7-160.PubMedPubMed CentralGoogle Scholar
- Alkema WB, Lenhard B, Wasserman WW: Regulog analysis: detection of conserved regulatory networks across bacteria: application to Staphylococcus aureus. Genome Res. 2004, 14 (7): 1362-1373. 10.1101/gr.2242604.PubMedPubMed CentralGoogle Scholar
- Van Hellemont R, Monsieurs P, Thijs G, de Moor B, Van de Peer Y, Marchal K: A novel approach to identifying regulatory motifs in distantly related genomes. Genome Biol. 2005, 6 (13): R113-10.1186/gb-2005-6-13-r113.PubMedPubMed CentralGoogle Scholar
- Wels M, Francke C, Kerkhoven R, Kleerebezem M, Siezen RJ: Predicting cis-acting elements of Lactobacillus plantarum by comparative genomics with different taxonomic subgroups. Nucleic Acids Res. 2006, 34 (7): 1947-1958. 10.1093/nar/gkl138.PubMedPubMed CentralGoogle Scholar
- Grundy FJ, Waters DA, Allen SH, Henkin TM: Regulation of the Bacillus subtilis acetate kinase gene by CcpA. J Bacteriol. 1993, 175 (22): 7348-7355.PubMedPubMed CentralGoogle Scholar
- Stülke J, Hillen W: Carbon catabolite repression in bacteria. Curr Opin Microbiol. 1999, 2 (2): 195-201. 10.1016/S1369-5274(99)80034-4.PubMedGoogle Scholar
- Brückner R, Titgemeyer F: Carbon catabolite repression in bacteria: choice of the carbon source and autoregulatory limitation of sugar utilization. FEMS Microbiol Lett. 2002, 209 (2): 141-148.PubMedGoogle Scholar
- Deutscher J, Francke C, Postma PW: How phosphotransferase system-related protein phosphorylation regulates carbohydrate metabolism in bacteria. Microbiol Mol Biol Rev. 2006, 70 (4): 939-1031. 10.1128/MMBR.00024-06.PubMedPubMed CentralGoogle Scholar
- Miwa Y, Nakata A, Ogiwara A, Yamamoto M, Fujita Y: Evaluation and characterization of catabolite-responsive elements (cre) of Bacillus subtilis. Nucleic Acids Res. 2000, 28 (5): 1206-1210. 10.1093/nar/28.5.1206.PubMedPubMed CentralGoogle Scholar
- Deutscher J, Galinier A, Martin-Verstraete I: Carbohydrate uptake and metabolism. Bacillus subtilis and its closest relatives; From genes to cells. Edited by: Sonenshein AL, Hoch JA, Losick R. 2002, Washington DC , ASM Press, 129-150.Google Scholar
- Babu MM, Teichmann SA, Aravind L: Evolutionary dynamics of prokaryotic transcriptional regulatory networks. J Mol Biol. 2006, 358 (2): 614-633. 10.1016/j.jmb.2006.02.019.Google Scholar
- Zomer AL, Buist G, Larsen R, Kok J, Kuipers OP: Time-resolved determination of the CcpA regulon of Lactococcus lactis subsp. cremoris MG1363. J Bacteriol. 2007, 189 (4): 1366-1381. 10.1128/JB.01013-06.PubMedPubMed CentralGoogle Scholar
- Kraus A, Hueck C, Gärtner D, Hillen W: Catabolite repression of the Bacillus subtilis xyl operon involves a cis element functional in the context of an unrelated sequence, and glucose exerts additional xylR-dependent repression. J Bacteriol. 1994, 176 (6): 1738-1745.PubMedPubMed CentralGoogle Scholar
- Leboeuf C, Leblanc L, Auffray Y, Hartke A: Characterization of the ccpA gene of Enterococcus faecalis: identification of starvation-inducible proteins regulated by ccpA. J Bacteriol. 2000, 182 (20): 5799-5806. 10.1128/JB.182.20.5799-5806.2000.PubMedPubMed CentralGoogle Scholar
- Yoshida K, Kobayashi K, Miwa Y, Kang CM, Matsunaga M, Yamaguchi H, Tojo S, Yamamoto M, Nishi R, Ogasawara N, Nakayama T, Fujita Y: Combined transcriptome and proteome analysis as a powerful approach to study genes under glucose repression in Bacillus subtilis. Nucleic Acids Res. 2001, 29 (3): 683-692. 10.1093/nar/29.3.683.PubMedPubMed CentralGoogle Scholar
- Weickert MJ, Chambliss GH: Site-directed mutagenesis of a catabolite repression operator sequence in Bacillus subtilis. Proc Natl Acad Sci USA. 1990, 87 (16): 6238-6242. 10.1073/pnas.87.16.6238.PubMedPubMed CentralGoogle Scholar
- Hueck CJ, Hillen W, Saier MH: Analysis of a cis-active sequence mediating catabolite repression in gram-positive bacteria. Res Microbiol. 1994, 145 (7): 503-518. 10.1016/0923-2508(94)90028-0.PubMedGoogle Scholar
- Reidl J, Römisch K, Ehrmann M, Boos W: MalI, a novel protein involved in regulation of the maltose system of Escherichia coli, is highly homologous to the repressor proteins GalR, CytR, and LacI. J Bacteriol. 1989, 171 (9): 4888-4899.PubMedPubMed CentralGoogle Scholar
- Weickert MJ, Adhya S: A family of bacterial regulators homologous to Gal and Lac repressors. J Biol Chem. 1992, 267 (22): 15869-15874.PubMedGoogle Scholar
- Spronk CA, Bonvin AM, Radha PK, Melacini G, Boelens R, Kaptein R: The solution structure of Lac repressor headpiece 62 complexed to a symmetrical lac operator. Structure. 1999, 7 (12): 1483-1492. 10.1016/S0969-2126(00)88339-2.PubMedGoogle Scholar
- Lewis M: The lac repressor. C R Biol. 2005, 328 (6): 521-548. 10.1016/j.crvi.2005.04.004.PubMedGoogle Scholar
- Makita Y, Nakao M, Ogasawara N, Nakai K: DBTBS: database of transcriptional regulation in Bacillus subtilis and its contribution to comparative genomics. Nucleic Acids Res. 2004, 32 (Database issue): D75-77. 10.1093/nar/gkh074.PubMedPubMed CentralGoogle Scholar
- Kummerfeld SK, Teichmann SA: DBD: a transcription factor prediction database. Nucleic Acids Res. 2006, 34 (Database issue): D74-81. 10.1093/nar/gkj131.PubMedPubMed CentralGoogle Scholar
- Salgado H, Santos-Zavaleta A, Gama-Castro S, Peralta-Gil M, Peñaloza-Spínola MI, Martínez-Antonio A, Karp PD, Collado-Vides J: The comprehensive updated regulatory network of Escherichia coli K-12. BMC Bioinformatics. 2006, 7 (1): 5-10.1186/1471-2105-7-5.PubMedPubMed CentralGoogle Scholar
- Kleerebezem M, Boekhorst J, van Kranenburg R, Molenaar D, Kuipers OP, Leer R, Tarchini R, Peters SA, Sandbrink HM, Fiers MW, Stiekema W, Klein Lankhorst RM, Bron PA, Hoffer SM, Nierop Groot M, Kerkhoven R, de Vries M, Ursing B, de Vos WM, Siezen RJ: Complete genome sequence of Lactobacillus plantarum WCFS1. Proc Natl Acad Sci USA. 2003, 100 (4): 1990-1995. 10.1073/pnas.0337704100.PubMedPubMed CentralGoogle Scholar
- Siezen R, Boekhorst J, Muscariello L, Molenaar D, Renckens B, Kleerebezem M: Lactobacillus plantarum gene clusters encoding putative cell-surface protein complexes for carbohydrate utilization are conserved in specific gram-positive bacteria. BMC Genomics. 2006, 7: 126-10.1186/1471-2164-7-126.PubMedPubMed CentralGoogle Scholar
- Fitch WM: Homology a personal view on some of the problems. Trends Genet. 2000, 16 (5): 227-231. 10.1016/S0168-9525(00)02005-9.PubMedGoogle Scholar
- Koonin EV: Orthologs, paralogs, and evolutionary genomics. Annu Rev Genet. 2005, 39: 309-338. 10.1146/annurev.genet.39.073003.114725.PubMedGoogle Scholar
- Huynen MA, Gabaldón T, Snel B: Variation and evolution of biomolecular systems: searching for functional relevance. FEBS Lett. 2005, 579 (8): 1839-1845. 10.1016/j.febslet.2005.02.004.PubMedGoogle Scholar
- Tagle DA, Koop BF, Goodman M, Slightom JL, Hess DL, Jones RT: Embryonic epsilon and gamma globin genes of a prosimian primate (Galago crassicaudatus). Nucleotide and amino acid sequences, developmental regulation and phylogenetic footprints. J Mol Biol. 1988, 203 (2): 439-455. 10.1016/0022-2836(88)90011-3.PubMedGoogle Scholar
- Boffelli D, McAuliffe J, Ovcharenko D, Lewis KD, Ovcharenko I, Pachter L, Rubin EM: Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science. 2003, 299 (5611): 1391-1394. 10.1126/science.1081331.PubMedGoogle Scholar
- Hall BG: The EBG system of E. coli: origin and evolution of a novel beta-galactosidase for the metabolism of lactose. Genetica. 2003, 118 (2-3): 143-156. 10.1023/A:1024149508376.PubMedGoogle Scholar
- Makarova K, Slesarev A, Wolf Y, Sorokin A, Mirkin B, Koonin E, Pavlov A, Pavlova N, Karamychev V, Polouchine N, Shakhova V, Grigoriev I, Lou Y, Rohksar D, Lucas S, Huang K, Goodstein DM, Hawkins T, Plengvidhya V, Welker D, Hughes J, Goh Y, Benson A, Baldwin K, Lee JH, Díaz-Muñiz I, Dosti B, Smeianov V, Wechter W, Barabote R, Lorca G, Altermann E, Barrangou R, Ganesan B, Xie Y, Rawsthorne H, Tamir D, Parker C, Breidt F, Broadbent J, Hutkins R, O'Sullivan D, Steele J, Unlu G, Saier M, Klaenhammer T, Richardson P, Kozyavkin S, Weimer B, Mills D: Comparative genomics of the lactic acid bacteria. Proc Natl Acad Sci USA. 2006, 103 (42): 15611-15616. 10.1073/pnas.0607117103.PubMedPubMed CentralGoogle Scholar
- Molenaar D, Bringel F, Schuren FH, de Vos WM, Siezen RJ, Kleerebezem M: Exploring Lactobacillus plantarum genome diversity by using microarrays. J Bacteriol. 2005, 187 (17): 6119-6127. 10.1128/JB.187.17.6119-6127.2005.PubMedPubMed CentralGoogle Scholar
- Mahr K, Hillen W, Titgemeyer F: Carbon catabolite repression in Lactobacillus pentosus: analysis of the ccpA region. Appl Environ Microbiol. 2000, 66 (1): 277-283.PubMedPubMed CentralGoogle Scholar
- Muscariello L, Marasco R, De Felice M, Sacco M: The functional ccpA gene is required for carbon catabolite repression in Lactobacillus plantarum. Appl Environ Microbiol. 2001, 67 (7): 2903-2907. 10.1128/AEM.67.7.2903-2907.2001.PubMedPubMed CentralGoogle Scholar
- Lapierre L, Mollet B, Germond JE: Regulation and adaptive evolution of lactose operon expression in Lactobacillus delbrueckii. J Bacteriol. 2002, 184 (4): 928-935. 10.1128/jb.184.4.928-935.2002.PubMedPubMed CentralGoogle Scholar
- Vaughan EE, van den Bogaard PT, Catzeddu P, Kuipers OP, de Vos WM: Activation of silent gal genes in the lac-gal regulon of Streptococcus thermophilus. J Bacteriol. 2001, 183 (4): 1184-1194. 10.1128/JB.183.4.1184-1194.2001.PubMedPubMed CentralGoogle Scholar
- Ajdić D, Ferretti JJ: Transcriptional regulation of the Streptococcus mutans gal operon by the GalR repressor. J Bacteriol. 1998, 180 (21): 5727-5732.PubMedPubMed CentralGoogle Scholar
- Nieto C, Puyet A, Espinosa M: MalR-mediated regulation of the Streptococcus pneumoniae malMP operon at promoter PM. Influence of a proximal divergent promoter region and competition between MalR and RNA polymerase proteins. J Biol Chem. 2001, 276 (18): 14946-14954. 10.1074/jbc.M010911200.PubMedGoogle Scholar
- Mekjian KR, Bryan EM, Beall BW, Moran CP: Regulation of hexuronate utilization in Bacillus subtilis. J Bacteriol. 1999, 181 (2): 426-433.PubMedPubMed CentralGoogle Scholar
- Bell CE, Lewis M: The Lac repressor: a second generation of structural and functional studies. Curr Opin Struct Biol. 2001, 11 (1): 19-25. 10.1016/S0959-440X(00)00180-9.PubMedGoogle Scholar
- Kalodimos CG, Biris N, Bonvin AM, Levandoski MM, Guennuegues M, Boelens R, Kaptein R: Structure and flexibility adaptation in nonspecific and specific protein-DNA complexes. Science. 2004, 305 (5682): 386-389. 10.1126/science.1097064.PubMedGoogle Scholar
- Schumacher MA, Allen GS, Diel M, Seidel G, Hillen W, Brennan RG: Structural basis for allosteric control of the transcription regulator CcpA by the phosphoprotein HPr-Ser46-P. Cell. 2004, 118 (6): 731-741. 10.1016/j.cell.2004.08.027.PubMedGoogle Scholar
- Kim JH, Yang YK, Chambliss GH: Evidence that Bacillus catabolite control protein CcpA interacts with RNA polymerase to inhibit transcription. Mol Microbiol. 2005, 56 (1): 155-162. 10.1111/j.1365-2958.2005.04496.x.PubMedGoogle Scholar
- Becskei A, Serrano L: Engineering stability in gene networks by autoregulation. Nature. 2000, 405 (6786): 590-593. 10.1038/35014651.PubMedGoogle Scholar
- Roy S, Sahu A, Adhya S: Evolution of DNA binding motifs and operators. Gene. 2002, 285 (1-2): 169-173. 10.1016/S0378-1119(02)00413-4.PubMedGoogle Scholar
- Maheshri N, O'Shea EK: Living with noisy genes: how cells function reliably with inherent variability in gene expression. Annu Rev Biophys Biomol Struct. 2007, 36: 413-434. 10.1146/annurev.biophys.36.040306.132705.PubMedGoogle Scholar
- van der Heijden RT, Snel B, van Noort V, Huynen MA: Orthology prediction at scalable resolution by phylogenetic tree analysis. BMC Bioinformatics. 2007, 8: 83-10.1186/1471-2105-8-83.PubMedPubMed CentralGoogle Scholar
- Quentin Y, Fichant G, Denizot F: Inventory, assembly and analysis of Bacillus subtilis ABC transport systems. J Mol Biol. 1999, 287 (3): 467-484. 10.1006/jmbi.1999.2624.PubMedGoogle Scholar
- Le Breton Y, Pichereau V, Sauvageot N, Auffray Y, Rincé A: Maltose utilization in Enterococcus faecalis. J Appl Microbiol. 2005, 98 (4): 806-813. 10.1111/j.1365-2672.2004.02468.x.PubMedGoogle Scholar
- Saulnier DM, Molenaar D, de Vos WM, Gibson GR, Kolida S: Identification of prebiotic fructooligosaccharide metabolism in Lactobacillus plantarum WCFS1 through microarrays. Appl Environ Microbiol. 2007, 73 (6): 1753-1765. 10.1128/AEM.01151-06.PubMedPubMed CentralGoogle Scholar
- Stentz R, Cornet M, Chaillou S, Zagorec M: Adaptation of Lactobacillus sakei to meat: a new regulatory mechanism of ribose utilization?. Lait. 2001, 81: 131-138. 10.1051/lait:2001117.Google Scholar
- Hiratsuka K, Wang B, Sato Y, Kuramitsu H: Regulation of sucrose-6-phosphate hydrolase activity in Streptococcus mutans: characterization of the scrR gene. Infect Immun. 1998, 66 (8): 3736-3743.PubMedPubMed CentralGoogle Scholar
- Luesink EJ, Marugg JD, Kuipers OP, de Vos WM: Characterization of the divergent sacBK and sacAR operons, involved in sucrose utilization by Lactococcus lactis. J Bacteriol. 1999, 181 (6): 1924-1926.PubMedPubMed CentralGoogle Scholar
- Barrangou R, Altermann E, Hutkins R, Cano R, Klaenhammer TR: Functional and comparative genomic analyses of an operon involved in fructooligosaccharide utilization by Lactobacillus acidophilus. Proc Natl Acad Sci USA. 2003, 100 (15): 8957-8962. 10.1073/pnas.1332765100.PubMedPubMed CentralGoogle Scholar
- Vaillancourt K, Moineau S, Frenette M, Lessard C, Vadeboncoeur C: Galactose and lactose genes from the galactose-positive bacterium Streptococcus salivarius and the phylogenetically related galactose-negative bacterium Streptococcus thermophilus: organization, sequence, transcription, and activity of the gal gene products. J Bacteriol. 2002, 184 (3): 785-793.PubMedPubMed CentralGoogle Scholar
- Barrangou R, Azcarate-Peril MA, Duong T, Conners SB, Kelly RM, Klaenhammer TR: Global analysis of carbohydrate utilization by Lactobacillus acidophilus using cDNA microarrays. Proc Natl Acad Sci USA. 2006, 103 (10): 3816-3821. 10.1073/pnas.0511287103.PubMedPubMed CentralGoogle Scholar
- Turinsky AJ, Grundy FJ, Kim JH, Chambliss GH, Henkin TM: Transcriptional activation of the Bacillus subtilis ackA gene requires sequences upstream of the promoter. J Bacteriol. 1998, 180 (22): 5961-5967.PubMedPubMed CentralGoogle Scholar
- Maze A, Boel G, Poncet S, Mijakovic I, Le Breton Y, Benachour A, Monedero V, Deutscher J, Hartke A: The Lactobacillus casei ptsHI47T mutation causes overexpression of a LevR-regulated but RpoN-independent operon encoding a mannose class phosphotransferase system. J Bacteriol. 2004, 186 (14): 4543-4555. 10.1128/JB.186.14.4543-4555.2004.PubMedPubMed CentralGoogle Scholar
- Schick J, Weber B, Klein JR, Henrich B: PepR1, a CcpA-like transcription regulator of Lactobacillus delbrueckii subsp. lactis. Microbiology. 1999, 145: 3147-3154.PubMedGoogle Scholar
- Andersson U, Molenaar D, Rådström P, de Vos WM: Unity in organisation and regulation of catabolic operons in Lactobacillus plantarum, Lactococcus lactis and Listeria monocytogenes. Syst Appl Microbiol. 2005, 28 (3): 187-195. 10.1016/j.syapm.2004.11.004.PubMedGoogle Scholar
- Pedersen H, Valentin-Hansen P: Protein-induced fit: the CRP activator protein changes sequence-specific DNA recognition by the CytR repressor, a highly flexible LacI member. EMBO J. 1997, 16 (8): 2108-2118. 10.1093/emboj/16.8.2108.PubMedPubMed CentralGoogle Scholar
- Falcon CM, Matthews KS: Operator DNA sequence variation enhances high affinity binding by hinge helix mutants of lactose repressor protein. Biochemistry. 2000, 39 (36): 11074-11083. 10.1021/bi000924z.PubMedGoogle Scholar
- Lehming N, Sartorius J, Kisters-Woike B, von Wilcken-Bergmann B, Müller-Hill B: Mutant lac repressors with new specificities hint at rules for protein-DNA recognition. EMBO J. 1990, 9 (3): 615-621.PubMedPubMed CentralGoogle Scholar
- Sadler JR, Sasmor H, Betz JL: A perfectly symmetric lac operator binds the lac repressor very tightly. Proc Natl Acad Sci USA. 1983, 80 (22): 6785-6789. 10.1073/pnas.80.22.6785.PubMedPubMed CentralGoogle Scholar
- Betz JL, Sasmor HM, Buck F, Insley MY, Caruthers MH: Base substitution mutants of the lac operator: in vivo and in vitro affinities for lac repressor. Gene. 1986, 50 (1-3): 123-132. 10.1016/0378-1119(86)90317-3.PubMedGoogle Scholar
- Spronk CA, Folkers GE, Noordman AM, Wechselberger R, van den Brink N, Boelens R, Kaptein R: Hinge-helix formation and DNA bending in various lac repressor-operator complexes. EMBO J. 1999, 18 (22): 6472-6480. 10.1093/emboj/18.22.6472.PubMedPubMed CentralGoogle Scholar
- Rosenfeld N, Young JW, Alon U, Swain PS, Elowitz MB: Gene regulation at the single-cell level. Science. 2005, 307 (5717): 1962-1965. 10.1126/science.1106914.PubMedGoogle Scholar
- Robinow C, Kellenberger E: The bacterial nucleoid revisited. Microbiol Rev. 1994, 58 (2): 211-232.PubMedPubMed CentralGoogle Scholar
- Chauvaux S, Paulsen IT, Saier MH: CcpB, a novel transcription factor implicated in catabolite repression in Bacillus subtilis. J Bacteriol. 1998, 180 (3): 491-497.PubMedPubMed CentralGoogle Scholar
- Barkley MD, Bourgeois S: Repressor recognition of operator and effectors. The operon. Edited by: Miller JH, Reznikoff WS. 1978, Cold Spring Harbor, NY , Cold Spring Harbor Laboratory, 177-220.Google Scholar
- Overbeek R, Larsen N, Walunas T, D'Souza M, Pusch G, Selkov E, Liolios K, Joukov V, Kaznadzey D, Anderson I, Bhattacharyya A, Burd H, Gardner W, Hanke P, Kapatral V, Mikhailova N, Vasieva O, Osterman A, Vonstein V, Fonstein M, Ivanova N, Kyrpides N: The ERGO genome analysis and discovery system. Nucleic Acids Res. 2003, 31 (1): 164-171. 10.1093/nar/gkg148.PubMedPubMed CentralGoogle Scholar
- Wheeler DL, Barrett T, Benson DA, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Edgar R, Federhen S, Geer LY, Kapustin Y, Khovayko O, Landsman D, Lipman DJ, Madden TL, Maglott DR, Ostell J, Miller V, Pruitt KD, Schuler GD, Sequeira E, Sherry ST, Sirotkin K, Souvorov A, Starchenko G, Tatusov RL, Tatusova TA, Wagner L, Yaschenko E: Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2007, 35 (Database issue): D5-12. 10.1093/nar/gkl1031.PubMedPubMed CentralGoogle Scholar
- Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25 (17): 3389-3402. 10.1093/nar/25.17.3389.PubMedPubMed CentralGoogle Scholar
- Edgar RC: MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics. 2004, 5: 113-10.1186/1471-2105-5-113.PubMedPubMed CentralGoogle Scholar
- Tippmann HF: Analysis for free: comparing programs for sequence analysis. Brief Bioinform. 2004, 5 (1): 82-87. 10.1093/bib/5.1.82.PubMedGoogle Scholar
- Clamp M, Cuff J, Searle SM, Barton GJ: The Jalview Java alignment editor. Bioinformatics. 2004, 20 (3): 426-427. 10.1093/bioinformatics/btg430.PubMedGoogle Scholar
- Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG: The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 1997, 25 (24): 4876-4882. 10.1093/nar/25.24.4876.PubMedPubMed CentralGoogle Scholar
- Kimura M: Estimation of evolutionary distances between homologous nucleotide sequences. Proc Natl Acad Sci USA. 1981, 78 (1): 454-458. 10.1073/pnas.78.1.454.PubMedPubMed CentralGoogle Scholar
- Bailey TL, Elkan C: Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol. 1994, 2: 28-36.PubMedGoogle Scholar
- Bailey TL, Gribskov M: Combining evidence using p-values: application to sequence homology searches. Bioinformatics. 1998, 14 (1): 48-54. 10.1093/bioinformatics/14.1.48.PubMedGoogle Scholar
- Finn RD, Mistry J, Schuster-Böckler B, Griffiths-Jones S, Hollich V, Lassmann T, Moxon S, Marshall M, Khanna A, Durbin R, Eddy SR, Sonnhammer EL, Bateman A: Pfam: clans, web tools and services. Nucleic Acids Res. 2006, 34 (Database issue): D247-251. 10.1093/nar/gkj149.PubMedPubMed CentralGoogle Scholar
- Tojo S, Satomura T, Morisaki K, Deutscher J, Hirooka K, Fujita Y: Elaborate transcription regulation of the Bacillus subtilis ilv-leu operon involved in the biosynthesis of branched-chain amino acids through global regulators of CcpA, CodY and TnrA. Mol Microbiol. 2005, 56 (6): 1560-1573.PubMedGoogle Scholar
- Wolf M, Müller T, Dandekar T, Pollack JD: Phylogeny of Firmicutes with special reference to Mycoplasma (Mollicutes) as inferred from phosphoglycerate kinase amino acid sequence data. Int J Syst Evol Microbiol. 2004, 54 (Pt 3): 871-875. 10.1099/ijs.0.02868-0.PubMedGoogle Scholar
- Crooks GE, Hon G, Chandonia JM, Brenner SE: WebLogo: a sequence logo generator. Genome Res. 2004, 14 (6): 1188-1190. 10.1101/gr.849004.PubMedPubMed CentralGoogle Scholar
- Andersson U, Rådström P: Physiological function of the maltose operon regulator, MalR, in Lactococcus lactis. BMC Microbiol. 2002, 2: 28-10.1186/1471-2180-2-28.PubMedPubMed CentralGoogle Scholar
- Daniel RA, Haiech J, Denizot F, Errington J: Isolation and characterization of the lacA gene encoding beta-galactosidase in Bacillus subtilis and a regulator gene, lacR. J Bacteriol. 1997, 179 (17): 5636-5638.PubMedPubMed CentralGoogle Scholar
- Inaoka T, Takahashi K, Yada H, Yoshida M, Ochi K: RNA polymerase mutation activates the production of a dormant antibiotic 3,3'-neotrehalosadiamine via an autoinduction mechanism in Bacillus subtilis. J Biol Chem. 2004, 279 (5): 3885-3892. 10.1074/jbc.M309925200.PubMedGoogle Scholar
- Jobe A, Bourgeois S: lac Repressor-operator interaction. VI. The natural inducer of the lac operon. J Mol Biol. 1972, 69 (3): 397-408. 10.1016/0022-2836(72)90253-7.PubMedGoogle Scholar
- Müller W, Horstmann N, Hillen W, Sticht H: The transcription regulator RbsR represents a novel interaction partner of the phosphoprotein HPr-Ser46-P in Bacillus subtilis. FEBS J. 2006, 273 (6): 1251-1261. 10.1111/j.1742-4658.2006.05148.x.PubMedGoogle Scholar
- Nieto C, Espinosa M, Puyet A: The maltose/maltodextrin regulon of Streptococcus pneumoniae. Differential promoter regulation by the transcriptional repressor MalR. J Biol Chem. 1997, 272 (49): 30860-30865. 10.1074/jbc.272.49.30860.PubMedGoogle Scholar
- Schönert S, Seitz S, Krafft H, Feuerbaum EA, Andernach I, Witz G, Dahl MK: Maltose and maltodextrin utilization by Bacillus subtilis. J Bacteriol. 2006, 188 (11): 3911-3922. 10.1128/JB.00213-06.PubMedPubMed CentralGoogle Scholar
- Silvestroni A, Connes C, Sesma F, Savoy de Giori G, Piard JC: Characterization of the melA locus for alpha-galactosidase in Lactobacillus plantarum. Appl Environ Microbiol. 2002, 68 (11): 5464-5471.PubMedPubMed CentralGoogle Scholar
- Stentz R, Zagorec M: Ribose utilization in Lactobacillus sakei: analysis of the regulation of the rbs operon and putative involvement of a new transporter. J Mol Microbiol Biotechnol. 1999, 1 (1): 165-173.PubMedGoogle Scholar
- Woodson K, Devine KM: Analysis of a ribose transport operon from Bacillus subtilis. Microbiology. 1994, 140 ( Pt 8): 1829-1838.Google Scholar
- Marasco R, Muscariello L, Rigano M, Sacco M: Mutational analysis of the bglH catabolite-responsive element (cre) in Lactobacillus plantarum. FEMS Microbiol Lett. 2002, 208 (1): 143-146. 10.1111/j.1574-6968.2002.tb11074.x.PubMedGoogle Scholar
- NC-IUB: Nomenclature for incompletely specified bases in nucleic acid sequences. Recommendations 1984. Eur J Biochem. 1985, 150 (1): 1-5. 10.1111/j.1432-1033.1985.tb08977.x.Google Scholar
- Rodionov DA, Mironov AA, Gelfand MS: Transcriptional regulation of pentose utilisation systems in the Bacillus/Clostridium group of bacteria. FEMS Microbiol Lett. 2001, 205 (2): 305-314. 10.1111/j.1574-6968.2001.tb10965.x.PubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.