The TF-specific operator motif identification workflow. 1) First a particular TF-family was selected and 2) a prominent representative of that family was chosen. 3) The related sequence was used to search the genome of a particular species for intra-species homologs. This search was iterated until no new sequences are recovered. A high e-value cut-off was employed to ensure the recovery of all homologs. The sequences were aligned and a NJ-tree was generated. Both the alignment and the NJ-tree were used to determine the family or sub-family boundaries. 4) The procedure was repeated to retrieve all inter-species homologs and the general features of the intra-species homologs were used to determine the sequences that were taken into consideration. Orthologous relations between sequences were established on basis of clustering in the NJ-tree and a sufficient bootstrap support (in green) for the clustering. In the case of Lp_0172 and Lp_0173 the orthologous clusters are color-coded in brown and orange, respectively, and the other TFs of L. plantarum are indicated in red. 5) The genomic context of the various orthologs was inspected (legend bottom left) and in case clear differences existed, the orthologous groups were sub-divided into different Groups of orthologous functional equivalents (GOOFEs), as illustrated. Then, upstream regions of the conserved gene(s) in context were selected and inspected for potential regulatory sequences (the selected regions are indicated by colored triangles). The potential regulatory sequences were compared and those that showed similar features were selected. In fact, only those sequences that showed the highest conservation were selected to determine a specific operator motif. In the case of Lp_0172 and Lp_0173, a 'CcpA-like' operator motif was found up to 3 times in the upstream regions. The sequences that were selected to determine the Lp_0172 and Lp_0173 specific operator motifs are displayed (Px indicates the relative position of the selected sequence with respect to other similar sequences and relative to the translation start). 6) The selected sequences were used to create a GOOFE specific operator motif. The thus identified specific motifs related to the orthologous groups containing Lp_0172 and Lp_0173 demonstrate that the division into GOOFEs was essential to arrive at highly specific operator motifs. Although the motifs within both orthologous groups are highly similar, they differ distinctly in one position depending on the GOOFE. In the case of the TFs orthologous to Lp_0172, the motifs are strikingly different at position +5, with a fully conserved guanine in the GOOFE containing Lp_0172 and a fully conserved thymidine in the other. And in the case of the TFs orthologous to Lp_0173 the motifs are strikingly different at position -5, with a fully conserved thymidine in the GOOFE containing Lp_0173 and a fully conserved adenine in the other. remark: The gene/protein identifiers in the figure are derived from the ERGO resource . A conversion to other identifiers can be found in [Additional file 2]. The functional annotation of the depicted genes were taken from the in-house annotation database of L. plantarum WCFS1 ( and C. Francke unpublished results) and the ERGO resource. See [Additional file 9] for a detailed description of the functional annotation in L. plantarum.