Sigma factors specify bacterial transcription by binding to a characteristic promoter and thereby recruiting the associated RNA polymerase to that promoter. Ordinarily, the expression of genes/operons is controlled by the so-called 'housekeeping' sigma factor 70. However, most bacteria possess a larger repertoire of sigma factors of the Sigma-70 family, where each additional factor is associated with a specific programmed response . For instance, in Escherichia coli and related Gamma-proteobacteria the entry into stationary phase and the adaptation to starvation is associated with Sigma-S [2, 3], whereas the response to heat shock and similar stresses is mediated by Sigma-32 (e.g. [4, 5]). In Bacillus subtilis, sporulation is orchestrated by 5 sigma factors (Sigma-E, F, G, H and K) , whereas the general stress response is controlled by Sigma-B [7, 8]. In many species, particular extracellular signals are translated into an appropriate response by ECF sigma factors .
There is one sigma factor that seemingly does not fit in this picture as it has been associated with a range of physiological phenomena instead of with a singular response. Sigma-54 (gene rpoN in E.coli, sigL in B. subtilis) constitutes an evolutionary separate protein family and is found widely distributed among the bacterial kingdom, although there are phyla that lack the protein [10, 11]. It binds to a characteristic -24/-12 promoter [12–14] and absolutely requires the input of free energy (ATP) from an associated activator to initiate transcription [15, 16] (see [17, 18] for recent reviews on the mechanism). In most cases the activator binds to an enhancer element located upstream of the promoter and hence is referred to as Enhancer Binding Protein (EBP54). The EBP54s bind the DNA as inactive dimers, but upon reception of the appropriate signal they assemble into oligomeric rings [19, 20], with hexamers constituting the oligomeric active state . A large variety of EBP54s exists and although some species possess one, for instance Chlamydia trachomatis  and Lactobacillus plantarum , most species have more variants. B. subtilis and E. coli were reported to have five (see ) and twelve , respectively, and Myxococcus xanthus to have fifty-three . However, many of the reported numbers need correction (as described later) because the previous analyses have included EBP54 paralogs that have lost the interaction with Sigma-54, like TyrR  and DhaR  in E. coli and HupR in Rhodobacter capsulatus [29, 30].
Historically, Sigma-54 has been linked to the regulation of nitrogen metabolism. The protein was discovered as a positive regulatory factor needed for the expression of enterobacterial glutamine synthetase , before it was recognized that the protein is actually a sigma factor . However, it was soon after established that Sigma-54 mediated control of transcription is not only connected to nitrogen assimilation but to a wider range of cellular processes and physiology in the enterobacteria [25, 33]. Since then, it was shown that its role also encompasses the regulation of for example: flagellar biosynthesis in E. coli ; carboxylate uptake, central metabolism and flagellar biosynthesis in Geobacter sulfurreducens ; phosphotransferase system (PTS)-mediated carbohydrate uptake in the Gram-positive species Lactobacillus plantarum  and Listeria monocytogenes ; and PTS-mediated regulation in Gram-positive as well as Gram-negative organisms [37, 38]; osmotolerance in Listeria ; the utilization of compounds like gamma-aminobutyrate in Bacillus , and the less familiar biphenyl in Ralstonia metallidurans  and toluene, xylene (see ) and choline  in Pseudomonas; Type III secretion system mediated pathogenicity in Pseudomonas syringae  and Type VI secretion system mediated toxin secretion in e.g. Aeromonas and Marinomonas ; the adaptation to cold shock in B. subtilis ; the control of Sigma-S , lipoprotein biosynthesis and virulence  in Borrelia burgdorferi; acid resistance of pathogenic E. coli O157 ; biofilm formation by Burkholderia ; and motility, biofilm formation, luminescence, and colonization in Vibrio fischeri [50, 51]. The above plethora of associations has up to now obscured the definition of a general underlying functional theme that adds to the accepted associations with nitrogen metabolism and flagellar biosynthesis.
Several comparative studies have been performed for Sigma-54 and EBP54-mediated regulation [10, 15, 16, 52], but no unifying biological theme was identified. An in-depth comparative analysis was made for E. coli by . These authors concluded that nitrogen assimilation was one of the main processes connecting the Sigma-54 regulon. Besides, they found that a substantial fraction of the associated functions was seemingly unrelated. Some additional associations were proposed on basis of a comparative analysis on Pseudomonas putida, including links to carbon metabolism and flagellar biosynthesis . Since the last comprehensive comparative study in 2003 a considerable number of genomes has been sequenced, allowing us to make a new overview of the presence of Sigma-54 and the EBP-activators. Surprisingly, we found a clear-cut connection between the presence of the system and characteristic morphological features. To enhance the identification of true EBP54 activators and Sigma-54 promoters, we have tested and employed a straightforward motif search algorithm that directly relates to sequence similarity. Redefinition of the -24/-12 promoter and the similar motif search (SMS) approach allowed for the reliable identification of promoter sites in all species. Finally, we have analyzed the function annotations that were highly represented (intra-phylum) and conserved (inter-phylum) within the genomic context of all genes encoding Sigma-54, its activators and its promoters, to identify common functional traits.
Conserved genome context, i.e. synteny, is a strong indicator of a functional relationship between genes [54, 55] and it is therefore being used broadly to guide function prediction. In principle, the fact that encoded functions that show a conserved genomic proximity are mostly related does not only hold for genes, but by necessity extends in the direction of genetic (regulatory) elements , and thereby also in the direction of associated regulators (see e.g. ) and their (in)activating signals . As a consequence, a comparative analysis of the conserved genome context of regulators and regulatory elements should yield clues regarding the particular associated stimuli and responses. Although regulatory routes can vary between species much more than metabolic pathways, the functional associations at a higher hierarchical level (i.e. in terms of process, response and/or physiology) are far less variable. For instance, the bacterial PTS mediates the transport and phosphorylation of carbohydrates by means of phosphoenolpyruvate via the same phosphotransfer mechanism in all species and, at a higher hierarchical level, the system controls the same processes like catabolite repression and chemotaxis [37, 59]. Nevertheless, the precise regulatory interactions of the PTS and the intracellular signals that connect the organism's physiological state to the metabolic level differ significantly between groups of species (i.e. catabolite repression involves EIIAGlc and cAMP in E. coli, whereas it involves HPr and Fructose-1,6-bisphosphate in B. subtilis). The above implies that underlying functional themes that can not be discovered directly, for instance by studying conserved gene-associations of a particular regulator, may be discovered by mapping the associated functions at a higher hierarchical level (like pathways).
Absolute conservation will be relatively rare because of the earlier noted variability in the specific regulatory associations. To take such variability into account, we included in our analysis those functional associations that are highly represented within a phylum/class but are at the same time evolutionary conserved, that is present within several phyla/classes. Associations that fulfill this criterion can be viewed as cross-phylum (or cross-class) conserved function tendencies. By mapping of the conserved annotations present in the genetic context of the genes encoding Sigma-54, its EBP54-activators and its promoters, we discovered that there is indeed a common functional theme related to Sigma-54-mediated regulation, namely, the control of the transport and biosynthesis of the molecules that constitute the bacterial exterior, which encompass the extracellular polysaccharides (EPS), flagella, lipopolysaccharides (LPS), lipoproteins and the building blocks of the peptidoglycan cell wall.