Skip to main content

High diversity in the regulatory region of Shiga toxin encoding bacteriophages



Enterohemorrhagic Escherichia coli (EHEC) is an emerging health challenge worldwide and outbreaks caused by this pathogen poses a serious public health concern. Shiga toxin (Stx) is the major virulence factor of EHEC, and the stx genes are carried by temperate bacteriophages (Stx phages). The switch between lysogenic and lytic life cycle of the phage, which is crucial for Stx production and for severity of the disease, is regulated by the CI repressor which maintain latency by preventing transcription of the replication proteins. Three EHEC phage replication units (Eru1-3) in addition to the classical lambdoid replication region have been described previously, and Stx phages carrying the Eru1 replication region were associated with highly virulent EHEC strains.


In this study, we have classified the Eru replication region of 419 Stx phages. In addition to the lambdoid replication region and three already described Erus, ten novel Erus (Eru4 to Eru13) were detected. The lambdoid type, Eru1, Eru4 and Eru7 are widely distributed in Western Europe. Notably, EHEC strains involved in severe outbreaks in England and Norway carry Stx phages with Eru1, Eru2, Eru5 and Eru7 replication regions. Phylogenetic analysis of CI repressors from Stx phages revealed eight major clades that largely separate according to Eru type.


The classification of replication regions and CI proteins of Stx phages provides an important platform for further studies aimed to assess how characteristics of the replication region influence the regulation of phage life cycle and, consequently, the virulence potential of the host EHEC strain.

Peer Review reports


Enterohemorrhagic Escherichia coli (EHEC) is an important foodborne pathogen, responsible for disease in humans ranging from uncomplicated diarrhea to severe conditions such as hemorrhagic colitis and hemolytic uremic syndrome (HUS) [1,2,3]. WHO has estimated that 10% of patients with EHEC infection develop HUS and the most important sources for HUS cases were contaminated beef [4]. The first major EHEC outbreak found place in U.S.A. in 1982 and was caused by hamburgers contaminated by E. coli O157:H7 strain EDL 933 carrying the Stx2 phage 933W. Since then, the world has experienced multiple outbreaks of EHEC disease involving other serotypes than O157:H7 and new variants are constantly emerging [5, 6]. Shiga toxin (Stx) is the major virulence factor of EHEC, and it exists in two distinct forms, Stx1 and Stx2. Each form comprises several subtypes [7] where some subtypes such as Stx2a are associated with severe disease while Stx2c is considered less potent [8, 9].

The genes encoding Stx are carried by temperate bacteriophages (Stx phages) [10]. After infection, Stx phages follow either a lysogenic or lytic pathway. The lysogenic pathway involves integration of phage DNA into the host genome and replication of the phage genetic material along with the chromosome of the host cell. The lytic pathway leads to proliferation of the Stx phage, death of the host bacterial cell and release of new phage particles [11]. Induction of the lytic pathway is also accompanied by production and release of substantial amounts of Stx toxin. Experimental infections of microbiome-repleted mice suggest that Stx prophage induction, but not production of phage particles, is required for development of lethal disease [12]. As the amount of produced Stx influences the severity of the disease, the mechanisms regulating the switch from lysogenic to lytic life cycle is highly relevant for the pathogenicity of the host E. coli strain.

Since the first sequenced Stx phages shared substantial genomic similarity to phage lambda it has been assumed that they behave similarly [13, 14]. The increasing availability of whole genome sequences has revealed that Stx-encoding prophages are very diverse and, sometimes, exhibit only limited similarity towards phage lambda [15]. We have previously reported Stx phages with non-lambdoid replication regions and named the regions Eru (EHEC phage replication unit) [15]. The non-lambdoid Stx phages completely lack the O and P genes, encoding proteins involved in replication initiation of the lambdoid phage genome, and instead carry genes which have previously not been described in connection to replication of Stx phages. Three non-lambdoid Stx phage replications, Eru1-3, have so far been described [15]. One of the Eru types, Eru1, is carried by the highly pathogenic EHEC strains that caused the Norwegian O103:H25 outbreak in 2006 and the large O104:H4 outbreak in Europe in 2011. It was also shown that Eru1 phages exhibited a less stable lysogenic state than the classical lambdoid Stx phages, which could increase the pathogenicity of the host E. coli strain [15]. The majority of EHEC strains carrying Eru1, Eru2 and Eru3 type of Stx phages were US isolates whose genome sequences were submitted to NCBI databases by the United State Department of Agriculture, the US Food and Drug Administration, and the Food-borne Pathogen Omics Research Center.

Despite the high genetic diversity among Stx phage genomes, the phage replication region and the lysis-lysogeny regulatory systems are always located upstream and in the vicinity to the stxA gene [16]. This region mediates the switch between repression and induction of the prophage, and the mechanisms regulating these events have been studied in detail in phage lambda. The key elements responsible for regulating the life cycle of phage lambda are the gene encoding repressor CI (cI), the promoter binding the CI repressor and the adjacent upstream genes, transcribed in the opposite direction of cI (Fig. 1) [17, 18]. The lambda CI repressor downregulates expression of genes involved in production of new phage particles, i.e., the lytic cycle, by specific binding to the promoter region of the adjacent genes encoding the O and P proteins which initiate replication of the lambda genome [19]. The crystal structure of CI has been solved and revealed that the protein is functional as a homodimer and that repression occurs when two subunits bind cooperatively to adjacent operator sites on the DNA [20]. The C-terminal domain mediates the dimer formation and the dimer-dimer interactions enable CI to bind cooperatively to two or more operator sites [21, 22] while the N-terminal domain contains a helix-turn-helix DNA-binding domain [23, 24]. Upon DNA damage, the SOS-response protein RecA becomes activated and may in lysogenic cells stimulate autocleavage of CI [25]. Cleaved CI can no longer bind to DNA and its repression of the promoters in the replication module is thus relieved. In lambdoid prophages, repression by CI ultimately controls Shiga toxin production and release of CI is required for Stx production [26, 27]. A lysogenic derivate of the Stx phage 933W, encoding a non-cleavable CI repressor, was found unable to produce Stx [26]. It has also been demonstrated that low lysogenic stability coincidences with low intracellular levels of the CI repressor [28].

Fig. 1
figure 1

A schematic overview of the genome of an Stx phage. The boxes labeled a and b indicate the replication genes which are represented by O and P in phage lambda and by other less characterized genes in Eru1-3 [15]

Some EHEC strains appear more virulent than others and the type of Stx produced is known to contribute significantly to the pathogenicity of the EHEC strain [9] but the amount of toxin produced should also be considered. The increasing number of outbreaks of gastrointestinal disease and HUS caused by EHEC have stimulated studies on the Stx phages to better understand their contribution to the pathogenicity of the host E. coli strain. However, there is still very limited knowledge on how the different types of replication regions seen among the Stx phages influences the stability of the lysogenic state and the switch to lytic cycle. In this study, we have classified the CI repressor sequences of 260 Stx phages into clade I-VIII and their replication regions into 13 Eru types to provide a platform for further studies of how the genetic structure of the Stx prophages influences the virulence potential of the host EHEC strain.


Eru types were defined by the identity of the proteins encoded by the two genes located directly upstream and in opposite direction of cI regardless of their function. The identity of the remaining proteins between cI and stx were not considered in this study. Four novel Eru phage types (Eru4-7) were detected among 120 Stx-converting phage genomes retrieved from NCBI virus database (Fig. 2; Additional file 1), while an additional six novel Eru types, Eru8 to Eru13, were detected among 298 genome sequences obtained from ten examined NCBI BioProjects (Fig. 2; Additional file 2). These genome sequences comprise both EHEC isolated from patients and Shiga toxin producing E. coli (STEC) isolated from other sources. The genomes are available as unfinished genome assemblies and only sequences where both cI and stx were located on same contig were included in this study. Eru2 and Eru3, described in a previous study, both carry genes encoding a protein of unknown function and a helicase directly upstream of cI [15]. However, since the two unknown proteins share a low sequence identity (10%), phages carrying these protein combinations were still assigned to different Eru types [15]. Phages representing each Eru type are listed in Table 1 as reference phages for each Eru type.

Fig. 2
figure 2

Physical maps of the region between cI (green) and stx (pink). The color code also indicates the putative function of the proteins encoded by the genes directly upstream of cI. White arrows indicate open reading frames (ORFs) which are not discussed in this study

Table 1 Accession numbers of sequences representing each Eru type

The distribution of Eru types found among the 120 sequenced Stx phages are shown in Table 2.

Table 2 Number of Eru types in the data set of 120 Stx-converting phage genomes retrieved from the NCBI virus database (taxid:10,239)

The relatively high number of phage genomes belonging to Eru2, Eru3 and Eru7 could be due to a bias related to the number of deposited sequences from different studies (Additional file 1).

Distribution of Eru types among Stx phages from Western Europe

The national distribution of Eru types found among 298 identified contigs carrying both stx and cI from ten European BioProjects is shown in Table 3.

Table 3 Distribution of the thirteen Eru types (1–13) and the lambdoid (L) type in ten European BioProjects

The distribution of Eru types indicates that the lambdoid and the Eru1, 4, 5, 6 and 7 phage types are among the most common types of Stx phages in Europe, and that Eru7 appears to be particularly widespread (Table 3). The lambdoid- and the Eru4 and Eru5 phage types seem more inclined to carry genes encoding Stx1, while Eru1, Eru6 and Eru7 seems more often associated with genes encoding Stx2 (Table 3).

The Eru proteins

All Eru phages carry genes encoding different types of DNA binding proteins, such as helicases, primases, or other helix-turn-helix (HTH) motif proteins, in the first and/or second position directly upstream of cI (Fig. 2). Eru6, Eru7 and Eru9 phages carry genes encoding proteins of the Phage_pRha protein family (pfam09669) directly upstream of cI (Fig. 2). The Rha domain, which contain a winged helix-turn-helix DNA-binding motif, is also found in other temperate phages where it has been suggested to have phage regulatory function [30, 31]. Some of the Rha proteins also contain the Ash domain (PF10554), which is present in the ASH protein of bacteriophage P4. However, no function has so far been assigned to this domain [31]. Eru4, and the previously described Eru2 and Eru3 [15], encode proteins of unknown function directly upstream of cI (Fig. 2). However, there are no similarities between these proteins, and they do not share any previously described protein domains. The primases encoded by genes carried by Eru1, Eru5 and Eru10 phages do not share any sequence similarities (< 10% amino acid identity). The amino acid sequence of the putative helicases encoded by Eru4 and Eru12 are 97% identical and they both share the AAA motif (PF13604) with the Eru1 helicase [15]. However, the overall sequence homology between the Eru4 and Eru12 helicases and the Eru1 helicase are low (< 10% amino acid identity).

Genes encoding HTH domain proteins are found in either the first or second position directly upstream of cI in Eru5, Eru6, Eru8, Eru11 and Eru13 (Fig. 2). The HTH proteins of Eru5 and Eru6 are 50% identical with a coverage of 66%, the HTH proteins of Eru8 and Eru13 are 59% identical over the total protein sequence, and all five proteins exhibit the HTH_36 motif (PF13730). The HTH proteins of Eru6 and Eru13 also share a motif (PF13814) which is found in protein families essential for relaxation and replication of plasmid DNA [32, 33]. Both Eru8 and Eru11 phages carry a gene encoding a protein with homology to the bacterial toxin YdaT (PF06254) directly upstream of cI. However, the two Eru-encoded toxin-like proteins share only 34% identity with each other. The shortest distance between cI and stx was displayed by Eru10, which only carried a bifunctional DNA primase-polymerase motif protein (PF09250) [34] and the Q antiterminator protein [35, 36] in this region. All other Eru phages also carried the gene encoding the antiterminator Q protein between cI and stx, indicating that this protein is essential for Stx phages.

Eru types in particularly virulent EHEC

To explore the distribution of Eru types carried by highly pathogenic EHECs within a country we examined the Stx phages from six highly pathogenic EHEC O157:H7 strains that have caused larger outbreaks in the UK [36]. Four different Eru phage types in addition to the lambdoid type were found among the six strains (Table 4).

Table 4 Eru type of Stx phages of highly pathogenic EHEC O157:H7 isolates from UK

Among this panel of phages, all carrying stx2c and one carrying stx2a are of Eru2 type. Two stx2a carrying phages are of the Eru5 type, while the two remaining stx2a phages are of types Eru1 and Eru7. The only stx1a carrying phage among these isolates has a lambdoid replication region. Among the 97 Norwegian STEC strains in BioProject PRJEB6447, 15 strains caused HUS [42] and 13 of these strains carried stx2 phages of Eru types 1, 2 or 7 (Additional file 2).

The CI repressors

The CI repressor regulates transcription of the genes encoding the replication proteins defining the Eru type, so to further examine the replication region of Stx phages, a total of 260 annotated CI sequences (Additional file 3) were extracted from the phage genomes and used to build a phylogeny (Fig. 3). This analysis grouped the CI proteins into several distinct clades, for which major clades defined by less than 52% sequence identity were named I to VIII. Despite the higher sequence identity between Clade III and IV they are divided into two different clades as there were profound structural differences between the two clades for instance that Clade III completely lack the HTH binding domain. The CI protein from lambda phage (NP_040628.1) was most closely related to the CI proteins from phages of Eru types 2 and 3, all belonging to Clade I. The CI proteins of Eru2 and Eru3 phages in this clade were all identical and show an overall identity of 61% towards lambda CI. Lambda CI contains two protein domains, a HTH_3 domain [43] and a peptidase_S24 domain, which executes the CI autolysis [44, 45]. The two domains are conserved within the CI proteins belonging to Clades I, II, IV, V, VI and VII (Fig. 4). However, the CI proteins of Clade III and YP_009907967.1 in Clade V lack the HTH domain, while Clade VIII CI proteins lack the peptidase domain and instead exhibit an additional HTH domain (Fig. 4).

Fig. 3
figure 3

Maximum-likelihood phylogeny of 260 CI protein sequences. The tree was midpoint rooted and bootstrap values > 80% are indicated by grey circles. The Stx type is shown in the inner ring and the Eru type is shown in the outer ring. Clades that are discussed in the text are labelled with roman numerals

Fig. 4
figure 4

Domain structures of Stx phage CI repressors of Clade I-VIII. HTH_3 domains (grey) and Peptidase_S24 domains (yellow) were assigned according to Pfam

In contrast to the observed high homology between CI proteins within a clade, the homology between the clades was low (Additional file 4). The highest CI homology was seen between Clades I and II (51%) and between Clades III and IV (60%). An amino acid sequence alignment of CI sequences from Clades I to VII is shown in Fig. 5. The alignment revealed six amino acids conserved throughout all clades, one of which was the lambda CI autocleavage residue S150 [21].

Fig. 5
figure 5

Sequence alignment of Clade I-VII Stx phage CI sequences. CI protein from Clade VIII is not included in the alignment due to large structural differences (see Fig. 4). Red boxes indicate the six amino acids that were conserved throughout all clades and the black arrow indicates the CI autocleavage residue found in this type of repressors [21]

Strong correlation between CI Clades and Eru type

There is a strong coherence between CI clades and Eru types which is not unexpected in light of their neighboring location in the phage genome. CI proteins belonging to Clades III and V are almost exclusively co-present with Eru4 replication proteins and the lambdoid replication type is mostly found in connection with Clades VIb and VII (Fig. 3). Similarly, the genes encoding CI proteins belonging to Clade II are almost exclusively located directly upstream of the genes defining Eru1, while those belonging to Clade I are located upstream of Eru2 and Eru3 (Fig. 3). However, a specific CI clade are not necessarily restricted to a specific Eru type and may regulate expression of different Eru types (Fig. 3). CI proteins of Clades III, V and VIb are linked to the lambdoid or Eru4 types and are mainly found in Stx1 producing phages (Fig. 3).


The present study shows that the replication region of Stx phages are genetically much more diverse than previously anticipated. This finding is important as differences in phages replication modules may influence the stability of the lysogenic state and the pathogenic potential of the host E. coli strain [15]. The Eru type was in the present study based on the type of proteins encoded by the two genes located directly upstream of cI. This definition is less differentiating than the definition used by Llarena et al. [15] where the entire region between cI and stx was considered. Due to the large variation of genes located between cI and stx, revealed in this study, we found that defining Eru type based on the identity of the two genes upstream of cI set the discrimination level to an appropriate level of sensitivity. However, it is very likely that additional proteins located in the region between cI and stx are required for replication of the phage.

Stx phages have traditionally been classified into the group of lambdoid phages based on similarity in behavior, genetic structure, and regulatory system. In phage lambda and lambdoid Stx phages, the assembly of the replication complex has been studied in detail [46] but there is so far no knowledge about the proteins involved in the replication process of Eru phages. Eru7 seems to be the most widespread Eru type in Europe and, together with Eru6 and Eru9, they encode proteins containing Rha or Rha/Ash domains. Rha domain proteins are common among temperate bacteriophages and large eukaryotic DNA viruses and is suggested to function as a regulatory protein that is involved in controlling the switch between lytic and lysogenic lifestyle [47]. Ash domain proteins are also common among bacteriophages, but little is known about their function [30, 31, 48]. However, none of these proteins have previously been associated with replication of Stx phages and it is of great interest to examine this aspect especially since Eru7 Stx phages seems to be among most common Eru types.

In phage lambda and lambdoid Stx phages, the CI repressor regulates expression of the O and P replication genes [18]. The cI gene is also present in the genomes of Eru phages suggesting that a similar regulatory mechanism is at play in non-lambdoid Stx phages. The genes located directly upstream of cI varies extensively between different Eru types, although most of them encode DNA binding proteins such as helicases, primases or other HTH motif proteins. When exploring the different Eru types, we observed that the amino acid sequence of the CI repressor differed substantially between Stx phages but there were also homologies which were used to group them into eight major Clades (I-VIII). In phage lambda, CI represses expression of upstream genes by forming dimers which bind to specific promoter sequences and self-cleavage of CI relieves the repression [20, 25]. All CI proteins belonging to Clade I-VII exhibit the self-catalytic Peptidase_S24 domain and the lambda S150 autocleavage residue [21, 44, 49] which mediates the cleavage of CI resulting in relieve of repression of the promoters in the replication module.

However, CI sequences belonging to Clade VIII lack this domain and it remains unexplored how this atypical CI protein is involved in regulating phage replication. Another atypical CI protein, lacking the HTH DNA-binding domain, was observed in Clade III, and the regulatory functions of this protein is also unknown. Considering the likelihood that CI is involved in regulation of upstream genes, the differences in amino acid sequence observed between CI repressors of different Eru types may reflect adaptation of binding specificities to match distinct target sequences. It is also likely that the differences observed between CI repressors may influence their regulatory network which, in turn, may influence the stability of the lysogenic state and the pathogenic potential of the host EHEC strain.

Stx phages are known to be highly mosaic and composed of gene segments with different evolutionary histories acquired through a variety of mechanisms, such as homologous recombination, transposition, and site-specific recombination [50,51,52]. The variation in CI protein sequence and Eru types and the different combinations of these revealed in the present study, indicates that Stx phages continuously change and that their classification may be less restricted to specific serotypes than previously anticipated [15]. We have previously suggested that the Eru2 type may be restricted to serotype O157:H7 and is predominant for the less potent subtype Stx2c phages [15]. However, we observed that among the 63 Eru2 phages detected in this study, fourteen were carried by E. coli of serotype O157:H7, while the remaining 49 phages (48 in Japanese EHEC strains (Additional file 1) and one in a Dutch EHEC strain (PRNJA285020 strains STEC 564; Additional file 2)) were carried by E. coli of serotype O121:H19. All Eru2 phages carried by O121:H19 strains encoded Stx2a, while all the O157:H7 strains carried Eru2 phages encoding Stx2c. We also observed that five of the six highly pathogenic strains of serotype O157:H7, which have caused large outbreaks in the UK carry Eru2 phages, and that four of these Eru2 phages encode Stx2c (Table 4). Although, the UK outbreak strains also do carry phages encoding the more potent Stx2a in addition. All in all, this indicate that Eru2 phages are not restricted to hosts of serotype O157:H7 but Eru2 phages carried by this serotype predominantly encode the Stx2c subtype.

Surprisingly, we did not observe any Eru3 type Stx phages among the European STEC strains examined during this study (Table 3). We have previously shown that Eru3 phages were carried by both serotype O157:H7 and O111 strains and often encode the potent subtype Stx2a [15]. A majority of the Eru3 type of Stx phages described in the previous work were isolated in the US, indicating that this phage type may be more widespread on the American continent than in Europe.

E. coli may carry multiple stx negative prophages with similarities to Stx phages together with multiple Stx phages in its genome [53]. Therefore, identification of the Eru type requires that the stx genes and the phage replication region is present on the same contig or scaffold. Assessment of Eru type from genome sequences generated by short read sequencing technology is often impossible due to contig breaks in the region between cI and stx (ND in Additional file 2). Stx phages often carry repetitive tRNA encoding genes immediately upstream of the stx making assembly of contigs difficult in this region.

In the present study, we observe that the Stx2a encoding phages carried by highly virulent EHEC strains from UK [39] and the HUS causing strains from Norway [42] are of Eru1-, Eru2-, Eru5- and Eru7-types. We have previously shown that the Eru1 type is carried by highly pathogenic EHEC strains and that Eru1 phages exhibit a less stable lysogenic state than the classical lambdoid Stx phages [15]. It is already well known that the outcome of EHEC disease is often more severe when the infection is caused by an E. coli strain producing Stx2 compared to a strain producing Stx1 [7, 9]. We must, however, emphasize that the amount of toxin produced must be taken into consideration. It is therefore of great importance to gain more knowledge about how the gene content of the replication region influences regulation of the phage life cycle and, consequently, the levels of Stx produced. More research is also needed to understand how different CI repressor types react to environmental stressors such as the host immune system and antibiotic treatment and the impact of these factors on the Stx production. Importantly, this work highlights that our understanding of bacterial pathogens cannot solely be based on studies on a few model bacterial strains and/or phage types.


Some EHEC strains appear more virulent than others and increased knowledge on how characteristics of the replication region influence the level of Stx produced is important for understanding the mechanisms behind their pathogenicity. The present study revealed ten novel Eru types encoding phage replication proteins as well as a broad variation in the amino acid sequence of the CI repressor proteins which regulate the transcription of the replication proteins. This diversity has the potential to explain why certain EHEC strains are more pathogenic than others and the study forms an important knowledge platform for further investigations on how characteristics of the Stx phage genome influences the virulence of the host EHEC strain.


A total of 120 Stx-converting phage genome sequences were retrieved from the NCBI virus database (taxid:10,239) by Standard Nucleotide BLAST using the A subunit of stx1 (M19437.1) and stx2 (AF125520) as query sequences (August 2021) (Additional file 1).

In addition, ten different bio-projects comprising European STEC strains, one Dutch (PRJNA285020), one Norwegian (PRJEB6447), one French (PRPRJNA706995), three Swiss (PRJNA680568, PRJNA694525, PRJNA438214), one English (PRJNA248042), one Italian (PRJNA666781), one German (PRJNA715185) and one Portuguese (PRJNA643688), were examined for contigs containing stx using BLAST as described above (Additional file 2). The dataset contained more than 3000 STEC isolates, however, the majority of contigs were too short (< 8000 bp) to contain cI and stx genes on the same contig thus only contigs larger than 8000 bp were examined. A total of 298 contigs containing the region between the CI-coding gene and the stx genes were identified in the dataset. The sequences were examined using pDRAW and Eru types were defined by the proteins encoded by the two genes located directly upstream of cI. GenomeNet Motif Search (Kyoto University Bioinformatics Center) was used for detection of protein motifs [54]. Erus were numbered consecutively as they were detected.

The 260 CI protein sequences (Additional file 3), mainly extracted from the abovementioned nucleotide sequences, were aligned using ClustalOmega [55]. A maximum likelihood tree was inferred from the alignment using IQ-TREE v1.6.12 [56]. Node supports were evaluated using the option -bb for ultrafast bootstraps [57] and the VT + GT model was selected as the best evolutionary model using ModelFinder and the BIC criterion [58]. Interactive Tree Of Life (iTOL) v6.4 was used for visualization [59].

Availability of data and materials

Data generated and analyzed throughout this study are included in this published article and in the additional information files. Stx-converting phage genome sequences were retrieved from the NCBI virus database (taxid:10,239) and each accession number can be found in Additional file 1. Accession number stx sequences from ten different bio-projects comprising European STEC strains are listed in Additional file 2. Accession number of protein sequences used for the phylogenetic analysis are listed in additional file 3.



ATPases Associated with diverse cellular Activities


Basic Local Alignment Search Tool


Deoxyribonucleic acid


Enterohaemorrhagic E. coli


EHEC phage replication units


Hemolytic uremic syndrome




National Center for Biotechnology Information


Shiga toxin-producing E. coli


Shiga toxin


  1. Karch H, Tarr PI, Bielaszewska M. Enterohaemorrhagic Escherichia coli in human medicine. Int J Med Microbiol. 2005;295(6–7):405–18.

    Article  CAS  PubMed  Google Scholar 

  2. Karmali MA. Emerging Public Health Challenges of Shiga Toxin-Producing Escherichia coli Related to Changes in the Pathogen, the Population, and the Environment. Clinical infectious diseases : an official publication of the Infectious Diseases Society of America. 2017;64(3):371–6.

  3. Nguyen Y, Sperandio V. Enterohemorrhagic E. coli (EHEC) pathogenesis. Front Cell Infect Microbiol. 2012;2:90; doi:

  4. Nations WHOFaAOotU: Shiga toxin-producing Escherichia coli (STEC) and food: attribution, characterization, and monitoring: report. In.: WHO; 2018.

  5. Bielaszewska M, Mellmann A, Zhang W, Köck R, Fruth A, Bauwens A, et al. Characterisation of the Escherichia coli strain associated with an outbreak of haemolytic uraemic syndrome in Germany, 2011: a microbiological study. Lancet Infect Dis. 2011;11(9):671–6.

    Article  CAS  PubMed  Google Scholar 

  6. Rasko DA, Webster DR, Sahl JW, Bashir A, Boisen N, Scheutz F, et al. Origins of the E. coli strain causing an outbreak of hemolytic-uremic syndrome in Germany. The New England journal of medicine. 2011;365(8):709–17.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Scheutz F, Teel LD, Beutin L, Piérard D, Buvens G, Karch H, et al. Multicenter evaluation of a sequence-based protocol for subtyping Shiga toxins and standardizing Stx nomenclature. J Clin Microbiol. 2012;50(9):2951–63.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Boerlin P, McEwen SA, Boerlin-Petzold F, Wilson JB, Johnson RP, Gyles CL. Associations between virulence factors of Shiga toxin-producing Escherichia coli and disease in humans. J Clin Microbiol. 1999;37(3):497–503.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Fuller CA, Pellino CA, Flagler MJ, Strasser JE, Weiss AA. Shiga toxin subtypes display dramatic differences in potency. Infect Immun. 2011;79(3):1329–37.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Allison HE. Stx-phages: drivers and mediators of the evolution of STEC and STEC-like pathogens. Future Microbiol. 2007;2(2):165–74.

    Article  CAS  PubMed  Google Scholar 

  11. Zeng L, Skinner SO, Zong C, Sippy J, Feiss M, Golding I. Decision making at a subcellular level determines the outcome of bacteriophage infection. Cell. 2010;141(4):682–91.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Balasubramanian S, Osburne MS, BrinJones H, Tai AK, Leong JM. Prophage induction, but not production of phage particles, is required for lethal disease in a microbiome-replete murine model of enterohemorrhagic E. coli infection. PLoS Pathog. 2019;15(1):e1007494.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. O’Brien AD, Marques LR, Kerry CF, Newland JW, Holmes RK. Shiga-like toxin converting phage of enterohemorrhagic Escherichia coli strain 933. Microb Pathog. 1989;6(5):381–90.

  14. Plunkett G 3rd, Rose DJ, Durfee TJ, Blattner FR. Sequence of Shiga toxin 2 phage 933W from Escherichia coli O157:H7: Shiga toxin as a phage late-gene product. J Bacteriol. 1999;181(6):1767–78.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Llarena AK, Aspholm M, O’Sullivan K, Wêgrzyn G, Lindbäck T. Replication Region Analysis Reveals Non-lambdoid Shiga Toxin Converting Bacteriophages. Front Microbiol. 2021;12: 640945.

  16. Pinto G, Sampaio M, Dias O, Almeida C, Azeredo J, Oliveira H. Insights into the genome architecture and evolution of Shiga toxin encoding bacteriophages of Escherichia coli. BMC Genomics. 2021;22(1):366.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Bednarz M, Halliday JA, Herman C, Golding I. Revisiting bistability in the lysis/lysogeny circuit of bacteriophage lambda. PLoS ONE. 2014;9(6): e100876.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Casjens SR, Hendrix RW. Bacteriophage lambda: Early pioneer and still relevant. Virology. 2015;479–480:310–30.

    Article  CAS  PubMed  Google Scholar 

  19. LeBowitz JH, Zylicz M, Georgopoulos C, McMacken R. Initiation of DNA replication on single-stranded DNA templates catalyzed by purified replication proteins of bacteriophage lambda and Escherichia coli. Proc Natl Acad Sci USA. 1985;82(12):3988–92.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Stayrook S, Jaru-Ampornpan P, Ni J, Hochschild A, Lewis M. Crystal structure of the lambda repressor and a model for pairwise cooperative operator binding. Nature. 2008;452(7190):1022–5.

    Article  CAS  PubMed  Google Scholar 

  21. Bell CE, Frescura P, Hochschild A, Lewis M. Crystal structure of the lambda repressor C-terminal domain provides a model for cooperative operator binding. Cell. 2000;101(7):801–11.

    Article  CAS  PubMed  Google Scholar 

  22. Pabo CO, Sauer RT, Sturtevant JM, Ptashne M. The lambda repressor contains two domains. Proc Natl Acad Sci USA. 1979;76(4):1608–12.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Nyíri K, Kőhegyi B, Micsonai A, Kardos J, Vertessy BG. Evidence-Based Structural Model of the Staphylococcal Repressor Protein: Separation of Functions into Different Domains. PLoS ONE. 2015;10(9): e0139086.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Pabo CO, Lewis M. The operator-binding domain of lambda repressor: structure and DNA recognition. Nature. 1982;298(5873):443–7.

    Article  CAS  PubMed  Google Scholar 

  25. Sauer RT, Ross MJ, Ptashne M. Cleavage of the lambda and P22 repressors by recA protein. J Biol Chem. 1982;257(8):4458–62.

    Article  CAS  PubMed  Google Scholar 

  26. Tyler JS, Mills MJ, Friedman DI. The operator and early promoter region of the Shiga toxin type 2-encoding bacteriophage 933W and control of toxin expression. J Bacteriol. 2004;186(22):7670–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Waldor MK, Friedman DI. Phage regulatory circuits and virulence gene expression. Curr Opin Microbiol. 2005;8(4):459–65.

    Article  CAS  PubMed  Google Scholar 

  28. Colon MP, Chakraborty D, Pevzner Y, Koudelka GB. Mechanisms that Determine the Differential Stability of Stx+ and Stx(-) Lysogens. Toxins. 2016;8(4):96.

  29. Tóth I, Sváb D, Bálint B, Brown-Jaque M, Maróti G. Comparative analysis of the Shiga toxin converting bacteriophage first detected in Shigella sonnei. Infect Genet Evol. 2016;37:150–7.

  30. Henthorn KS, Friedman DI. Identification of related genes in phages phi 80 and P22 whose products are inhibitory for phage growth in Escherichia coli IHF mutants. J Bacteriol. 1995;177(11):3185–90.

  31. Iyer LM, Koonin EV, Aravind L. Extensive domain shuffling in transcription regulators of DNA viruses and implications for the origin of fungal APSES transcription factors. Genome Biol. 2002;3(3):Research0012.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Núñez B, De La Cruz F. Two atypical mobilization proteins are involved in plasmid CloDF13 relaxation. Mol Microbiol. 2001;39(4):1088–99.

    Article  PubMed  Google Scholar 

  33. Zou X, Caufield PW, Li Y, Qi F. Complete nucleotide sequence and characterization of pUA140, a cryptic plasmid from Streptococcus mutans. Plasmid. 2001;46(2):77–85.

    Article  CAS  PubMed  Google Scholar 

  34. Lipps G, Weinzierl AO, von Scheven G, Buchen C, Cramer P. Structure of a bifunctional DNA primase-polymerase. Nat Struct Mol Biol. 2004;11(2):157–62.

    Article  CAS  PubMed  Google Scholar 

  35. Goliger JA, Roberts JW. Bacteriophage 82 gene Q and Q protein. Sequence, overproduction, and activity as a transcription antiterminator in vitro. J Biol Chem. 1987;262(24):11721–5.

  36. Grayhack EJ, Roberts JW. The phage lambda Q gene product: activity of a transcription antiterminator in vitro. Cell. 1982;30(2):637–48.

    Article  CAS  PubMed  Google Scholar 

  37. Scotland SM, Willshaw GA, Smith HR, Rowe B. Properties of strains of Escherichia coli belonging to serogroup O157 with special reference to production of Vero cytotoxins VT1 and VT2. Epidemiol Infect. 1987;99(3):613–24.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Taylor CM, White RH, Winterborn MH, Rowe B. Haemolytic-uraemic syndrome: clinical experience of an outbreak in the West Midlands. Br Med J (Clin Res Ed). 1986;292(6534):1513–6.

    Article  CAS  Google Scholar 

  39. Yara DA, Greig DR, Gally DL, Dallman TJ, Jenkins C. Comparison of Shiga toxin-encoding bacteriophages in highly pathogenic strains of Shiga toxin-producing Escherichia coli O157:H7 in the UK. Microb Genom. 2020;6(3):e000334.

    Article  CAS  PubMed Central  Google Scholar 

  40. Byrne L, Dallman TJ, Adams N, Mikhail AFW, McCarthy N, Jenkins C. Highly Pathogenic Clone of Shiga Toxin-Producing Escherichia coli O157:H7. England and Wales Emerg Infect Dis. 2018;24(12):2303–8.

    Article  PubMed  Google Scholar 

  41. Gobin M, Hawker J, Cleary P, Inns T, Gardiner D, Mikhail A, et al. National outbreak of Shiga toxin-producing Escherichia coli O157:H7 linked to mixed salad leaves, United Kingdom, 2016. Euro Surveill. 2018;23(18):17–00197.

    Article  PubMed Central  Google Scholar 

  42. Haugum K, Johansen J, Gabrielsen C, Brandal LT, Bergh K, Ussery DW, et al. Comparative genomics to delineate pathogenic potential in non-O157 Shiga toxin-producing Escherichia coli (STEC) from patients with and without haemolytic uremic syndrome (HUS) in Norway. PLoS ONE. 2014;9(10): e111788.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Durante-Rodríguez G, Mancheño JM, Díaz E, Carmona M. Refactoring the λ phage lytic/lysogenic decision with a synthetic regulator. Microbiologyopen. 2016;5(4):575–81.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Kim B, Little JW. LexA and lambda Cl repressors as enzymes: specific cleavage in an intermolecular reaction. Cell. 1993;73(6):1165–73.

    Article  CAS  PubMed  Google Scholar 

  45. Mo CY, Birdwell LD, Kohli RM. Specificity determinants for autoproteolysis of LexA, a key regulator of bacterial SOS mutagenesis. Biochemistry. 2014;53(19):3158–68.

    Article  CAS  PubMed  Google Scholar 

  46. Kozłowska K, Glinkowska M, Boss L, Gaffke L, Deptuła J, Węgrzyn G. Formation of Complexes Between O Proteins and Replication Origin Regions of Shiga Toxin-Converting Bacteriophages. Front Mol Biosci. 2020;7:207.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Bochow S, Elliman J, Owens L. Bacteriophage adenine methyltransferase: a life cycle regulator? Modelled using Vibrio harveyi myovirus like. J Appl Microbiol. 2012;113(5):1001–13.

    Article  CAS  PubMed  Google Scholar 

  48. Casjens S, Eppler K, Parr R, Poteete AR. Nucleotide sequence of the bacteriophage P22 gene 19 to 3 region: identification of a new gene required for lysis. Virology. 1989;171(2):588–98.

    Article  CAS  PubMed  Google Scholar 

  49. Luo Y, Pfuetzner RA, Mosimann S, Paetzel M, Frey EA, Cherney M, et al. Crystal structure of LexA: a conformational switch for regulation of self-cleavage. Cell. 2001;106(5):585–94.

    Article  CAS  PubMed  Google Scholar 

  50. Hatfull GF, Hendrix RW. Bacteriophages and their genomes. Curr Opin Virol. 2011;1(4):298–303.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Johansen BK, Wasteson Y, Granum PE, Brynestad S. Mosaic structure of Shiga-toxin-2-encoding phages isolated from Escherichia coli O157:H7 indicates frequent gene exchange between lambdoid phage genomes. Microbiology (Reading). 2001;147(Pt 7):1929–36.

    Article  CAS  Google Scholar 

  52. Smith DL, Rooks DJ, Fogg PC, Darby AC, Thomson NR, McCarthy AJ, et al. Comparative genomics of Shiga toxin encoding bacteriophages. BMC Genomics. 2012;13:311.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Zhang Y, Liao YT, Salvador A, Sun X, Wu VCH. Prediction, Diversity, and Genomic Analysis of Temperate Phages Induced From Shiga Toxin-Producing Escherichia coli Strains. Front Microbiol. 2019;10:3093.

    Article  PubMed  Google Scholar 

  54. Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR, et al. The Pfam protein families database. Nucleic Acids Res. 2002;30(1):276–80.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol. 2011;7:539.

    Article  PubMed  PubMed Central  Google Scholar 

  56. Nguyen LT, Schmidt HA, von Haeseler A, Minh BQ. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol. 2015;32(1):268–74.

    Article  CAS  PubMed  Google Scholar 

  57. Hoang DT, Chernomor O, von Haeseler A, Minh BQ, Vinh LS. UFBoot2: Improving the Ultrafast Bootstrap Approximation. Mol Biol Evol. 2018;35(2):518–22.

    Article  CAS  PubMed  Google Scholar 

  58. Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods. 2017;14(6):587–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Letunic I, Bork P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 2021;49(W1):W293–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references


Not applicable.


This work is funded by the Norwegian University of Life Sciences.

Author information

Authors and Affiliations



AF and TL designed the study. TL wrote the first draft of the manuscript. AF, MA, GW and TL contributed to data analysis and writing of the manuscript. All authors approved the final manuscript.

Corresponding author

Correspondence to Toril Lindbäck.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be interpreted as a potential conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Stx-converting phage genomes with Eru type

Additional file 2.

BioProjects comprising European STEC strains with Eru type

Additional file 3.

Information about the 260 CI sequences used in phylogenetic analysis

Additional file 4.

Percent Identity Matrix (Clustal2.1) between clades of CI repressor sequences

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fagerlund, A., Aspholm, M., Węgrzyn, G. et al. High diversity in the regulatory region of Shiga toxin encoding bacteriophages. BMC Genomics 23, 230 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: