Comparative genomics and phylogenetics of rhs genes in E. coli/Shigella spp. a. Rhs loci are found at 11 unique positions in E. coli. These are marked along the K12_MG1655 genome (scale in base-pairs, beginning at the thr operon). For each locus the following are noted: the existing gene name ('alias') where available, the clade to which it belongs (see Figure 2), the presence or absence of a contiguous vgrs gene, and its phylogenetic distribution across all strains (present: solid circle, pseudogene: crossed through, relic: half circle, or otherwise absent). Note that two sequences labelled 'III' belong to Clade III, rather than Clade I. The ML phylogeny shown at left was estimated from MLST concatenated sequence (see Methods), and is labelled with bootstrap values. Coloured boxes denote the inferred origins of rhs loci. b. A phylogenetic network estimated from HKY distances using a Neighbour-Net algorithm. Sequence labels are shaded by locus, as in a. A key is provided that relates strain names to sequence codes. Clades are linked to their corresponding positions by arrows.