Pan genome tree. The tree was created based on the presence or absence of 16,373 HGCs in the 186 E. coli genomes. MLST types are annotated to the far right of each genome name. The phylotypes are marked with the colors blue (A), red (B1), purple (B2), green (D), and the Shigella genomes are marked with the color brown. Bootstrap values are annotated at each node as a percentage between 0 and 100. At each node a black circle indicates a bootstrap value of 100, a grey circle indicates a bootstrap value between 100 and 70 and a red circle indicates a bootstrap value below 70. The original tree with all bootstrap values can be seen in Additional file 3.