Pan-genome tree of the 16 IMPACT isolates and 25 publicly available pathogenic and non-pathogenic isolates. Pathogenic isolates are labeled in red, commensals are labeled in black, K12MG1655 is labeled in green. Leaves labeled with triangles represent strains genome sequenced as part of this study (Table 1). Leaves labeled with circles represent publicly available E.coli genomes downloaded from Genbank (Table 3). The pan-genome tree was created using hierarchical clustering of a Manhattan distance matrix based on the gene presence/absence matrix. The scale below the pan-genome tree indicates Manhattan distances.