Skip to main content

Table 2 Criteria used for gene enrichment analyses

From: Comparisons of infant Escherichia coli isolates link genomic profiles with adaptation to the ecological niche

Sorting criteria

Focal group (nr. strains)

Gene presence in focal group

Gene absence in non-focal group

Criteria I, cladistic comparison

Clade1 (8)

≥7

≥7

Clade2 (8)

≥7

≥7

Criteria II, pathogen/commensal comparison

Pathogen (4)

≥3

≥9

Commensal (12)

≥9

≥3

Criteria III, growth rate comparison

Fast (2)

2

≥6

Medium (4)

≥3

≥4

Slow (4)

≥3

≥4

Criteria IV, colonization time comparison

Early (6)

≥5

≥4

Late (6)

≥4

≥5

Criteria V, pathogen/commensal comparison

Pathogen (23)

≥17

≥13

Commensal (17)

≥13

≥17

Criteria VI, pathogen/commensal comparison

Pathogen (5)

≥4

≥13

 

Commensal (17)

≥13

≥4

  1. Criteria I: Criteria used for discriminating cladistic gene content enrichments. Each of the two clades contained 8 strains and enrichment required a gene to be present in at least 7 strains of one clade (focal group) while being absent from at least 7 strains in the other clade (non-focal group).
  2. Criteria II: Criteria used for discriminating pathogen vs. commensal gene content enrichments. Since the two groups are of unequal size, a pathogen enriched gene had to present in at least 3 of 4 pathogenic strains and absent from at least 9 of 12 commensal strains. A commensal enriched gene had to be present in at least 9 of 12 commensal strains and absent from at least 3 of 4 pathogenic strains.
  3. Criteria III: Criteria used for discriminating growth rate related gene content enrichments. The three growth rate categories (slow, medium and fast) contained 2,4, and 4 strains respectively. For a gene to be considered enriched in the fast category, a gene had to be preset in both fast strains and absent from at least 6 of 8 of the combined slow and medium strains. For a gene to be considered enriched in the medium category, a gene had to be preset in at least 3 of 4 medium strains and absent from at least 4 of 6 of the combined slow and fast strains. For a gene to be considered enriched in the slow category, a gene had to be present in at least 3 of 4 slow strains and absent from at least 4 of 6 of the combined medium and fast strains.
  4. Criteria IV: Criteria used for discriminating early vs. late colonizer gene content enrichments. The two groups contain 6 strains each. Since one of the strains in the early group was also isolated in the late group (EDM123c). An asymmetrical enrichment profile was designed which required an early enriched gene to be present in at least 5 of 6 early strains and absent in at least 4 of 6 late strains. A gene enriched in the late colonizer group had to be present in at least 4 of 6 late strains and absent from at least 5 of 6 early strains.
  5. Criteria V: Criteria used for discriminating pathogen vs. commensal gene content enrichments including 24 additional published E.coli genomes (Table 3). Since the two groups are of unequal size, a pathogen enriched gene had to present in at least 17 of 23 pathogenic strains and absent from at least 13 of 17 commensal strains. A commensal enriched gene had to be present in at least 13 of 17 commensal strains and absent from at least 17 of 23 pathogenic strains.
  6. Criteria VI: Criteria used for discriminating pathogen vs. commensal gene content enrichments including one additional published enteropathogenic E.coli (EPEC) genome and 5 additional published commensal isolate genomes (Table 3). Since the two groups are of unequal size, a pathogen enriched gene had to present in at least 4 of 5 pathogenic strains and absent from at least 13 of 17 commensal strains. A commensal enriched gene had to be present in at least 13 of 17 commensal strains and absent from at least 4 of 5 pathogenic strains.