Skip to main content
Fig. 6 | BMC Genomics

Fig. 6

From: Comparative pangenomics: analysis of 12 microbial pathogen pangenomes reveals conserved global structures of genetic and functional diversity

Fig. 6

Mutation enrichment in protein domains from 76 genes present in all 12 species’ core genomes. a Workflow for computing the extent of mutation enrichment in a domain relative to the full protein from a set of coding variants. Briefly, the entropy at each position of the gene’s multiple sequence alignment is computed, and the mean entropy across the length of the domain is compared to that of all same length subsequences of the protein to compute a domain entropy percentile. b Species-specific mutation enrichment for 443 gene-domain pairs, sorted by domain entropy percentile averaged across 12 species. Domains with statistically significant multispecies enrichment or depletion are boxed (Bootstrap test, FDR < 0.05, Benjamini-Hochberg correction). E. faecium is not shown, due to low variability attributable to initial subtype imbalances in the genome set. c Species-specific mutation enrichment for gene-domain pairs with significant multispecies mutation enrichment. Domains related to aminoacyl-tRNA synthetases are labeled purple. White cells correspond to domains that could not be annotated within the species’ consensus sequence for the parent protein

Back to article page