Skip to main content

Table 1 Summary of statistical data for SEED data sets 1

From: Safety in numbers: multiple occurrences of highly similar homologs among Azotobacter vinelandiicarbohydrate metabolism proteins probably confer adaptive benefits

  All synolog pairs ≥90% protein sequence identity between synolog pairs
  Median ± MAD Min Max Median ± MAD Min Max
Number of carbohydrate metabolism synologs 49.0 ± 35.0 0 394 0.0 ± 0.0 0 47
Number of carbohydrate metabolism synolog groups 20.0 ± 13.0 0 128 0.0 ± 0.0 0 16
Average protein sequence identity between synolog pairs2 [%] 36.8 ± 4.2 13.4 100.0 97.3 ± 1.8 90.0 100.0
Synolog fraction of carbohydrate metabolism genes [%] 30.0 ± 9.7 0.0 85.7 0.0 ± 0.0 0.0 34.6
  1. 1Median, minimum and maximum values for the carbohydrate metabolism gene set extracted from 943 prokaryote genomes in the SEED database [19], with no set cutoff and with a cutoff set at 90% protein sequence identity between synologs. Synologs are here defined as intra-genome sequences assigned to the same FIGfam (see text). The synolog fraction describes the ratio of the total number of synologs relative to the total number of genes in a genome. MAD is median absolute deviation. The median number of carbohydrate metabolism genes in the data set was 160.0 ± 74.0. The minimum and maximum numbers of carbohydrate metabolism genes observed among the included genomes were 4 and 585 genes, respectively.
  2. 2Calculated from the genomes containing carbohydrate metabolism synologs at the given cutoff.