Skip to main content

Table 1 Summary of statistical data for SEED data sets 1

From: Safety in numbers: multiple occurrences of highly similar homologs among Azotobacter vinelandiicarbohydrate metabolism proteins probably confer adaptive benefits

 

All synolog pairs

≥90% protein sequence identity between synolog pairs

 

Median ± MAD

Min

Max

Median ± MAD

Min

Max

Number of carbohydrate metabolism synologs

49.0 ± 35.0

0

394

0.0 ± 0.0

0

47

Number of carbohydrate metabolism synolog groups

20.0 ± 13.0

0

128

0.0 ± 0.0

0

16

Average protein sequence identity between synolog pairs2 [%]

36.8 ± 4.2

13.4

100.0

97.3 ± 1.8

90.0

100.0

Synolog fraction of carbohydrate metabolism genes [%]

30.0 ± 9.7

0.0

85.7

0.0 ± 0.0

0.0

34.6

  1. 1Median, minimum and maximum values for the carbohydrate metabolism gene set extracted from 943 prokaryote genomes in the SEED database [19], with no set cutoff and with a cutoff set at 90% protein sequence identity between synologs. Synologs are here defined as intra-genome sequences assigned to the same FIGfam (see text). The synolog fraction describes the ratio of the total number of synologs relative to the total number of genes in a genome. MAD is median absolute deviation. The median number of carbohydrate metabolism genes in the data set was 160.0 ± 74.0. The minimum and maximum numbers of carbohydrate metabolism genes observed among the included genomes were 4 and 585 genes, respectively.
  2. 2Calculated from the genomes containing carbohydrate metabolism synologs at the given cutoff.