Skip to main content

Table 1 General and BUSCO benchmark statistics for homology grouping performed under setting D1 to D8

From: The Pectobacterium pangenome, with a focus on Pectobacterium brasiliense, shows a robust core and extensive exchange of genes from a shared gene pool

Clustering setting Minimum sequence similarity Homology groups Single copy groups Correct groups a True Positives False Positives False Negatives Recall Precision F-score
D1 95% 49,290 812 395 128,085 14 3905 0.9704 0.9999 0.9849
D2 85% 28,896 1615 629 131,795 24 195 0.9985 0.9998 0.9992
D3 75% 24,650 1690 638 131,952 35 38 0.9997 0.9997 0.9997
D4 65% 22,347 1699 640 131,975 38 15 0.9999 0.9997 0.9998
D5 55% 20,636 1683 639 131,975 44 15 0.9999 0.9997 0.9998
D6 45% 19,234 1653 633 131,985 245 5 0.9981 0.9981 0.9991
D7 35% 17,908 1612 623 131,985 508 5 0.9962 0.9962 0.9981
D8 25% 16,486 1486 607 131,986 7002 4 1.0000 0.9496 0.9741
  1. a Correct groups are defined as the number of groups that correctly organize one out of 670 ‘complete’ and ‘non-duplicated’ Enterobacteriaceae BUSCO genes. Calculations of recall, precision, and F-score explained in Methods