Species-species discrimination. The AUC for classifiers trained to distinguish between proteins from each species pair is shown (median of five replicates). With the exception of H. ducreyi vs. S. agalactiae and H. ducreyi vs. M. genitalium, all comparisons yield excellent classification performance. This means that proteins from different source organisms can be distinguished with surprising accuracy based solely on amino acid sequence features. The unrooted tree to the left shows the phylogenetic relationships of the seven bacterial species, based on 16S rRNA analysis.