Expectation score vs. confidence in protein interactions of yeast and fly and statistics of training sets and predictions in Plasmodium. (a). Assigning each protein interaction the domain interaction with highest expectation value, we observe that the confidence in the underlying protein interaction correlates with the expectation value of the highest scoring interacting domain pair. In particular, the dependence of the mean domain expectation value E in each bin of confidence values of yeast protein interactions follows a statistically significant exponential distribution (E ~ e(2.73 × cv), Pearson's r = 0.28, P < 10-5, Spearman's ρ = 0.31, P < 10-5). In principle, we obtain similar results for fly protein interactions (inset) (E ~ e(1.75 × cv), r = 0.19, P < 10-5; ρ = 0.16, P < 10-5), allowing us to conclude that modeling a protein interaction by the highest scoring domain interaction is a suffcient approximation for the determination of the presence and quality of the underlying protein interaction. Error bars correspond to standard deviations in each bin. (b) In order to evaluate predicted interactions in P. falciparum, we utilized a logistic regression model that we trained by carefully selected sets of true positive and negative interactions. Binning confidence values, we show the frequencies of the predicted protein interactions, the positive (good) and negative (bad) training sets.