Estimation of the false discovery rate. A) Validation of NCI39 derived gene:drug correlate pairs in the NCI7 cell lines. The percent of pairs which validated increased with the increasing correlation coefficient of the nominated gene-drug relationship in the NCI39 training set, and very few candidates were found to have significant but oppositely signed correlations in the NCI7 data. When the percentage of validated correlates is adjusted by the power of a correlation, assuming a single-sided p value of 0.5 is considered significant with a sample size of six, it becomes apparent that a majority of the gene:drug correlate pairs with correlations greater than 0.7 would have validated in a replication of the original experiment. B) Comparison of q-values and p-values for the gene:drug correlates. To estimate the q value in these very large data sets, random subsets of the gene expression and GI50 data were iteratively compared and the distribution of p values were measured for ~150000 correlations in total. A q value of 0.05 (i.e., that for every 100 significant correlates, five false correlates are expected.) is associated with a p value of approximately 1 × 10-5.