The cellular response to drug perturbation is limited: comparison of large-scale chemogenomic fitness signatures

Background Chemogenomic profiling is a powerful approach for understanding the genome-wide cellular response to small molecules. First developed in Saccharomyces cerevisiae, chemogenomic screens provide direct, unbiased identification of drug target candidates as well as genes required for drug resistance. While many laboratories have performed chemogenomic fitness assays, few have been assessed for reproducibility and accuracy. Here we analyze the two largest independent yeast chemogenomic datasets comprising over 35 million gene-drug interactions and more than 6000 unique chemogenomic profiles; the first from our own academic laboratory (HIPLAB) and the second from the Novartis Institute of Biomedical Research (NIBR). Results Despite substantial differences in experimental and analytical pipelines, the combined datasets revealed robust chemogenomic response signatures, characterized by gene signatures, enrichment for biological processes and mechanisms of drug action. We previously reported that the cellular response to small molecules is limited and can be described by a network of 45 chemogenomic signatures. In the present study, we show that the majority of these signatures (66%) are also found in the companion dataset, providing further support for their biological relevance as conserved systems-level, small molecule response systems. Conclusions Our results demonstrate the robustness of chemogenomic fitness profiling in yeast, while offering guidelines for performing other high-dimensional comparisons including parallel CRISPR screens in mammalian cells. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08395-x.


Background
A major, persistent challenge in drug discovery is validation of the molecular targets and the target pathways that can be modulated by bioactive small molecules in cellular assays. This is especially true for target-based approaches where drug candidates are selected based on high throughput biochemical screens, because their behavior when tested in cells can be unpredictable. Drugs that fail in the clinic often do so because of incomplete characterization of their effects in vivo. Perhaps, as a consequence, phenotypic, cell-based screens have seen renewed interest. Yet, despite advances in the complexity and sophistication of phenotypic screens, the unambiguous assessment of a drug's primary, secondary and tertiary effects, in vivo, remains a signi cant challenge. Successful implementation of chemogenomic assays and a computational framework with which to analyze them would help bridge the gap between bioactive compound discovery and drug target validation.
Chemogenomics integrates drug discovery and target identi cation through the detection and analysis of chemical-genetic interactions. Despite the increase in such studies (including an increasing number that are performed at single-cell resolution), most chemogenomic methods currently rely on correlation to infer drug-target interactions; i.e. few directly identify drug-target chemical-genetic interactions [1]. For example, genome-wide differential expression analysis (aka transcriptomics) is one strategy used to probe Mechanism of Action (MoA). In these studies, gene expression changes induced by chemical perturbation are typically compared to a compendium of pro les (derived from genetic perturbations and compounds of known mechanisms) to uncover "guilt-byassociation", e.g., https://lincsproject.org/LINCS/. In practice, the pro le with the best "match" is then used to infer the drug target through the assumption that the expression pro le of a chemically induced knockdown of the drug target will mimic a genetic mutation of the drug target or cells treated with a compound of the same MoA. Such approaches, while having been greatly expanded in the past decade, still depend on the composition and quality of their reference database and are therefore prone to systematic bias and lab-to-lab variations. Further complicating differential expression approaches is the fact that a genetic knockdown or knockout often lacks a discrete phenotype but nevertheless results in the differential expression of hundreds or thousands of transcripts. In contrast, drug perturbation of the proteins encoded a locus (or loci) of interest is consequential. By way of example, nocodazole treatment will depolymerize microtubules composed of multiple tubulin isoforms and thereby result in a phenotype, a genetic perturbation of a single isoform may have no effect.
Finally, despite the impressive scale and scope of expression consortia such as the LINC group, such approaches are challenging to compare across experimental platforms or between laboratories.
Encouragingly, assays that directly identify drug-target interactions such as gene-editing assays (including pathway-wide and genome-wide loss-of-function CRISPR-Cas9 screens) are being industrialized. In these assays, robustness and quality control issues are being addressed directly [1]. However, with a few notable exceptions [2], standardized protocols are still lacking, and laboratory-to-laboratory reproducibility remains challenging [3,4].
Chemogenomic pro ling, rst developed in yeast, identi es novel therapeutic targets in a cellular context and can therefore provide strong evidence for direct drug-target engagement [5,6]. These functional genetic screens provide mechanistic insight because they report all chemical-genetic interactions that are required for drug resistance. In yeast, the HaploInsu ciency Pro ling and HOmozygous Pro ling (HIPHOP) platform [5,6] employs the barcoded heterozygous and homozygous yeast knockout collections. HIP exploits drug-induced haploinsu ciency; a phenotype where strain-speci c sensitivity (decreased growth rate) is observed in a heterozygous strain deleted for one copy of an essential gene upon exposure to a drug targeting the product of this gene. In HIP, the 20bp molecular identi ers unique to each strain allow the ~1100 essential heterozygous deletion strains to be grown competitively in a single pool and tness to be quanti ed by barcode sequencing. The resulting tness defect (FD) scores report the relative abundance, and therefore the drug sensitivity of each strain. Those heterozygous strains (deleted for essential genes) with the greatest FD scores identify the most likely drug target candidates. Similarly, the complementary HOP assay interrogates ~4800 non-essential homozygous deletion strains; and identi es genes involved in the drug target biological pathway and those required for drug resistance. The combined HIPHOP chemogenomic pro le, reporting drug-target candidates in the HIP assay and genes required for resistance in the HOP assay, provides a comprehensive genome-wide view of the cellular response to a speci c compound [5,6].
Here we present a comparative analysis of the two largest independent yeast chemogenomic HIPHOP datasets published to date [6,7]. Speci cally, we compare a dataset generated in our lab (aka HIPLAB) [6] to a dataset generated by a group at the Novartis Institute for Biomedical Research (NIBR) [7]. The datasets are distinct; they were obtained from two independent platforms, using different experimental designs and distinct analytic pipelines ( Table 1). The primary aims of this study were to 1) assess the data concordance at different levels of analysis 2) to assess their reproducibility and 3) to analyze the NIBR and HIPLAB datasets in parallel so that both datasets might be more broadly used by the research community. A secondary aim was to identify any biological themes in the combined data that were not obvious from either of the individual datasets.
Our comparison shows excellent agreement between chemogenomic pro les for established compounds and correlations between entirely novel compounds.
Our analysis revealed global properties common to both datasets, including speci c drug targets, correlation between chemical pro les with similar mechanism and co tness between genes with similar biological function. Unique features of each dataset were also uncovered. In our previous report, we identi ed 45 major cellular response signatures [6]. We also hypothesized that these 45 signatures were comprehensive because, in our simulations, we found that 80% of these clusters would have been identi ed after screening < 30% of the ~3200 compounds. In the new independent analysis presented here, we found that the majority of these signatures (66.7%) are also present in the NIBR dataset-an observation that supports the fundamental biological relevance of these 45 core drug responses. In addition, by combining the two datasets we were able to: 1) identify robust chemogenomic responses both common and research site-speci c, the majority (81%) enriched for GO biological processes and associated with gene signatures 2) infer chemical diversity/structure and 3) gauge screen-to-screen reproducibility within replicates and between compounds with similar MoA. We present the data on a website that provides a resource for the discovery of functional interactions between genes, compounds and biological processes (Comparative chemogenomics).

Results And Discussion
Overview of NIBR and HIPLAB screens Because all our comparisons are based on the ability to compare both datasets, we describe each dataset in detail. The data processing strategies of the raw data were fundamentally different between the two research sites (Table 1). In the HIPLAB dataset, the raw data was normalized separately for the strainspeci c uptags and downtags, independently for the heterozygous and homozygous strains, creating 4 sets of results: uptag/het, uptag/hom, downtag/het, downtag/hom. For each set, logged raw average intensities were normalized across all arrays using a variation of median polish that incorporates batch effect correction [6]. Because the performance of the two tags in each strain can vary signi cantly, a 'best tag' was identi ed for each strain, de ned as the tag with the lowest robust coe cient of variation across all of the control microarrays. For each array, tags were removed if they did not pass the computed compound and control background thresholds, calculated from the median + 5MADs of the raw signal from the unnormalized intensity values of the used (corresponding to strain tags) and unused (control) features on the array across all arrays. In contrast, in the NIBR dataset, arrays were normalized by "study id", (a set of ~40 compounds) but were not corrected for batch effects. Rather, tags that performed poorly, based on their correlation values of uptags and downtags across different intensity ranges in the control arrays, were removed and the remaining tags were averaged to obtain strain intensity values.
In the HIPLAB dataset, relative strain abundance was quanti ed for each strain as the log 2 of the median signal in the control condition divided by the signal from the compound treatment. The nal tness defect (FD) score is expressed as a robust z-score where the median of the log 2 ratios for all strains in a given screen is subtracted from the log 2 ratio of a speci c strain and divided by the MAD of all log 2 ratios for all strains in that screen. In the NIBR dataset, the inverse log 2 ratio HIPLAB was used with three differences: 1) average intensities of controls were used (instead of median signals) and 2) because NIBR used replicates for each compound, the average of signals of the compound samples were used instead of a single value (Table 1) and 3) the nal gene-wise zscore normalizes for median and standard deviation of each strain across all experiments using quantile estimates (see Methods).
Both laboratories constructed pools of heterozygous and homozygous strains in a similar manner and collected samples robotically for both the HIP and HOP assays as previously described [8]. For NIBR experiments, samples were collected at xed time points (which served as a proxy for the number of cell doublings), whereas in the HIPLAB experiments cells were collected based on actual doubling time. Notably, in the NIBR pools, ~300 strains fewer homozygous deletion strains were detectable compared to the HIPLAB pools. These strains correlate with strains known to be slow-growers in the absence of drug [9] and their absence is likely due to the fact that the pool was allowed to grow overnight (~16hrs) in the NIBR assays, during which slow-growing strains drop out before the start of the experiment.
Another difference between protocols was that NIBR screened all heterozygous strains, deleted for both essential and nonessential genes, while the HIPLAB screened only the essential heterozygotes. We decided against screening non-essential heterozygotes based on the following logic: because the concept of the HIP assay relies on a tness defect resulting from gene dosage being decreased from two copies to one in a heterozygous diploid deletion strain, it follows that such tness defects should not be observed if that gene is not required for growth, as is the case for nonessential genes [10]. Indeed, we nd in practice that the HIP pro les of the nonessential heterozygotes do not correlate with HOP pro les for the same drug, nor are these nonessential heterozygote pro les biologically informative. This is illustrated by the pro les for DNA damaging agents. For example, In HOP screens of nonessential deletion strains, RAD genes have high FD scores in the presence of a DNA damaging agent (mechlorethamine), but none of these strains were sensitive as heterozygotes ( Figure S1). The exception to this is the small number of nonessential heterozygous strains that exhibit severe tness defects as homozygotes. As these strains exhibit 'nearly essential' phenotypes as homozygotes, they would be expected to exhibit drug-induced haploinsu ciency as heterozygotes and therefore should be included in the HIP assay.
We next compared the depth and breadth of each screening dataset. The NIBR screening library included 1641 propriety compounds and 135 reference compounds with known mechanisms of action. In total there were 2956 HIP and 2923 HOP experiments, for 1776 discrete chemical structures.
However, because NIBR HIP screens included heterozygous strains deleted for both essential and nonessential genes (as mentioned above) when we combined the HIP and HOP NIBR datasets for shared compounds 2725 full HIPHOP screens spanning 1771 distinct compounds remained. ~56% of the NIBR screening library, however (representing 596 compounds) could practically be considered replicate screens because they exhibit correlations on par with true replicates, even though they were screened at different concentrations. For example, we observe such "practical replicates" when a particular compound is screened at a different concentration, yet the level of inhibition is comparable. Supporting this observation, the majority of such "replicates" clustered together (~65%; those with more than one replicate are included if at least one pair is clustered together). Given these experimental caveats, the informative datapoints were reduced from ~30 million to ~15 million due to the nonessential heterozygotes, and to ~9 million unique datapoints if the replicates were excluded.
The HIPLAB screening library comprised 3356 screens and 3250 unique compounds selected from a set of > 50,000 maximally diverse small molecules (~20 million data points) with unknown mechanisms and ~characterized drugs or chemical probes.
The structural diversity of the screening libraries re ects the scale of a large screening effort. While NIBR did not provide the compound structures of their libraries, they reported that 50% of the pairwise comparisons between compounds had Tanimoto coe cients less than 0.1 [7]. In comparison, the HIPLAB compounds were of lower diversity; ~43% of the pairwise comparisons had Tanimoto coe cients less than 0.1. Because the NIBR structures were not provided, however (with the exception of 135 reference compounds and 15 novel inhibitors) this claim is not veri able [7].

Coinhibition between chemogenomic pro les
To compare the HIPLAB and NIBR screens, we rst compared ~150 chemogenomic pro les representing ~50 reference compounds with known MoA that were screened by both NIBR and HIPLAB (Table 2). For many of these compounds, the drug target is well-established in yeast. Chemogenomic pro les were compared individually using 'coinihibition' values, where coinhibition is de ned as the degree of similarity between two chemogenomic pro les, i.e., the FD scores across all genes in each screen, using Pearson correlation as a metric. The HOP pro les for the mechlorethamine, a DNA damaging agent, identi ed a similar set of DNA repair genes including RAD1, RAD2, RAD4, RAD5, RAD10, RAD14, RAD18, REV7, REV3, SRS2 and PSO2 ( Figure 1A). Likewise, we did a pairwise comparison of four nocodazole chemogenomic pro les exhibiting between-drug correlations of 0.48 and greater across the entire set of deletion strains ( Figure 1B). These correlation values increased when comparing only those individual genes exhibiting signi cant FD scores in the NIBR and HIPLAB in the HIP nocodazole pro les ( Figure 1C). In this case, the HIP genes identi ed are enriched for genes required for tubulin folding (CCT genes). Finally, based on a correlation value of > 0.5 with the nocodazole pro les, we highlight the HIP pro les of two novel compounds, NIBR 2667 and HIPLAB 5790901, both identifying a nearly identical set of genes ( Figure 1D). It should be noted that because the screens were performed at different concentrations, a linear correlation of one is not expected.
In addition to measuring the correlation between chemical pro les, a valuable metric is the correlation of gene tness scores across compounds and between datasets. For this comparison we employed 'co tness'; the degree of similarity between tness pro les in which the FD scores between two genes are measured across all compounds, using Pearson correlation as a metric. Genes that exhibit a high degree of correlation or co tness between tness pro les across compounds are often functionally related. In this case, where we have two independent datasets, we expect the same gene to be co t across the 50 compounds that were shared between the two datasets. Overall, we observed an overall correlation of ~0.15 between the same gene. Because most genes are not perturbed in any given experiment, we expect that those with highly variable scores (therefore more likely to be more biologically informative) to exhibit greater correlation. This is indeed the observation, when only signi cantly sensitive strains are considered (genes with standard deviations in the top 5%) the correlation between genes increases to ~0.5. As the correlation increases, the mechanistic similarity between drugs that signi cantly perturb a given deletion strain also increases. For example, while the IDP1 gene pro les exhibit a similar pattern of perturbation (R-value ~0.4) (Figure 2A), RAD5 and HMG1 exhibit higher correlations (R-value ~0.7, ~0.9, respectively) and signi cant perturbations are seen in mechanistically related compounds such as 1) the DNA damaging agents' hydroxyurea, mechlorethamine and methyl methanesulfonate (MMS) and 2) the sterol pathway inhibitors uconazole and uvastatin ( Figure 2B, 2C). Similarly, in the case of TOR1 (R-value 0.85), outlier tness deviations all arise from the same compound (rapamycin) ( Figure 2D). When we examined the genes exhibiting co tness within the shared 50 compounds, we observe enrichment for pairs in both sets that re ect the mechanistic enrichment of the compounds as a whole. For example, 6 of the 50 compounds were DNA damaging agents, and as a result, several of the top co t genes were pairs where both genes were involved in DNA damage.
To examine the agreement between the NIBR and HIPLAB datasets at a more comprehensive level, we combined, and then hierarchically clustered the two HIPHOP datasets together for a subset of mechanistically related compounds. In the rst case ~100 compounds representing 19 distinct mechanistic classes including TOR signaling, microtubule poisons, FAS1 inhibitors, cell wall inhibitors, statins, ionophores, ion channel blockers, azoles and morpholine antifungals, (Figure 3A), and in the second case ~40 DNA damaging agents representing eight mechanistic classes including the drug and tool compounds doxorubicin, camptothecin, hydroxyurea, mechlorethamine and MMS (Figure3B). In the resulting heatmaps, in both cases, the two identical dendrograms reveal that the screens cluster primarily by the mechanism of drug action and not by the research institute. Screens from NIBR and HIPLAB were interspersed, and all replicates and compounds with the same mechanism clustered together. In the DNA damaging clustergram, one notable exception was observed for two aclarubicin pro les (one from each research site) that did not cluster with the other anthracycline compounds including doxorubicin, daunorubicin and epirubicin. These differences between speci c anthracyclines likely re ect true mechanistic differences between these closely related compounds [11,12]. For example, the individual aclarubicin HIPHOP pro les implicate RPO31 (encoding an RNA polymerase III subunit) as a potential target. In select cases, compounds with similar mechanisms (i.e., part of the same pathway) also clustered together, including the morpholine antifungals, e.g. fenpropimorph and amorol ne, both targeting ERG2, the azoles, e.g. uconazole and clotrimazole, targeting ERG11, and the statins, e.g. atorvastatin and uvastatin, targeting HMG1, with all three targets in the sterol biosynthesis pathway). Other examples include clustering of ion channel blockers next to ionophores, amiodarone and nigericin, respectively, and clustering of rapamycin next to caffeine, known to target the TOR pathway in Saccharomyces cerevisiae ( Figure 3A).

Common response signatures
Our previous global analysis of the HIPLAB dataset [6] revealed that, despite the complexities of pharmacological inhibition, the cellular response to small molecules is limited and can be described by a network of 45 major response signatures. These responses comprise chemogenomic pro les with; 1) a characteristic gene signature, 2) distinct GO enrichments and 3) enriched chemical sub-structures. In this 2014 study we found that, by subsampling, the majority of these signatures (~80%) could be identi ed after screening less than 30% of the compounds, suggesting that the cellular response to small molecules is limited. To test if these response signatures are also present in the NIBR dataset, we used the same methodology as in Lee et al., (2014) [6] to hierarchically cluster the NIBR screens using coinhibition as a distance metric and a dynamic branch cutting method [13] to generate discrete clusters. 96 robust clusters were initially identi ed covering ~41% of the pro les. Compared to the 45 major responses in the HIPLAB dataset covering ~36% of the pro les, the number of NIBR response signatures was two-fold greater. However, many of these signatures were redundant with respect to their GO enrichments and associated gene signatures. Gene signatures were also longer, as would be expected when clusters are small and when compounds within a cluster are replicates. While it is not entirely clear, one explanation for this observation is that the NIBR screening library contains a large number of replicates (56% of all screens) which would produce partially redundant clusters. To identify discrete clusters with minimal redundancy, dynamic branch cutting parameters were modi ed to be less sensitive to smaller clusters (see Methods), which resulted in a nal set of 42 robust NIBR clusters, comparable to the 45 HIPLAB response signatures. The median number of genes in the response signatures was similar between the nal NIBR signatures (7 genes) and HIPLAB signatures (8 genes). Using the overlap between gene signatures to measure similarity between response types, we found that ~66.7% of the 45 major HIPLAB response types were detectable in the NIBR clusters. These common signatures include; iron & copper homeostasis, cell wall signaling, mitochondrial stress, and perturbation of the plasma membrane. More speci c responses, often including drugs of known mechanism, included the responses: unfolded protein, anthracycline transcription coupled DNA repair, azoles and statins, ERAD & cell cycle, heme biosynthesis & mitochondrial translocase, NEO1-PIK1, tubulin folding & SWR complex, superoxide and DNA damage.
The majority of these conserved chemogenomic response signatures are enriched for biological processes, details of which can be visualized at the accompanying website (Comparative chemogenomics). Taken together, these results provide further support for the concept that the cellular response to small molecules is limited and that it can be de ned by chemogenomic signatures.
Because the chemogenomic signature comparison may be impacted by biases in screening library composition, we asked which of the nal 42 responses were unique to the NIBR dataset. Response signatures that were not detectable in the HIPLAB responses included those comprising the three TOR signaling clusters, the GPCR inhibitor response as well as the eukaryotic translation initiation factor (eIF) complex inhibitor signature. Other responses were also gene/target-speci c and included: inhibitors of VRG4, encoding a Golgi GDP-mannose transporter, RPL15A & SPP41, encoding a ribosomal gene and a regulator of spliceosome components, respectively, and FAS1, encoding fatty acid synthase. The nding that these 'missing' responses likely re ect NIBR screening a small number of target-focused compound sets, combined with our initial nding of many, small and highly redundant clusters suggests that the NIBR libraries are enriched for sub-libraries of mechanistically and/or structurally related molecules.
We used the same approach to compare the signatures of the combined dataset to the HIPLAB responses. In this case, of the resulting 47 chemogenomic signatures, ~84% of the original 45 HIPLAB signatures were detected. Of these 38 overlapping signatures, common signatures included all the DNA damage responses, as well as the azole & statin, superoxide, tubulin folding & SWR complex, unfolded protein and mitochondrial-speci c stress responses. Interestingly, by combining the two datasets, some of the NIBR signatures that had not previously matched a HIPLAB response were merged into one of the 38 overlapping responses. Only three signatures were comprised solely of HIPLAB pro les: the NEO1, ubiquinone biosynthesis & proteosome, and the RSC complex & mRNA processing signatures. Conversely, the signatures driven by NIBR pro les largely overlapped the target-speci c responses unique to the NIBR dataset including: TIM54, RPL15A & SPP41, VRG4, eIF, and GPCR inhibitors as well as the major TOR signaling response.

Target frequency comparison
Compared to the HIPLAB dataset, which focused on screening diverse compounds with unknown mechanisms, NIBR clusters were highly enriched for screens identifying genes as potential drug targets. The most frequently identi ed targets that dominated speci c clusters in the NIBR dataset include, 1) ERG11 (of the sterol biosynthesis pathway) and KOG1, AVO1, and TOR2, encoding subunits of the Targets of Rapamycin (TOR1 and TOR2) complexes, 2) FAS1, encoding fatty-acid desaturase, and 3) the mitochondrial transport gene TIM54. The coherence of these signatures suggests that the contributing compounds represent structural analogs. In the azole & statin and ERG11-GCN responses, ERG11 is identi ed as the target in 31% and 67% of the screens, respectively. In the three rapamycin clusters, the TOR1 and TOR2 subunits are identi ed as targets in over half (26) of the 51 screens. In 2012, NIBR published a study of novel Erg11 inhibitors, suggesting these published inhibitors may be present in the NIBR screening library [14]. Similarly, the high frequency of targeting mTOR (mammalian target of rapamycin) complexes (as evidenced by the three responses associated with TOR signaling) suggests an enrichment of rapamycin analogs in the NIBR compound library. This is consistent with the fact that Rapamycin and aging are active areas of inquiry at NIBR [15][16][17].
HIPHOP pro les presented in studies previously published by NIBR researchers allowed us, in select cases, to infer the structure of blinded screens. For example, a Nature Chemical Biology study published by the group demonstrated TIM23-dependent mitochondrial import as the target of the natural product stendomycin [18]. A HIPHOP pro le in the NIBR dataset was nearly identical to the published version, particularly after accounting for differences in concentration ( Figure S2). Similarly, the NIBR group also published a novel geranylgeranyltransferase inhibitor (uncovering sensitivities of strains encoding subunits of the CDC43/RAM2 heterodimer) that was highly correlated to the HIPHOP pro le for NIBR compound 5692 in the NIBR dataset [19] (Figure S3).

Compounds and mechanism of action inferred by clustering with reference compounds
One of the NIBR clusters revealed the mechanism of NIBR compounds by virtue of its correlation to reference compounds. The HIPLAB amphotericin B HIP screen was highly correlated with the NIBR 4247 and 1020 HIP screens (> 0.7, p-value < 1e-16) ( Figure 4A). In another example, the NIBR compounds 1208, 1209, 1210 and 1211 exhibited correlations of > 0.8 (p-value < 1e-16) with the hydroxyurea screens, a correlation value on par with that observed between replicates, suggesting these compounds are most likely structural analogs of hydroxyurea or closely related derivatives ( Figure 4B).

Conclusions
Our global analysis of the HIPLAB and NIBR datasets provides a systems-level view of the cellular response to small molecules. Despite the enormous complexity of the cell, the ~ 35 million chemical-genetic quantitative measurements reported here can be described by ~ 45 chemogenomic signatures, de ned by chemical structure-and biological process-based properties. These drug signatures provide a framework for understanding drug action and importantly, the impact of genetics on the in vivo response to small molecule perturbation. Because we observed saturation of the 45 major signatures in our previous dataset and also detected 66.7% of these responses in the NIBR dataset, we expect that these signatures represent fundamental, systems-level smallmolecule responses. 60% of these responses were also detectable in an earlier large-scale screening campaign [5]. 40% of the responses were conserved in all three datasets. We suggest that the proteins encoded by the genes that comprise these shared, conserved signatures represent potential starting points for therapeutic intervention. The power of functional genetic screens to uncover drug targets and target pathways, and to delineate the mechanism of action of therapeutics has been demonstrated both in the yeast model system and more recently in meta-analyses of mammalian-cell based CRISPR screens [20]. As the complexity of these screens increase (e.g., in vivo assays, applying combined perturbations, etc.) the ability to perform integrated analyses will grow in importance. Based on our analysis of the two largest gene-drug comprehensive datasets collected to date, we show, using standardized protocols and analytics that yeast-based screens can be performed, at scale, across laboratories and that the resulting data are robust.

Source of Datasets
The NIBR dataset was downloaded from Hoepfner et al., 2014 [7] through the Drayd digital repository at: http://doi.org/10.5061/dryad.v5m8v. Gene-wise zscores data of the essential genes present in the heterozygous dataset were selected and combined with the nonessential homozygous dataset for 2725 screens present in both datasets. The HIPLAB data consists of 5905 strains and 3356 screens [6]. For clustering the combined datasets, the two matrices were merged into a nal matrix of 5894 strains x 6081 screens. 309 strains in this dataset were absent in the NIBR dataset.
Identi cation of signi cant chemical-genetic interactions FD scores were calculated for both datasets using slightly different techniques. Speci cally, for each HIPLAB strain, log 2 ratios were calculated for as follows: (1) log 2 ratio HIPLAB = log 2 [<median signal from control samples> / <signal from chemical sample>] To facilitate comparisons between screens, log 2 ratios were standardized (separately for heterozygous and homozygous strains).
The FD score of strain i in screen j was computed as follows: (2) FD i,j HIPLAB = (log 2 ratio i,j -<median of log 2 ratios for screen j>) /<MAD of log 2 ratios for screen j> Because the FD scores follow a standard normal distribution, the probability that a given score is an outlier in this distribution was obtained using a one-tailed P test. P < 0.001 were identi ed as signi cant chemical-genetic interactions. To identify outlier screens for a given deletion strain, FD scores were converted into gene-wise Z-scores and P-values [6].
As described [7], the NIBR dataset de ned the log 2 ratio roughly as 1/log 2 ratio HIPLAB : (1) r L = log 2 ratio NIBR = log 2 [<average signal from chemical sample>/<average signal from control samples>] the normalized MADL score FD of strain i in screen j was computed as: (2) MADL i,j = FD i,j NIBR = (log 2 ratio i,j -<median of log 2 ratios for screen j>) /<MAD of log 2 ratios for screen j> The MADL or FD NIBR is roughly equivalent to the negative value of the FD HIPLAB .
Lastly, the MADL scores were multiplied with the t-test p-value between replicates and the controls to be adjusted for highly variable strains (a MADL ). The genewise z-scores were further estimated using a MADL of strain i over n experiments and the standard deviation (σ) obtained from the middle 70% of the quantiles: (3) z-score i = a MADL(i) /σ i A z-score cutoff of -5 is used to de ne the signi cant chemical-genetic interactions [7].
Identi cation of HIP hits 'HIP hits' are de ned as potential targets of pro led compounds with high speci city [6]. In the HIPLAB dataset, 'clearance' was de ned as a measure of speci city that identi es signi cant hits in strains exhibiting FD scores greater than zero in a given HIP pro le where: Strains are ordered by FD scores in descending order, where FD (i) is the i th greatest FD score in the pro le and clearance is de ned as the difference between FD scores: Clearance max is the maximum clearance' associated with the pro le, and FD max is the FD score of the strain with clearance' = clearance max If any FD (i) ≥ FD max , clearance = clearance max otherwise, clearance = clearance' Clearance thresholds were optimized using the gold standard compounds with known targets and in the dataset resulting in a threshold of 5.75. Therefore, strain(s) with signi cant FD scores (P < 0.001) and clearance max ≥ 5.75 are designated HIP hits [6]. We used this clearance scoring system to identify hits in both datasets.

Hierarchical clustering
Our chemogenomic dataset is in a matrix format where each screen is a column, and each row is a gene (corresponding to its homozygous or heterozygous deletion strain). To identify robust clusters in the NIBR dataset and to fairly compare the two datasets, we followed the same hierarchical clustering methodology used in Lee et al. 2014 [6]. We rst replace insigni cant scores (standard normal P > 0.001) in the NIBR screening matrix with zero, to focus on the most signi cant cellular responses to chemical perturbation. We then compute coinhibition, the pairwise Pearson correlation between all screens, representing the similarity between the NIBR pro led compounds. Pro les were then hierarchically clustered using (1 -coinhibition) as the distance metric, and the Ward agglomeration method. Discrete clusters were obtained using a dynamic branch [13] cutting method. For the full NIBR data set we used the following parameters: deepSplit = 4, minClusterSize = 3, as was done in Lee et al. [6]. For the nal version with 41 clusters we used deepSplit = 2, cutHeight = 20, minClusterSize = 3. For the combined HIPHOP dataset, we used a minGap = 0.098, deepSplit = 2, minClusterSize = 3.

Chemogenomic response signatures
In our previous study, we classi ed HIPHOP cellular response types into chemogenomic signatures de ned by characteristic genes and associated biological processes [6]. To determine whether these major response signatures exist in the NIBR dataset, we used the same analytic methods. Speci cally, an FD matrix was provided using all pro led compounds as columns and all deletion strains (genes) as rows. Similarity between the cellular responses to the pro led compounds was measured using the Pearson correlation between the matrix columns (coinhibition). The functional similarity between two genes was measured using the Pearson Correlation between the matrix rows (co tness). To identify robust clusters in the NIBR dataset, pro les were hierarchically clustered using (1 -coinhibition) as the distance metric, and the Ward agglomeration method. For each cluster, we calculated the median FD scores of each deletion strain across all pro les in that speci c cluster to generate a median pro le. Strains with signi cantly positive FD scores (standard normal distribution P < 0.001) identify the characteristic gene signatures that are an important part of the cellular responses. A standard normal distribution of P < 0.001 was used for comparing HIPLAB and NIBR signatures. For the signatures in the combined dataset, we used a threshold standard normal distribution P < 0.05.
In the NIBR dataset, 49 clusters were identi ed and 41 were associated with characteristic genes or gene signatures. Response signatures with fewer than two genes that were not enriched for biological processes were omitted. We performed GO enrichment analysis on each response signature.    Hierarchical cluster analysis of reference compounds screened by both the HIPLAB and NIBR. To identify robust clusters, we generated the 'coinhibitory' square matrix, de ned as the pairwise Pearson correlation between the selected screens, representing the similarity between pro led compounds. Pro les were then hierarchically clustered using (1 -the coinhibitory matrix) as the distance metric and Ward as the agglomeration method. Heatmap of: (A) drugs with established mechanism (B) antimetabolites and DNA damaging agents. Row dendrogram branches are colored by mechanism of drug action; column dendrogram branches are colored by research institute: NIBR and HIPLAB in navy and light blue, respectively. Drugs within each major cluster represent screens with highly correlated chemogenomic pro les, indicated by both the heatmap color scale and dendrogram height. This suggests that compounds within a cluster act by a similar mechanism.