### Data processing and normalization

Several sources of noise, including technical and procedural factors, may influence measurement quality, generating inferential errors. Usually normalization is done prior to data analysis in RNAi screening studies such that variations contributed by unequal amounts of cells and/or RNAi are substantially reduced. Within-plate normalization can be performed using the non-silencing RNAi controls in the plate as a reference to give a relative measurement of target-gene knockdown effect, often adjusting for the variance by dividing by the standard deviation (SD) or median absolute deviation (MAD). Some approaches use a positive control or both positive and negative controls [7], others do not use a control, including Z score/robust Z score and B score [8]. Across-plate normalization is the process that makes measurements comparable across culture plates by removing systematic plate-to-plate variation. Common approaches include mean- or median-centered normalization, and standardization. In drug sensitivity studies, however, it is important to realize that such approaches can conceal true differences between drug-treated and untreated plates and consequently produce false-negatives. Instead, the raw viability data can be centered using the global mean or median of untreated non-silencing (NS) siRNA controls. Therefore, we recommend including untreated NS controls in all culture plates for drug sensitivity studies.

### Statistical approaches

One major purpose of RNAi screens is to identify genes that mediate the effects on cells of certain conditions, such as treatment with a chemotherapeutic drug or endocrine therapy. Such experiments explore the effect of gene knockdown in treated versus untreated wells, aiming to find meaningful associations between genes and the treatment. Different rules have been used to identify hits. A commonly used parametric approach is mean ± kSD under the assumption of normality [8, 9] or its more robust version by replacing SD with MAD [10]. When distribution is skewed, a quartile-based method is an option [11]. Strictly standardized mean difference (SSMD) was first proposed to assess siRNA effect size, and modified later to balance false-negatives and false-positives in hit selection [12, 13]. In high-throughput RNAi screens designed for drug sensitivity studies, available statistical approaches are much fewer; most commonly used by biologists include fold-change methods (sometimes combined with percent cell viability), parametric two-sample tests such as the *t*-test and Z-factor and their variants, and sensitivity index (SI).

Under the assumption that most features are not significant in high-dimensional data analysis, feature selection methods like Lasso (L_{1} penalty) and Elastic Nets (both L_{1} and L_{2} penalty) and their variants are often found useful and efficient in dimension reduction and feature selection. However, it may be difficult to adopt similar strategies to RNAi screening studies for drug sensitivity evaluation because, firstly, our main interest focuses on testing the gene-drug interactions, therefore in addition to siRNAs whose gene-drug interaction effect showed significance, the final model also needs to include drug effect regardless of its statistical significance. This however cannot be guaranteed by the automatic variable selection methods mentioned above. Secondly, when the number of features (*p*) is larger than the number of samples (*n*), lasso methods can select at most *n* features. This may be problematic for RNAi screening studies where *p* > > *n* is usual (in practice such experiments are usually designed with 3, 6, or 9 -replicate). Thirdly, lasso methods generate a number of most important features. Nevertheless, for gene function and drug discovery purposes in a high-throughput screening experiment, finding features with a small effect can be substantively important and a ranked list of candidate features, based on their significance, are often helpful.

Based on above considerations, we conducted a simulation study to evaluate the performance of commonly applied methods: fold change, *t* test, SI. We also fit a linear model (LM) of the probability of being a hit for each gene with an interaction term of drug and RNAi effect on cell viability. Each method is described as below.

#### Fold change/ratio

Fold change is the most intuitive approach used to represent the relative cell viability between two conditions, which usually is calculated as average cell viability (measured, for example, through an Alamar blue dye assay) over all wells in a condition divided by average viability over all wells in another condition. Because most genes knocked down in a siRNA screen do not have a significant effect on cell viability/growth in the background of the treatment, the log2 viability ratios between treated and untreated wells will be around zero for most genes. An arbitrary cut-off level, such as two- or three-fold change, is typically used to select hits for further experiments and analysis.

#### Parametric tests/statistics

Many biologists favor tests of two-group comparison for their easy calculation and interpretation. One widely used test is Student's two-sample *t* test. For each siRNA, a *t* statistic, *Ti*, is computed, and an siRNA is considered significant if |*Ti* | exceeds some threshold. The Z-factor and Z'-factor have been used for similar purposes; however, such analysis is usually based on the difference of the averaged readings over replicates between treated and untreated groups. These methods are more sophisticated than fold-change in the sense that they not only consider the average ratios between the groups but also incorporate information on the variation of the measurements and the number of replicates in the experiment.

#### Sensitivity index (SI)

The SI method was developed to measure the influence of siRNA-induced gene knockdown on drug sensitivity, by estimating the difference between the expected and observed combined effects of RNAi and drug on cell viability. Different from the methods discussed above, the SI method estimates both the influence of siRNA-induced gene knockdown on drug sensitivity and the individual drug and RNAi effects. The SI index can be calculated for each siRNA as SI = (Rc/Cc*Cd/Cc)-(Rd/Cc), where Rc is the average viability in drug-untreated wells transfected with active siRNA, Rd is the average viability in drug-treated wells with active siRNA, Cc is the average viability in drug-untreated wells with control siRNA, and Cd is the average viability in drug-treated wells with control siRNA. The SI value ranges from -1 to 1, with positive values indicating a sensitizing effect and negative values indicating an antagonizing effect. The SI method makes no assumption about the underlying probability distribution and therefore no *p*-values can be calculated.

#### Linear models with an siRNA-drug interaction effect

The SI method attempts to estimate combined RNA and drug effect. Nevertheless, one major disadvantage of the SI method is that it ignores the cross-plate variation of a particular siRNA, as the calculation of sensitivity ratio (Rd/Cd)/(Rc/Cc) involves only averaged reading levels over the replicate plates. Model-based methods are often used for feature selection in other types of high-throughput genomic data, including gene-expression microarray data and single nucleotide polymorphism (SNP) data. In our study, we used a simple linear model with an interaction term to assess RNAi effect, drug effect, and their combined effect. Assuming normal distribution, a full linear model

*D*
_{2} of cell viability (Y) for each siRNA

*i* can be constructed based on the predictor variables: drug effect (

*x*
_{1i
}, yes/no), RNAi effect (

*x*
_{2i
}, yes/no), and their interaction term (

*x*
_{1i
}
*x*
_{2i
}).

This model not only allows for estimating the gene-drug effect but also takes into consideration the variance among the replicates in its estimations. A test based on the difference between the deviance of the null model (model with no explanatory variables)

*D*
_{0} and the deviance of the fitted full model

*D*
_{2} may yield significant result when the drug effect is significant, even if the siRNA does not have any effect on cell viability. Therefore, we calculated the difference between the residual deviance of the fitted full model

*D*
_{2} and the deviance of the reduced model

*D*
_{1} including only drug effect (

*x*
_{1}):

This statistic, *D*
_{1} -*D*
_{2}, follows a chi-square distribution with 2 degrees of freedom. The *p*-value based on this statistic reflects the combined effect of drug and RNAi as well as the RNAi effect alone of the given siRNA. The reason we did not include RNAi effect (*x*
_{2}) in *D*
_{1} is that a significant RNAi effect alone without a significant interaction effect with drug treatment also provides vital information about the gene that is silenced, which can be very useful in identifying novel therapeutic targets for future studies.

### Simulation of datasets

We evaluated the methods using datasets simulated to represent different scenarios corresponding to a given combination of parameters of number of true hits, the amount of noise, the skewness of the data, the strength of chemotherapeutic drug effect, and the RNAi effect. We focused on combined RNAi and drug effect on cell viability, control of false-positive and false-negative rates, and the influence of drug concentration on the statistical power.

Data for ten 96-well plates with three, six, nine, or twelve replicates were simulated. For each scenario, 500 simulations were carried out. For each simulation, a number of true hits were drawn randomly from the distribution Uniform{10, 11, ..., 60} with an average of 35 out of 900 siRNA wells being truly sensitizing or antagonizing. The viability measurements of the samples transfected with non-hits were generated from N(*μ*
_{
NH
}, σ^{2}), with *σ* at values of 0.2, 0.4, 0.6, or 0.8. The distribution of true hits is assumed to have a shifted mean relative to the distribution of non-hits: N(*μ*
_{
NH*}
*C*
_{
1
}, *σ*
^{
2
}) with *C*
_{
1
} > 1 for a sensitizing effect and N(*μ*
_{
NH*}
*C*
_{
2
}, *σ*
^{
2
}) with *C*
_{
2
} < 1 for an antagonizing effect (*C*
_{
1
} = 7, *C*
_{
2
} = 0.15; *C*
_{
1
} = 5, *C*
_{
2
} = 0.3; *C*
_{
1
} = 2, *C*
_{
2
} = 0.5; and *C*
_{
1
} = 1.25, *C*
_{
2
} = 0.8). The average cell viability in control wells is usually higher than that in siRNA-transfected wells. The parameter of the chemotherapeutic drug effect *D* was used to tune the strength of such effect. Specifically, the distribution of drug-treated samples (with active or control siRNA) has a shifted mean relative to untreated: *μ*
_{
trt
} = *μ*
_{
unt}**D*, *D* = 0.3, 0.6, and 0.8, where *μ*
_{
unt
}= *μ*
_{
NH
}, *μ*
_{
NH*}
*C*
_{
1
} , *μ*
_{
NH*}
*C*
_{
2
} , or *μ*
_{
NH*}
*K* as previously defined. In addition, parameter *K* is defined to be K = 1.05, 1.10, 1.15, or 1.2, such that control wells have a distribution with mean *μ*
_{
ctl
} = *μ*
_{
rna
}**K*, where *μ*
_{
rna
} = *μ*
_{
NH
}or *μ*
_{
NH*}
*D*. Parameters *μ*
_{
NH
}, *σ*, *C*
_{
1
}, *C*
_{
2
}, *D*, and *K* were chosen such that the simulated data would resemble data with different distributions and properties, similar to those we have observed in real siRNA screening experiments. In particular, *C*
_{
1
} and *C*
_{
2
} were chosen such that the sensitizing and antagonizing effects would be equal in magnitude in order to have roughly same number of true hits simulated in both directions of the effect.

To evaluate the robustness of the methods for skewed data, gamma distributions were used instead of normal. The shape and scale parameters of gamma distributions were calculated by solving *μ = rλ and σ*
^{
2
}
*= rλ*
^{
2
} based on previously used parameters of normal distributions. The skewness value (
) is taken to be 0.5, 1, 1.5, or 2. Two situations were considered: when a strong drug effect is present (D = 0.3) and when a weak drug effect is present (D = 0.8).

### Criteria for the evaluation of statistical approaches

In practice, RNAi screening studies often involve a great deal of variation and noise in the raw data. Moreover, due to cost constraints, the number of replicates is often very limited. Hypothesis-testing under such conditions is, therefore, error-prone, with errors falling into two types: type I error (false-positive, FP) and type II error (false-negative, FN). To evaluate the performance of the methods, we calculated the false-positive rate (FPR) and the false-negative rate (FNR) of each method and in each scenario:

The FPR corresponds to the portion of genes that, when silenced, have no influence on drug sensitivity among those identified as influential by the method. The FNR corresponds to the portion of genes influencing drug sensitivity among those claimed non-influential by the method.

### Real data analysis

Paclitaxel is a potent anti-microtubule agent used in the treatment of patients with locally advanced and metastatic breast cancer. Despite its wide use, paclitaxel-based chemotherapy results in full response in only a small portion of patients; many patients have an incomplete response or are resistant to treatment. We performed a loss-of-function RNAi screen to identify genes that modulate paclitaxel sensitivity [14]. We targeted a subset of genes (n = 428) frequently found to be "deregulated" in breast cancers and known to be associated with a targeted pharmacological agent (i.e., druggable), with the idea these could be analyzed in preclinical models for synergistic activity with paclitaxel. An shRNA screen was initially performed to identify druggable gene targets; we then validated the top high-confidence hits from the shRNA screen by designing two independent siRNAs for each gene, to be assayed in two representative breast cancer cell lines, MDA-MB-231 and MDA-MB-468. The two cell lines were reverse-transfected with siRNAs complexed with lipid reagent in each well of a 96-well plate for 48 h and subsequently split into six replicate plates. Following transfection of siRNAs, plates/cells then were treated for 24 h ± paclitaxel (i.e., 3-replicate plates +paclitaxel, 3-replicate plates -paclitaxel) and incubated for an additional 72 h to allow for changes in cell viability. To account for plate-to-plate variability and to control for the effects of siRNA transfection, data were normalized to non-silencing (NS) siRNA or shRNA controls, which do not target any human gene, for all plates. The full experiment (i.e., transfection, drug treatment, viability assay, and data normalization) was repeated, resulting in high reproducibility Pearson's correlation coefficients ~0.70-0.80.