GSK690693 was synthesized at GlaxoSmithKline. For all in vitro studies, GSK690693 was dissolved in DMSO at a concentration of 10 mM prior to use and subsequently diluted in aqueous medium. For the tumor xenograft studies, GSK690693 was formulated in either 4% DMSO/40% hydroxypropyl-β-cyclodextrin in water, pH6.0.
Female CD1 Swiss Nude mice were obtained from Taconic (Hudson, NY) and C.B-17 SCID mice were obtained from Charles River (Wilmington, MA). All animal studies were performed in compliance with federal requirements, GlaxoSmithKline policy on the Care and Use of Animals, and with related codes of practice.
Sample preparation for RNA and Phosphoprotein analysis
Human tumor cell lines BT474, T47D, MDA-MB-468, MDA-MB-453, LNCaP, and SKOV-3 were treated with 1 μM GSK690693 for 2, 8, 24, and 48 h (N = 3 or 4 replicates/treatment group) and lysates were prepared in Trizol for RNA expression analysis. Phosphorylation of various AKT substrates was analyzed in breast carcinoma cell lines treated for 10 min, 30 min, 2 h, or 24 h of treatment (N = 2 replicate/time point) using reverse phase protein microarray.
Tumor xenografts were initiated by injection of tumor cell suspension (LNCaP) or tumor fragments (BT474, SKOV-3) subcutaneously in 8-12 week old CD1 Swiss Nude mice (LNCaP, SKOV-3) or SCID mice (BT474). When tumors reached a volume of 100-200 mm3, mice were randomized and divided into groups of 8-12 mice/group. GSK690693 was administered once daily at 30 mg/kg by IP administration. Tumor tissues were harvested after dosing for 3, 7, or 21 days (n = 3 mice/group/time point) and homogenized in Trizol. Total RNA was isolated from each Trizol lysate using RNeasy reagents (Qiagen, Valencia, CA) and was quantified by spectroscopy and quality assessed using an Agilent Bioanalyser. Five micrograms of high quality total RNA was used to generate amplified cRNA probe material using the Eberwine protocol (Van Gelder et al., 1990) and hybridized overnight using U133 Plus 2.0 GeneChips (Affymetrix, Santa Clara, CA). GeneChip washing and scanning were performed according to manufacturer's instructions.
Reverse-phase Protein Array
Protein arrays were constructed as described previously . Briefly, serially diluted protein lysates were printed in duplicate onto nitrocellulose-coated glass slides. The lysate arrays were incubated for at least 5 hours in blocking solution [1 g I-block (Tropix, Bedford, MA), 0.1% Tween-20 in 500 mL PBS] at room temperature with constant rocking. Blocked arrays were stained with pGSK3a/b (S9), pFOXO (T24/32), pFOXO (S256), pmTOR (S2448), pBAD (S112), and pPRAS40 (T246) antibody on an automated slide stainer. All antibodies were obtained from Cell Signaling Technology (Beverly, MA), except phospho-PRAS40 antibody which was purchased from Biosource (Carlsbad, CA). Stained slides were scanned individually on a UMAX PowerLook III scanner (UMAX, Dallas, TX, USA) at 600 dpi and saved as TIF files in Photoshop 6.0 (Adobe, San Jose, CA, USA). The TIF images for antibody-stained slides and Sypro-stained slide images were analyzed with MicroVigene image analysis software, version 2.200 (Vigenetech, North Billerica, MA) and Microsoft Excel 2000 software. Images were imported into Microvigene, which performed spot finding, local background subtraction, replicate averaging, and total protein normalization, producing a single value for each sample at each endpoint.
Microarray data was analyzed using the R-based Bioconductor suite of analytical tools to determine genes that changed state across a variety of comparisons. RMA analysis was applied to generate expression values from the Affymetrix CEL files. Principle component analysis was applied to the expression values for each group of microarrays to determine if any samples differed dramatically from the set of similar microarrays. Significant gene expression changes for all but the LNCaP 3 day xenograft comparison were generated based on ANOVA adjusted p-values of 0.05 corrected for multiple testing effects using Benjamini-Hochberg FDR and fold changes of at least 1.3. As the purpose of this analysis was to determine common mechanisms of action across diverse cell lines and xenografts, the choice of a conservative minimum fold change selection of 1.3 was appropriate in that it ensured that hypotheses were supported by clear experimental evidence. Due to the large number of fold changes in the LNCaP 3 day xenografts an unadjusted p-value of 0.01 was used without a fold change criteria.
Phosphoproteomic values were determined using reverse phase proteomic analysis. The resulting values were analyzed using the ratios of the treated and vehicle cell lines at 10 and 30 minutes, 2 and 24 hours. The presence of a significant change was determined by a 20 percent decrease or a 50 percent increase in the treated compared to the vehicle samples. These cutoff thresholds were determined empirically to enable changes to contribute to the analysis across cell lines.
Causal Reasoning Methodology
In this study, the activation or inhibition of specific biological signaling networks were identified as explanations for statistically significant RNA gene expression changes observed in response to treatment with GSK690693 in multiple cell lines. These networks represent mechanistic hypotheses for molecular effects of GSK690693 and together they comprise a network called a Causal Network Model (CNM) that links GSK690693 treatment to a large fraction of the observed data in multiple experiments via common mechanisms. These networks were identified in a two-stage process: (1) Reverse Causal Analysis, an automated analysis of the experimental data using a large, literature-derived network of cause-and-effect relationships, the Genstruct Human Knowledge Assembly Model, and (2) a software-assisted methodology enabling scientists to vet the results of the automated analysis and to produce the explanatory networks. Note that this process is a means to explain observed data in the context of existing knowledge, distinct from approaches that attempt to infer novel causal relationships from observed data.
The Human Knowledge Assembly Model (KAM) is a set of human-specific causal assertions that has been augmented with orthologous causal assertions derived from either rat or mouse sources. Each causal assertion is the result of manual curation of the scientific literature and is supported by one or more specific scientific citations. An example causal assertion would be: increased transcriptional activity of NF-KB complex causing an increase in the gene expression of the insulin receptor substrate 1 (Irs1) (Ruan et al., 2002).
Reverse Causal Analysis (RCA) of experimental data evaluates each node in the KAM as a hypothesis, a potential cause for observed differential measurements in an experiment. By computing statistical figures of merit for each hypothesis, RCA enables each hypothesis to be ranked by multiple criteria (see below) and prioritized for inclusion in larger explanatory networks. RCA starts with the quantification of differential measurements as "state changes", reducing values to be one of "increase", "decrease" or "no change". While these differences are referred to as "changes", in fact they can be any differences in state between two biological systems, such as differential protein expression between drug-sensitive and insensitive cell lines, or differential RNA expression between tissues of knockout and wild-type animals. State changes are then assigned to nodes in the KAM that represent entities corresponding to the measurements. In the case of transcriptomic data, state changes are mapped to nodes representing RNA abundances. Finally, every node in the KAM is evaluated as a hypothesis, where a hypothesis is a potential explanation for some subset of the state changes. A node evaluated as a hypothesis is the "root" of the hypothesis and the hypothesis is composed by assuming that the root node has a value of "increased" and then searching a defined number of steps in the network of the KAM for all causal paths leading from the root node to a mapped state change. Each state change node found by this search is a "prediction" of the hypothesis and is assigned a polarity of either "increase", "decrease", or "ambiguous". The polarity assignment is based on the sequence of inverting and non-inverting causal relationships traversed in each path from the root node to the state change node. If the state change node can be reached by paths making contradictory polarity assignments, it is assigned a polarity of "ambiguous". The set of predictions for each hypothesis is then evaluated with respect to the mapped state changes by calculating two figures of merit: Richness and Concordance p-values. Richness is a measure of the relevance of a hypothesis to the changes observed in the experiment, while Concordance is a measure of the accuracy of the predictions of a hypothesis. Both Richness and Concordance cast the predictions and the measurement data into canonical forms for probability analysis. Richness is a measure of the over-representation of observed state changes in the set of genes for which a hypothesis makes predictions. For example, observed state changes are overrepresented if only 1% of all genes measured show significant change but 10% of the genes for which a hypothesis makes predictions show significant change. Richness is the significance of the overrepresentation, calculated as a p-value based on the hypergeometric distribution, i.e. sampling without replacement. It is the likelihood of having Q state changes both predicted to change and observed to change, given N total measurements, M total significant changes and P predictions. Note that Richness does not depend on whether the observed direction of any change agrees with the predicted direction; hence ambiguous predictions may be included in the calculation. Concordance is a measure of the correctness of the predictions of the hypothesis, whether observed changes agree with unambiguous predictions. The direction of the hypothesis root is taken to be the direction that results in the higher number of successful predictions. The stronger the biases of the supporting evidence in favor of a hypothesis root direction, the more concordant the hypothesis. The significance of this bias is calculated as a p-value based on the binomial distribution. For hypothesis A, if K is the number of state changes supporting increased A and J is the number of state changes supporting decreased A, then Concordance of A is the probability of making H or more (H = max (J,K)) correct predictions out of (J + K) total predicted and observed changes given that the null probability of the prediction (increase or decrease) matches the observed change (increase or decrease) is 0.5. Richness and Concordance are metrics of the significance of a hypothesis: whether the hypothesis can explain more of the observed changes than would be expected by chance and whether its predictions are more consistent than would be expected by chance.
In the second stage of the analysis, the hypotheses produced for each dataset were filtered for significance and presented in an analysis interface that facilitated the sorting of hypotheses by multiple criteria, comparison of hypotheses between experiments, and the investigation of the literature citations supporting each hypothesis. A hypothesis was considered to be statistically (although not necessarily biologically) significant if it met richness and concordance probability cutoffs of 0.05, and marginally significant if it met richness and concordance probability cutoffs of 0.1. Scientists using this analysis interface selected hypotheses for inclusion in the explanatory networks and eventually the CNM based on criteria including (1) whether the nodes in the hypothesis were causally linked to phenotypes and processes observed in the study, (2) whether the hypothesis node was causally downstream from GSK690693, (3) whether the root node was causally connected to other hypothesis root nodes, and (4) whether the root node itself was an increased or decreased state change. The four mechanisms presented in this report are the networks common to the Causal Network models constructed for each treatment of sensitive cell lines in cell culture and xenografts. A summary overview of the Causal Reasoning methodology is shown in Additional file 13 Figure S5. References for all data in the additional files can be found in Additional file 14 Table S9.