Co-acting gene networks predict TRAIL responsiveness of tumour cells with high accuracy
© O’Reilly et al.; licensee BioMed Central. 2014
Received: 14 October 2014
Accepted: 11 December 2014
Published: 19 December 2014
Identification of differentially expressed genes from transcriptomic studies is one of the most common mechanisms to identify tumor biomarkers. This approach however is not well suited to identify interaction between genes whose protein products potentially influence each other, which limits its power to identify molecular wiring of tumour cells dictating response to a drug. Due to the fact that signal transduction pathways are not linear and highly interlinked, the biological response they drive may be better described by the relative amount of their components and their functional relationships than by their individual, absolute expression.
Gene expression microarray data for 109 tumor cell lines with known sensitivity to the death ligand cytokine tumor necrosis factor-related apoptosis-inducing ligand (TRAIL) was used to identify genes with potential functional relationships determining responsiveness to TRAIL-induced apoptosis. The machine learning technique Random Forest in the statistical environment “R” with backward elimination was used to identify the key predictors of TRAIL sensitivity and differentially expressed genes were identified using the software GeneSpring. Gene co-regulation and statistical interaction was assessed with q-order partial correlation analysis and non-rejection rate. Biological (functional) interactions amongst the co-acting genes were studied with Ingenuity network analysis. Prediction accuracy was assessed by calculating the area under the receiver operator curve using an independent dataset. We show that the gene panel identified could predict TRAIL-sensitivity with a very high degree of sensitivity and specificity (AUC = 0 · 84). The genes in the panel are co-regulated and at least 40% of them functionally interact in signal transduction pathways that regulate cell death and cell survival, cellular differentiation and morphogenesis. Importantly, only 12% of the TRAIL-predictor genes were differentially expressed highlighting the importance of functional interactions in predicting the biological response.
The advantage of co-acting gene clusters is that this analysis does not depend on differential expression and is able to incorporate direct- and indirect gene interactions as well as tissue- and cell-specific characteristics. This approach (1) identified a descriptor of TRAIL sensitivity which performs significantly better as a predictor of TRAIL sensitivity than any previously reported gene signatures, (2) identified potential novel regulators of TRAIL-responsiveness and (3) provided a systematic view highlighting fundamental differences between the molecular wiring of sensitive and resistant cell types.
KeywordsTRAIL Biomarker Gene expression Random forest
Tumor necrosis factor-related apoptosis-inducing ligand (TRAIL, TNFSF10), a member of the TNF cytokine family, is an emerging therapeutic option for various cancers. Due to the tumor-specific cytotoxicity of TRAIL, its recombinant version and agonistic antibodies against the death-inducing TRAIL receptors (TNFRSF10A/DR4, TNFRSF10B/DR5) are currently being tested in Phase I/II clinical trials. The basis for the tumor-specific action of TRAIL is that during malignant transformation cells become sensitive to TRAIL . During later progression however, tumors can re-acquire resistance to evade immune-mediated killing and thus, prediction of tumor TRAIL-responsiveness is critical . While TRAIL can be a very potent tumoricidal agent due to its ability to target both the tumor cells and the tumor vasculature , administration of TRAIL to resistant tumors may trigger invasiveness and promote metastasis [4–6] further highlighting the need for robust biomarkers predicting TRAIL-responsiveness. While the TRAIL-induced apoptotic machinery is well studied and a number of regulatory mechanisms have been identified, none of them have proven to be useful as a predictive marker.
Owing to its high sensitivity and full coverage of the human genome, transcriptomics is one of the most widely applied tools for biomarker research by selecting genes that are differentially expressed in the majority of the samples, or in a specific subgroup. This gene-by-gene approach however cannot address the question of how genes relate to each other in determining the biological outcome. Often the change in the expression of key regulatory genes is minor, or it is only significant in a small subset within a diverse sample cohort, or its effect on the phenotype is conditional on the expression level of other genes. Thus the relative expression of genes acting/participating in the same biological process and their combinatorial analysis may better describe the behavior of a cell and predict the response to a stimulus.
Here we show that it is possible to predict TRAIL sensitivity with high accuracy by using hierarchical decision tree analysis of transcriptomic data. We found that the identified predictor genes are not differentially expressed, but instead they have linked biological functions (including inhibition, activation, induction of expression etc.). We report here that these “co-acting” gene clusters can be identified from transcriptomic data and these co-acting genes could predict TRAIL-responsiveness with a much higher degree of sensitivity and specificity (AUC = 0.84) than the currently available best-performing gene signature predicting TRAIL-responsiveness (AUC of 0.72) .
The focus of the study was to identify genes that predict TRAIL-responsiveness and analyse their correlation and functional interactions, such as regulation by direct interaction, post-translational modification, induction or repression of expression, induction of degradation etc. (analysis workflow is shown in Additional file 1: Figure S1).
The percentage of genes which had at least one co-regulated gene partner
Biomarkers are pillars of diagnostic biology both for detection and prognosis. In the last number of years there has been a paradigm shift from the identification of prognostic and diagnostic biomarkers to biomarkers that can predict treatment efficacy. This refocusing has been facilitated by the advent of high-throughput technologies such as genome-wide association analysis, transcriptomics or metabolomics and already resulted theranostic diagnostics, such as the Oncotype-DX biomarker panel for treatment-identification for breast cancer patients . Differential gene expression patterns from transcriptome studies can be also used to identify drugs that have the potential to reverse an unfavorable phenotype, such as drug-resistance with the help of the computational algorithm Connectivity mapping (Broad Institute of Massachusetts Institute of Technology, Harvard University) .
Biomarkers to predict the functionality of the TRAIL apoptotic pathway in cancer cells are becoming increasingly important with emerging promising phase I trials and new pre-clinical studies showing the potency of TRAIL on the tumor vasculature and synergistic DR5-activation by the combination of TRAIL and agonistic DR5 antibody (AMG655, Amgen) in ovarian cancer [3, 13]. Some of the pathway regulators have been indicated as potential biomarkers, including GalNT14, c-FLIP, DcR1 or DcR2 as individual markers [14–19] or groups of differentially expressed genes . Currently the best classifier of TRAIL sensitivity is a 71-gene signature of genes differentially expressed between TRAIL-resistant and sensitive tumor cell lines with a prediction accuracy of AUC = 0 · 72 .
Here we show that by using machine-learning methods we could predict the TRAIL-responsiveness of cancer cell types with an accuracy superior to any of the current markers. Importantly, most of the identified genes were not differentially expressed; instead 40% of them were linked based on their biological function.
The advantage of the co-acting gene clusters in prediction can be due to their ability to capture the non-linear nature of signal transduction pathways, which one-dimensional analyses, such as differential expression poorly reflect. It is also well established that not all cells utilize the same mechanism to block apoptosis [1, 19]. Ensembles of decision tree models can follow and identify branching gene interactions and are able to simultaneously test a number of potential routes of co-acting gene linkages that describe the cell’s phenotype even in highly diverse sample sets. Finally, unlike differential expression analyses, co-acting gene clusters do not exclude genes which are differentially expressed only in a small subset of a diverse sample population.
In comparison, the well-documented components of the TRAIL pathway were poor predictors (AUC = 0.74). Complementing this, out of the 350 co-acting genes only six, namely DR4, DR5, DcR2, osteoprotegerin, Caspase-8, and heme-oxidized IRP2 ubiquitin ligase 1 (HOIL1) are well characterized components of TRAIL signaling (Additional file 2: Table S2). Another 14 genes linked to TRAIL-sensitivity by at least one study are present in the panel, including mixed-lineage kinase domain-like (MLKL), DNA-binding death effector domain-containing protein 2 (DEDD2), NADPH oxidase organizer 1, the serine/threonine kinase Ataxia telangiectasia mutated (ATM), polypeptide N-acetylgalactosaminyltransferase-14 and cyclin-dependent kinase 2 (please refer to Additional file 2: Table S2 for full list). The remaining 336 genes have not been associated with TRAIL. These genes may be co-regulated with other genes that regulate TRAIL sensitivity, but have no effect on the pathway (bystanders), although it is more likely that many of them directly or indirectly regulate TRAIL-sensitivity.
IPA functional pathway analysis revealed that 40% of the identified genes have already been reported to interact either directly or indirectly. Many co-acting genes are upstream regulators of well-documented regulators of the TRAIL signaling machinery, such as NF-κB, p53, or AKT. In addition to these known TRAIL-signaling regulators, other nodal points in the signaling networks were proteins that have established roles in cancer progression and/or resistance to cytotoxic drugs but have not been implicated in TRAIL-induced apoptosis previously. For example KDM5B (Histone demethylase JARID1B) is overexpressed in many cancers including breast, prostate, and lung cancer as well as melanoma where it confers resistance to apoptotic stimuli such as cisplatin and vemurafenib [20, 21]. Recent reports suggest that KDM5B functions through E2F1/2, which is known to be able to modulate sensitivity to TRAIL [22, 23]. Another example is NUPR1 (Nuclear protein 1), known to be highly expressed in a wide range of cancerous malignancies and its expression has been inversely correlated with the induction of apoptosis by various compounds such as doxorubicin [24, 25].
We show that a random forest classifier based on gene expression performs significantly better than previously reported biomarkers in predicting sensitivity to TRAIL in tumor cell lines perhaps because it allows a more flexible description of the molecular networks present in individual cells or cell types. These findings shed light on why previous studies failed to find a reliable marker of TRAIL sensitivity and also pinpoint potential novel regulators of the pathway.
Materials and methods
Analysis of microarray data
Raw transcriptome microarray data for 109 cell lines from Gene Expression Omnibus (GEO) accession number GSE8332 (training dataset)  and for an additional 40 tumor cell lines (NIH CellMiner, test dataset) was acquired in CEL file format. Background correction and normalization of the datasets has been performed using the RMA algorithm in the Affy package of Bioconductor  in the statistical environment, R (version 2.10.1). Noise in the analysis caused by multiple probesets per gene was reduced by identifying the probeset best representing each gene based on scoring and ranking the probesets for specificity, splice isoform coverage and robustness against degeneration using the JetSet algorithm .
Differentially expressed genes were identified using Genespring GX v11. After importing the CEL files, gene expression values were transformed (Log2) and normalized to the 75th percentile using the RMA algorithm. For each cell line the genes whose expression was above the expression value of the 20th percentile across all cell lines were selected. These genes were filtered by retaining only those which were present in at least 75% of the cell lines and/or exhibited a greater than 2 fold change in expression between TRAIL sensitive and resistant cell lines. One-way ANOVA unequal variance (Welch) was used to test for significance using a cut-off value of p ≤ 0.05. Multiple testing corrections were done by Benjamini-Hochberg.
Determining TRAIL sensitivity of cancer cell lines
TRAIL sensitivity of the cell lines was assessed and described by Wagner and colleagues using MTT assay . A cell line was defined as being sensitive if 1 ug/ml recombinant human TRAIL (rhTRAIL) reduced viability to 50% within 72 h or less and resistant if the reduction in viability was less than 50%.
Classification of tumor cell lines
Jetset-identified mRNA probesets for each gene were extracted from the training microarray data and were used for classifying the cell line samples into TRAIL-sensitive and resistant groups using random jungle, a parallel implementation of random forest (RF) modeling reducing computational time for high dimensional datasets (number of classification trees = 10,000, mtry = default). In order to avoid imbalance between the number of sensitive and resistant cell lines during training, the sample number to grow the trees was set to 30 sensitive and 30 resistant cell lines and the remaining samples were used to determine the OOB error rate (sampsize). The genes were ranked by mean decrease in Gini-importance. In order to determine the core co-acting gene-set, the lowest ranking genes were iteratively removed and the model’s performance tested by determining out of bag (OOB) error rate. Performance of the final model was measured by predicting TRAIL-responsiveness of the test dataset and calculating the area under the receiver operator curve (AUC) from 10 repeats. All statistical analyses were carried out using Random Jungle  and RandomForest  packages in the R (version 3 · 0 · 3) environment (http://www.r-project.org/).
Gene relationship analysis
The strength of the statistical interactions between genes was measured with q-order partial correlation using the qpgraph package in the Bioconductor project in R (version 3 · 0 · 3) . This analysis determines interaction between two genes while filtering out potential effects of other genes (number of genes for filtering is the q value). Non-rejection rates were calculated from the normalized gene expression data using q values ranging between 1–105 q-order partial correlations. Gene regulatory networks where nodes represent genes and edges represent the partial correlation between the genes were graphed using the igraph package in R (version 3 · 0 · 3).
Ingenuity pathway analysis (IPA)
Biological interactions and inter-relationship amongst the co-acting genes were analyzed using the Ingenuity IPA network analysis tool (Qiagen) by performing core analysis with the following settings: reference set: human genome U133 plus 2 · 0 Array, included both direct and indirect relationships experimentally observed limited to human with the maximum number of network components set to 140 genes. Statistical significance for canonical pathway and network analysis was determined by Fisher’s exact test.
We would like to thank Paul Korir for helpful discussion and application of the machine learning algorithms. This was work supported by Science foundation Ireland (SFI) SIRG grant 09/SIRG/B1575 and Irish Cancer Society Research Fellowship to ES (CRF09SZE).
- van Dijk M, Halpin-McCormick A, Sessler T, Samali A, Szegezdi E: Resistance to TRAIL in non-transformed cells is due to multiple redundant pathways. Cell death & disease. 2013, 4: e702-10.1038/cddis.2013.214.View ArticleGoogle Scholar
- Mahalingam D, Szegezdi E, Keane M, de Jong S, Samali A: TRAIL receptor signalling and modulation: Are we on the right TRAIL?. Cancer Treat Rev. 2009, 35 (3): 280-288. 10.1016/j.ctrv.2008.11.006.PubMedView ArticleGoogle Scholar
- Wilson NS, Yang A, Yang B, Couto S, Stern H, Gogineni A, Pitti R, Marsters S, Weimer RM, Singh M, Ashkenazi A: Proapoptotic activation of death receptor 5 on tumor endothelial cells disrupts the vasculature and reduces tumor growth. Cancer Cell. 2012, 22 (1): 80-90. 10.1016/j.ccr.2012.05.014.PubMedView ArticleGoogle Scholar
- Wang H, Xu C, Kong X, Li X, Kong X, Wang Y, Ding X, Yang Q: Trail Resistance Induces Epithelial-Mesenchymal Transition and Enhances Invasiveness by Suppressing PTEN via miR-221 in Breast Cancer. PLoS One. 2014, 9 (6): e99067-10.1371/journal.pone.0099067.PubMed CentralPubMedView ArticleGoogle Scholar
- Azijli K, Yuvaraj S, Peppelenbosch MP, Wurdinger T, Dekker H, Joore J, van Dijk E, Quax WJ, Peters GJ, de Jong S, Kruyt FA: Kinome profiling of non-canonical TRAIL signaling reveals RIP1-Src-STAT3-dependent invasion in resistant non-small cell lung cancer cells. J Cell Sci. 2012, 125 (Pt 19): 4651-4661.PubMedView ArticleGoogle Scholar
- Fingas CD, Blechacz BR, Smoot RL, Guicciardi ME, Mott J, Bronk SF, Werneburg NW, Sirica AE, Gores GJ: A smac mimetic reduces TNF related apoptosis inducing ligand (TRAIL)-induced invasion and metastasis of cholangiocarcinoma cells. Hepatology. 2010, 52 (2): 550-561. 10.1002/hep.23729.PubMed CentralPubMedView ArticleGoogle Scholar
- Chen JJ, Knudsen S, Mazin W, Dahlgaard J, Zhang B: A 71-gene signature of TRAIL sensitivity in cancer cells. Mol Cancer Ther. 2012, 11 (1): 34-44. 10.1158/1535-7163.MCT-11-0620.PubMedView ArticleGoogle Scholar
- Li Q, Birkbak NJ, Gyorffy B, Szallasi Z, Eklund AC: Jetset: selecting the optimal microarray probe set to represent a gene. BMC Bioinformatics. 2011, 12: 474-10.1186/1471-2105-12-474.PubMed CentralPubMedView ArticleGoogle Scholar
- Schwarz DF, Konig IR, Ziegler A: On safari to Random Jungle: a fast implementation of Random Forests for high-dimensional data. Bioinformatics. 2010, 26 (14): 1752-1758. 10.1093/bioinformatics/btq257.PubMed CentralPubMedView ArticleGoogle Scholar
- Wiener : Classification and Regression by randomForest. R News. 2002, 2: 18-22.Google Scholar
- Toole MJ, Kidwell KM, Van Poznak C: Oncotype dx results in multiple primary breast cancers. Breast Cancer (Auckl). 2014, 8: 1-6.Google Scholar
- Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, Wrobel MJ, Lerner J, Brunet JP, Subramanian A, Ross KN, Reich M, Hieronymus H, Wei G, Armstrong SA, Haggarty SJ, Clemons PA, Wei R, Carr SA, Lander ES, Golub TR: The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science. 2006, 313 (5795): 1929-1935. 10.1126/science.1132939.PubMedView ArticleGoogle Scholar
- Lemke J, von Karstedt S, Zinngrebe J, Walczak H: Getting TRAIL back on track for cancer therapy. Cell Death Differ. 2014, 21 (9): 1350-1364. 10.1038/cdd.2014.81.PubMed CentralPubMedView ArticleGoogle Scholar
- Yoldas B, Ozer C, Ozen O, Canpolat T, Dogan I, Griffith TS, Sanlioglu S, Ozluoglu LN: Clinical significance of TRAIL and TRAIL receptors in patients with head and neck cancer. Head Neck. 2011, 33 (9): 1278-1284. 10.1002/hed.21598.PubMedView ArticleGoogle Scholar
- Ryu BK, Lee MG, Chi SG, Kim YW, Park JH: Increased expression of cFLIP(L) in colonic adenocarcinoma. J Pathol. 2001, 194 (1): 15-19. 10.1002/path.835.PubMedView ArticleGoogle Scholar
- Bullani RR, Huard B, Viard-Leveugle I, Byers HR, Irmler M, Saurat JH, Tschopp J, French LE: Selective expression of FLIP in malignant melanocytic skin lesions. J Invest Dermatol. 2001, 117 (2): 360-364. 10.1046/j.0022-202x.2001.01418.x.PubMedView ArticleGoogle Scholar
- Okano H, Shiraki K, Inoue H, Kawakita T, Yamanaka T, Deguchi M, Sugimoto K, Sakai T, Ohmori S, Fujikawa K, Murata K, Nakano T: Cellular FLICE/caspase-8-inhibitory protein as a principal regulator of cell death and survival in human hepatocellular carcinoma. Lab Invest. 2003, 83 (7): 1033-1043. 10.1097/01.LAB.0000079328.76631.28.PubMedView ArticleGoogle Scholar
- Tepper CG, Seldin MF: Modulation of caspase-8 and FLICE-inhibitory protein expression as a potential mechanism of Epstein-Barr virus tumorigenesis in Burkitt's lymphoma. Blood. 1999, 94 (5): 1727-1737.PubMedGoogle Scholar
- Mahalingam D, Oldenhuis CN, Szegezdi E, Giles FJ, de Vries EG, de Jong S, Nawrocki ST: Targeting TRAIL towards the clinic. Curr Drug Targets. 2011, 12 (14): 2079-2090. 10.2174/138945011798829357.PubMedView ArticleGoogle Scholar
- Sayegh J, Cao J, Zou MR, Morales A, Blair LP, Norcia M, Hoyer D, Tackett AJ, Merkel JS, Yan Q: Identification of small molecule inhibitors of Jumonji AT-rich interactive domain 1B (JARID1B) histone demethylase by a sensitive high throughput screen. J Biol Chem. 2013, 288 (13): 9408-9417. 10.1074/jbc.M112.419861.PubMed CentralPubMedView ArticleGoogle Scholar
- Roesch A, Vultur A, Bogeski I, Wang H, Zimmermann KM, Speicher D, Korbel C, Laschke MW, Gimotty PA, Philipp SE, Krause E, Patzold S, Villanueva J, Krepler C, Fukunaga-Kalabis M, Hoth M, Bastian BC, Vogt T, Herlyn M: Overcoming intrinsic multidrug resistance in melanoma by blocking the mitochondrial respiratory chain of slow-cycling JARID1B(high) cells. Cancer Cell. 2013, 23 (6): 811-825. 10.1016/j.ccr.2013.05.003.PubMedView ArticleGoogle Scholar
- Nijwening JH, Geutjes EJ, Bernards R, Beijersbergen RL: The histone demethylase Jarid1b (Kdm5b) is a novel component of the Rb pathway and associates with E2f-target genes in MEFs during senescence. PLoS One. 2011, 6 (9): e25235-10.1371/journal.pone.0025235.PubMed CentralPubMedView ArticleGoogle Scholar
- Yoo J, Lee YJ: Aspirin enhances tumor necrosis factor-related apoptosis-inducing ligand-mediated apoptosis in hormone-refractory prostate cancer cells through survivin down-regulation. Mol Pharmacol. 2007, 72 (6): 1586-1592. 10.1124/mol.107.039610.PubMedView ArticleGoogle Scholar
- Clark DW, Mitra A, Fillmore RA, Jiang WG, Samant RS, Fodstad O, Shevde LA: NUPR1 interacts with p53, transcriptionally regulates p21 and rescues breast epithelial cells from doxorubicin-induced genotoxic stress. Curr Cancer Drug Targets. 2008, 8 (5): 421-430. 10.2174/156800908785133196.PubMedView ArticleGoogle Scholar
- Cano CE, Hamidi T, Sandi MJ, Iovanna JL: Nupr1: the Swiss-knife of cancer. J Cell Physiol. 2011, 226 (6): 1439-1443. 10.1002/jcp.22324.PubMedView ArticleGoogle Scholar
- Wagner KW, Punnoose EA, Januario T, Lawrence DA, Pitti RM, Lancaster K, Lee D, von Goetz M, Yee SF, Totpal K, Huw L, Katta V, Cavet G, Hymowitz SG, Amler L, Ashkenazi A: Death-receptor O-glycosylation controls tumor-cell sensitivity to the proapoptotic ligand Apo2L/TRAIL. Nat Med. 2007, 13 (9): 1070-1077. 10.1038/nm1627.PubMedView ArticleGoogle Scholar
- Gautier L, Cope L, Bolstad BM, Irizarry RA: affy--analysis of Affymetrix GeneChip data at the probe level. Bioinformatics. 2004, 20 (3): 307-315. 10.1093/bioinformatics/btg405.PubMedView ArticleGoogle Scholar
- Castelo R, Roverato A: Reverse engineering molecular regulatory networks from microarray data with qp-graphs. J Comput Biol. 2009, 16 (2): 213-227. 10.1089/cmb.2008.08TT.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.