 Research
 Open Access
 Published:
Exploring the key genes and signaling transduction pathways related to the survival time of glioblastoma multiforme patients by a novel survival analysis model
BMC Genomics volume 18, Article number: 950 (2017)
Abstract
Background
This study is to explore the key genes and signaling transduction pathways related to the survival time of glioblastoma multiforme (GBM) patients.
Results
Our results not only showed that mutually explored GBM survival time related genes and signaling transduction pathways are closely related to the GBM, but also demonstrated that our innovated constrained optimization algorithm (CoxSisLasso strategy) are better than the classical methods (CoxLasso and CoxSis strategy).
Conclusion
We analyzed why the CoxSisLasso strategy can outperform the existing classical methods and discuss how to extend this research in the distant future.
Background
Glioblastoma multiforme (GBM) is the most common and malignant brain tumor [1–3]. Since GBM is high invasive and is mixed together with the healthy brain tissue, it is almost impossible to remove the tumor without causing serious consequences [4]. Moreover, GBM is very easy to relapse [5, 6]. The median survival and progression free survival time of GBM are 14.6 and 6.9 months, respectively. And the 5 year survival rate was 9.8 %[7]. Previous studies [8–10] indicated that gene mutation is one of the most important factors for GBM development. Therefore, gene expression analysis can not only be used to discover the underlying abnormality of gene expression associated with the GBM gene mutation, but also be employed to discover gene signatures which could help us to investigate the related signaling transduction pathways. Results from the pathway analysis can lay the foundation for the GBM cancer targeted drug research in the future.
As one of the important survival analysis methods, the cox proportional hazards model [11] is broadly employed to investigate the connections between various covariates and the length of life. However, the classical cox proportional hazards model [12] can only process such survival data that the dimension of the factors (P) are less than the number of samples (N) [13] (we call it as P < <N type of data), but it is not able to handle the survival data that the dimension of the factors are greater than the number of samples such as the gene expression data [13] (we call it as P> > N type of data). To process P> > N type of data, Tibshirani et al., [14] integrated the Lasso algorithm, one of the constrained optimization methods, into the classical Cox proportional hazards model [15] to select the key predictors. However, Fan et al., [16] pointed out if the number of predictors is much greater than the sample size (P> > N), a precleaning step by a computationally expedient screening procedure is often preferred to increase the accuracy of the algorithm. Thus, Fan et al., [16] developed the sure independence screening (SIS) method by fitting marginal Cox regression models for each covariate and screening out several covariates by a prespecified threshold. Nevertheless, reported by Hong et al., [17], marginal screening may encounter the difficulty in identifying these hidden and jointly important variables to incur false negatives. Therefore, Hong et al., [17] proposed a conditional SIS method to explore the potential predictors for the regular linear system, but not consider the survival data. On the other hand, developing a systematic approach to identify the target generic drug for the cancer treatment already becomes a popular research field [18, 19]. However, to our knowledge, there is no recent research discussing the incoherent connection between survival time and the target generic drug in detail.
To overcome the shortcomings of these previous research, this study proposed a multiscale genes and signaling transduction pathways exploration platform (Fig. 1) with the following three innovations. Firstly, we innovatively analyzed the clinical GBM gene expression and survival time data [20] to investigate the incoherent relation between the signatures of genes and the survival time of GBM patient. Secondly, we not only integrated the constrained optimization method such as Lasso [15] into classical Cox proportional hazards model [13] to explore survival time related key genes by processing the P> > N type of data, but also used the SIS algorithm to improve the predictive accuracy. Thirdly, we employed KOBAS database [21] and hypergeometric test [22] to investigate the correlated GBM signaling transduction pathways regarding the explored survival time related key genes. And then, these survival time related signaling transduction pathway could help us to bridge the relation between the targeted drugs and the survival time for GBM patients.
The clinical GBM gene expression and survival data set used in this study is downloaded from the Georgetown Database of Cancer GDOC [20], which has 54,675 features (P) and 227 samples (N). To handle such a P> > N type of data, we developed the CoxSisLasso strategy. It firstly integrated constrained optimized methods such as Lasso into the classical cox regression model to select the prior genes with potentially great impact on the patients’ survival time. Secondly, conditioned on these genes selected by Lasso, conditional SIS method [23] is used to reselect the possible genes from these genes screened out in the first step. To bridge the relation between the targeted drugs and the survival time for GBM patients, we employed the KOBAS [21] application and the explored GBM survival time related key genes to investigate which signaling transduction pathways closely correlate with the GBM survival time.
In general, this study developed a multiscale genes and signaling transduction pathways exploration algorithm that can not only investigate the molecular mechanism between the key genes and cancer patients’ survival time, but also employ hypergeometric distribution based database (KOBAS) to look for the related signaling pathways in the proteomics level for the future targeted cancer therapy [24, 25]. Manuallyreviewed experimental evidences showed that mutually explored GBM survival time related genes [26–38] and signaling transduction pathways [39–52] are closely related to the GBM. In addition, the research results demonstrate that our proposed CoxSisLasso strategy has the best predictive power and model fitting capacity compared to the CoxLasso and CoxSis strategy developed by Tibshirani et al.,[14] and Fan et al., [16], respectively. Finally, we theoretically analyze why the CoxSisLasso strategy outperforms CoxLass and CoxSis and discuss the further research.
Methods
Materials
We used a multistudy microarray database of GBM expression profiles (n = 227) from the Georgetown Database of Cancer GDOC [20], based on the Affymetrix U133 plus 2.0 GeneChip microarray platform. The microarray datasets of GBM are listed by Table 1.
Data filtering
The original microarray datasets are normalized and preprocessed by R software package [53]. After preprocessing step, there are 227 samples and 54,675 genes left in the data matrix. Next, the interquartile range (IQR) threshold [54] is employed to screen out the genes with small variance value. After that, there are only 227 samples and 10,992 genes left in the GBM gene expression and survival time data matrix.
Cox proportional hazards model
Survival analysis [11, 55] works for the analysis of time duration until one or more events happen. As one of the widespread used survival analysis methods, the Cox proportional hazards model [13] is used to analyze the timetoevent data with both censored data and covariates, which assumes a semiparametric form for the hazard as Eq. 1.
where h _{ i }(t) is the hazard for patient i at time t, h _{0}(t) is a shared baseline hazard function, β is an unknown pdimensional regression coefficient vector and x _{ i } is a vector of potential predictors for the i ^{th} individual. Based on the available samples, the estimator of the unknown parameter coefficients \( \widehat{\beta} \), can be obtained by maximizing the logpartial likelihood function as Eq. 2
where D is the set of indices of the events and R _{ k } denotes the set of indices of the individuals at risk at time t _{ k }.
Since this study encounters the P> > N type of data, it is impossible to employ classical Cox proportional hazard regression method [13] to analyze the GBM gene expression data matrix directly. Therefore, the following sections propose three variable selection strategies to obtain the sparse regression coefficient.
Combined Cox and Lasso (CoxLasso) strategy
To obtain the sparse solution for the parameter β in the Cox proportional hazards model (Eq. 1), we have to integrate constrained optimization methods such as Lasso proposed by Tibshirani et al.,[14] into classical Cox proportional hazards model to minimize the negative log partial likelihood subject to the sum of the absolute values of the parameters being bounded by a constant as Eq. 3.
It is equivalent to the following optimization problem
where λ is the tuning parameter to control the sparsity of the estimator. This research used the R package tool glmnet developed by Friedman et al.,[56] to implement the combined Cox and Lasso (CoxLasso) strategy (Fig. 2a) by using cross validation to choose the tuning parameter.
Combined Cox and SIS (CoxSis) strategy
Though directly integrating Lasso method into Cox model can process P> > N type of data, it may encounter problems with speed, stability, and accuracy, once the dimension of the covariates is ultrahigh [23] . Therefore, it is often preferred to employ a simple and computationally efficient screening procedure to reduce the data dimensionality to a moderate size before using Lasso method. The combined Cox and SIS (CoxSis) strategy is illustrated by the following steps:

Step 1: Fit a marginal Cox regression model for each covariate x _{ m } to obtain \( {\widehat{\beta}}_m \) by Eq. 5.1
$$ {\widehat{\beta}}_m= \arg \max {\displaystyle \sum_{k\in D}\left[{x}_{km}{\beta}_m \log \left({\displaystyle {\sum}_{j\in {R}_k} \exp}\left({x}_{jm}{\beta}_m\right)\right)\right]} $$(5.1) 
Step 2: Rank the magnitudes of \( {\widehat{\beta}}_j,j=1,2,\dots, p \) in decreasing order and keep the number of d top ranked covariates.

Step 3: Denote the index of selected covariates by Θ. Implement Lasso with the selected d covariates by minimizing Eq. 5.2
$$ \underset{\beta_{\Theta}}{ \min}\left\{{\displaystyle \sum_{k\in D}\left[{x}_{k,\Theta}^T{\beta}_{\Theta} \log \left({\displaystyle {\sum}_{j\in {R}_k} \exp}\left({x}_{j,\Theta}^T{\beta}_{\Theta}\right)\right)\right]}+\lambda {\displaystyle \sum_{j\in {\beta}_{\Theta}}\left{\beta}_j\right}\right\} $$(5.2)
This study employs R package of SIS developed by Fan et al., [16] to implement the combined Cox and SIS (CoxSis) strategy (Fig. 2b).
Combined Cox, SIS and Lasso (CoxSisLasso) strategy
Recently, Barut et al., [57] proposed a conditional screening approach (Conditional SIS) to enhance the accuracy of SIS by using the prior knowledge of the key factors to select the predictors. Regarding to our P> > N type of data and the limitation of Lasso method in the stability and accuracy, this study proposed a combined Cox, SIS and Lasso (CoxSisLasso) strategy (Fig. 2c) to increase the predictive accuracy of the model as follows:

Step 1: Implement Lasso for the data. Denote the index of selected covariates with Lasso by C _{0}.

Step 2: Conditioned on the selected subset of covariates C _{0}, for each covariate x _{ m }, m ∉ C _{0}, fit the following Cox regression model by maximizing Eq. 6
$$ {\widehat{\beta}}_m=\underset{\beta_m}{ \arg \max}\left\{{\displaystyle \sum_{k\in D}\left[{x}_{k,{C}_0}^T{\beta}_{C_0}+{x}_{k,m}{\beta}_m \log \left({\displaystyle {\sum}_{j\in {R}_k} \exp}\left({x}_{j,{C}_0}^T{\beta}_{C_0}+{x}_{j,m}{\beta}_m\right)\right)\right]}\right\} $$(6) 
Step 3: For a given threshold γ, keep the variables x _{ m }, m ∉ C _{0} if \( \left{\widehat{\beta}}_m\right\ge \gamma \). Denote \( {C}_1=\left\{m\notin {C}_0,\left{\widehat{\beta}}_m\right\ge \gamma \right\} \). Then the augmented selected predictors are C _{0} ∪ C _{1}.

Step 4: Implementing Lasso with the covariates in the set C _{0} ∪ C _{1} to select the final predictors.
For the threshold γ, Barut et al.,[57] proposed two procedures by controlling FDR and random decoupling to choose the proper level of threshold. Motivated by Zhao and Li [23], this study sets the threshold γ = 1/p, and p is the total number of all the covariates. Once the pvalue of the Ztest of the covariate x _{ m }, m ∉ C _{0} is less than the γ, we keep it as one of the important predictors.
Investigate potential signaling pathway regarding to the candidate genes related to the GBM survival time
After obtaining the explored GBM survival time related key genes by previous strategies, it is interesting for us to investigate which potential signaling pathways are closely related to these genes. And the potential pathways will be employed for the targeted drug therapy to treat the GBM cancer in the future.
KOBAS is a signaling transduction pathway database to identify statistically significantly enriched pathways using hypergeometric test [11]. In statistics, the hypergeometric test uses the hypergeometric distribution (Eq. 7) to calculate the statistical significance.
where N is the population size, K is the number of success states in the population, n is the number of draws, k is the number of observed successes.
Results
The explored GBM survival time related key genes by CoxLasso, CoxSis and CoxSisLasso strategy, respectively
Here, Table 2 shows the explored GBM survival time related key genes by CoxLasso, CoxSis and CoxSisLasso strategy, respectively. Also, the Venn plot (Fig. 3) indicates there are four common genes (AEBP1, GDNF, IL17RC and EIF3A) mutually selected by these three strategies, which closely correlate with the survival time of GBM patient validated by the manuallyreviewed experimental evidences [26–38].
Firstly, AEBP1 (Adipocyte enhancer binding protein 1) was discovered as a transcriptional repressor [26]. It not only expresses at different levels in different organ and tissue types and its expression is relatively strong in brain [27], but also it can interact with tumor suppressor protein PTEN and inhibit its tumorsuppressing function [28]. AEBP1 can also negatively regulate IkB, resulting in the upregulation of NFkB and enhanced inflammatory response [29]. It is well known that both PTEN and NFkB, closely related to the AEBP 1, are important players in GBM cancer progression. Moreover, previous research identified several genomic targets of AEBP1 playing vital roles in the survival of glioma cells [30].
Secondly, GDNF is a Glial Cell derived neurotrophic factor which promotes survival of neurons [31]. GDNF is not only identified as an important factor in macrophage infiltration into GBM, contributing to GBM progression [32], but also it can promote glioma cell invasion through its receptors that are present on invasive GBM cells [33].
Thirdly, EIF3A (Eukaryotic translation initiation factor subunit 3A) is not only expressed in all tissue types in human body and its expression is upregulated in some type of cancers [34], but also it is important in regulating the expression of proteins involved in DNA repair pathway which is essential in drug sensitivity and resistance in cancer treatment [35, 36]. Especially, EIF3A is found to be overexpressed in some glioma patients [37].
Fourthly, Inlerleukin17 receptor C (IL17RC) is a key molecule mediating interleukin 17 signaling. It is important in immune response and inflammation which are important in GBM progression [38].
Predictive performance comparison of survival time for each strategy
This study employs the idea of timedependent receiver operating characteristic curve (ROC) for the censored data and the area under the curve (AUC) [58, 59] to quantify the predicative accuracy for each strategy, when the outcome of interest is the survival time. The ROC curve depicts the sensitivity (Eq. 8.1) versus 1specificity (Eq. 8.2) at each time t for any risk score function x ^{T} β
with c being the cutoff value and δ(t) is the event indicator at time t.
Figures 4 and 5 depicts the ROC curve at a specific predicted time 30 and AUC over a period of time respectively to quantify the performance of the three strategies to predict the survival time of the GBM patients. It demonstrates that CoxLasso and CoxSis strategy shares the similar predictive performance, whereas our proposed CoxSisLasso strategy has the best predictive accuracy since it not only has the greatest value of the sensitivity and 1specificity (Fig. 4), but also the largest AUC value (Fig. 5). Furthermore, to assess the generalization ability of the proposed model, we randomly select 120 samples as the training samples and the rest 68 samples as the test samples. Figure 6 shows the AUCs of the three strategies for the testing samples. Both Figs. 5 and 6 turn out that our proposed CoxSisLasso method provides the largest AUC value with the best performance.
Model fitting performance comparison for each strategy
Table 3 summarizes the Cox regression results with the key genes selected by three strategies. R^{2} is the statistic of the goodness of fit measure [60]. The concordance index [61] is a valuable measure of model discrimination in analyses involving survival time data. Greater R^{2} and concordance index value imply better model fitting performance. Table 3 shows that R^{2} and concordance index value of CoxSisLasso strategy (Table 3C) outperforms the other two (Table 3A & B). Moreover, by comparing results of Table 3C (CoxSisLasso) with the results of Table 3A (CoxLasso) and Table 3B (CoxSis), we found that CoxSisLasso not only can preserve the genes selected by the CoxLasso and CoxSis, but also it can introduce several statistically significant genes, which are potential for us to explore their relationships with GBM in the distant future.
The explored GBM survival time related signaling transduction pathways by CoxLasso, CoxSis and CoxSisLasso strategy, respectively
Here, Table 4 lists the explored GBM related signaling transduction pathways by CoxLasso, CoxSis and CoxSisLasso strategy, respectively. Also, the Venn plot (Fig. 7) indicates three explored GBM related signaling transduction pathways mutually selected by these three strategies.
And then, we employed the manuallyreviewed experimental evidences [39–52] to demonstrate that these mutually explored signaling transduction pathways closely correlate with the survival time of GBM patient as following.
Firstly, mTOR (Mammalian target of rapamycin) is an important mediator of phosphatidylinositol3 kinase (PI3K) pathway. And previous research turned out that constitutive activation of PI3K signaling is found in the majority of GBM patients [39]. Moreover, PI3KaktmTOR axis plays essential role in cell growth and proliferation [40]. Signaling of mTOR pathway is vital for cancer cell growth and survival in GBM patients [41]. Currently mTOR pathway inhibitors are under active investigation in preclinical experiments and in clinical trials for GBM treatment [42].
Secondly, TGFbeta (Transforming growth factor beta) is a secreted cytokine which signals through specific receptors and exerts its effect via intracellular Smad family proteins [43]. TGFbeta pathway controls GBM cell proliferation [44]. Its signaling contributes to the maintenance of tumorinitiating cells in GBM [45]. TGFbeta pathway is also involved in tumor invasion and metastasis in GBM patients [46]. Inhibition of TGFbeta pathway signaling reduced GBM cell proliferation and invasion in preclinical cellbased assays [47]. TGFbeta pathway inhibitors showed promising results in improving GBM patient survival in clinical trials [48].
Thirdly, IRE (Internal Ribosomal Entry) pathway is involved in the synthesis of some proteins during which protein synthesis is initiated from a start codon near an IRE site rather than by scanning the Kozak sequence. This IRE pathway is used in the translation of many eukaryotic genes including growth factors such as VEGF, FGF2 and PDGF [49] and transcription factors such as cmyc and hypoxia induced factor [50, 51] . Indeed, upregulated expression of proto oncogene cJun in human GBM is mediated through a potent internal ribosomal entry site (IRES) in the 5′UTR of the cJun mRNA, and the upregulation of cJun contributes to the malignant properties of GBM cells [52].
Discussion
This study developed a multiscale gene and signaling transduction pathway exploration platform based on the classical Cox proportional hazard model [12], constrained optimization method [14–16] and hypergeometric test to analyze P> > N type of GBM gene expression and survival time data (Table 1). Compared to the previous research [14–16, 62], we proposed a novel CoxSisLasso strategy to investigate relationship between genes and GBM patients’ survival time in molecular level as well as used KOBAS database [21] to look for the survival time related signaling transduction pathways.
On the one hand, manually reviewed experimental evidences validate that both mutually explored key genes [26–38] (Table 2) and signaling transduction pathways [39–52] (Table 4) are closely related to GBM. On the other hand, since CoxLasso strategy may encounter problems with speed, stability, and accuracy for processing high dimensional data [23], the CoxSis strategy is developed by employing a simple and computationally efficient screening procedure to reduce the dimensionality of the data to a moderate size before using Lasso method based on the previous work of Fan et al., [16]. Though classic marginal screening approach based CoxSis is theoretically proved to be capable of selecting all important predictors [56], it is difficult to identify these hidden predictors which jointly correlate with the response variable but not marginally. For this reason, we proposed the CoxSisLasso strategy, which not only uses the CoxLasso strategy to obtain a prior set of important predictors, but also incorporates the SIS [16] approach to select the important predictors regarding to the previous results. Figure 5 and 6 turned out that CoxSisLasso strategy has the best predictive power and model fitting capacity than both CoxLasso and CoxSis.
Conclusions
In general, this study innovatively developed a CoxSisLasso strategy to interrogate the connections between GBM gene expression and GBM patients’ survival time as well as employed the KOBAS database [21] and hypergeometric test [21] to investigate the incoherent signaling transduction pathways and the survival time of GBM patient. Though the research results demonstrated the advantages of our algorithm, the current research still has several shortcomings such as the theoretically proof for the CoxSisLasso strategy, simulation study for the gene and pathway selection platform and so on. In the distant future, we will not only need improve our current CoxSisLasso algorithm, but also will employ the related pathway analysis theory [63] to explore the GBM survival time related proteins for the target drug study.
Abbreviations
 AEBP1:

Adipocyte enhancer binding protein 1
 CoxLasso:

Combined Cox and Lasso
 CoxSis:

Combined Cox and SIS
 CoxSisLasso:

Combined Cox, SIS and Lasso
 EIF3A:

Eukaryotic translation initiation factor subunit 3A
 GBM:

Glioblastoma multiforme
 GDNF:

Glial Cell derived neurotrophic factor
 IL17RC:

Inlerleukin17 receptor C
 IQR:

Interquartile range
 IRE:

Internal Ribosomal Entry
 IRES:

Internal ribosomal entry site
 mTOR:

Mammalian target of rapamycin
 PI3K:

Phosphatidylinositol3 kinase
 SIS:

Sure independence screening
 TGFbeta:

Transforming growth factor beta
References
 1.
Templeton A, Hofer S, Töpfer M, Sommacal A, Fretz C, Cerny T, Gillessen S. Extraneural spread of glioblastomareport of two cases. Onkologie. 2008;31:192–4.
 2.
Tania A, Arkaitz C, Boris J, Guillermo V, Garry M, Raphael M, Luis A, Manuel G, Ismael GR. Cannabinoids induce glioma stemlike cell differentiation and inhibit gliomagenesis. J Biol Chem. 2007;282:6854–62.
 3.
Scott J, Rewcastle N, Brasher P, Fulton D, Hagen N, MacKinnon J, Sutherland G, Cairncross J, Forsyth P. Longterm glioblastoma multiforme survivors: a populationbased study. Can J Neurol Sci. 1998;25:197–201.
 4.
Sasayama T, Nishihara M, Kondoh T, Hosoda K, Kohmura E. MicroRNA10b is overexpressed in malignant glioma and associated with tumor invasive factors, uPAR and RhoC. Int J Cancer. 2009;125:1407–13.
 5.
Lassman AB, Iwamoto FM, Gutin PH, Abrey LE. Patterns of relapse and prognosis after bevacizumab (BEV) failure in recurrent glioblastoma (GBM). J Clin Oncol. 2008;26:431–6.
 6.
D’Amico A, Gabbani M, Dall’Oglio S, Cristofori L, Turazzi S, Sanzone E, Maluta S. Protracted administration of low doses of temozolomide (TMZ) in the treatment of relapse glioblastoma (GBM) enhances the antitumor activity of this agent. In: Asco Meeting. 2006. p. 810–3.
 7.
Gladson CL, Prayson RA, Liu WM. The pathobiology of glioma tumors. Ann Rev Pathol Mech Dis. 2010;5:33–50.
 8.
Yu L, Maximilian D, Nathan W, Bollen AW, Aldape KD, M Kelly N, Lamborn KR, Berger MS, David B, Brown PO. Gene expression profiling reveals molecularly and clinically distinct subtypes of glioblastoma multiforme. Proc Natl Acad Sci. 2005;102:5814–9.
 9.
Shumin D, Nutt CL, Betensky RA, StemmerRachamimov AO, Denko NC, Ligon KL, Rowitch DH, Louis DN. Histologybased expression profiling yields novel prognostic markers in human glioblastoma. J Neuropathol Exp Neurol. 2005;64:948–55.
 10.
Bertram JS. The molecular biology of cancer. Mol Aspects Med. 2000;21:167–223.
 11.
Richards SJ. A handbook of parametric survival models for actuarial use. Scand Actuar J. 2012;2012:233–57.
 12.
Cox DR. Regression models and lifetables. J R Stat Soc. 1972;34:527–41.
 13.
Crichton N. Cox proportional hazards model. J Clin Nurs. 2002;11:723.
 14.
Tibshirani R. Regression shrinkage and selection via the Lasso. J R Stat Soc. 1996;58:267–88.
 15.
Tibshirani R. The lasso method for variable selection in the Cox model. Stat Med. 1997;16:385–95.
 16.
Fan J, Feng Y, Wu Y. Highdimensional variable selection for Cox’s proportional hazards model. J Am Stat Assoc. 2010;105:205–17.
 17.
Hong HG, Wang L, He X. A datadriven approach to conditional screening of high dimensional variables. 2015. Manuscript.
 18.
Nelander S, Wang W, Nilsson B, She QB, Pratilas C, Rosen N, Gennemark P, Sander C. Models from experiments: combinatorial drug perturbations of cancer cells. Mol Syst Biol. 2008;4:1484–94.
 19.
Sergio Iadevaia YL, Morales FC, Mills GB, Ram PT. Identification of optimal drug combinations targeting cellular networks: integrating phosphoproteomics and computational network analysis. Cancer Res. 2010;70:6704–14.
 20.
The Georgetown Database of Cancer GDOC. https://gdoc.georgetown.edu/gdoc/. Accessed 28 Apr 2016.
 21.
Xie C, Mao X, Huang J, Ding Y, Wu J, Dong S, Kong L, Gao G, Li CY, Wei L. KOBAS 2.0: a web server for annotation and identification of enriched pathways and diseases. Nucleic Acids Res. 2011;39:W316–22.
 22.
Rabinowitz L. Mathematical Statistics and data analysis. Elsevier; 2006.
 23.
Zhao SD, Li Y. Principled sure independence screening for Cox models with ultrahighdimensional covariates. J Multivar Anal. 2012;105:397–411.
 24.
Takashi O. Drug target validation and identification of secondary drug target effects using DNA microarrays. Tanpakushitsu Kakusan Koso. 2007;52:1808–9.
 25.
Behr MA, Wilson MA, Gill WP, Salamon H, Schoolnik GK, Rane S, Small PM. Comparative genomics of BCG vaccines by wholegenome DNA microarray. Science. 1999;284:1520–3.
 26.
He GP, Muise A, Li AW, Ro HS. A eukaryotic transcriptional represser with carboxypeptidase activity. Nature. 1995;378:92–6.
 27.
Ro HS, Kim SW, Wu D, Webber C, Nicholson TE. Gene structure and expression of the mouse adipocyte enhancerbinding protein. Gene. 2002;280:123–33.
 28.
Zhang L, Reidy SP, Nicholson TE, Lee HJ, Majdalawieh A, Webber C, Stewart BR, Dolphin P, Ro HS. The role of AEBP1 in sexspecific dietinduced obesity. Mol Med. 2005;11:39–47.
 29.
Majdalawieh A, Zhang L, Ro HS. Adipocyte enhancerbinding protein1 promotes macrophage inflammatory responsiveness by upregulating NFkappaB via IkappaBalpha negative regulation. Mol Biol Cell. 2007;18:930–42.
 30.
Ladha J, Sinha S, Bhat V, Donakonda S, Rao SM. Identification of genomic targets of transcription factor AEBP1 and its role in survival of glioma cells. Mol Cancer Res. 2012;10:25–35.
 31.
Yu T, Scully S, Yu Y, Fox GM, Jing S, Zhou R. Expression of GDNF family receptor components during development: implications in the mechanisms of interaction. Journal of Neurosci. 1998;18:4684–96.
 32.
Ku MC, Wolf SA, Respondek D, Matyash V, Pohlmann A, Waiczies S, Waiczies H, Niendorf T, Synowitz M, Glass R. GDNF mediates glioblastomainduced microglia attraction but not astrogliosis. Acta Neuropathol. 2013;125:609–20.
 33.
Hoelzinger DB, Tim D, Berens ME. Autocrine factors that sustain glioma invasion and paracrine biology in the brain microenvironment. J Natl Cancer Inst. 2007;99:1583–93.
 34.
Saletta F, Rahmanto YS, Richardson DR. The translational regulator eIF3a: the tricky eIF3 subunit! Biochim Biophys Acta. 1806;2010:275–86.
 35.
JiYe Y, Jie S, ZiZheng D, Qiong H, MeiZuo Z, DeYun F, HongHao Z, JianTing Z, ZhaoQian L. Effect of eIF3a on response of lung cancer patients to platinumbased chemotherapy by regulating DNA repair. Clin Cancer Res. 2011;17:4600–9.
 36.
RY L, Dong Z, Liu J, JY Y, Zhou L, Wu X, Yang Y, Mo W, Huang W, Khoo SK. Role of eIF3a in regulating cisplatin sensitivity and in translational control of nucleotide excision repair of nasopharyngeal carcinoma. Oncogene. 2011;30:4814–23.
 37.
Navani S. The human protein atlas. J Obstet Gynecol India. 2011;61:27–31.
 38.
Parajuli P, Mittal S. Role of IL17 in Glioma Progression. Journal of Spine & Neurosurgery. 2013; Suppl 1:s1–004.
 39.
McLendon R, Friedman A, Bigner D, Van Meir EG, Brat JD. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature. 2013;455:1061–8.
 40.
Akhavan D, Mischel PS. mTOR Signaling in Glioblastoma: Lessons Learned from Bench to Bedside. Neuro Oncol. 2010;12:882–9.
 41.
Jhanwaruniyal M, Labagnara M, Friedman M, Kwasnicki A, Murali R. Glioblastoma: molecular pathways, stem cells and therapeutic targets. Cancers. 2015;7:538–55.
 42.
Arshawn S, Michael K. Targeting the PI3K/AKT/mTOR signaling pathway in glioblastoma: novel therapeutic agents and advances in understanding. Tumor Biol. 2013;34:1991–2002.
 43.
Zhang VE, Derynck R. Smaddependent and Smadindependent pathways in TGFß family signalling. Nature. 2003;425:577–84.
 44.
Joan S, HongVan L, Lijian S, Anderson SA, Joan M. Integration of Smad and forkhead pathways in the control of neuroepithelial and glioblastoma cell proliferation. Cell. 2004;117:211–23.
 45.
Hiroaki I, Tomoki T, Yasushi I, Masamichi T, Nobuhito S, Keiji M, Kohei M. Gliomainitiating cells retain their tumorigenicity through integration of the Sox axis and Oct4 protein. J Biol Chem. 2011;286:41434–41.
 46.
Han J, Alvarezbreckenridge CA, Wang QE, Yu J. TGFβ signaling and its targeting for glioma treatment. Am J Cancer Res. 2015;5:945–55.
 47.
Roy LO, Poirier MB, Fortin D. Chloroquine inhibits the malignant phenotype of glioblastoma partially by suppressing TGFbeta. Invest New Drugs. 2015;33:1020–31.
 48.
Bogdahn U, Hau P, Stockhammer G, Venkataramana NK, Mahapatra AK, Suri A, Balasubramaniam A, Nair S, Oliushine V, Parfenov V. Targeted therapy for highgrade glioma with the TGFβ2 inhibitor trabedersen: results of a randomized and controlled phase IIb study. Neuro Oncol. 2010;13:132–42.
 49.
Huez I, Créancier L, Audigier S, Gensac MC, Prats AC, Prats H. Two independent internal ribosome entry sites are involved in translation initiation of vascular endothelial growth factor mRNA. Mol Cell Biol. 1998;18:6178–90.
 50.
Stoneley M, Chappell S, Jopling CL, Dickens M, Macfarlane M, Willis A. cMyc protein synthesis is initiated from the internal ribosome entry segment during apoptosis. Mol Cell Biol. 2000;20:1162–9.
 51.
Lang KJD, Andreas K, Goodall GJ. Hypoxiainducible factor1alpha mRNA contains an internal ribosome entry site that allows efficient translation during normoxia and hypoxia. Mol Biol Cell. 2002;13:1792–801.
 52.
Lior B, Revital K, Iris BD, Sivan O, Silke K, Peter H, Martin P, AnjaKatrin B, Lily V. Aberrant expression of cJun in glioblastoma by internal ribosome entry site (IRES)mediated translational activation. Proc Natl Acad Sci U S A. 2012;109:E2875–84.
 53.
Bioconductor:open source software for boinformatics. http://www.bioconductor.org/. Accessed 28 Apr 2016.
 54.
Miller FP, Vandome AF, Mcbrewster J. Interquartile: Interquartile Range. 2010.
 55.
Singh R, Mukhopadhyay K. Survival analysis in clinical trials: basics and must know areas. Perspect Clin Res. 2011;2:145–8.
 56.
Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2010;33:1–22.
 57.
Barut E, Fan J, Verhasselt A. Conditional Sure Independence Screening. J Am Stat Assoc. 2016;111:1266–77.
 58.
Mcclish DK. Analyzing a portion of the ROC curve. Med Decis Making. 1989;9:190–5.
 59.
Pepe M. An interpretation for the ROC curve and inference using GLM procedures. Biometrics. 2000;56:352–9.
 60.
Myers SC, Jin L. Rsquared around the world: new theory and new tests. Ssrn Electron J. 2004;79:257–92.
 61.
Kremers WK, Kremers WK. Concordance for survival time data: fixed and timedependent covariates and possible ties in predictor and time. Mayo Foundation. 2007. http://www.mayo.edu/research/documents/biostat80pdf/doc10027891.
 62.
Simon N, Friedman JH, Hastie T, Tibshirani R. Regularization paths for Cox’s proportional hazards model via coordinate descent. J Stat Softw. 2011;39:1–13.
 63.
Peng H, Peng T, Wen J, Engler DA, Matsunami RK, Su J, Zhang L, Chang CC, Zhou X. Characterization of p38 MAPK isoforms for drug resistance study using systems biology approach. Bioinformatics. 2014;30:1899–907.
Acknowledgement
This work is supported by the National Science Foundation of China under Grant No. 61372138, Chongqing excellent youth award, the Chinese Recruitment Program of Global Youth Experts, Fundamental Research Funds for the Central Universities of China No. XDJK2014B012 and NO. XDJK2016A003. We also appreciate the discussion with Dr. Romane M. Auvergne and Dr. Steven A. Goldman from the Center for Translational Neuromedicine of University of Rochester Medical Center.
Declaration
This article has been published as part of BMC Genomics Volume 18 Supplement 1, 2016: Proceedings of the 27th International Conference on Genome Informatics: genomics. The full contents of the supplement are available online at http://bmcgenomics.biomedcentral.com/articles/supplements/volume18supplement1.
Funding
This work was supported by the National Science Foundation of China under Grant No. 61372138, Chongqing excellent youth award, the Chinese Recruitment Program of Global Youth Experts, Fundamental Research Funds for the Central Universities of China No. XDJK2014B012 and NO. XDJK2016A003. Publication of this article was funded by the Chinese Recruitment Program of Global Youth Experts.
Availability of data and materials
Raw data (GBM expression profiles) are accessible through the Georgetown Database of Cancer GDOC [61].
Authors’ contributions
YX, LZ, XH and TL: Idea, implementation, testing and writing. CY, NH and ZY carried out experimental work for the manuscript. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Consent for publication
Not applicable.
Ethics approval and consent to participate
Not applicable.
Author information
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Published
DOI
Keywords
 Least absolute shrinkage and selection operator (Lasso)
 Sure independence screening (SIS)
 Cox proportional hazards model (Cox)
 Glioblastoma multiforme (GBM)
 Signaling transduction pathway