Visual gene-network analysis reveals the cancer gene co-expression in human endometrial cancer
© Chou et al.; licensee BioMed Central Ltd. 2014
Received: 4 November 2013
Accepted: 4 April 2014
Published: 23 April 2014
Skip to main content
© Chou et al.; licensee BioMed Central Ltd. 2014
Received: 4 November 2013
Accepted: 4 April 2014
Published: 23 April 2014
Endometrial cancers (ECs) are the most common form of gynecologic malignancy. Recent studies have reported that ECs reveal distinct markers for molecular pathogenesis, which in turn is linked to the various histological types of ECs. To understand further the molecular events contributing to ECs and endometrial tumorigenesis in general, a more precise identification of cancer-associated molecules and signaling networks would be useful for the detection and monitoring of malignancy, improving clinical cancer therapy, and personalization of treatments.
ECs-specific gene co-expression networks were constructed by differential expression analysis and weighted gene co-expression network analysis (WGCNA). Important pathways and putative cancer hub genes contribution to tumorigenesis of ECs were identified. An elastic-net regularized classification model was built using the cancer hub gene signatures to predict the phenotypic characteristics of ECs. The 19 cancer hub gene signatures had high predictive power to distinguish among three key principal features of ECs: grade, type, and stage. Intriguingly, these hub gene networks seem to contribute to ECs progression and malignancy via cell-cycle regulation, antigen processing and the citric acid (TCA) cycle.
The results of this study provide a powerful biomarker discovery platform to better understand the progression of ECs and to uncover potential therapeutic targets in the treatment of ECs. This information might lead to improved monitoring of ECs and resulting improvement of treatment of ECs, the 4th most common of cancer in women.
Endometrial cancers (ECs) are the most common type of uterine cancer. With more than 280,000 cases occurring annually worldwide, it has become the fourth most common cancer in women worldwide . The incidence rate of ECs is higher than uterine cervix cancer and ovarian cancer in the United State . These tumors have been broadly classified into two major subtypes I and II, based on the clinic-pathological characteristics such as prognosis and aggressiveness, as well as, molecular alterations that impact tumor response to therapies . Type I tumors are the most frequent subtype which have been linked to obesity, estrogen excess, and low-grade (differentiated) inflammation. Type II endometrial tumors are serous carcinoma that have been associated with older, non-obese, post-menopausal women, high-grade (undifferentiated), and also with worst outcomes. It has been suggested that the molecular-genetic alterations may be responsible for the distinct morphology and biologic behavior of the different subtypes of human ECs . For example, low-grade or early-stage Type I tumors may progress to high-grade or late-stage; nonetheless Type I and II cancers appear to be separate entities in most cases, and different molecular abnormalities would result in unique cellular functions and distinctive tumor morphology . Therefore, it is crucial to discover with more accuracy the putative molecular signatures of ECs, which should allow for improved detection and monitoring of endometrial tumorigenesis, since such knowledge could be beneficial for early diagnostic, enhanced prognostic, and more effective therapeutic strategies.
ECs are characterized by a variety of genetic alterations and significant gene expression modifications. Over the last decade, multiple studies have identified aberrant gene expression of several important genes in ECs, with the mutation frequency varying according to the histological classification . Type I tumors are frequently characterized by the loss or altered expression of phosphatase and tensin homolog (PTEN). PTEN modulates cell survival and proliferation through its effects on downstream factors, mainly phospholipid phosphatidylinositol (3, 4, 5)-triphosphate (PIP3) and protein kinase B (PKB, Akt). PTEN inactivation leads to a decrease of lipid and protein phosphatase activity and promotes cell cycle progression to the G1/S phase . Other genes are linked to abnormalities in Type I tumors includingβ-catenin, K-ras and DNA-mismatch repair genes [7–10]. In comparison, Type II tumors have been reported to be associated with abnormalities in TP53 and Her2/neu. The gene TP53 encodes a tumor suppressor p53, the most frequently mutated protein in cancer. P53 prevents cell cycle progression after DNA damage, inducing cell arrest and apoptosis through several regulator proteins such as p21, Cyclin D1, and RB1. TP53 mutations occur as an early event in Type II tumorigenesis and may occur as manifestations of late-stage molecular changes in Type I lesions. Overexpression of Her-2/neu observed in Type II carcinomas has been linked to coding alterations for a transmembrane receptor tyrosine kinase involved in cell signaling . Although these studies provide important insights into the molecular basis of endometrial cancers, a limited set of well-known cancer genes was obtained from these studies. In fact, until now, a large-scale screen of the gene expression analyses incorporating systematic methods to discover cancer subtypes and their molecular alterations in ECs has not been globally conducted and explored.
Recent advances in constructing genetic network approaches have enabled the unprecedented characterization of studying a variety of somatic alterations and gene expression in cancer genomes. Therefore, these advances allow connecting the existent gap of understanding the association of individual genes to complex diseases such as cancer by the systematic investigation of the observed relationship between gene products and tumorigenesis. A weighted gene co-expression network approach (WGCNA) has been proposed to reconstruct gene co-expression networks (modules) in terms of large-scale gene expression profiles and as well as for the distinction of centrally located genes (hub genes) driving key cellular signaling pathways [12, 13]. The WGCNA approach provides a functional interpretation in Systems Biology and leads to new insights into cancer pathophysiology [14–17].
Here, we aimed to establish a systematic framework for constructing for the first time, the ECs-associated gene co-expression networks and pin-pointing cancer hub genes contributing to endometrial tumorigenesis and progression. This study provides a novel and broad application platform for the identification of cancer gene signatures of ECs tumorigensis and for the discovery of potential new molecular targets for the development of more effective therapies for the treatment of ECs.
Module significance in ECs and GO analysis
Correlationa with phenotypic characteristics of ECs (p-value)
Gene ontology enrichment analysis
List of hub gene associated with grade, type and stage o ECs
Hub gene significance*
Hub genes for grade
Tumor protein 53
Aurora kinase B
Budding uninhibited by benzimidazoles 1
Centromere protein A
Hub genes for type
Aurora kinase B
Protein regulator of cytokinesis 1
Cell division cycle 6
E2F transcription factor 2
Kinesin family member 20A
Hub genes for stage (derived from module yellow)
Budding uninhibited by benzimidazoles 1
Flap structure-specific endonuclease 1
Kinesin family member 23
Cell division cycle 20
Protein regulator of cytokinesis 1
Hub genes for stage (derived from module blue)
Isocitrate dehydrogenase 3 (NAD+) gamma
NADH dehydrogenase (ubiquinone) flavoprotein 2
ATP synthase, H + transporting, mitochondrial F1 complex, beta polypeptide
Proteasome (prosome, macropain) subunit, beta type, 3
Proteasome (prosome, macropain) subunit, beta type, 7
This study reconstructed and identified the gene networks of cancer hub genes to search the potential key regulators of endometrial tumorigenesis and to identify regulatory relationships among cancer hub genes. Using functional enrichment analysis, we identified three major signaling pathways; cell-cycle regulated networks (mitotic, G2/M or G1/S phases), antigen processing (Class I MHC antigen processing) and the TCA cycle, as the main Gene Ontology biological processes overrepresented in the gene networks deriving from the cancer hub genes (Figure 4). In the cell-cycle network, BUB1, AURKB, CENPA, KIF20A, CDC20, CDC6, E2F2 and FEN1 seem to regulate distinct co-expression pathways that contributed to mitotic G2-G2/M phase, G2/M checkpoints, and kinesins in Type II ECs, whereas these genes regulate the M phase and mitotic G1-G1/S phase in Type I ECs (Figure 4A). In addition, the antigen processing and TCA cycle networks are regulated by the hub genes IDH3G, NDUFV2, ATP5B, PSMB3 and PSMB7, contributing to tumorigenesis in Type II ECs (Figure 4B).
In this study, we integrated a large-scale transcriptional profiling of 273 ECs datasets to identify appropriate cancer subtypes and biomarkers. Our long-term goal is to provide insights into disease biology and diagnostic classification, which may guide early-phase of clinical therapeutic applications. We also determined that co-expression networks reflect causative relationships between different gene-gene interactions. First, this study found that an EC-specific co-expression network majorly regulated the tumorigenesis of ECs. This network not only enriched genes known to play critical roles in cell cycle regulation but also showed significant correlation with grade, type, and stage of ECs. Second, 19 highly connected hub genes were defined into two EC-associated co-expression networks. Detailed and systematic network analysis revealed that these hub genes played roles as major regulators in cell-cycle regulation, kinase modulation, and the development of various tumors. Finally, for independent analysis of microarray data as validated data sets, these hub genes can provide excellent diagnostic power in distinguishing grade, type, and stage of ECs. Altogether these findings strongly suggest that the role of these hub genes in the EC-associated networks may provide new insights into the underlying biological mechanisms driving the tumorigenesis of ECs.
Tumor cell progression is typically associated with aberrant cell-cycle regulation, and this study identified several unique hub genes in the networks associated with the grade of ECs including TP53, BUB1, AURKB, CENPA. In agreement with previous reports [17–20], the disruption of these hub genes was observed in a variety of human tumors and often correlates with tumorigenesis. For example, BUB1 has been suggested to play a direct role in the suppression of p53-mediated cell death via physical interaction with p53 at kinetochores in response to mitotic spindle damage . Altered expression of BUB1 is associated with therapy failure and death in patients with multiple types of cancer . Fu et al. (2007)  reported a possible mechanism on the roles of aurora kinases in mitosis and tumorigenesis. They found that overexpression of AURKA and AURKB disruption overrides spindle checkpoint, resulting in the aneuploidy or polyploidy that occurs during cell division. When cells lose the normal function of p53, they enter mitosis and become aneuploid, which is considered a cancerous manifestation.
Recently, up-regulation of BUB1 gene has been linked to the inhibition of p53-dependent senescence, hyper-activation of AURKB and phosphorylation of CENP-A, and might be considered as a potential oncogene in driving the aneuploidiation and tumorigenesis . Interestingly, this study found BUB1 not only highly connected with other genes in the co-expression network, but also significantly correlated with the transformation of ECs. Therefore, BUB1 gene seems to modulate the expression of tens or hundreds of genes suggesting that it might allow undifferentiated cancer cells to overcome apoptotic checkpoints favoring aberrant progression through mitosis.
Previous reports suggested that PTEN mutations occur early in the neoplastic process of Type I ECs and co-exist frequently with other mutations in the PI(3)K/AKT pathway , yet Type II ECs showed an alteration of the E2F–retinoblastoma protein–p16 pathway by mutations of p53 and p16. In this study, several unique hub genes (AURKB, PRC1, CDC6, E2F2, KIF20A) were identified in the co-expression network associated with the subtype of ECs, which were linked to the PTEN and p16-associated pathway. The cell division gene, CDC6, is required for the cell cycle G1-to-S transition. Wu et al. (2009)  have identified that CDC6 is the putative PTEN target and demonstrated a causal linkage between CDC6 and PTEN in metastatic human prostate cancer. They indicated that the regulation of CDC6 expression by PTEN is mediated through the E2F transcriptional factor and the E2F2 protein. Furthermore, overexpression of AURKB has been suggested to compromise the tumor suppressor function of p53. The gene PRC1, encodes one of the polycomb-group proteins, and is involved in cytokinesis. Of note, the PRC1 and PRC2/3/4 proteins can induce the repression of the PTEN transcription via binding to the PTEN promoter region in leukemic cells , while KIF20A is known to be controlled by the E2F–retinoblastoma protein–p16 pathway, and is linked to tumor aggressiveness in human hepatocellular carcinomas . Taken together, it appears that this gene-signature is commonly linked to PTEN, PI(3)K/AKT and E2F–retinoblastoma protein–p16 pathway, and may be marginally associated with the subtype of ECs.
Cancer stage is the most import indicator for selecting an appropriate cancer treatment option for a patient. In this study, two co-expression networks were significantly correlated with the stage of ECs. These two network-regulated signaling events contributed to cell-cycle regulation, antigen processing and TCA cycle, respectively. From the cell-cycle regulation network, 5 hub genes were identified including BUB1, FEN1, KIF23, CDC20 and PRC1. BUB1 and CDC20 are involved in the M phase of mitotic cell cycle and DNA replication, and play critical roles in the cell-cycle regulation . Altered expression of cell-cycle associated protein like BUB1 and CDC20 may lead to chromosomal instability. BUB1 and CDC20 are important regulators of the anaphase-promoting complex (APC) . APC’s function is to trigger the transition from metaphase to anaphase by tagging critical cell-cycle proteins for degradation, and maintaining genomic stability in a normal manner. BUB1 can directly phosphorylate the APC’s co-activator CDC20, leading to the decreased activity of APC, which determines the metaphase-to-anaphase transition. The overexpression of BUB1 or CDC20 induces misregulation of APC, and is associated with the chromosomal instability and poor outcomes in breast cancer patients [30, 31]. KIF23 belongs to the kinesin family, and it is part of the central spindle in a complex that clusters PRC1 and AURKB together at the spindle midzone to enable anaphase in dividing cells and regulation of central spindle assembly . Abnormal regulation of these genes may lead to the chromosomal instability, thereby promoting tumor development and progression.
The pathway of MHC class I presentation is an important mechanism in determining whether tumors are able to evade immune response. Down-regulation of MHC Class I has been described in ovary and cervix malignancies. Abnormal expression of MHC Class I genes has been linked to the advanced stage of disease and poor survival in ovarian cancer . Recently, a large cohort study of patients with endometrial cancer indicated that down-regulation of MHC Class I expression in endometrial cancer patients are correlated to late-stage ECs . In this study, we identified a group of hub genes associated with the stage of ECs deriving from the MHC Class I co-expression network. These hub genes, PSMB7 and PSMB3, were found to significantly regulate the network of antigen processing and to contribute to Class I MHC mediated processing. Furthermore, we also discovered 3 hub genes (IDH3G, NDUFV2 and ATP5B) associated with the TCA cycle. No other study to date has reported that these hub genes might be associated with cancer, but recent studies have indicated that the alterations in the TCA cycle enzymes may favor tumorigenesis by impacting on cellular redox state and overall cell metabolism [35, 36]. Therefore, it is reasonable to hypothesize that these hub genes could be involved with tumor progression.
To identify the signaling pathway through which hub genes regulated the co-expression network in the progression of ECs, we compared the gene co-expression networks between Type I (early-stage, low-grade) and Type II ECs (late-stage, high-grade) of ECs. In the cell-cycle regulation network, Type II ECs displayed a distinct pathway when compared with Type I ECs. However, BUB1, AURKB, CDC6, CENPA and KIF20A seemed to be critical regulators for the co-expression network in Type II compared to Type I ECs. Among these genes, BUB1 may play a key role in regulating these genes to promote tumor formation. The overexpression of BUB1 regulates AURKB, CENPA, CDC6 and KIF20A in a variety of human cancers [37–39]. Furthermore, from the antigen processing and TCA cycle co-expression networks, we found that the network regulated TCA cycle through the co-regulation of IDH3G, NDUFV2 and ATP5B, and linked to the pathway of Cass I MHC antigen processing key players were PSMB3 and PSMB7 in Type II compared to Type I ECs. Expression of these genes in Type II ECs may facilitate the identification of signaling pathways contributing to tumor progression.
This study used a novel systematic framework to identify two co-expression networks associated with ECs tumorigenesis based on large-scale human microarray data. In addition, a number of novel hub genes in these two co-expression networks were identified contributing to three signaling pathways: cell-cycle regulation, antigen processing and TCA cycle, and presented a high predictive power in distinguishing grade, type and stage of ECs. Although Type I and Type II ECs shared similar genetic information, several critical hub genes were identified that may contribute to progression of ECs. Together, these findings provided a clearer and broader picture of the signaling pathways regulated by co-expression networks contributing to ECs. Furthermore, the characterization of these hub genes might infuse novel insights into the identification of novel clinical markers and potential therapeutic targets for ECs.
Characteristics of microarray datasets for platform, sample groups, and grade, type and stage in endometrial cancers
Training dataset (n = 273)
Validation set A (n = 40)
Validation set B (n = 90)
IC or late stage
We reviewed the sample profiles in each of the 273 microarray datasets. From these 273 datasets, at least four samples corresponding to both classes of one analysis of interest were selected for additional analyses. Thus, analyses of interest included cancer versus non-cancer patients, cancer grade, which was further divided into high grade (G3, poorly differentiated) versus low grade (G1, well differentiated), cancer types; Type I (estrogen dependent) versus Type II (estrogen independent), and cancer stage (higher than stage 2) versus early stage (lower than stage 2). After the assignment of samples to classes, we assessed the differential expression using Student’s t test to identify the significantly differential expression of gene profiles. False discovery rates (FDR) were used in these analyses for correcting for multiple comparisons . All differentially expressed analysis used the “limma” program in the R-based Bioconductor package to calculate the level of differential expression .
This study performed WGCNA analysis to construct the modules of co-expression gene for the EC-associated networks and their interactions. From the processed expression files, the networks were formed from the weighted correlation matrices following the protocols of WGCNA. Briefly, the WGCNA converts the gene expression profiles into connection weights that can be visualized as topology overlap measures (TOM) (Additional file 7: Figure S3). We chose expression profiles of 4,500 genes in the co-expression network analysis. These genes were either significantly differentially expressed between non-cancer and cancer samples (FDR < 0.05 and fold change > 1.5 between two groups) or showed a large variability in expression. We defined modules using a hierarchical cluster method, and used the topological overlap dissimilarity measure (1-TOM) as the distance measure with a height cutoff value of 0.95 and a minimum size (gene groups) cutoff value of 100 for the resulting dendrogram. All network analyses were implemented in the package WGCNA in the R environment as previously described .
We implemented for the first time an unique systematic framework that applies the elastic-net regularization-based approach and WGCNA to take the ECs-specific gene co-expression networks into account in the process of identification of cancer hub genes. Our approach contain two major steps. Firstly, we used WGCNA to identify the cancer hub genes that functionally contribute to the tumorigenesis of ECs. To distinguish centrally located genes (hub genes) of the co-expressed network, we calculated its scaled connectivity (K) and genes significance (GS) using WGCNA. This approach enabled us to determine the hub genes implicated by both the genetic marker and network connectivity information.
Secondly, to link these potential hub genes to the phenotypic characteristics of ECs, the elastic-net analysis was used to select which of these features were significantly associated with phenotypic characteristics of ECs across the gene co-expression networks. This approach is ideal for building the linear models in situations where the number of variables significantly outweighs the number of samples. In fact, this approach has been used as a powerful classification algorithm for large-scale microarray analysis [45, 46]. In addition to all these innovative but highly structured and systematized approaches and procedures, we alsoperformed bootstrap analysis, sampling the datasets with replacements 1,000 times, and calculating the frequency (f) of markers (genes) for inclusion in the model for each bootstrap sample. Only genes present in more than the fourth quartile (f > 750) of all bootstrap samples were selected as ECs-specific hub genes. All elastic-net analysis used the R package “glmnet” . Full methods are available in Support information.
Classification performance was assessed with areas under the receiving operating characteristic (AUC) curve. Using the penalized logistic regression via the elastic-net, a classification model was built, and its discriminatory capacity was first estimated with a strict 10-fold cross-validation methodology (as described in Additional file 1: Materials and Methods). The resulting model was next tested on independent datasets using the cancer hub genes as a model input to predict the classes of particular samples relevant to the process of neoplastic transformation and progression in ECs.
The network was visualized through Cytoscape Software 3.0.1 . This study used the Cluepedia plug-in in Cytoscape to identify potential association to pathways of cancer hub genes . CluePedia organizes a functionally grouped pathway with cancer hub genes by integrating heterogeneous expression data and functional network information.
This work was supported by Taiwan National Science Council NSC102-2918-I-007-011. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.