Identification of differential expressed RNAs
We identified the differentially expressed mRNAs and lncRNs in STAD, with P < 0.05 and |logFC| > 2 as the thresholds (Supplementary File 1, 2). A total of 424 differential expressed mRNAs (266 up- and 158 down-regulated) and 42 lncRNAs (17 up-, and 25 down-regulated) were detected between STAD and normal samples. Volcano plots displaying the distribution of the mRNAs, and lncRNAs were generated, as shown in Fig. 2.
STAD lncRNA-mRNA network
Differentially expressed lncRNAs and mRNAs were considered as initial nodes of the STAD lncRNA-mRNA network (Supplementary File 3). The network’s edges were defined based on Pearson correlation (Supplementary File 4). The final network formed four distinct modules (Fig. 3). Nodes of each module could be related to similar biological processes or the same cancer hallmarks. We will follow this question during the rest of paper. According to network analyses policies, those nodes with the highest degree and betweenness centrality might be more effective in diagnosis and progression of STAD, this will also follow during the rest of paper. Two nodes of module 1,” U2AF1” and “BCL1”, are oncogenes and “NAB2” is a tumor suppressor and “VTI1A” is a fusion gene. In module 3 also “HOXD13” is an oncogene. This evidence and the existence of most high degree nodes in module 1 declare the high importance of this module.
Correlation between STAD-specific lncRNAs and overall survival
To explore whether the expression of each lncRNA in the STAD network had prognostic significance for predicting overall survival, Kaplan–Meier and log-rank test was used to determine the association between the STAD network lncRNAs and OS. Statistical significance was set at p-value < 0.05. Finally, ten lncRNAs, SOX21-AS1, HCG22, C5orf66-AS1, TINCR, MIR205HG, PGM5-AS1, NALT1, ENSG00000241224.2, ENSG00000262756, and HNF4A-AS1 were found to be related to OS (Fig. 4). These ten lncRNAs could serve as prognostic biomarkers of STAD.
Topological features of the STAD networks
The STAD mRNA-lncRNA network includes 108 nodes with 88 mRNA and 20 lncRNA which form 4 distinct modules. Based on expression correlation the network consists of 317 edges. As observed, the degree distribution of nodes in STAD mRNA-lncRNA network closely followed a power law distribution (Fig. 5). Most nodes had relatively few interactions with others and only a small portion of nodes had a large number of interactions.
All nodes were sorted in descending order according to their degree and their betweenness centrality (Supplementary File 5).
STAD lncRNAs signature
All 20 lncRNAs of STAD network were fitted into the multivariate Cox regression model. The results indicated that only two lncRNAs, SOX21-AS1 and LINC02560, had a significant prognostic value (p-value < 0.05) in STAD (Supplementary File 6). Then, these two lncRNAs were used to develop a lncRNA signature. The two-lncRNA signature risk scores were calculated for each patient based on lncRNAs expressions. Finally, the patients were divided into the “Low-risk” (n = 153) and “High-risk” (n = 153) groups based on their risk scores. The mortality rate of the high-risk patients was significantly higher compared to the low-risk patients (60/13% vs 28/1%). Patients in the high-risk group had significantly shorter survival than those in the low-risk group (Fig. 6). Furthermore, we performed the time-dependent ROC curve analysis to evaluate sensitivity and specificity for survival prediction of two-lncRNA signature. The AUC of STAD signature was 0.6975 (Fig. 6B), demonstrating its utility as a prognostic model for predicting the survival status of STAD.
Identification of cancer hallmark associated lncRNAs in STAD network
Cancer develops from accumulation of mutations and epigenetic changes. Such events may enforce altered expression of mRNAs. These conversions may contribute to the development and progression of cancer. Therefore, it is essential to explore the interaction between the regulatory systems (such as RNAs) for understanding the function of cells and more complex systems.
STAD network was constructed based on differentially expressed RNAs among normal and tumor samples. Then the most expression-correlated RNAs were selected to identify STAD-related lncRNAs. Since there is no annotation for lncRNAs of our STAD network we investigated their function and effects on cancer initiation and progression, via their mRNA neighbors in the network. First, the STAD mRNAs were mapped to their related GO terms. Then, each GO term was mapped to its related cancer hallmarks. Figure 7 represents the results of this mapping. Those RNAs that were not connected to any hallmark were omitted from the network. Since the formation of 4 different modules in the STAD network could be associated with their biological mechanism, we construct the lncRNA-mRNA-hallmark charts for each module separately. Predicted hallmarks of each lncRNAs illustrated in Supplementary File 7.
In Fig. 7, lncRNAs of Module 1 are connected to 7 cancer hallmarks. Generally, hallmarks with higher degrees seem to be more significant than others. “sustaining proliferative signal” has the highest degree among other hallmarks of module one, which is the most fundamental hallmark of cancer. “Activating invasion and metastasis” is the second high-degree hallmark of this module which most leads to death. This could support the high proportion of STAD first module lncRNAs (5/7 = 0.71) that were associated with survival probability among other modules.
LINC02560 (the node with the highest degree, 33, in the STAD network), is connected to “Activating invasion and metastasis “via LAMA4 from three different GO terms which means there could be a strong relationship between LINC02560 and “Activating invasion and metastasis “. On the other hand, the results of cox proportional regression in the previous part demonstrated that the survival probability of patients with downregulation of LINC02560 is twice less than other patients. Furthermore, LINC02560 is connected to “sustaining proliferative signal” via NAB2 and ARAP2, while NAB2 is an oncogene.
The rest of module one lncRNAs are HCG22, C5orf66-AS1, TINCR, and MIR205HG whose expressions were significantly correlated with the patient’s overall survival. Among these, C5orf66-AS1(node with degree 32 in STAD network), and TINCR are also connected to “Activating invasion and metastasis “via three different GO terms from LAMA4.
In module two “sustaining proliferative signal” is also the highest degree and most significant hallmark. PGM5-AS1, NALT1, MIR1-1HG-AS1, and ENSG00000250734 are connected to this hallmark from 4 different paths. The connection between “Avoiding immune destruction” and lncRNAs of module two is a significant distinction among this module and others.
It seems that some lncRNAs of this module could disrupt immune system. Especially, PGM5-AS1 is the only lncRNA of this module that connects to this hallmark via two separate paths. PGM5-AS1 connects to “Avoiding immune destruction” from SNX30 via endocytosis and on the other hand connects to immune response via CD36. Furthermore, we demonstrate that PGM5-AS1 expression is associated with survival probability.
In module three, “sustaining proliferative signal” and “Activating invasion and metastasis” are high-degree hallmarks that both are associated with oncogenesis. HNF4A-AS1 is connected to “sustaining proliferative signal” via HOXD13 and BST2, while HOXD13 is an oncogene and BST2 involves the growth and development of B-cells. This lncRNA expression is also associated with overall survival. ENSG00000241224.2 is another lncRNA of this module that connects to three cancer hallmarks. It is connected to “sustaining proliferative signal” via HOXD13 oncogene.
The fourth module that consists of significant lncRNA, SOX21-AS1, connects to “deregulating cellular energetics” via glucose metabolic process. Malignant tumor cells exhibit fundamentally altered cellular energetics, such as increased aerobic glycolysis, which may contribute to tumorigenesis and malignancy. The connection of SOX21-AS1 to a notable and different cancer hallmark vindicates its separation from the other three STAD modules.
This kind of analysis for defining the connections among lncRNAs, mRNAs, GO terms, and cancer hallmarks could help to have a more precise and comprehensive view of cancer and direct us toward the detection of underlying cancer-related mechanisms. It also directs us in designing new drugs and therapies.
Validation of STAD key lncRNAs by PubMed literature review
LINC02560 was the highest degree node of the STAD network (degree = 33) and strongly correlated with mortality probability. After fitting the STAD lncRNAs into the multivariate Cox regression model, it had the highest hazard ratio (HR = 2.42) and highest p-value (0.004). It indicates that the probability of death in patients with low expression of LINC02560 is 2.42 times higher than others.
It also has a strong expression correlation with U2AF1oncogene, VTI1A fusion gene, and NAB2 tumor suppressor. According to Fig. 7, LINC02560 might be associated with “sustaining proliferative signal” and “Activating invasion and metastasis”. Zhou et al. have reported that dysregulation of LINC02560 expression was significantly associated with overall survival in squamous cell carcinoma of tongue [18].
C5orf66-AS1 is the second high-degree node of the STAD network (degree = 32) and its’ expression was found to be associated with OS. It also has a strong expression correlation with U2AF1oncogene and VTI1A fusion gene. According to Fig. 7, C5orf66-AS1 might be related to “sustaining proliferative signal” and “Activating invasion and metastasis”. We consider the hallmarks with degree> 2 from the specified lncRNA. C5orf66-AS1 is an antisense lncRNA, and growing evidence has implicated that a large number of antisense lncRNAs play crucial roles in cancer [19,20,21].
Expression levels of C5orf66-AS1 in gastric cancer tissues, serum, and cell lines were used for analysis in [22]. C5orf66-AS1 expression was downregulated in gastric cancer cells compared to that in adjacent normal tissues. Serum C5orf66-AS1 levels were significantly lower in gastric cancer patients than in superficial gastritis and atrophic gastritis patients. Low serum expression of C5orf66-AS1 was associated with an increased risk of gastric dysplasia and gastric cancer. They introduced C5orf66-AS1 a Potential Biomarker for Predicting Early Gastric Cancer [22].
Downregulation and aberrant hypermethylation of C5orf66-AS1 have been detected in limited several tumors. Guo et al. have investigated the expression status and function of C5orf66-AS1 in gastric cardia adenocarcinoma. C5orf66-AS1 was significantly downregulated in GCA tissues and cell lines, and the expression level was associated with TNM stage, pathological differentiation, lymph node metastasis, and distant metastasis or recurrence. C5orf66-AS1 expression was significantly increased in cancer cells after being treated with 5-Aza-dC. Further methylation analysis demonstrated that the aberrant hypermethylation of the regions around the transcription start site of C5orf66-AS1 was more tumor-specific and was associated with its expression. In addition, C5orf66-AS1 inhibited gastric cancer cell proliferation and invasion, and the dysregulation and hypermethylation of the regions around the transcription start site of C5orf66-AS1 were associated with poorer gastric cardia adenocarcinoma patients’ survival [23].
Lu et al. have also demonstrated that C5orf66-AS1 prevents oral squamous cell carcinoma (OSCC) through inhibiting cell growth and metastasis. Their findings revealed that the expression of lncRNA C5orf66-AS1 in OSCC tissues and cells was significantly decreased. Overexpression of C5orf66-AS1 significantly inhibited the proliferation, invasion and migration ability of OSCC cells, and promoted cell apoptosis, while C5orf66-AS1 downregulation presented the opposite effects [24].
Rui et al. have reported that C5orf66-AS1 was significantly upregulated in cervical cancer tissues and cells. Over-expression of C5orf66-AS1 promoted the proliferation of cervical cancer cells, while downregulation of C5orf66-AS1 promoted the apoptosis of cervical cancer cells [25]. Yu et al. have also demonstrated that C5orf66-AS1 is downregulated in pituitary null cell adenomas and is associated with their invasiveness [26]. Sailer et al. have illustrated that knockdown of C5orf66-AS1 significantly inhibits the proliferation, invasion, and migration of U2OS cells in Osteosarcoma, and stimulated cell apoptosis [27].
LINC02487 is the third high-degree node of the STAD network (degree = 31) and also has the highest betweenness centrality among the whole nodes. LINC02487 had a strong expression correlation with oncogenes U2AF1 and BCL2 and VTI1A fusion gene. According to Fig. 6, C5orf66-AS1 might be associated with “sustaining proliferative signal”. Feng et al. have reported tumor Suppressor LINC02487 downregulation in Oral Squamous Cell Carcinoma. Their functional studies demonstrate that overexpression of LINC02487 significantly suppresses cell migration and invasion and also inhibits cell proliferation. For the mechanism, they revealed that LINC02487 directly binds to USP17, a deubiquitinating enzyme, and regulates cell migration and invasion through the USP17-SNAI1 axis in a process that involves epithelial-mesenchymal transition (EMT) [28]. Li et al. also introduced LINC02487as Biomarker for Early Diagnosis of Oral Squamous Cell Carcinoma [29].
HCG22 is the fourth high degree node of the STAD network (degree = 29) and also has the second high-centrality among the whole nodes. HCG22 expression was found to be associated with OS. HCG22 had a strong expression correlation with the U2AF1 oncogene and VTI1A fusion gene. According to Fig. 7, HCG22 might be related to “sustaining proliferative signal” and “Activating invasion and metastasis “.
Wang et al. found that HCG22 was weakly expressed in oral squamous cell carcinoma (OSCC) cells. HCG22 overexpression inhibited cell proliferation and invasion and induced apoptosis in OSCC cells. The levels of PCNA, Vimentin, and Bcl2 were decreased and E-cadherin and Bax expression was elevated in OSCC cells after HCG22 overexpression. Additionally, HCG22 overexpression inhibited the Akt, mTOR, and Wnt/β-catenin pathways. Activation of Akt, mTOR and Wnt/β-catenin pathways attenuated the anti-tumor property of HCG22 in OSCC cells. Furthermore, HCG22 overexpression inhibited the growth of OSCC cells in vitro and in vivo. In conclusion, HCG22 exerted anti-tumor properties in OSCC by inhibiting the Akt, mTOR, and Wnt/β-catenin pathways [30]. Cui et al. also introduces HCG22 as a key lncRNA in competing for endogenous RNA network in laryngeal squamous cell carcinoma [31]. Totally, we obtain 28 research in PubMed that introduce HCG22 as a key lncRNA in diverse cancers, but none of them were investigated on STAD.
ENSG00000233850 is the fifth high degree node of STAD network (degree = 27). This lncRNA has a strong expression correlation with the U2AF1 oncogene and VTI1A fusion gene. According to Fig. 7, it might be associated with “Activating invasion and metastasis”. ENSG00000233850 is a novel lncRNA. It is only reported as a downregulated lncRNA in head-and-neck squamous cell carcinoma [32].
TINCR is a low degree lncRNA of the STAD network (degree = 7) but it has a strong expression correlation with oncogene U2AF1. HCG22 expression was found to be related to OS. According to Fig. 7, it might be related to “Activating invasion and metastasis”. Zheng et al. declare that mounting evidence has indicated that TINCR contributes to various cellular processes, such as proliferation, apoptosis, autophagy, migration, invasion, and metastasis [33].
Zheng et al. has reported that lncRNA TINCR was significantly upregulated in nasopharyngeal carcinoma (NPC) and was positively associated with poor survival. TINCR Silencing inhibited NPC progression and cisplatin resistance. TINCR bound ACLY and protected it from ubiquitin degradation to maintain total cellular acetyl-CoA levels. Accumulation of cellular acetyl-CoA promoted de novo lipid biosynthesis and histone H3K27 acetylation, which ultimately regulated the peptidyl arginine deiminase 1 (PADI1)-MAPK-iMMP2/9 pathway. These findings demonstrate that TINCR acts as a crucial driver of NPC progression and chemoresistance [34].
Trastuzumab resistance followed by metastasis is a major obstacle in improving the clinical outcome of patients with advanced HER-2+ breast cancer. Dong et al. prove that TINCR could promote trastuzumab resistance and the accompanied Epithelial-mesenchymal Transition process in breast cancer. Therefore, TINCR might be a potential indicator for prognosis and a therapeutic target to enhance the clinical efficacy of trastuzumab treatment [35]. Furthermore, Wang et al. demonstrated that serum lncRNA TINCR level was significantly increased in breast cancer, especially in triple-negative breast cancer (TNBC). High circulating lncRNA TINCR was significantly correlated with worse clinicopathological features and clinical outcome of TNBC. Finally, they declared that serum lncRNA TINCR might be a useful novel and non-invasive biomarker for the prognosis prediction of TNBC [36].
Xu et al. found that the expression of TINCR was significantly increased in bladder cancer tissues and cell lines. Moreover, the high expression of TINCR was associated with tumor metastasis and advanced tumor, metastasis stage, as well as reduced overall survival rates of patients with bladder cancer. Further investigation revealed that microRNA-7 was negatively mediated by TINCR in bladder cancer cells. Silencing of TINCR expression significantly increased miR-7 expression and reduced bladder cancer cell proliferation, migration, and invasion, while knockdown of miR-7 expression reversed the inhibitory effects of TINCR downregulation on bladder cancer cells [37]. Increasing number of research on TINCR lncRNA in diverse cancer from 2019 declares its impressive role.
MIR205HG is a low degree node in the STAD network, but it has a strong expression correlation with NAB2 tumor suppressor. On other hand, MIR205HG expression dysregulation was found to be associated with OS. According to Fig. 7, it might be related to “sustaining proliferative signal”. MIR205HG acts as a competing endogenous RNA to expedite cell proliferation and progression via targeting miR-299-3p in lung squamous cell carcinoma [38]. MIR205HG was upregulated in cervical cancer tissues and cell lines, and its depletion inhibited the proliferation and metastasis of cervical cancer cells [39]. Long Non-coding MIR205HG Depletes Hsa-miR-590-3p Leading to unrestrained proliferation in Head and Neck Squamous Cell Carcinoma [40]. Song et al. studied the involvement of the miR205HG in the development of Barrett’s esophagus (BE) and esophageal adenocarcinoma (EAC) to clarify the role of miR205HG in in-vitro and in-vivo experiments. They revealed that miR205HG was downregulated in EAC vs. normal esophageal epithelia (NE) as well as in EAC cell lines, and its forced overexpression inhibited EAC cell proliferation and cell cycle progression in vitro. Similarly, overexpression of MIR205HG inhibited xenograft tumor growth in mice in vivo. Finally, their findings declared that MIR205HG functions as a tumor suppressor in the development of BE and EAC, at least in part through its effect on the Hedgehog signaling pathway [41].
Up to here, we investigate the role of module 1 lncRNAs in the literature. In module 2, PGM5-AS1 is the node with the highest degree (degree = 18). This lncRNA expression dysregulation was found to be associated with OS. “sustaining proliferative signal” and “avoiding immune destruction” are the associated high degree hallmarks of this lncRNA.
Du et al. study the inhibitory role of human PGM5-AS1 in the proliferation and apoptosis of prostate cancer (PC) cells. PGM5-AS1 is expressed at low levels in PC cell lines. Forced overexpression of PGM5-AS1 restricted proliferation and facilitated apoptosis of PC cells, manifesting in suppressed xenograft tumor growth in nude mice. They validated that the anti-cancer role of PGM5-AS1 in PC cells was achieved by binding to miR-587 to promote the expression of GDF10 [42].
Zhou et al. found that PGM5-AS1 was low expressed in human colorectal cancer cells, which could promote tumor proliferation, migration, and invasion by serving as a molecular sponge. It modulated the inhibitory effect of miR-100-5p on tumor suppressor gene SMAD4 [43].
According to Qian et al.’s findings, the PGM5-AS1expression in clear cell renal cell carcinoma (CCRCC) specimens was lower than those in matched non-tumors. Downregulation of PGM5-AS1 was closely associated with more advanced clinical features, including lymph nodes metastasis and distant metastasis. A clinical study revealed that CCRCC patients with lower PGM5-AS1 expressions had substantially shorter overall survival (OS) and disease-free survival (DFS) than patients with higher PGM5-AS1 expressions. Further multivariate assays demonstrated that PGM5-AS1 was identified as an independent prognostic factor for patients with CCRCC. Finally, they concluded that Down-regulation of PGM5-AS1 in CCRCC tissues had a strong association with unfavorable outcomes and PGM5-AS1 might be a potential tumor suppressor [44].
NALT1 is the second high-degree node of module 2 of the STAD network (degree = 17). This lncRNA expression dysregulation was found to be associated with OS. According to Fig. 7, it might affect “sustaining proliferative signal”. Yet, no experimental study investigates its role in cancer initiation or progression.
The HAND2-AS1 degree in module 2 is 15. It also holds the highest betweenness centrality among whole nodes of module 2. Da et al. reviewed HAND2- AS1 related studies in late 2020. They reported that HAND2- AS1 is downregulated in common tumors such as hepatocellular carcinoma (HCC) and colorectal cancer (CRC). Low expression of HAND2-AS1 is associated with the early development and prognosis of many types of tumors. They believe that it can act as a prognostic marker and therapeutic target for many kinds of tumors. Regulation of HAND2-AS1 in numerous tumor systems can affect the proliferation and other phenotypes of cancer cells, such as osteosarcoma, esophageal squamous cell carcinoma, endometrioid carcinoma, and colorectal cancer [45].
MIR1-1HG-AS1, ENSG00000250734, ENSG00000223774 are other lncRNAs of module 2 that were differentially expressed among tumor and normal samples and had strong correlations with module 2 lncRNAs. These lncRNAs are novel and we hadn’t found their related study in the literature. According to Fig. 7, we believe that they might be associated with “sustaining proliferative signal”.
In module 3, lncRNA ENSG00000241224.2 is the sixth high-degree node of the STAD network (degree = 6) and the first high-degree node in the module. It is also the fourth high-betweenness node of the STAD network. This lncRNA expression dysregulation was found to be associated with OS. Though it is a novel lncRNA, it has a strong correlation expression with HOXD13 oncogene. Although there is no evidence about the role of ENSG00000241224.2 on cancer, its’ probable relationships with other mRNAs, GO terms, and cancer hallmarks are illustrated in Fig. 7.
HNF4A-AS1 is another lncRNA of module 3 that has a strong correlation expression with HOXD13 oncogene and its expression dysregulation was found to be associated with OS. HNF4A-AS1 possible relationship with other mRNAs, GO terms, and cancer hallmarks illustrated in Fig. 7. HNF4A-AS1 is introduced as a prognostic biomarker in lung adenocarcinoma and hepatocellular carcinoma [46, 47]. Song et al. also suggested that therapeutic targeting of HNF4A-AS1/hnRNPU/CTCF axis inhibits aerobic glycolysis and neuroblastoma progression [48].
SOX21-AS1 is the only lncRNA of module 4 and its loneliness could be associated with it’ distinct function. SOX21-AS1 expression dysregulation was found to be associated with OS. This significant lncRNA was selected for STAD lncRNA signature in previous parts. According to Fig. 7, SOX21-AS1 might associate with “deregulating cellular energetics”. Lack of sufficient predefined relations among GO terms and cancer hallmarks, leads to limited number of connections among this lncRNA and cancer hallmarks. Literature review declares that LncRNA SOX21-AS1 is dysregulated in many types of human cancers, and functions as a tumor suppressor or promoter depending on tumor types [49].
Gai et al.’s findings illustrated SOX21-AS1 upregulation in glioma tissues and cells. In vitro, SOX21-AS1 knockdown repressed proliferation, migration, invasion, and enhanced apoptosis in glioma cells. In vivo, SOX21-AS1 knockdown suppressed tumor growth in mice. Their results indicated that SOX21-AS1/miR-144-3p/PAK7 axis played an oncogenic role in glioma cells by regulating the Wnt/β-catenin pathway, which suggests a rational therapeutic strategy for glioma [50]. To the best of our knowledge, yet no experimental study investigates the SOX21-AS1 role in STAD, but our results confirm its key regulatory role in STAD.
This comprehensive literature review of significant lncRNAs of the STAD network provides a comprehensive view that could help to investigate STAD biomarkers and therapeutic targets.