A novel regulatory event-based gene set analysis method for exploring global functional changes in heterogeneous genomic data sets
© Tung et al; licensee BioMed Central Ltd. 2009
Received: 14 November 2008
Accepted: 16 January 2009
Published: 16 January 2009
Analyzing gene expression data by assessing the significance of pre-defined gene sets, rather than individual genes, has become a main approach in microarray data analysis and this has promisingly derive new biological interpretations of microarray data. However, the detection power of conventional gene list or gene set-based approaches is limited on highly heterogeneous samples, such as tumors.
We developed a novel method, the regulatory e vent-based G ene S et A nalysis (eGSA), which considers not only the consistently changed genes but also every gene regulation (event) of each sample to overcome the detection limit. In comparison with conventional methods, eGSA can detect functional changes in heterogeneous samples more precisely and robustly. Furthermore, by utilizing eGSA, we successfully revealed novel functional characteristics and potential mechanisms of very early hepatocellular carcinoma (HCC).
Our study creates a novel scheme to directly target the major cellular functional changes in heterogeneous samples. All potential regulatory routines of a functional change can be further analyzed by the regulatory event frequency. We also provide a case study on early HCCs and reveal a novel insight at the initial stage of hepatocarcinogenesis. eGSA therefore accelerates and refines the interpretation of heterogeneous genomic data sets in the absence of gene-phenotype correlations.
In the past decade microarray technology has become a popular tool for identifying differentially expressed genes (DEGs) associated with a given phenotype or sample classification. The biological interpretation of transcriptomic changes is commonly revealed by gene function annotation enrichment analysis based on a list of statistically selected DEGs. It is also called as Individual Gene Analysis (IGA) by Nam . However, since only the most significant portion of DEGs was taken into account, the small set of genes might not perfectly represent the whole transcriptomic changes [1–5]. Indeed, Pan et al. showed that the selections of DEGs significantly affected the IGA results . In addition, IGA usually tests functional changes by annotation enrichment methods such as hypergeometric test. These algorithms equally count each DEG so that the significance levels of gene-phenotype correlation are flattened .
Recently, gene set-based approaches have been proposed to overcome the drawbacks of IGA [1, 3, 7]. In principle, these approaches measure the gene-phenotype correlations (e.g. t-statistic) of every gene in a predefined gene set (GS), usually according to the Gene Ontology (GO) categories or the sets of genes representing biological pathways in the cell, and then give a GS score to represent its changes associated with the phenotype . The statistical significance of the GS score can be determined by two different null hypotheses: competitive (Q1) and self-contained (Q2). In Q1 test, it assumes the genes in a gene set have the same level of association with the phenotype compared with the rest of the genes; in Q2 test, it assumes no gene in the gene set is associated with the phenotype. Although the gene set-based approaches are promising in deriving new information, their limitations and the underlying hypotheses have been discussed intensively. For example, Q1 approaches are very sensitive to correlation structure in gene sets that tend to give false positives . Q2 approaches have low power of detection in highly varying samples such as clinical data sets . Moreover, in Q2 test, it can be detected as significant by chance when the size of the gene set is large. Another approach, called gene set enrichment analysis (GSEA), which is a combination of two hypotheses (mixed model, Q3) has become one of the most mentioned and used gene set-based approaches [5, 8]. However, a series of reports have also shown that GSEA inherited the limitations from both null hypotheses [3, 8, 9].
Here, we propose a novel method, called regulatory e vent-based G ene S et A nalysis (eGSA), to evaluate the significantly changed functions or pathways from diverse genomic data sets. eGSA is not based on the gene-phenotype correlations, but rather on gene expression regulatory events which are determined by comparing each gene expression level with the corresponding reference data set (Figure 1). Total regulatory events in a given gene set (GS) are counted as a GS score and then the significance of GS is tested by Q1 null hypothesis using hypergeometric test. Comparing with the approaches based on gene-phenotype correlation, we show in this study that eGSA can successfully derive new functional information from gene expression profiles that are very heterogeneous in nature.
Results and discussion
Heterogeneity of tumor transcriptome
The limitation of gene-phenotype approaches on tumor samples
The transcriptomes of cirrhosis samples (§ in Figure 2A) are homogeneous and their transcriptome profiles (blue circle) as well as that of normal livers (open circle) are closely self-clustered in MDS plot (Figure 2B). Both very early HCC (# in Figure 2A) and very advanced HCC samples (‡ in Figure 2A) are heterogeneous and their transcriptome profiles are dispersed (very early HCC: green circle, very advanced HCC: red circle; Figure 2B). We also added an extreme case, 4 normal liver and 4 normal spleen samples in the analysis. The gene expression patterns of normal tissues are tightly controlled by tissue-specific regulation. The high homogeneity is shown in Figure 2C and ¤ in Figure 2A. A huge transcriptome difference (0.120 ± 0.009, Sts: Tissue comparison in Figure 3F) can be detected between the two tissue types.
To estimate background variations for each data set, we compared the t-distributions of these four data sets (Sci, Sve, Sva and Sts) with their corresponding null data sets (Snull), which have the same sample number but their gene signals were randomly sampled from normal distribution (μ = 0, σ2 = 1). In such comparison, heavier tail distribution indicates more genes have significant gene-phenotype correlations, i.e., data sets with heavy tails are more readily being analyzed by conventional t value-based methods. As an extreme case, Sts showed the most heavy tail distributions (black line in Figure 3D) among all tested data sets. We then compared two data sets with similar transcriptomic change level (Sci and Sve, Figure 3F). The tail distributions of Sci(blue line, Figure 3A) are heavier than those of Sve(green line, Figure 3B). Even in Sva(red line, Figure 3C), which contains over 2-fold dissimilarity than Sci(Figure 3F), the tail distributions (red line, Figure 3D) were still smaller than those of Sci(blue line, Figure 3D). The variation of t value is less informative in heterogeneous data sets. Classical approaches are therefore more applicable to homogenous data sets, but not heterogeneous ones (such as clinical tumor samples).
In gene set-based analysis (GSA), the GS score, which is summarized from gene-phenotype correlations of all members in a gene set, is used to estimate functional changes (Figure 1). Because GSA bypasses the threshold selection step, it can avoid the dilemma mentioned above. However, we wondered if such algorithms, based on the gene-phenotype correlation, also decrease their detection power in heterogeneous samples. We calculated the average t-statistic for each gene set (GS) to get GS scores. The increasing or decreasing of GS score represents the degree of gene set correlation with phenotype. We compared scores distribution of all data sets in Figure 3E. Sts shows the most significant GS score variation (-5.5~22.7), which represents the obvious function difference between liver and spleen. The score variations of Sci, Sve and Sva are much smaller than Sts since they have similar tissue background when compared with reference samples, the normal liver. Although Sci has the smallest transcriptomic changes (Figure 3F), Sci shows a more significant GS score variation (-5.22~-6.20) than the other data sets do (Sve: -3.35~4.52, Sva: -4.75~4.09). The result implies that the heterogeneity of tumor samples reduces the detection power of GSA too.
Threshold insensitivity and robustness of eGSA
Recognizing the limitations of current gene- or gene set-based methods on heterogeneous samples, we alternatively analyzed transcriptomic changes by using regulatory events. Regulatory event is defined as a gene signal change of a test sample compared with a reference sample pool (see Materials and Methods). The distribution of regulatory event frequency in all data sets is clearly distinguishable from null data sets (Figure 4A) and the gradual increasing of tail distributions correlates with their transcriptomic change levels (Figure 3F). This result implies regulatory event is a better approach for analyzing heterogeneous samples. Based on this observation, we designed a novel regulatory event-based strategy, which we called the regulatory e vent-based G ene S et A nalysis (eGSA), for the exploration of significant gene sets and compensate for the insufficiency of traditional method in heterogeneous samples.
One might argue that the regulatory event determination is still under an arbitrary cut-off threshold and its results also suffer from the same dilemma of threshold selection as IGA does. This argument can be partially answered by the insensitivity of regulatory event detection in different threshold values (see Figure 4B). The number of differentially expressed genes (DEGs) is sensitive to the threshold selection and is dramatically decreased when a stringent threshold is applied. Under a strict threshold (q-value < 1.0E-3), 1204 genes could be detected in homogenous data set (Sci, Figure 4B). However, only a few differentially expressed genes could be detected in heterogeneous data sets (Sve: 1 and Sva: 229, Figure 4B). In contrast, even under the most stringent threshold in our test (q-value < 1.0E-7), thousands of regulatory events could be detected robustly. Moreover, the event numbers of those data sets also correlated with the level of transcriptomic changes (Figure 3F).
The threshold insensitivity of regulatory event detection also reflects the robustness of eGSA. We selected the lipid metabolic process (GO:0006629) category in the GO databases as an example because both liver cirrhosis and carcinogenesis cause the defective lipid metabolism [11, 12]. Obvious down-regulation of this functional category is expected in all Sci, Sve and Sva data sets (red line, Figure 4C–E). In IGA, hypergeometric test p-values increase heavily toward the background values of the opposite regulation when higher threshold stringency was used in heterogeneous data set (dash line, Figure 4D–E). IGA even loses the detection power due to no annotated de-regulated genes (dash line, Figure 4C–E). Although the homogeneous data set, Sci, showed higher tolerance to stringent thresholds, the p-values appeared unstable when different thresholds were selected (red dash line, Figure 4C). Nevertheless, in eGSA the p-values are more stable in both homogeneous and heterogeneous data set (red solid line in Figure 4C–E) and could be distinguished from the background values in all tested threshold values. Conclusively, measuring regulatory events is insensitive to threshold selection and can be performed more robustly in hypergeometric test.
Precise and broad biological interpretation of transcriptomechanges by eGSA
In Sts, the most distinctive data set (Figure 3F), the ranking orders of gene set are highly correlated between eGSA and IGA (Figure 5A), suggesting that eGSA has comparable detection power to that of IGA in homogeneous samples. This is due to the fact that most of the consistent gene regulations in Sts can be detected by both strategies. In Sve, the most indistinctive data set (Figure 3F), the top- and bottom-ranked gene sets of IGA and eGSA are correlated and are aligned in the diagonal corners (Figure 5B). This is because these gene sets represent the most consistent changes. However, eGSA detected more up-regulatory gene sets that were under-estimated (red nodes in the 2nd quadrant) or missed (green triangle) by IGA. This data demonstrates that eGSA has a broader scope of analysis than IGA. Moreover, several down-regulated GSs (blue nodes in the 4th quadrant) are misidentified as up-regulated GSs by IGA (Figure 5B).
We selected two extremely contradictory gene sets to demonstrate the different preferences between these two methods. The first gene set, the regulation of signal transduction (GO:0009966; indicated by a red arrow in Figure 5B), is highly-ranked in eGSA (170th) but not in IGA (1554th). The second gene set, the amino acid metabolic process (GO:0006520; blue arrow in Figure 5B), is highly ranked in IGA (94th) but not in eGSA (1359th). The details of both gene sets, including their differentially expressed genes, regulatory events and signal values, are shown in Figure 5C and 5D. In the first gene set, regulatory event-based analysis shows that 20% of genes (total regulatory events/(total genes × total test samples)) are up-regulated, which is in agreement with the observation in sample signal values. However, only 4% up-regulatory de-regulated genes can be detected due to the lack of consistency in gene expression level (Figure 5C). This also suggests that IGA may sometimes under-estimate the significance of a gene set. In the second GS, IGA detects a certain number of consistently up-regulatory de-regulated genes and conclude this gene set as a significantly up-regulated one (p = 7.99E-05). But this doesn't perfectly present the fact that there are actually more down-regulatory (24%) than up-regulatory (15%) expression changes in this gene set (Figure 5D). We conclude that for heterogeneous samples, since eGSA takes into account all REs in a gene set, this novel strategy can overcome certain limitations of IGA.
Functional patterns of very early HCC
To demonstrate the advantage of eGSA, we applied eGSA on the discovery of biological functions associated with liver cancer tumorigenesis. Hepatocellular carcinoma (HCC) is characterized as an obvious multistage process, just like many other types of human tumors. Understanding the de-regulated biological functions in early stage HCC will help to reveal the initial processes of hepatocarcinogenesis. The insights of early HCC functional changes and regulatory mechanisms are far from clear due to the lack of common oncogenes and tumor suppressors. One of the major difficulties in identifying common regulatory mechanisms is the genetic heterogeneity of HCCs [13, 14]. To overcome this heterogeneity issue, we applied eGSA on the analysis of the initial functional changes of HCV-induced HCCs (HCC2) .
In the eGSA results, the major functional changes are basically consistent with the original observations and previous descriptions on HCC [11, 15–17]. Moreover, eGSA provides new insights into the initial processes of HCC. Gene sets related to major hepatic metabolisms, including cholesterol, steroid, hormone and amino acid metabolisms, are significantly down-regulated (see section B of Additional file 2). The down-regulation of these biological functions may be due to the failure of normal hepatic functions. On the contrary, nucleoside and protein metabolisms required for rapid cell growth are significantly up-regulated. These include pyrimidine nucleoside metabolism (p value is 2.3E-5, ranking order is 89th), ubiquitin-mediated protein degradation (4.76E-5, 37th), tRNA aminoacrylation (3E-3, 98th) and ribosome biogenesis and assembly (9.52E-4, 76th) (see section A of Additional file 2).
The down-regulation of lipid metabolisms has been frequently reported in liver diseases [11, 12]. In the initial stage, fatty acid metabolism (2.6E-5, 22th), steroid biosynthesis (6.5E-3,77th) and sterol related metabolisms, e.g. hormone metabolism (6.4E-3,76th), are all down regulated since several common enzymes are shared within these metabolic processes (see B section in Additional file 2). Although lacking a consistent conclusion, several studies did report that a sterol hormone, androgen, was associated with HCC development: high level serum androgen was found as risk factor for HCC , and androgen receptor has also been reported to promote cell growth and anti-apoptosis in hepatic cell line model . Hence, the linkages between the down-regulations of sterol metabolisms and the abnormal hormone metabolisms, and their effects on HCC initiation could be interesting and worthy of further study.
Escaping and surviving from attacks by the immune system are two other significant functional changes in HCC initiation. Certain immune functions, such as the inflammatory response (8.0E-3, 81th), innate immune response (3.5E-3, 64th) and immune effector process (9.2E-3, 86th), are down-regulated while the survival pathway, the regulations of IκkB/NFκB cascade [20, 21] (2.6E-3, 93th), is up-regulated (Additional file 2). These observations are supported by many studies. For example, the number of different types of inflammatory cells are decreased in HCC tissue sections ; the suppression of T cell-mediated immune responses is found in HCC patients and is associated with poor prognosis [23, 24]. To understand the regulatory mechanisms, we analyzed the biological pathways involved in those immune-related gene sets. Genes with high regulatory event frequency (up + down > 0.8) were annotated with their involved KEGG pathways , and the most annotated pathways are listed in Additional file 3. Several biological signaling pathways are highlighted, including the Toll-like receptor, JAK-STAT, MAPK and T cell receptor signaling pathway. Besides, a number of immune-related processes are also highlighted, such as the cell adhesion molecules, cytokine-cytokine receptors, antigen-processing and presentation, and NK cell mediated cytotoxicity (see Additional file 3).
CCNA2 is considered as an important factor leading to cancer because of its dual roles in both S and M phase [27, 28]. The increased expressions of CCNA2 mRNA and protein were observed in many clinical HCC studies [27, 29, 30]. However, in cell model experiments, over-expression of CCNA2 alone could not promote cell cycle but delayed metaphase/anaphase onset . These findings suggest that the high expression of CCNA2 might simply reflect high tumor proliferation rate rather than promote tumorigenesis .
In our study, CCNA2 appears as a most frequent and earliest up-regulated cyclin at the initial stage. This strongly suggests that CCNA2 might play a crucial role in HCC initiation. Accompanying CCNA2, several mitosis-related regulators, such as CDC2, CCNB1/2 and CDC20, are synchronically up-regulated (Figure 7), indicating that all these mitosis genes are also required for HCC initiation. However, previous individual gene studies did not test the gene combination (see Additional file 4). Besides, an interesting supportive mechanism, the up-regulation of ubiquitin mediated proteolysis activity (4.76E-05, 37th), could also potentially contribute to the rapid turn-over of cyclin and other mitosis components, and then help to complete the cell cycle (see section A of Additional file 2).
The common transcriptomic heterogeneity of tumor reduces the detection power of gene-list or gene-set analysis to identify functional patterns of transcriptome profiles. We developed eGSA, a novel method based on regulatory events, to overcome such limitations. eGSA is insensitive to threshold bias and can provide more robust and precise results than IGA. These properties make eGSA an excellent approach to analyze complex tumor transcriptome profiles.
eGSA overlooks the highly heterogeneous regulatory mechanisms and directly targets the question "what happened on cellular functions?". Once functional issues are addressed, the regulatory event frequency can be applied to highlight potential regulatory routines in the gene network and to present their prevalent regulatory mechanisms. The identification of these regulatory routines will greatly accelerate the development of pharmaceutical targeting strategies and personalized therapy.
Along with this study, we also noticed a few limitations of eGSA. First, the determination of regulatory events is heavily dependent on a stable and highly correlated reference pool. This requirement cannot always be met in many cases, such as the comparison between two heterogeneous sample types. Second, the biological interpretation of eGSA relies heavily on the definition and composition of a gene set. Several unsolved problems, such as name-space issues, imprecise or incorrect annotations, will interfere with eGSA results. It still lacks an optimization process to avoid these problems so far. Finally, like most gene set-based analysis, eGSA counts the contribution of each gene in biological regulations equally. This equality is not always true in a cell because certain genes are more crucial in changing the whole cellular function than others. Hence, an advanced weighting algorithm of gene set definition remains to be developed for more precise biological function interpretation.
Data collections and preprocessing
Six independent data sets (Normal, HCC1, HCC2, Tumor1, Tumor2, Tumor3), including one normal tissue data set (GSE3526), two HCC data sets (E-TABM-36 and GSE6764), and data sets for other three tumor types: thyroid cancer (GSE3678), colon cancer(GSE4107) and lung cancer (GSE7670), were downloaded from two public archives (NCBI GEO  and ArrayExpress ). The sample accession numbers and tissue types are listed in Additional file 5. All data were implemented in Affymetrix U133A and U133 Plus 2.0 platform, and the raw data (.CEL) were normalized by RMA algorithm using statistic software Partek® Genomics Suite™ (Partek Inc., St. Louis, MO, USA). Affymetrix probe set IDs were converted to UniProt IDs before mapping onto the GO database. The expression level of each UniProt ID (gene) is the average of the corresponding probe set signals. In total, 26300 genes were summarized from 54675 probe sets of U133 plus 2.0 platform and 18033 genes were annotated in GO.
Gene set (GS) definition
GS is comprised of genes with the same GO term. By using the biological process category of GO, there are 4903 GSs encapsulated from the Affymetrix U133 Plus 2.0 platform. After excluding the GSs containing less than 10 genes, 1643 GSs were used for GS-based analysis. The GS mapping, gene ID converting and summarization were processed by Microsoft Access 2007 and VBA.
The measurement of sample heterogeneity
We measured the Pearson dissimilarity distance (defined as [1 - r]/2, where r is the Pearson correlation) to present the transcriptome difference between two samples. The heterogeneity of a sample type was presented by the mean of distances between paired samples. The transcriptomic change between two sample types was presented by the mean of all pairwise distances between members of the two groups concerned. Multidimensional scaling plot (MDS) was used to illustrate the distances among samples in a data set . The calculation of Pearson dissimilarity distance and MDS plotting were performed using the statistical software Partek Genomic Suit.
The measurement of gene-phenotype correlation
In homogeneous data sets, the gene-phenotype correlations were measured by the Student's t-test. Alternatively, Welch's t test was applied for the heterogeneous data set due to the possible unequal variance observed in the two samples types. The t- statistic of each gene was applied for GSA scoring (average t-statistics) and t-distribution plot. For detecting differentially expressed genes, the multiple-test correction of p-value was performed by positive false discovery rate (q-value)  and genes were filtered by a given q-value threshold. The t- statistic of genes and positive false discovery rate (q-value) were calculated by the Partek Genomic Suit.
In a data set, samples were classified into test and reference groups. The statistical significance of gene signal (x) in a test sample was estimated by the cumulative probability distribution of the normal distribution. The signal mean (μ) and standard deviation (σ) of reference sample pool were calculated and applied to cumulative probability distribution function in Microsoft Excel, NORMDIST(x, μ, σ, True). The multiple-test correction of p-values was performed by q-value. The significant signal change, as an expression regulatory event (RE), was determined by a given significance cut-off. According to the regulated direction, RE was defined as up-regulatory event (up-RE) or down-regulatory event (down-RE).
Event-based gene set analysis (eGSA)
eGSA contains two processes to estimate the significance level of a gene set (GS): (1) GS scoring and (2) Gene set statistic (Figure 1).
(1) GS scoring
In an event matrix, up-regulatory event (RE) frequency and down-RE frequency were calculated respectively as:
Gene RE frequency = ΣRE j /j,
where j is the sample number of a given test sample type. A GS score (k) is the sum of gene RE frequency in a GS and calculated as:
k = ΣRE ij /j,
where i is the gene number of a GS.
(2) Gene set statistic
where m is the total gene number in a GS, N is the total gene number in a microarray platform, and n is the average REs in a sample type. The calculations of hypergeometric test were performed by R packages http://www.r-project.org/.
Simulated null data set
We generated a null data set to mimic random variation of a testing data set. The null data set has the same sample number and gene number as the test data set but its signal values were randomly sampled from normal distribution. The simulated data set was generated by using a standard function, NORMSINV(RAND), in Microsoft Excel 2007.
differentially expressed gene
individual gene analysis
event-based gene set analysis
expression regulatory event
This work is supported by grants from the Nation Science Council (NSC), Taiwan (NSC95-3112-B-010-006, NSC 96-3112-B-010-012 and NSC 95-2321-B-010-005), in part by another grant from the Ministry of Education, Aim for the Top University Plan. We also acknowledge the efforts of Wurmbach, E., Roth, R.B., Boyault, S., Hong, Y., Ismael, R. and Su, L et. al. for their valuable microarray data.
- Nam D, Kim SY: Gene-set approach for expression pattern analysis. Brief Bioinform. 2008, 9 (3): 189-197.View ArticlePubMedGoogle Scholar
- Khatri P, Draghici S: Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics. 2005, 21 (18): 3587-3595.PubMed CentralView ArticlePubMedGoogle Scholar
- Goeman JJ, Buhlmann P: Analyzing gene expression data in terms of gene sets: methodological issues. Bioinformatics. 2007, 23 (8): 980-987.View ArticlePubMedGoogle Scholar
- Barry WT, Nobel AB, Wright FA: Significance analysis of functional categories in gene expression studies: a structured permutation approach. Bioinformatics. 2005, 21 (9): 1943-1949.View ArticlePubMedGoogle Scholar
- Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, et al: Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA. 2005, 102 (43): 15545-15550.PubMed CentralView ArticlePubMedGoogle Scholar
- Pan KH, Lih CJ, Cohen SN: Effects of threshold choice on biological conclusions reached during analysis of gene expression by DNA microarrays. Proc Natl Acad Sci USA. 2005, 102 (25): 8961-8965.PubMed CentralView ArticlePubMedGoogle Scholar
- Tian L, Greenberg SA, Kong SW, Altschuler J, Kohane IS, Park PJ: Discovering statistically significant pathways in expression profiling studies. Proc Natl Acad Sci USA. 2005, 102 (38): 13544-13549.PubMed CentralView ArticlePubMedGoogle Scholar
- Damian D, Gorfine M: Statistical concerns about the GSEA procedure. Nat Genet. 2004, 36 (7): 663-author reply 663View ArticlePubMedGoogle Scholar
- Mlecnik B, Scheideler M, Hackl H, Hartler J, Sanchez-Cabo F, Trajanoski Z: PathwayExplorer: web service for visualizing high-throughput expression data on biological pathways. Nucleic Acids Res. 2005, W633-637. 33 Web Server
- Wang HW, Trotter MW, Lagos D, Bourboulia D, Henderson S, Makinen T, Elliman S, Flanagan AM, Alitalo K, Boshoff C: Kaposi sarcoma herpesvirus-induced cellular reprogramming contributes to the lymphatic endothelial gene expression in Kaposi sarcoma. Nat Genet. 2004, 36 (7): 687-693.View ArticlePubMedGoogle Scholar
- Liu Y, Zhu X, Zhu J, Liao S, Tang Q, Liu K, Guan X, Zhang J, Feng Z: Identification of differential expression of genes in hepatocellular carcinoma by suppression subtractive hybridization combined cDNA microarray. Oncol Rep. 2007, 18 (4): 943-951.PubMedGoogle Scholar
- Jiang J, Nilsson-Ehle P, Xu N: Influence of liver cancer on lipid and lipoprotein metabolism. Lipids Health Dis. 2006, 5: 4-PubMed CentralView ArticlePubMedGoogle Scholar
- Unsal H, Yakicier C, Marcais C, Kew M, Volkmann M, Zentgraf H, Isselbacher KJ, Ozturk M: Genetic heterogeneity of hepatocellular carcinoma. Proc Natl Acad Sci USA. 1994, 91 (2): 822-826.PubMed CentralView ArticlePubMedGoogle Scholar
- Thorgeirsson SS, Grisham JW: Molecular pathogenesis of human hepatocellular carcinoma. Nat Genet. 2002, 31 (4): 339-346.View ArticlePubMedGoogle Scholar
- Wurmbach E, Chen YB, Khitrov G, Zhang W, Roayaie S, Schwartz M, Fiel I, Thung S, Mazzaferro V, Bruix J, et al: Genome-wide molecular profiles of HCV-induced dysplasia and hepatocellular carcinoma. Hepatology. 2007, 45 (4): 938-947.View ArticlePubMedGoogle Scholar
- Choi JK, Choi JY, Kim DG, Choi DW, Kim BY, Lee KH, Yeom YI, Yoo HS, Yoo OJ, Kim S: Integrative analysis of multiple gene expression profiles applied to liver cancer study. FEBS Lett. 2004, 565 (1–3): 93-100.PubMedGoogle Scholar
- Boyault S, Rickman DS, de Reynies A, Balabaud C, Rebouissou S, Jeannot E, Herault A, Saric J, Belghiti J, Franco D, et al: Transcriptome classification of HCC is related to gene alterations and to new therapeutic targets. Hepatology. 2007, 45 (1): 42-52.View ArticlePubMedGoogle Scholar
- Tanaka K, Sakai H, Hashizume M, Hirohata T: Serum testosterone:estradiol ratio and the development of hepatocellular carcinoma among male cirrhotic patients. Cancer Res. 2000, 60 (18): 5106-5110.PubMedGoogle Scholar
- Jie X, Lang C, Jian Q, Chaoqun L, Dehua Y, Yi S, Yanping J, Luokun X, Qiuping Z, Hui W, et al: Androgen activates PEG10 to promote carcinogenesis in hepatic cancer cells. Oncogene. 2007, 26 (39): 5741-5751.View ArticlePubMedGoogle Scholar
- Vainer GW, Pikarsky E, Ben-Neriah Y: Contradictory functions of NF-kappaB in liver physiology and cancer. Cancer Lett. 2008Google Scholar
- Arsura M, Cavin LG: Nuclear factor-kappaB and liver carcinogenesis. Cancer Lett. 2005, 229 (2): 157-169.View ArticlePubMedGoogle Scholar
- Cervello M, Foderaa D, Florena AM, Soresi M, Tripodo C, D'Alessandro N, Montalto G: Correlation between expression of cyclooxygenase-2 and the presence of inflammatory cells in human primary hepatocellular carcinoma: possible role in tumor promotion and angiogenesis. World J Gastroenterol. 2005, 11 (30): 4638-4643.PubMed CentralPubMedGoogle Scholar
- Ormandy LA, Hillemann T, Wedemeyer H, Manns MP, Greten TF, Korangy F: Increased populations of regulatory T cells in peripheral blood of patients with hepatocellular carcinoma. Cancer Res. 2005, 65 (6): 2457-2464.View ArticlePubMedGoogle Scholar
- Fu J, Xu D, Liu Z, Shi M, Zhao P, Fu B, Zhang Z, Yang H, Zhang H, Zhou C, et al: Increased regulatory T cells correlate with CD8 T-cell impairment and poor survival in hepatocellular carcinoma patients. Gastroenterology. 2007, 132 (7): 2328-2339.View ArticlePubMedGoogle Scholar
- Masoudi-Nejad A, Goto S, Endo TR, Kanehisa M: KEGG Bioinformatics Resource for Plant Genomics Research. Methods Mol Biol. 2007, 406: 437-458.PubMedGoogle Scholar
- Masaki T, Shiratori Y, Rengifo W, Igarashi K, Yamagata M, Kurokohchi K, Uchida N, Miyauchi Y, Yoshiji H, Watanabe S, et al: Cyclins and cyclin-dependent kinases: comparative study of hepatocellular carcinoma versus cirrhosis. Hepatology. 2003, 37 (3): 534-543.View ArticlePubMedGoogle Scholar
- Yam CH, Fung TK, Poon RY: Cyclin A in cell cycle control and cancer. Cell Mol Life Sci. 2002, 59 (8): 1317-1326.View ArticlePubMedGoogle Scholar
- Pagano M, Pepperkok R, Verde F, Ansorge W, Draetta G: Cyclin A is required at two points in the human cell cycle. EMBO J. 1992, 11 (3): 961-971.PubMed CentralPubMedGoogle Scholar
- Zhang Y, Peng Z, Qiu G, Wang Z, Gu W: Overexpression of cyclin A in hepatocellular carcinoma and its relationship with HBx gene integration. Zhonghua Zhong Liu Za Zhi. 2002, 24 (4): 353-355.PubMedGoogle Scholar
- Desdouets C, Thoresen GH, Senamaud-Beaufort C, Christoffersen T, Brechot C, Sobczak-Thepot J: cAMP-dependent positive control of cyclin A2 expression during G1/S transition in primary hepatocytes. Biochem Biophys Res Commun. 1999, 261 (1): 118-122.View ArticlePubMedGoogle Scholar
- den Elzen N, Pines J: Cyclin A is destroyed in prometaphase and can delay chromosome alignment and anaphase. J Cell Biol. 2001, 153 (1): 121-136.PubMed CentralView ArticlePubMedGoogle Scholar
- Barrett T, Suzek TO, Troup DB, Wilhite SE, Ngau WC, Ledoux P, Rudnev D, Lash AE, Fujibuchi W, Edgar R: NCBI GEO: mining millions of expression profiles – database and tools. Nucleic Acids Res. 2005, D562-566. 33 Database
- Parkinson H, Kapushesky M, Shojatalab M, Abeygunawardena N, Coulson R, Farne A, Holloway E, Kolesnykov N, Lilja P, Lukk M: ArrayExpress – a public database of microarray experiments and gene expression profiles. Nucleic Acids Res. 2007, D747-750. 35 Database
- Storey JD, Tibshirani R: Statistical methods for identifying differentially expressed genes in DNA microarrays. Methods Mol Biol. 2003, 224: 149-157.PubMedGoogle Scholar
- Hong Y, Ho KS, Eu KW, Cheah PY: A susceptibility gene set for early onset colorectal cancer that integrates diverse signaling pathways: implication for tumorigenesis. Clin Cancer Res. 2007, 13 (4): 1107-1114.View ArticlePubMedGoogle Scholar
- Su LJ, Chang CW, Wu YC, Chen KC, Lin CJ, Liang SC, Lin CH, Whang-Peng J, Hsu SL, Chen CH, et al: Selection of DDX5 as a novel internal control for Q-RT-PCR from microarray data using a block bootstrap re-sampling scheme. BMC Genomics. 2007, 8: 140-PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.