Reciprocal regulation of metabolic and signaling pathways
© Barth et al; licensee BioMed Central Ltd. 2010
Received: 16 November 2009
Accepted: 24 March 2010
Published: 24 March 2010
By studying genome-wide expression patterns in healthy and diseased tissues across a wide range of pathophysiological conditions, DNA microarrays have revealed unique insights into complex diseases. However, the high-dimensionality of microarray data makes interpretation of heterogeneous gene expression studies inherently difficult.
Using a large-scale analysis of more than 40 microarray studies encompassing ~2400 mammalian tissue samples, we identified a common theme across heterogeneous microarray studies evident by a robust genome-wide inverse regulation of metabolic and cell signaling pathways: We found that upregulation of cell signaling pathways was invariably accompanied by downregulation of cell metabolic transcriptional activity (and vice versa). Several findings suggest that this characteristic gene expression pattern represents a new principle of mammalian transcriptional regulation. First, this coordinated transcriptional pattern occurred in a wide variety of physiological and pathophysiological conditions and was identified across all 20 human and animal tissue types examined. Second, the differences in metabolic gene expression predicted the magnitude of differences for signaling and all other pathways, i.e. tissue samples with similar expression levels of metabolic transcripts did not show any differences in gene expression for all other pathways. Third, this transcriptional pattern predicted a profound effect on the proteome, evident by differences in structure, stability and post-translational modifications of proteins belonging to signaling and metabolic pathways, respectively.
Our data suggest that in a wide range of physiological and pathophysiological conditions, gene expression changes exhibit a recurring pattern along a transcriptional axis, characterized by an inverse regulation of major metabolic and cell signaling pathways. Given its widespread occurrence and its predicted effects on protein structure, protein stability and post-translational modifications, we propose a new principle for transcriptional regulation in mammalian biology.
Transcriptional profiling by DNA microarrays allows the simultaneous quantitative analysis of tens of thousands of transcripts in a single experiment. By applying transcriptional profiling technology to healthy and diseased tissues across a wide range of pathophysiological conditions, DNA microarrays have revealed unique insights into complex disease patterns. However, the high-dimensionality of microarray data makes interpretation of heterogeneous gene expression studies inherently difficult. One of the main challenges in the analysis of microarray data is to identify common underlying biological themes by integrating multiple similar experiments. A frequent approach to this problem is to extract common genes from these gene lists and then subject these genes to enrichment analysis by grouping them into pathways.
In a previous study examining failing and non-diseased dog hearts, we observed an intriguing reciprocal transcriptional regulation of selected cell signaling and metabolic processes . To extend this initial observation beyond myocardial tissue and selected pathways, we used a systems biology approach based on KEGG pathways (K yoto E ncyclopedia of G enes and G enomes ) in a large collection of ~2400 mammalian tissue samples derived from more than 20 diseased and non-diseased tissues. As a result, we identified a robust genome-wide reciprocal regulation of metabolic and cell signaling pathways which was present across all 20 different tissues examined.
Cells react to changes in their environment by a coordinated transcriptional response. Using a meta-analysis of more than 40 diverse microarray studies which included different microarray platforms (long and short oligonucleotide arrays, cDNA and bead microarrays) and different methods of normalizations (MAS5, RMA, GC-RMA, VSN, LOWESS), we demonstrate a robust interaction between gene expression in signaling and metabolic pathways. While metabolic pathways were positively correlated to each other, they were negatively correlated to signal transduction pathways. Several findings suggest that this characteristic gene expression pattern represents a novel paradigm for mammalian transcriptional regulation. First, this coordinated transcriptional pattern occurred in a wide variety of physiological and pathophysiological conditions and was identified in all 20 different tissue types examined. Importantly, it occurred independently of the proliferative potential of the underlying tissue, as the inverse regulation of metabolism and signal transduction was observed in terminally differentiated organs like brain and heart, but also in more rapidly dividing malignant tumors. Second, and most strikingly, these changes in steady-state mRNA levels predict a profound effect on the proteome, as KEGG cell signaling pathways are characterized by an increased magnitude of IUPs as compared to metabolic and biosynthetic pathways. The lack of a rigid 3D structure in IUPs is thought to provide several functional advantages, including conformational flexibility to interact with multiple targets, increased interaction surface area, and accessible post-translational modification sites [4, 5]. These and other properties are ideal for proteins that mediate signaling, transcription and coordinate regulatory events, where binding to multiple partners and high-specificity/low-affinity interactions play a crucial role . The critical role of IUPs in signaling is further supported by the finding that eukaryotic proteomes, characterized by their rich interaction networks, are highly enriched in IUPs compared to prokaryotes . An increase of IUPs has been associated with perturbed cellular signaling in a wide range of pathological conditions such as cancer, diabetes, and neurodegenerative diseases; thus, intracellular levels of IUPs need to be tightly controlled . Gsponer et al. demonstrated that IUPs as a class had a significantly shorter half-life and lower abundance compared to highly structured proteins in both unicellular and multicellular organisms, suggesting an evolutionarily conserved pattern . Consistent with its role as an ATP-consuming proteolytic system , gene expression of proteasomal degradation pathways was positively correlated with metabolic pathways (Figures 3B and 4B). In addition to D- and KEN-boxes, ubiquitin proteasome-dependent degradation is mediated by the N-end-rule and PEST-mediated degradation pathways. Consistent with the shorter protein half-life of IUPs compared to structured proteins , recent studies have found IUPs to contain a significantly greater fraction of PEST motifs (regions rich in proline, glutamic acid, serine, and threonine), while no differences were noted for the N-end-rule pathway [10, 12]. Importantly, the 20S proteasome can distinguish between intrinsically unstructured and other proteins, as it can digest IUPs under conditions in which native, and even molten globule states, are resistant to degradation . In line with this finding, it has been suggested that the 20S proteasome degradation assay provides a powerful system for operational definition of IUPs . While protein degradation is not determined by a single characteristic, but is a multi-factorial process that shows large protein-to-protein variations , it is tempting to speculate that an increased abundance of proteins belonging to metabolic pathways contributes to the down-regulation of signaling pathways via concurrent up-regulation of proteasomal degradation pathways.
In summary, proteins in signaling and metabolic pathways have fundamentally different properties ranging from inversely regulated transcriptional patterns (Figures 1 and 3), abundance and stability of respective mRNAs to underlying differences in the translational rate, protein abundance and stability . Additionally, profound differences in post-translational modifications exist between signaling and metabolic pathways, as evident by differences in SUMOylation, mucin-type O-glycosylation, N-glycosylation and serine/threonine phosphorylation sites (Figure 6). Ultimately, this novel transcriptional pattern provides a unifying concept for the interpretation of heterogeneous and multi-dimensional microarray datasets, as the dynamic interaction between cellular signaling and metabolic pathways impacts on the quantity (Figure 2B) and pattern (Figures 1, 3 and 4) of the observed gene expression changes. Given the widespread occurrence of this transcriptional pattern and the predicted differences in IUPs, protein stability and post-translational modifications, we propose the reciprocal relationship between metabolic and signaling pathways as a new canonical principle for transcriptional regulation in mammalian biology.
In the present study, we noted a striking and robust reciprocal correlation of transcriptional changes between metabolic and signaling pathways. Importantly, correlations do not prove cause and effect. Therefore, we can not determine whether transcriptional changes in metabolic activity anticipate changes in signaling pathways or vice versa. While this study was centered on pathway analysis, future studies will need to identify individual genes or hub nodes that connect metabolic and signaling pathways. In addition, the role of up- and down-stream regulatory events, e.g. transcription factors, miRNAs, splicing, 3' end termination and/or stability of mRNAs need to be examined.
Future studies will need to address the role of this transcriptional pattern in various disease processes. While the association of IUPs with various disease processes might suggest that down-regulation of metabolism and up-regulation of signaling pathways is a common theme in a wide range of disease processes, we found this generalization is not universal. This could be related to a different baseline level of OXPHOS activity in various tissues and cancer specimens and/or differences in tissue handling. Clearly, future studies need to address whether this transcriptional pattern will help in refining the distinction between diseased and non-diseased tissue samples.
Gene Expression Data
Public datasets were obtained from the GEO database . A detailed summary of all datasets used in the present meta-analysis is given in Additional File 2. The criteria for the selection of the dataset were as follows: (1) whole-genome coverage of microarray platforms (covering ≥ 20,000 transcripts; the only exception was the comparison between human adult and fetal hearts, for which whole-genome microarray datasets were not publicly available), (2) quality of normalization procedure: comparable levels of mean signal intensity and variance of signal intensity across experimental groups, (3) non-myocardial tissue datasets had to include at least 50 samples and (4) human myocardial datasets had to have more than ten non-failing samples.
To determine differentially expressed genes, unpaired two-class Significance Analysis of Microarrays (SAM) was used . Differences in gene expression were regarded as statistically significant if a false discovery rate (FDR) of q<0.05 was achieved. Functional annotation of differentially expressed genes was based on the KEGG pathways database. Overrepresentation of specific KEGG pathways in a gene set was statistically analyzed by the Database for Annotation, Visualization and Integrated Discovery (DAVID) . The net regulation of a pathway was defined as number of up- minus down-regulated transcripts of a KEGG pathway expressed as percentage of the total number of regulated genes within a study. Clustering of the expression of KEGG pathways and phosphorylation sites was done using Genesis .
Batch prediction of long disordered regions was carried out using the IUPforest-L software, based on the Moreau-Broto autocorrelation function of amino acid indices (AAIs) and other physicochemical features of the primary sequences . Non-parametrical rank tests (Kolmogorov-Smirnoff and Wilcoxon) incorporated into StatView (SAS Institute Inc., NC, USA) were used to determine statistical significance for the distribution of IUP across metabolic and signaling pathways. Batch prediction of N-glycosylation, mucin-type O-glycosylation, SUMOylation and protein kinase phosphorylation sites were carried out using NetNGlyc 1.0 http://www.cbs.dtu.dk/services/NetNGlyc, NetOGlyc 3.1 , SUMOsp 2.0 , and GPS 2.1 , respectively.
Intrinsically Unstructured Proteins
Kyoto Encyclopedia of Genes and Genomes
Significance Analysis of Microarrays
Database for Annotation, Visualization and Integrated Discovery
Gene Expression Omnibus.
The work was supported by NIH P01 HL077180, HL072488, R33 HL087345 and RC1HL099892 to G.F.T., R01 AG17022 to K.B.M., R01 HL088577 and R21 HL092379 to T.P.C., and NIH T32 HL007227 to A.S.B. G.F.T. is the Michel Mirowski M.D. Professor of Cardiology.
- Barth AS, Aiba T, Halperin V, DiSilvestre D, Chakir K, Colantuoni C, Tunin RS, Dimaano VL, Yu W, Abraham TP: Cardiac resynchronization therapy corrects dyssynchrony-induced regional gene expression changes on a genomic level. Circ Cardiovasc Genet. 2009, 2 (4): 371-378. 10.1161/CIRCGENETICS.108.832345.PubMed CentralPubMedView ArticleGoogle Scholar
- Kanehisa M, Goto S: KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 2000, 28 (1): 27-30. 10.1093/nar/28.1.27.PubMed CentralPubMedView ArticleGoogle Scholar
- Tusher VG, Tibshirani R, Chu G: Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA. 2001, 98 (9): 5116-5121. 10.1073/pnas.091062498.PubMed CentralPubMedView ArticleGoogle Scholar
- Kriwacki RW, Hengst L, Tennant L, Reed SI, Wright PE: Structural studies of p21Waf1/Cip1/Sdi1 in the free and Cdk2-bound state: conformational disorder mediates binding diversity. Proc Natl Acad Sci USA. 1996, 93 (21): 11504-11509. 10.1073/pnas.93.21.11504.PubMed CentralPubMedView ArticleGoogle Scholar
- Uversky VN, Oldfield CJ, Dunker AK: Intrinsically disordered proteins in human diseases: introducing the D2 concept. Annu Rev Biophys. 2008, 37: 215-246. 10.1146/annurev.biophys.37.032807.125924.PubMedView ArticleGoogle Scholar
- Chu I, Sun J, Arnaout A, Kahn H, Hanna W, Narod S, Sun P, Tan CK, Hengst L, Slingerland J: p27 phosphorylation by Src regulates inhibition of cyclin E-Cdk2. Cell. 2007, 128 (2): 281-294. 10.1016/j.cell.2006.11.049.PubMed CentralPubMedView ArticleGoogle Scholar
- Grimmler M, Wang Y, Mund T, Cilensek Z, Keidel EM, Waddell MB, Jakel H, Kullmann M, Kriwacki RW, Hengst L: Cdk-inhibitory activity and stability of p27Kip1 are directly regulated by oncogenic tyrosine kinases. Cell. 2007, 128 (2): 269-280. 10.1016/j.cell.2006.11.047.PubMedView ArticleGoogle Scholar
- Iakoucheva LM, Radivojac P, Brown CJ, O'Connor TR, Sikes JG, Obradovic Z, Dunker AK: The importance of intrinsic disorder for protein phosphorylation. Nucleic Acids Res. 2004, 32 (3): 1037-1049. 10.1093/nar/gkh253.PubMed CentralPubMedView ArticleGoogle Scholar
- Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT: Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol. 2004, 337 (3): 635-645. 10.1016/j.jmb.2004.02.002.PubMedView ArticleGoogle Scholar
- Gsponer J, Futschik ME, Teichmann SA, Babu MM: Tight regulation of unstructured proteins: from transcript synthesis to protein degradation. Science. 2008, 322 (5906): 1365-1368. 10.1126/science.1163581.PubMed CentralPubMedView ArticleGoogle Scholar
- Hershko A, Ciechanover A, Rose IA: Resolution of the ATP-dependent proteolytic system from reticulocytes: a component that interacts with ATP. Proc Natl Acad Sci USA. 1979, 76 (7): 3107-3110. 10.1073/pnas.76.7.3107.PubMed CentralPubMedView ArticleGoogle Scholar
- Singh GP, Ganapathi M, Sandhu KS, Dash D: Intrinsic unstructuredness and abundance of PEST motifs in eukaryotic proteomes. Proteins. 2006, 62 (2): 309-315. 10.1002/prot.20746.PubMedView ArticleGoogle Scholar
- Tsvetkov P, Asher G, Paz A, Reuven N, Sussman JL, Silman I, Shaul Y: Operational definition of intrinsically unstructured protein sequences based on susceptibility to the 20S proteasome. Proteins. 2008, 70 (4): 1357-1366. 10.1002/prot.21614.PubMedView ArticleGoogle Scholar
- Tompa P, Prilusky J, Silman I, Sussman JL: Structural disorder serves as a weak signal for intracellular protein degradation. Proteins. 2008, 71 (2): 903-909. 10.1002/prot.21773.PubMedView ArticleGoogle Scholar
- Edgar R, Domrachev M, Lash AE: Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002, 30 (1): 207-210. 10.1093/nar/30.1.207.PubMed CentralPubMedView ArticleGoogle Scholar
- Dennis G, Sherman BT, Hosack DA, Yang J, Gao W, Lane HC, Lempicki RA: DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol. 2003, 4 (5): P3-10.1186/gb-2003-4-5-p3.PubMedView ArticleGoogle Scholar
- Sturn A, Quackenbush J, Trajanoski Z: Genesis: cluster analysis of microarray data. Bioinformatics. 2002, 18 (1): 207-208. 10.1093/bioinformatics/18.1.207.PubMedView ArticleGoogle Scholar
- Han P, Zhang X, Norton RS, Feng ZP: Large-scale prediction of long disordered regions in proteins using random forests. BMC Bioinformatics. 2009, 10: 8-10.1186/1471-2105-10-8.PubMed CentralPubMedView ArticleGoogle Scholar
- Julenius K, Molgaard A, Gupta R, Brunak S: Prediction, conservation analysis, and structural characterization of mammalian mucin-type O-glycosylation sites. Glycobiology. 2005, 15 (2): 153-164. 10.1093/glycob/cwh151.PubMedView ArticleGoogle Scholar
- Ren J, Gao X, Jin C, Zhu M, Wang X, Shaw A, Wen L, Yao X, Xue Y: Systematic study of protein sumoylation: Development of a site-specific predictor of SUMOsp 2.0. Proteomics. 2009, 9 (12): 3409-3412. 10.1002/pmic.200800646.View ArticleGoogle Scholar
- Xue Y, Ren J, Gao X, Jin C, Wen L, Yao X: GPS 2.0, a tool to predict kinase-specific phosphorylation sites in hierarchy. Mol Cell Proteomics. 2008, 7 (9): 1598-1608. 10.1074/mcp.M700574-MCP200.PubMed CentralPubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.