- Open Access
NET-GE: a novel NETwork-based Gene Enrichment for detecting biological processes associated to Mendelian diseases
© Di Lena et al.; licensee BioMed Central Ltd. 2015
Published: 18 June 2015
Enrichment analysis is a widely applied procedure for shedding light on the molecular mechanisms and functions at the basis of phenotypes, for enlarging the dataset of possibly related genes/proteins and for helping interpretation and prioritization of newly determined variations. Several standard and Network-based enrichment methods are available. Both approaches rely on the annotations that characterize the genes/proteins included in the input set; network based ones also include in different ways physical and functional relationships among different genes or proteins that can be extracted from the available biological networks of interactions.
Here we describe a novel procedure based on the extraction from the STRING interactome of sub-networks connecting proteins that share the same Gene Ontology(GO) terms for Biological Process (BP). Enrichment analysis is performed by mapping the protein set to be analyzed on the sub-networks, and then by collecting the corresponding annotations. We test the ability of our enrichment method in finding annotation terms disregarded by other enrichment methods available. We benchmarked 244 sets of proteins associated to different Mendelian diseases, according to the OMIM web resource. In 143 cases (58%), the network-based procedure extracts GO terms neglected by the standard method, and in 86 cases (35%), some of the newly enriched GO terms are not included in the set of annotations characterizing the input proteins. We present in detail six cases where our network-based enrichment provides an insight into the biological basis of the diseases, outperforming other freely available network-based methods.
Considering a set of proteins in the context of their interaction network can help in better defining their functions. Our novel method exploits the information contained in the STRING database for building the minimal connecting network containing all the proteins annotated with the same GO term. The enrichment procedure is performed considering the GO-specific network modules and, when tested on the OMIM-derived benchmark sets, it is able to extract enrichment terms neglected by other methods. Our procedure is effective even when the size of the input protein set is small, requiring at least two input proteins.
Next Generation Sequencing (NGS) technologies enable the discovery of large sets of genetic variations characterizing the individual variability. One common problem is to dig out variations potentially related to different phenotypes, including susceptibility to diseases. A widely adopted procedure relies on the extraction of functional information from sets of genes or proteins already associated to the phenotype under investigation: this procedure allows extending the set of genes or proteins potentially associated to the phenotype and can therefore be useful for prioritizing large sets of experimental variations detected with NGS experiments. Functional association is routinely performed by means of statistical enrichment analysis over a gene/protein set of interest (see  for a comprehensive review of different approaches). Standard enrichment methods treat each gene/protein as an isolated object and completely neglect the different types of relations among molecules. However, the analysis of genes and proteins in the context of their physical interaction networks, gene regulatory networks, metabolic and signaling pathways can help in extracting new biological information (see  for a comprehensive review on the applications of interaction networks to the study of human diseases).
Several approaches exploiting the interaction networks for functional association analysis (network-based enrichment analysis) have emerged in the last few years . These network-based methods can be broadly classified into two main classes: A) methods that use the topology of the interaction network to infer how much similar distinct sets of gene/proteins are (among them, EnrichNET , PWEA , THINKBack , NetPEA , PathNet , NetGSA , SANTA , SPIA , JEPETTO , PathwayExpress, DEGraph ); B) methods that identify functionally-related modules in interaction networks and then infer protein/gene biological roles from such modules (among them, FunMod , PINA , MetaCORE ). In both classes, graph-theoretic measures and graph properties(such as shortest paths, degree, etc) are commonly used to extract information from the interaction network. Most methods deal with pathway enrichment analysis, some of them with both pathway and Gene Ontology (GO) terms. Among the publicly available tools that perform GO enrichment analysis, EnrichNet  and PINA  are two of the most cited methods, representative of the A and B classes above, respectively.
PINA (Protein Interaction Network Analysis) is a web resource based on the integration of six protein-protein interaction databases (IntAct , MINT , BioGRID , DIP , HPRD  and MIPS MPact ). The core of PINA consists of a computational pre-analysis of the molecular interaction network aiming at identifying clusters of densely interconnected nodes, which are likely to represent sets of functionally related proteins. Each cluster is annotated, through a standard enrichment analysis, with terms derived from different biological databases (KEGG , PFAM , GO ). Given an input dataset of genes/proteins, they are mapped on the pre-computed clusters and the overrepresented clusters are identified by means of a hypergeometric enrichment test. The input dataset is then characterized by the significantly enriched annotations of the overrepresented clusters. EnrichNet is a web platform for enrichment analysis based on a network integrating different information: molecular interactions (STRING ), cellular pathways (KEGG , BioCarta , WikiPathways , REACTOME , PID ), biological annotations (GO , InterPro ) and tissue-specific gene expression data. EnrichNet introduces i) a network-based distance between sets of proteins, computed by means of a random walk with a restart procedure; ii) a statistical framework for assessing the significance of distance between two protein sets. These measures allow comparing an input protein set with all the sets of proteins that share the same annotation term on the network. Given an input set, its network-based distances are computed and the annotations corresponding with significantly close sets are retained.
Here we introduce a method for enrichment analysis that implements a novel computational strategy designed to mine and extract information from publicly available interactomics datasets. Our method falls within class B and, similarly to PINA, it is based on a preprocessing phase aimed at identifying interconnected and compact modules in a molecular interaction network. However, differently from all the other approaches in class B, the modules found by our method are function-specific by construction, since they are built starting from seed sets collecting all the proteins related to a specific biological annotation. We make use of graph-theoretic and information-theoretic measures to extend the seed sets into connected subgraphs of a molecular interaction network. Each subgraph represents a compact and function specific module in the interaction network. Our enrichment pipeline consists of two independent analyses: a standard enrichment and a network-based enrichment. The network-based analysis is performed by mapping an input set of proteins into the pre-computed network modules and by collecting the corresponding annotations for an enrichment test. The network-based enrichment allows the detection of statistical associations not directly inferable from the annotations of the starting protein set, and thus not detectable through the standard enrichment. Here, we test the ability of our network-based approach to detect novel biological associations for sets of proteins related to 244 different Mendelian diseases that are associated to two or more proteins, according to the Online Catalog of Human Genes and Genetic Disorders of Mendelian Inheritance in Man (OMIM) .
Interaction network and protein annotations
The human protein interaction network was downloaded from STRING  (release 9.1). We retained all the links with documented action (file protein.actions.v9.1.txt.gz on the STRING website), irrespectively of the STRING score and of the supporting evidence. The actions associated to the links are activation, binding, catalysis, expression, post-translational modification, and reaction. The resulting network consisting of 16,958 nodes and 457,546 links, summarizes a large variety of interactions types and integrates different large datasets.
All the nodes in STRING were unambiguously mapped onto UniProtKB, using the UniProt id mapping data file . Human proteins in UniProtKB were annotated with Gene ontology (GO) terms for Biological Process (BP) [26, 35], as retrieved from the UniProt-GOA web resource . Out of 138,517 human proteins included in UniProtKB, 37,743 are annotated with 12,785 different GO BP terms. A total of 14,056 annotated proteins are mapped on the STRING interactome and a total of 12,621 out of 12,785 GO BP terms are represented in the STRING network. For 8,098 terms, it is possible to extract specific modules from the STRING network, containing a total of 33,315 proteins (see "Module extraction" section for details).
General workflow of the enrichment pipeline
Given a set of input proteins, our pipeline implements the novel network-based enrichment and a standard one.
The standard enrichment is performed with a Bonferroni-corrected Fisher's exact test to highlight the overrepresented BP terms associated to the input proteins, as annotated in UniProtKB. All the human proteins in UniProtKB with at least one BP annotation are used as background for the Fisher's test (37,743 protein identifiers and 12,785 related BP terms).
The network-based enrichment relies on a preprocessing phase aimed at extracting modules starting from seed sets of proteins sharing the same GO BP annotation. By construction, a module is a compact and connected subgraph of the molecular-interaction network. Given a GO BP term (our reference GO term), the corresponding module contains all the proteins directly annotated with the same term in UniProtKB (seed nodes) and some of their interacting partners (connecting nodes). The module is determined by computing all the shortest paths among the seeds and by reducing the resulting network into the minimal connecting network preserving the distances among seeds. The minimal connecting network adds to the seeds a set of connecting nodes that are more reliably related to the reference GO term. The details of module extraction are given below and the algorithmic description is available in the Additional file 1. The enrichment procedure determines whether there are significant overlaps between the input proteins and the network modules built for each GO BP term. In addition, in the network-based enrichment, the Bonferroni-corrected Fisher's exact test is adopted. The whole set of human proteins in the network-modules is used as background for the Fisher's test (33,315 protein identifiers and 8,098 related GO BP terms).
The output of the pipeline consists of a non-redundant ranking of GO BP terms overrepresented in the input set, ranked according to their Bonferroni-corrected p-values. It is important to notice that with a standard enrichment only GO terms already associated to input proteins can result as overrepresented. On the contrary, the network-based enrichment allows to detect statistical associations with GO terms not included in the annotations of the input protein set. Such terms represent the added-value information of the network-based enrichment analysis.
Extraction of the shortest path network
We extract the sub-network of STRING consisting of all the shortest paths between the proteins in the seed set. Seed proteins not appearing in STRING are kept as isolated nodes in the shortest path network. For the shortest paths computation, we do not make use of the edge-scores provided in STRING, i.e. we treat STRING as an undirected and unweighted graph, without self-loops. The size of the shortest path networks extracted from STRING is usually large, even for relatively small input protein sets. On average, the shortest path networks extracted for the different GO BP terms contain 15 times more proteins than their seed sets.
Minimal connecting network
The nodes in the network are split into two disjoint groups: seed nodes (i.e. the nodes related to the seed proteins) and connecting nodes (i.e. the remaining nodes in the shortest path network).
The connecting nodes are ranked according to three predefined relevance criteria. Their description is detailed in the "Ranking scores" section.
The ranked list is iteratively processed starting from the least important node.
The currently evaluated node is removed from the shortest path network only if its deletion does not increase the shortest distance between any pair of seed nodes.
Seed centrality (sc). We say that a node connects two seed nodes if it appears in some shortest path connecting them. Thus, the seed centrality measure simply counts the number of distinct seed pairs connected by a node. This property implicitly assumes that the higher the number of seed pairs a node connects, the higher the probability that such node appears in a minimal connecting network.
Maximum semantic similarity with the reference GO term (ss). The semantic similarity measures to which extent the annotation terms of each connecting node is related to the reference GO term: a connecting node with a high semantic similarity score is more likely to be functionally related to the seed nodes. The semantic similarity is defined as the Lin's information-theoretic metric . In detail, we define the maximum semantic similarity of a connecting node with respect to the reference GO term as the highest Lin's score between the GO terms associated to the connecting node/protein and the reference GO term. The background for the information content measure used in Lin's metric is given by the entire set of UniProt-GOA annotations for human proteins . The maximum semantic similarity property explicitly gives more importance to connecting proteins whose annotations are more closely related to the reference GO term (see Additional file 1 for further details).
Betweenness centrality (bc). The betweenness centrality (with respect to the nodes in the seed set) is a measure of centrality of a node in a network . This property is mainly used to assess a local ranking for those connecting nodes that have exactly the same ranking with respect to the previous two properties. In large shortest path networks, this happens quite often, due to the limited range of values of the previous two properties above.
A quality filtering procedure is applied to the minimal connecting networks built in the previous step. The idea is to filter out those networks for which the GO annotations of the connecting nodes are weakly related to the reference GO term. In particular, rare BP terms (i.e. BP terms with few related proteins) tend to produce minimal networks consisting uniquely of long paths. In most of such cases, the annotations of the connecting proteins are unrelated to the reference GO, and then the resulting minimal network is unlikely to include many proteins related to the reference GO. Such network-modules are discarded and not considered for the enrichment. The quality filtering procedure makes use of the maximum semantic similarity measure, as defined above. In particular, a minimal network is retained if, with respect to the reference GO term, the average maximum similarity of the connecting nodes is significantly higher than the average maximum similarity of all the nodes in STRING, as assessed by a Student's t-test with significance set to 5%. The quality test discharges 1,205 networks out of 12,621 (with sizes ranging from 3 to 137 nodes, with an average of 13).
We also filter out minimal networks that do not contain any connecting node. The number of GO BP terms for which we extract a non-trivial network is then 8,098.
In order to benchmark the method, we extracted from the OMIM web resource  a list of genetic diseases that have been associated to two or more genes. We filtered out all the diseases associated to genes ambiguously mapped on UniProtKB. For performance assessment, we retained only the diseases associated to at least two proteins present in the function-specific network modules. We ended up with a set of 244 genetic diseases. The number of proteins associated to each selected disease ranges from 2 to 29, with an average of 4.
The annotation pipeline retrieves enriched GO BP terms computed with a standard and a network-based procedure. Both are performed with Bonferroni-corrected Fisher tests, considering a significance level of 5%. We benchmarked on the OMIM-derived benchmark set the level of annotation added by the network-based method from both a quantitative and qualitative point of view. The quantitative analysis highlights the ability of the network-based method in recovering new enriched functions. The qualitative analysis focuses on six cases for which the newly enriched terms add new biological insights, as confirmed by previously published experimental data.
Quantitative analysis on OMIM diseases
Functional annotation of 244 OMIM diseases with our pipeline.
No significant GO BP terms extracted by SE and NET-GE
Same significant terms extracted by SE and NET-GE
NET-GE enriches more terms already included in the annotation of the input proteins
NET-GE adds new terms not included in the annotation of the input proteins
Qualitative analysis on OMIM diseases
The newly enriched terms that are absent in the original annotations of the input genes are likely to gain new knowledge on the disease at hand. We focus the qualitative analysis on them and we detail here six case studies for which experimental validations are available for the annotations derived with our method. For all the reported cases, PINA does not return any significant association. EnrichNet enriches only terms that are already included in the annotations of the input proteins. However EnrichNet is best suited to analyze sets including at least 10 proteins, while in our case studies, four out of six cases consist of input sets comprising two to four proteins.
OMIM #133100 ERYTHROCYTOSIS, FAMILIAL, 1
GO BP terms enriched with NET-GE for OMIM disease #133100 (FAMILIAL ERYTHROCYTOSIS 1).
Biological Process GO Term
Bonferroni corrected p-value
response to erythropoietin
cellular response to erythropoietin
negative regulation of myeloid cell apoptotic process
OMIM #143465 ATTENTION DEFICIT-HYPERACTIVITY DISORDER; ADHD
GO BP terms enriched with NET-GE for OMIM disease #143465 (ATTENTION DEFICIT-HYPERACTIVITY DISORDER; ADHD).
Biological Process GO Term
Bonferroni corrected p-value
regulation of gamma-aminobutyric acid secretion
response to histamine
positive regulation of amine transport
behavioral fear response
regulation of synaptic transmission, GABAergic
negative regulation of synaptic transmission
regulation of postsynaptic membrane potential
inorganic anion transmembrane transport
OMIM #188890 TOBACCO ADDICTION, SUSCEPTIBILITY TO
GO BP terms enriched with NET-GE for OMIM disease #188890 (SUSCEPTIBILITY TO TOBACCO ADDICTION).
Biological Process GO Term
Bonferroni corrected p-value
intraspecies interaction between organisms
response to cocaine
OMIM #188050 THROMBOPHILIA DUE TO THROMBIN DEFECT; THPH1
GO BP terms enriched with NET-GE for OMIM disease #188050 (THROMBOPHILIA DUE TO THROMBIN DEFECT; THPH1).
Biological Process GO Term
Bonferroni corrected p-value
OMIM #608446 SUSCEPTIBILITY TO MYOCARDIAL INFARCTION
GO BP terms enriched with NET-GE for OMIM disease #608446 (SUSCEPTIBILITY TO MYOCARDIAL INFARCTION).
Biological Process GO Term
Bonferroni corrected p-value
regulation of angiogenesis
regulation of vasculature development
OMIM #601665 OBESITY
GO BP terms enriched with NET-GE for OMIM disease #601665 (OBESITY).
Biological Process GO Term
Bonferroni corrected p-value
sodium ion homeostasis
monovalent inorganic cation homeostasis
CD4-positive, alpha-beta T cell differentiation
CD4-positive, alpha-beta T cell activation
negative regulation of bile acid biosynthetic process
regulation of adrenergic receptor signaling pathway
regulation of serotonin secretion
negative regulation of cAMP-mediated signalling
regulation of lipid catabolic process
We describe a novel computational method, NET-GE, for enrichment analysis, which exploits the information contained into molecular interaction networks. Given a set of input proteins, our method can detect functional associations not directly inferable from the annotations of the starting protein set, and thus not detectable through a standard enrichment. The method has been benchmarked on a set of 244 different Mendelian diseases associated to more than two proteins, as reported in the OMIM database. The lists of enriched terms for the benchmark examples are available in Additional file 3. NET-GE is able to enrich terms neglected by the standard method and, in a considerable amount of cases, the terms are not even included in the annotation of the input set. For some diseases, it is possible to prove that new enrichment terms are coherent with the experimental information available for the diseases. Therefore, we propose our novel network-based enrichment as a procedure helping in formulating new hypotheses on the biological processes underlying a particular phenotype for which a pool of associated proteins is known. Enriched GO-terms can suggest pools of new proteins potentially associated to the phenotype at hand and can therefore help the prioritization of new variants to be discovered with sequencing techniques. One of the advantages of our method, with respect to other similar ones, is its ability to extract new information even from very small sets of input proteins. In the current version, the network-based method makes use of the STRING network of physical interactions and analyzes only the GO BP annotations. However, the method is quite general and it does not rely on such specific interaction network and biological annotations. For future development, we plan to extend it to different networks and different biological annotations.
We acknowledge the following grants involved in publication of this work: PRIN 2010-2011 project 20108XYHJS (to P.L.M.) (Italian MIUR); COST BMBS Action TD1101 and BM1405 (European Union RTD Framework Program to R.C.); PON projects PON01_02249 and PAN Lab PONa3_00166 (Italian Miur to R.C. and P.L.M.); FARB-UNIBO 2012 (to R.C.).
This article has been published as part of BMC Genomics Volume 16 Supplement 8, 2015: VarI-SIG 2014: Identification and annotation of genetic variants in the context of structure, function and disease. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcgenomics/supplements/16/S8.
- Huang DW, Sherman BT, Lempicki RA: Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucl Acids Res. 2009, 37: 1-13. 10.1093/nar/gkn923.PubMed CentralView ArticleGoogle Scholar
- Gonzalez MW, Kann MG: Chapter 4: Protein interactions and disease. PloS Comput Biol. 2012, 8: 1002819-10.1371/journal.pcbi.1002819.View ArticleGoogle Scholar
- Laukens K, Naulaerts S, Berghe WV: Bioinformatics approaches for the functional interpretation of protein lists: from ontology term enrichment to network analysis. Proteomics. 2015, 15: 981-996. 10.1002/pmic.201400296.View ArticlePubMedGoogle Scholar
- Glaab E, et al: Enrichnet: network-based gene set enrichment analysis. Bioinformatics. 2012, 28 (18): 451-457. 10.1093/bioinformatics/bts389.View ArticleGoogle Scholar
- Hung JH, Whitfield TW, Yang TH, Hu Z, Weng Z, DeLisi C: Identification of functional modules that correlate with phenotypic difference: the influence of network topology. Genome Biol. 2010, 11: R23-10.1186/gb-2010-11-2-r23.PubMed CentralView ArticlePubMedGoogle Scholar
- Farfán F, et al: THINK Back:KNowledge-based Interpretation of High Throughput data. BMC Bioinformatics. 2012, 13 (Suppl 2): S4-10.1186/1471-2105-13-S2-S4.PubMed CentralView ArticlePubMedGoogle Scholar
- Liu L, Ruan J: Network-based Pathway Enrichment Analysis. IEEE International Conference on Bioinformatics and Biomedicine. 2013, 218-221. doi: 10.1109/BIBM.2013.6732493Google Scholar
- Dutta , et al: PathNet: a tool for pathway analysis using topological information. Source Code for Biology and Medicine. 2012, 7: 10-10.1186/1751-0473-7-10.PubMed CentralView ArticlePubMedGoogle Scholar
- Shojaie A, Michailidis G: Analysis of Gene Sets Based on the Underlying Regulatory Network. J Comp Biol. 2009, 16: 407-426. 10.1089/cmb.2008.0081.View ArticleGoogle Scholar
- Cornish AJ, Markowetz F: SANTA: Quantifying the Functional Content of Molecular Networks. PLOS Comp Biol. 2014, 10: e1003808-10.1371/journal.pcbi.1003808.View ArticleGoogle Scholar
- Tarca AL, et al: A novel signaling pathway impact analysis. Bioinformatics. 2009, 25: 75-82. 10.1093/bioinformatics/btn577.PubMed CentralView ArticlePubMedGoogle Scholar
- Winterhalter C, Widera P, Krasnogor N: JEPETTO: a Cytoscape plugin for gene set enrichment and topological analysis based on interaction networks. Bioinformatics. 2014, 30: 1029-1030. 10.1093/bioinformatics/btt732.PubMed CentralView ArticlePubMedGoogle Scholar
- Draghici S, et al: A systems biology approach for pathway level analysis. Genome Res. 2007, 17: 1537-1545. 10.1101/gr.6202607.PubMed CentralView ArticlePubMedGoogle Scholar
- Jacob L, Neuvial P, Dudoit S: More power via graph-structured tests for differential expression of gene networks. Ann Appl Stat. 2012, 6: 561-600. 10.1214/11-AOAS528. doi:10.1214/11-aoas528View ArticleGoogle Scholar
- Natale M, Benso A, Di Carlo S, Ficarra E: FunMod: A Cytoscape Plugin for Identifying Functional Modules in Undirected Protein-Protein Networks. Genomics, Proteomics & Bioinformatics. 2014, 12: 178-186. 10.1016/j.gpb.2014.05.002.View ArticleGoogle Scholar
- Cowley MJ, et al: Pina v2.0: mining interactome modules. Nucl Acids Res. 2012, 40: 862-865. 10.1093/nar/gkr967.View ArticleGoogle Scholar
- Bessarabova , et al: Knowledge-based analysis of proteomics data. BMC Bioinformatics. 2012, 13 (Suppl 16): S13-PubMed CentralPubMedGoogle Scholar
- Kerrien S, et al: IntAct - open source resource for molecular interaction data. Nucleic Acids Res. 2007, 35: 561-565.View ArticleGoogle Scholar
- Chatr-Aryamontri A, et al: MINT: the Molecular INTeraction database. Nucleic Acids Res. 2007, 35: 572-574. 10.1093/nar/gkl950.View ArticleGoogle Scholar
- Breitkreutz BJ, et al: The BioGRID interaction database: 2008 update. Nucleic Acids Res. 2008, 36: 637-640.View ArticleGoogle Scholar
- Salwinski L, et al: The database of interacting proteins: 2004 update. Nucleic Acids Res. 2004, 32: 449-451.View ArticleGoogle Scholar
- Peri S, et al: Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res. 2003, 13: 2363-2371. 10.1101/gr.1680803.PubMed CentralView ArticlePubMedGoogle Scholar
- Guldener U, et al: MPact: the MIPS protein interaction resource on yeast. Nucleic Acids Res. 2006, 34: 436-441. 10.1093/nar/gkj451.View ArticleGoogle Scholar
- Kanehisa M, et al: From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 2006, 34: 354-357. 10.1093/nar/gkj102.View ArticleGoogle Scholar
- Finn RD, et al: The Pfam protein families database. Nucleic Acids Research. 2014, 42: 222-230. 10.1093/nar/gkt1223.View ArticleGoogle Scholar
- Ashburner M, et al: Gene ontology: tool for the unification of biology. Nature genetics. 2000, 25: 25-29. 10.1038/75556.PubMed CentralView ArticlePubMedGoogle Scholar
- Franceschini A, et al: String v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res. 2013, 41: 808-815. 10.1093/nar/gks1094.View ArticleGoogle Scholar
- Nishimura D: BioCarta. Biotech Software & Internet Report. 2001, 2 (3): 117-120. 10.1089/152791601750294344.View ArticleGoogle Scholar
- Pico A, et al: WikiPathways: pathway editing for the people. PLoS Biol. 2008, 6: e184-10.1371/journal.pbio.0060184.PubMed CentralView ArticlePubMedGoogle Scholar
- Joshi-Tope G, et al: Reactome: a knowledgebase of biological pathways. Nucleic Acids Res. 2005, 33: D428-PubMed CentralView ArticlePubMedGoogle Scholar
- Schaefer C, et al: PID: the pathway interaction database. Nucleic Acids Res. 2009, 37: D674-10.1093/nar/gkn653.PubMed CentralView ArticlePubMedGoogle Scholar
- Apweiler R, et al: The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Res. 2001, 29: 37-40. 10.1093/nar/29.1.37.PubMed CentralView ArticlePubMedGoogle Scholar
- Online Mendelian Inheritance in Man (OMIM). Retrieved on September 18, 2014, [http://omim.org]
- UniProt id mapping data for human proteins. Retrieved on September 8, 2014, [ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/idmapping/by_organism/HUMAN_9606_idmapping.dat.gz]
- Gene ontology hierarchy data file. Generated on September 6, 2014, [http://geneontology.org/ontology/go-basic.obo]
- Gene Ontology Annotation database (UniProt-GOA). Generated on September 1, 2014, [http://www.ebi.ac.uk/GOA]
- Hwang F, Richards D, Winter P: The Steiner Tree Problem. 1992, Elsevier, AmsterdamGoogle Scholar
- Sadeghi A, Fröhlich H: Steiner tree methods for optimal sub-network identification: an empirical study. BMC Bioinformatics. 2013, 14: 144-10.1186/1471-2105-14-144.PubMed CentralView ArticlePubMedGoogle Scholar
- Lin D: An information-theoretic definition of similarity. Proceedings of the 15th International Conference on Machine Learning. Edited by: Kaufmann M. 1998, 296-304.Google Scholar
- Freeman L: A set of measures of centrality based on betweenness. Sociometry. 1977, 40: 35-41. 10.2307/3033543.View ArticleGoogle Scholar
- Testa U: Apoptotic mechanisms in the control of erythropoiesis. Leukemia. 2004, 18: 1176-1199. 10.1038/sj.leu.2403383.View ArticlePubMedGoogle Scholar
- Edden RA, et al: Reduced GABA concentration in attention-deficit/hyperactivity disorder. Arch Gen Psychiatry. 2012, 69: 750-753.PubMed CentralView ArticlePubMedGoogle Scholar
- Johansson J, et al: Altered tryptophan and alanine transport in fibroblasts from boys with attention-deficit/hyperactivity disorder (ADHD): an in vitro study. Behav Brain Funct. 2011, 7: 40-10.1186/1744-9081-7-40.PubMed CentralView ArticlePubMedGoogle Scholar
- Levine A, et al: Molecular mechanism for a gateway drug: epigenetic changes initiated by nicotine prime gene expression by cocaine. Sci Transl Med. 2011, 3: 107ra109-PubMed CentralView ArticlePubMedGoogle Scholar
- Yi SS, Kansagra SM: Associations of sodium intake with obesity, body massindex, waist circumference, and weight. Am J Prev Med. 2014, e53-5. 46Google Scholar
- Van der Weerd K, Dik WA, Schrijver B, et al: Morbidly Obese Human Subjects Have Increased Peripheral Blood CD4+ T Cells With Skewing Toward a Treg- and Th2-Dominated phenotype. Diabetes. 2012, 61: 401-408. 10.2337/db11-1065.PubMed CentralView ArticlePubMedGoogle Scholar
- Ma H, Patti ME: Bile acids, obesity, and the metabolic syndrome. Best Pract Res Clin Gastroenterol. 2014, 28: 573-83. 10.1016/j.bpg.2014.07.004.PubMed CentralView ArticlePubMedGoogle Scholar
- Lowell BB, Bachman ES: Beta-Adrenergic receptors, diet-induced thermogenesis, and obesity. J Biol Chem. 2003, 278: 29385-8. 10.1074/jbc.R300011200.View ArticlePubMedGoogle Scholar
- Wurtman RJ, Wurtman JJ: Brain Serotonin, Carbohydrate-craving, obesity and depression. Adv Exp Med Biol. 1996, 398: 35-41. 10.1007/978-1-4613-0381-7_4.View ArticlePubMedGoogle Scholar
- Gregor MF, Hotamisligil GS: Inflammatory mechanisms in obesity. Annu Rev Immunol. 2011, 29: 415-445. 10.1146/annurev-immunol-031210-101322.View ArticlePubMedGoogle Scholar
- McKnight GS, Cummings DE, Amieux PS, et al: Cyclic AMP, PKA, and the physiological regulation ofadiposity. Recent Prog Horm Res. 1998, 53: 139-59.PubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.