PutidaNET: Interactome database service and network analysis of Pseudomonas putida KT2440
- Seong-Jin Park†1,
- Jong-Soon Choi†2, 3,
- Byoung-Chul Kim1,
- Seong-Woong Jho1,
- Jea-Woon Ryu1,
- Daeui Park1,
- Kyung-A Lee1,
- Jong Bhak1Email author and
- Seung Il Kim2Email author
© Park et al; licensee BioMed Central Ltd. 2009
Published: 3 December 2009
Pseudomonas putida KT2440 (P. putida KT2440) is a highly versatile saprophytic soil bacterium. It is a certified bio-safety host for transferring foreign genes. Therefore, the bacterium is used as a model organism for genetic and physiological studies and for the development of biotechnological applications. In order to provide a more systematic application of the organism, we have constructed a protein-protein interaction (PPI) network analysis system of P. putida KT2440.
PutidaNET is a comprehensive interaction database and server of P. putida KT2440 which is generated from three protein-protein interaction (PPI) methods. We used PSIMAP (Protein Structural Interactome MAP), PEIMAP (Protein Experimental Interactome MAP), and Domain-domain interactions using iPfam. PutidaNET contains 3,254 proteins, and 82,019 possible interactions consisting of 61,011 (PSIMAP), 4,293 (PEIMAP), and 30,043 (iPfam) interaction pairs except for self interaction. Also, we performed a case study by integrating a protein interaction network and experimental 1-DE/MS-MS analysis data P. putida. We found that 1) major functional modules are involved in various metabolic pathways and ribosomes, and 2) existing PPI sub-networks that are specific to succinate or benzoate metabolism are not in the center as predicted.
We introduce the PutidaNET which provides predicted interaction partners and functional analyses such as physicochemical properties, KEGG pathway assignment, and Gene Ontology mapping of P. putida KT2440 PutidaNET is freely available at http://sequenceome.kobic.kr/PutidaNET.
P. putida KT2440 is a ubiquitous bacterium which can break down a variety of organic materials for food. Because of its versatile metabolic activities, P. putida KT2440 is thought to play a pivotal role in the recycling of organic wastes and the degrading of biogenic and xenobiotic pollutants in the environment [1, 2]. According to various carbon sources, we want to know the difference of networks according to the substrates. To simplify the culture condition, we selected succinate and benzoate as a sole carbon source. The easy carbon-utilization source, succinate and the required biochemical degradation-requiring benzoate were chosen for the comparison of a network analysis combined with the different proteomic data. An interactome of a species provides important clues about how to interpret metabolic pathways of constituent enzymes and global protein networks, which facilitates understanding the mechanism responsible for the cellular functions. Recently, the genomic-scale identification of protein-protein interaction (PPI) in model organisms, such as Synechocystis sp. PCC 6803 and Xanthomonas oryzae, have been published to map the whole protein-protein interaction networks [3, 4]. Thanks to advanced high-throughput PPI experiments and information technology, many biologists can access large-scale species specific PPI data on the web . Several web sites have been developed to disseminate PPI data such as POINT , OPHID , and PIANA . POINT and OPHID systems provide predicted PPI information using sequence homology. PIANA integrates several proteins and interaction databases. However, these web services do not include the following methods: structure domain or domain-domain interaction, interaction networks in a graphical network viewer, functional annotation, localization, or the physicochemical properties of PPI data.
We constructed a web-based server, PutidaNET specifically for P. putida using major PPI algorithms. Functional and physicochemical annotations are provided using KEGG , Gene Ontology , amino acid distribution, instability index, isoelectric point, Gravy score, and sub-cellular localization. PutidaNET is designed to be user-friendly and easy to use.
Prediction of protein-protein interaction
The prediction of PPI is based on .PSIMAP [11, 12], PEIMAP, and iPfam (domain-domain interaction) . PSIMAP predicts interactions among proteins by using the BLASTP algorithm  with a common expectation value (E-value) cut-off of 0.0001. Interactions among domains or proteins for known PDB (Protein Data Bank) http://www.rcsb.org/pdb structures are the basis of the predictions. PEIMAP includes integrating various experimental protein-protein interaction databases such as BIND , DIP, IntAct , MINT , HPRD , CYGD) , and BioGrid . PSIMAP and PEIMAP assume that, in terms of unknown proteins, the query tends to interact with its homolog's partners. The most commonly used concept is 'homologous interaction' [21–23]. In this step, we used to recruit homologous sequences using the PSI-BLAST  with a cut-off of 40% sequence identity. Furthermore, we have aligned the Pfam  domains of all the P. putida KT2440 proteins with hmmpfam by the cut-off of 0.01 (E-value).
In order to select more reliable PPIs, we developed and used a 'combined score' between any pair of proteins which were predicted by PEIMAP, PSIMAP, and iPfam algorithms. This scoring method is also used by the STRING server http://string.embl.de.
Protein function annotation
In order to understand the biological function of P. putida KT2440 proteome, we searched physicochemical properties and cross-reference databases using KEGG and GO. We used Biopython  modules to acquire physicochemical properties, including hydropathy profile, GRAVY score (the average hydropathy score), molecular weight, amino acid distribution, isoelectric point, and protein instability index. In addition, we predicted trans-membrane helices and signal peptides using Phobius  and SignalP 3.0  programs for the sub-cellular localization prediction of P. putida KT2440 proteome.
PutidaNET provides cross-reference to public database information such as 1) KEGG pathways, 2) GO categories, and 3) GO-slim  through protein ID mapping. In order to gain more accurate statistical test results of KEGG and GO assignment, we added Fisher's exact test algorithm (P-value).
Protein network analysis case study
Cell culture and MS/MS analysis
In order to find significant features, we integrated PPI network and proteomic data which were produced as previously described . P. putida KT2440 was pre-cultured at 30°C with vigorous shaking in culture media (50 mM potassium phosphate buffer, pH 6.25, 3.4 mM MgSO4, 0.3 mM FeSO4, 0.2 mM CaCO3, 10 mM NH4Cl, and 10 mM sodium succinate) and then inoculated into 1 L culture media containing succinate (10 mM) or benzoate (5 mM) as a sole carbon source. The bacteria were harvested at the late exponential phase (absorbance at 600 nm. 0.7-0.8) and suspended in 20 mM Tris-HCl buffer (pH 8.0). Bacteria were disrupted by a French pressure cell (SLM AMINCO, Urbana, IL) at 20,000 lb/in2, and soluble protein mixtures were prepared by centrifugation (15,000 g, 45 min). The protein samples were fractionated by 12% SDS-PAGE. The gel lanes were divided into 42 fractions according to molecular weight, and the sliced gels were digested with trypsin (Promega, Madison, WI). The resulting peptide extracts were pooled and lyophilized in a vacuum concentrator. Tryptic peptides were dissolved with 0.5% TFA (Trifluoroacetic acid) solution prior to further 2D-LC fractionation and used for MS/MS analysis using LTQ linear Ion Trap MS (ThermoFinigan San Jose, CA). For the database search, the P. putida protein database was downloaded from the National Center for Biotechnology Information http://www.ncbi.nlm.nih.gov/. Tryptic peptides were identified using SEQUEST (version 3.1 SR1, ThermoFinnigan).
For better accuracy of protein identification by MS/MS analysis, the P. putida protein database and the reverse protein database were used to exclude any false-discovered proteins .
We acquired the protein lists in culture media including succinate or benzoate. In order to find regulated sub-network by succinate and benzoate, we analyzed betweenness centrality (BC), the number of shortest paths going through a certain node, and degree, the number of interaction partners, using NetworkAanalyzer, a cytoscape plugin . We used the R software containing some packages and Welch two sample t-test for P-value . Also, we found potential functional modules using MCODE, a cytoscape plugin that finds clusters (highly interconnected regions) in protein networks .
PutidaNET, a free accessible database with 3,254 proteins for P. putida KT2440, contains 82,019 PPI partners that have been predicted. Using the PPI algorithms, we obtained 61,011 (PSIMAP), 4,293 (PEIMAP), and 30,043 (iPfam) predicted PPIs except for self interaction. These PPIs were around 74.39% (PSIMAP), 5.23% (PEIMAP), and 36.62% (iPfam) of the P. putida KT2440 proteins. Although the total number of predicted interaction targets is very large, as they are ranked by combined score, experimentalists can select high ranking (more probable) ones according to their functional interests.
Pathway analysis of five modules obtained from MCODE
rpsM, rpsD, rpsE, rplC
rplV, rplA, rpsT, rplD, rpsC, rplM, rpsB, rpsA
PP_2589, PP_3463, PP_2680
Urea cycle and metabolism of amino groups
PP_2589, PP_5278, PP_3463, PP_2680
3-Chloroacrylic acid degradation
PP_2589, PP_3463, PP_2680
Ascorbate and aldarate metabolism
PP_2589, PP_3463, PP_2680
PP_2589, PP_3463, PP_2680
PP_2589, PP_3463, PP_2680, ilvB
Bile acid biosynthesis
PP_2589, PP_3463, PP_2680
PP_2589, PP_3463, PP_2680
Limonene and pinene degradation
PP_2589, PP_3463, PP_2680
PP_2589, PP_3463, PP_2680
ABC transporters - General
PP_0225, aapP, PP_1068, PP_5022
rplI, rplB, rplQ, rplL, rpmB, rplP, rplE, rplX
aceE, eno, aceF, lpdG
rpsG, rpsN, rplR, rplS, rplW, rpsH
PutidaNET is the integration of mutually complementary protein-protein interaction information for the systematic analysis of Pseudomonas putida. The PutidaNET server is the first web server that provides various kinds of functional information such as a PPI viewer, physicochemical properties, biological pathways, gene ontology, and protein-protein interaction for P. putida KT2440. It can assist researchers to access and obtain the information through an automatic annotation for queried proteins. Using proteomics data from certain medium conditions, we analyzed the characteristics of P. putida KT24400 using PutidaNET. Proteomic data gave us the quantitative information of induced proteins at benzoate or succinate culture conditions, which supplements the database. PPI combined with proteomic data can give users more specific information.
Other papers from the meeting have been published as part of BMC Bioinformatics Volume 10 Supplement 15, 2009: Eighth International Conference on Bioinformatics (InCoB2009): Bioinformatics, available online at http://www.biomedcentral.com/1471-2105/10?issue=S15.
We thank our colleagues at KOBIC and Maryana Bhak for editing the article. This work was supported by a grant from the KRIBB Research Initiative Program of Korea, by a Korea Science and Engineering Foundation (KOSEF) grant funded by the Korean government (MOST) (No. M10508040002-07N0804-00210). In addition, this work was supported by a K-MeP grant (No. T28021) of Korea Basic Science Institute and the Marine and Extreme Genome Research Center Program of the Ministry of Land, Transportation and Maritime.
This article has been published as part of BMC Genomics Volume 10 Supplement 3, 2009: Eighth International Conference on Bioinformatics (InCoB2009): Computational Biology. The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2164/10?issue=S3.
- Kim YH, Cho K, Yun SH, Kim JY, Kwon KH, Yoo JS, Kim SI: Analysis of aromatic catabolic pathways in Pseudomonas putida KT 2440 using a combined proteomic approach: 2-DE/MS and cleavable isotope-coded affinity tag analysis. Proteomics. 2006, 6 (4): 1301-1318. 10.1002/pmic.200500329.View ArticlePubMedGoogle Scholar
- Larimer FW, Chain P, Hauser L, Lamerdin J, Malfatti S, Do L, Land ML, Pelletier DA, Beatty JT, Lang AS, et al: Complete genome sequence of the metabolically versatile photosynthetic bacterium Rhodopseudomonas palustris. Nat Biotechnol. 2004, 22 (1): 55-61. 10.1038/nbt923.View ArticlePubMedGoogle Scholar
- Kim WY, Kang S, Kim BC, Oh J, Cho S, Bhak J, Choi JS: SynechoNET: integrated protein-protein interaction database of a model cyanobacterium Synechocystis sp. PCC 6803. BMC Bioinformatics. 2008, 9 (Suppl 1): S20-10.1186/1471-2105-9-S1-S20.PubMed CentralView ArticlePubMedGoogle Scholar
- Kim JG, Park D, Kim BC, Cho SW, Kim YT, Park YJ, Cho HJ, Park H, Kim KB, Yoon KO, et al: Predicting the interactome of Xanthomonas oryzae pathovar oryzae for target selection and DB service. BMC Bioinformatics. 2008, 9: 41-10.1186/1471-2105-9-41.PubMed CentralView ArticlePubMedGoogle Scholar
- Park D, Kim BC, Cho SW, Park SJ, Choi JS, Kim SI, Bhak J, Lee S: MassNet: a functional annotation service for protein mass spectrometry data. Nucleic Acids Res. 2008, W491-495. 10.1093/nar/gkn241. 36 Web ServerGoogle Scholar
- Bowers PM, Pellegrini M, Thompson MJ, Fierro J, Yeates TO, Eisenberg D: Prolinks: a database of protein functional linkages derived from coevolution. Genome Biol. 2004, 5 (5): R35-10.1186/gb-2004-5-5-r35.PubMed CentralView ArticlePubMedGoogle Scholar
- Brown KR, Jurisica I: Online predicted human interaction database. Bioinformatics. 2005, 21 (9): 2076-2082. 10.1093/bioinformatics/bti273.View ArticlePubMedGoogle Scholar
- Aragues R, Jaeggi D, Oliva B: PIANA: protein interactions and network analysis. Bioinformatics. 2006, 22 (8): 1015-1017. 10.1093/bioinformatics/btl072.View ArticlePubMedGoogle Scholar
- Kanehisa M, Goto S: KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000, 28 (1): 27-30. 10.1093/nar/28.1.27.PubMed CentralView ArticlePubMedGoogle Scholar
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000, 25 (1): 25-29. 10.1038/75556.PubMed CentralView ArticlePubMedGoogle Scholar
- Park J, Lappe M, Teichmann SA: Mapping protein family interactions: intramolecular and intermolecular protein family interaction repertoires in the PDB and yeast. Journal of molecular biology. 2001, 307 (3): 929-938. 10.1006/jmbi.2001.4526.View ArticlePubMedGoogle Scholar
- Gong S, Yoon G, Jang I, Bolser D, Dafas P, Schroeder M, Choi H, Cho Y, Han K, Lee S, et al: PSIbase: a database of Protein Structural Interactome map (PSIMAP). Bioinformatics (Oxford, England). 2005, 21 (10): 2541-2543. 10.1093/bioinformatics/bti366.View ArticleGoogle Scholar
- Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25 (17): 3389-3402. 10.1093/nar/25.17.3389.PubMed CentralView ArticlePubMedGoogle Scholar
- Bader GD, Hogue CW: BIND--a data specification for storing and describing biomolecular interactions, molecular complexes and pathways. Bioinformatics (Oxford, England). 2000, 16 (5): 465-477. 10.1093/bioinformatics/16.5.465.View ArticleGoogle Scholar
- Xenarios I, Rice DW, Salwinski L, Baron MK, Marcotte EM, Eisenberg D: DIP: the database of interacting proteins. Nucleic acids research. 2000, 28 (1): 289-291. 10.1093/nar/28.1.289.PubMed CentralView ArticlePubMedGoogle Scholar
- Hermjakob H, Montecchi-Palazzi L, Lewington C, Mudali S, Kerrien S, Orchard S, Vingron M, Roechert B, Roepstorff P, Valencia A: IntAct: an open source molecular interaction database. Nucleic acids research. 2004, D452-455. 10.1093/nar/gkh052. 32 DatabaseGoogle Scholar
- Zanzoni A, Montecchi-Palazzi L, Quondam M, Ausiello G, Helmer-Citterich M, Cesareni G: MINT: a Molecular INTeraction database. FEBS letters. 2002, 513 (1): 135-140. 10.1016/S0014-5793(01)03293-8.View ArticlePubMedGoogle Scholar
- Peri S, Navarro JD, Kristiansen TZ, Amanchy R, Surendranath V, Muthusamy B, Gandhi TK, Chandrika KN, Deshpande N, Suresh S: Human protein reference database as a discovery resource for proteomics. Nucleic acids research. 2004, D497-501. 10.1093/nar/gkh070. 32 DatabaseGoogle Scholar
- Guldener U, Munsterkotter M, Kastenmuller G, Strack N, van Helden J, Lemer C, Richelles J, Wodak SJ, Garcia-Martinez J, Perez-Ortin JE: CYGD: the Comprehensive Yeast Genome Database. Nucleic acids research. 2005, D364-368. 33 DatabaseGoogle Scholar
- Stark C, Breitkreutz BJ, Reguly T, Boucher L, Breitkreutz A, Tyers M: BioGRID: a general repository for interaction datasets. Nucleic acids research. 2006, D535-539. 10.1093/nar/gkj109. 34 DatabaseGoogle Scholar
- Marcotte EM, Pellegrini M, Thompson MJ, Yeates TO, Eisenberg D: A combined algorithm for genome-wide prediction of protein function. Nature. 1999, 402 (6757): 83-86. 10.1038/47048.View ArticlePubMedGoogle Scholar
- Walhout AJ, Sordella R, Lu X, Hartley JL, Temple GF, Brasch MA, Thierry-Mieg N, Vidal M: Protein interaction mapping in C. elegans using proteins involved in vulval development. Science. 2000, 287 (5450): 116-122. 10.1126/science.287.5450.116.View ArticlePubMedGoogle Scholar
- Deane CM, Salwinski L, Xenarios I, Eisenberg D: Protein interactions: two methods for assessment of the reliability of high throughput observations. Mol Cell Proteomics. 2002, 1 (5): 349-356. 10.1074/mcp.M100037-MCP200.View ArticlePubMedGoogle Scholar
- Sonnhammer EL, Eddy SR, Birney E, Bateman A, Durbin R: Pfam: multiple sequence alignments and HMM-profiles of protein domains. Nucleic Acids Res. 1998, 26 (1): 320-322. 10.1093/nar/26.1.320.PubMed CentralView ArticlePubMedGoogle Scholar
- von Mering C, Huynen M, Jaeggi D, Schmidt S, Bork P, Snel B: STRING: a database of predicted functional associations between proteins. Nucleic acids research. 2003, 31 (1): 258-261. 10.1093/nar/gkg034.PubMed CentralView ArticlePubMedGoogle Scholar
- Chapman B, Chang J: Biopython: Python tools for computational biology. ACM SIGBIO Newsletter. 2000, 20 (2): 15-19. 10.1145/360262.360268.View ArticleGoogle Scholar
- Kall L, Krogh A, Sonnhammer EL: A combined transmembrane topology and signal peptide prediction method. Journal of molecular biology. 2004, 338 (5): 1027-1036. 10.1016/j.jmb.2004.03.016.View ArticlePubMedGoogle Scholar
- Emanuelsson O, Brunak S, von Heijne G, Nielsen H: Locating proteins in the cell using TargetP, SignalP and related tools. Nature protocols. 2007, 2 (4): 953-971. 10.1038/nprot.2007.131.View ArticlePubMedGoogle Scholar
- Camon E, Magrane M, Barrell D, Lee V, Dimmer E, Maslen J, Binns D, Harte N, Lopez R, Apweiler R: The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology. Nucleic acids research. 2004, D262-266. 10.1093/nar/gkh021. 32 DatabaseGoogle Scholar
- Park GW, Kwon KH, Kim JY, Lee JH, Yun SH, Kim SI, Park YM, Cho SY, Paik YK, Yoo JS: Human plasma proteome analysis by reversed sequence database search and molecular weight correlation based on a bacterial proteome analysis. Proteomics. 2006, 6 (4): 1121-1132. 10.1002/pmic.200500318.View ArticlePubMedGoogle Scholar
- Assenov Y, Ramírez F, Schelhorn SE, Lengauer T, Albrecht M: Computingtopological parameters of biological networks. Bioinformatics. 2008, 24 (2): 282-284. 10.1093/bioinformatics/btm554.View ArticlePubMedGoogle Scholar
- Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, et al: Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004, 5 (10): R80-10.1186/gb-2004-5-10-r80.PubMed CentralView ArticlePubMedGoogle Scholar
- Bader GD, Hogue CW: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics. 2003, 4: 2-10.1186/1471-2105-4-2.PubMed CentralView ArticlePubMedGoogle Scholar
- Joy MP, Brock A, Ingber DE, Huang S: High-betweenness proteins in the yeast protein interaction network. J Biomed Biotechnol. 2005, 2005 (2): 96-103. 10.1155/JBB.2005.96.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.