Comparative genomics of metabolic networks of free-living and parasitic eukaryotes
© Nerima et al; licensee BioMed Central Ltd. 2010
Received: 11 May 2009
Accepted: 31 March 2010
Published: 31 March 2010
Obligate endoparasites often lack particular metabolic pathways as compared to free-living organisms. This phenomenon comprises anabolic as well as catabolic reactions. Presumably, the corresponding enzymes were lost in adaptation to parasitism. Here we compare the predicted core metabolic graphs of obligate endoparasites and non-parasites (free living organisms and facultative parasites) in order to analyze how the parasites' metabolic networks shrunk in the course of evolution.
Core metabolic graphs comprising biochemical reactions present in the presumed ancestor of parasites and non-parasites were reconstructed from the Kyoto Encyclopedia of Genes and Genomes. While the parasites' networks had fewer nodes (metabolites) and edges (reactions), other parameters such as average connectivity, network diameter and number of isolated edges were similar in parasites and non-parasites. The parasites' networks contained a higher percentage of ATP-consuming reactions and a lower percentage of NAD-requiring reactions. Control networks, shrunk to the size of the parasites' by random deletion of edges, were scale-free but exhibited smaller diameters and more isolated edges.
The parasites' networks were smaller than those of the non-parasites regarding number of nodes or edges, but not regarding network diameters. Network integrity but not scale-freeness has acted as a selective principle during the evolutionary reduction of parasite metabolism. ATP-requiring reactions in particular have been retained in the parasites' core metabolism while NADH- or NADPH-requiring reactions were lost preferentially.
Unicellular endoparasites are the causative agents of a plethora of human diseases: malaria, sleeping sickness, Chagas' disease, toxoplasmosis, leishmaniasis, amoebic dysentery and many more, particularly in the tropics. Obligate endoparasitism among the protozoa is of polyphyletic origin. Nevertheless, the different parasites exhibit striking similarities regarding their metabolism, which is reduced - or streamlined - compared to free-living organisms. The parasites have lost metabolic functions in the course of evolution , presumably in adaptation to life within a foreign host organism. For instance, all obligate endoparasitic protozoa are incapable of purine de novo synthesis; they lack the corresponding genes and import exogenous purines from their hosts to synthesize nucleic acids [2, 3]. The phenomenon also includes catabolic pathways: the intestinal parasites Entamoeba histolytica and Giardia duodenalis lack mitochondria, and hence oxidative phosphorylation and respiration . Trypanosoma brucei possess mitochondria, but these are only functional in the insect stages; the bloodstream-stages rely on substrate-level phosphorylation to generate ATP ; the same may apply to Plasmodium falciparum . The trend towards metabolic simplification is also apparent from endoparasitic bacteria such as Treponema pallidum  or Mycoplasma genitalium [8, 9], which lack the genes for purine synthesis and those for pyrimidine synthesis (and the Krebs cycle as well). Thus the reduction of metabolic complexity is a convergent trait among endoparasites. Parasite metabolism is of interest not only to the evolutionary biologist but also to the pharmacologist. With the advent of large compound libraries and high-throughput screening facilities, the identification of suitable drug targets has become the bottleneck in antiparasitic hit discovery. Comparison of host and parasite metabolism may reveal vulnerable points for chemotherapeutic intervention, such as enzymes that are essential for the parasite and do not have orthologues in the host (or whose orthologues in the host are regulated differently ). Alternatively, prodrugs may be designed which are specifically activated by metabolic conversion in the parasite . Proof of principle for this strategy was obtained with purine antimetabolites targeted towards Toxoplasma gondii .
Traditionally, the metabolism of a cell has been represented as a network of interlinked pathways . More recently, representation of the metabolism as a graph, where metabolites are the nodes and enzymes are the edges (and each node appears exactly once), has provided novel insights . When the mathematical concepts of graph theory that had originally been developed for quantitative analysis of social networks were applied to metabolism, the resulting graphs shared a number of characteristics with other real-world networks. Namely a short average path-length (the 'small world') and a scale-free, i.e. power-law frequency distribution of number of edges per node [14, 15]. More recently, the representation of metabolic networks as hypergraphs, with different types of nodes to represent metabolites and biochemical reactions, has allowed the application of set algebra for quantitative comparison of reconstructed metabolic graphs . Power-law frequency distributions in number of links are thought to result from the fact that new nodes joining the network preferentially attach to highly-linked ones ('the rich get richer' ). Therefore, networks are generally studied in terms of their expansion. The shrinkage of networks has mainly been analyzed in the context of how the targeted removal of nodes or edges may lead to collapse. Hence there is a third reason - besides the study of convergent evolution among parasites and the quest for new drug targets - to study parasite metabolism: it provides an opportunity to study how networks shrink in a natural way that maintains functionality. Genome-scale reconstruction of metabolic networks is most advanced for bacteria [18–20], where comparative analyses have yielded novel insights into the evolution of metabolic modules and the adaptation of microorganisms to different habitats [21, 22]. Host-pathogen comparisons are contributing to antimicrobial drug discovery and to a deeper understanding of metabolic adaptations in parasitism [23, 24]. Thanks to the wealth of available data, a metabolic model of E. coli was reconstructed that integrates information on gene expression and metabolic flux . This in turn has allowed the simulation of the shrinkage of metabolic networks in bacteria by randomly deleting enzymes and calculating the effects on performance based on the predicted biomass production . However, it is notoriously difficult to classify bacteria as parasitic or free-living, since the versatility of species such as Legionella defies categorization. For eukaryotes, the distinction between obligate endoparasites, facultative parasites, and free-living organisms is more straightforward. With the completion of the genome projects for a number of obligate endoparasitic protozoa, their metabolic networks can be reconstructed in silico from the predicted proteomes [27–30]. Here we compare the predicted core metabolic graphs of protozoan endoparasites, their human host, and reference organisms, aiming to identify convergent trends in the reductive evolution of metabolic networks in parasites.
In silico reconstruction of core metabolic networks from parasites
Quantitative comparison of the core metabolic networks
Basic graph properties
483 ± 52
287 ± 67
333 ± 8.6
539 ± 79
278 ± 80
279 ± 7.0
0.0046 ± 0.0003
0.0070 ± 0.001
0.0049 ± 0.0002
2.2 ± 0.1
1.9 ± 0.1
1.7 ± 0.0
11.3 ± 1.5
8.3 ± 0.7
7.2 ± 1.5
Avg. Path length
2.9 ± 0.44
1.2 ± 0.89
0.12 ± 0.07
Max. Path length (diameter)
25.6 ± 1.1
22.7 ± 7.1
15.4 ± 3.9
Global clustering coefficient
0.059 ± 0.009
0.039 ± 0.014
0.015 ± 0.006
50 ± 9
51 ± 4
74 ± 6
How did the parasites' core metabolic networks shrink?
To study the selective forces that governed the loss of particular metabolic enzymes and pathways in obligate endoparasites, their reconstructed core metabolic graphs were compared to experimental graphs, generated by random removal of edges from the reference network until they reached the average size of the parasites' graphs. The resulting networks are unlikely to be functional and only served as a negative control. These negative control networks differed from the natural networks in that they were less coherent: the randomly shrunk graphs contained significantly (Kruskal-Wallis followed by Dunn's multiple comparison test, p = 0.0001) more isolated edges than those of either parasites or non-parasites (Table 2), and even after removal of these isolated edges, nodes of degree 1 were overrepresented in the random graphs (Figure 2, intersection with ordinate). The randomly shrunk graphs also exhibited a lower global clustering than the real metabolic networks (Table 2). The reference, the parasite, and the non-parasite graphs, while strongly differing in number of nodes and edges, all had a diameter of around 24, which is in agreement with previous studies on E. coli . In contrast, the randomly shrunk graphs had a significantly (Kruskal-Wallis followed by Dunn's multiple comparison test, p = 0.0024) smaller diameter of around 15 (Table 2), reflecting the decomposition of the network into medium-sized entities. This was also apparent from very small average path length of the randomly shrunk graphs (Table 2).
To further identify potential factors that shaped core metabolism in the parasites, we tested the hypothesis that the frequency distribution of the number of links per node may have exerted a selective pressure: possibly, scale-freeness had to be maintained in order to preserve the robustness of the metabolic networks.
Discussion and Conclusion
To study in an evolutionary context the convergence in reduction of metabolic complexity among different parasites, we focused on metabolic pathways that must have been present in the free-living ancestor of all the organisms included (Table 1). Core metabolic graphs comprising glycolysis, gluconeogenesis, the Krebs cycle, pentose phosphate pathway, purine, pyrimidine, and amino acid metabolism were reconstructed from the KEGG collection of databases for obligate endoparasitic eukaryotes (referred to as 'parasites'), and free-living or facultative parasitic organisms (referred to as 'non-parasites'). As expected, the resulting core metabolic graphs of the parasites were significantly smaller than those of the non-parasites when regarding the number of nodes (metabolites) or edges (reactions). The validity of this finding depends on the status of functional annotation in the KEGG data: the non-parasites predominantly constituting model organisms, the observed difference in network size may be caused by the higher accuracy of annotation in the predicted proteomes of the non-parasites compared to the parasites. However, the finding that the parasites possess reduced core metabolic networks is in agreement with other studies  and with biochemical data . Furthermore, the facultative pathogen Candida albicans (which is not a model organism) clustered with the free-living eukaryotes, while the obligate endoparasite Trypanosoma brucei (which has a high quality of gene annotation) clustered with the parasites (Figure 1). Thus we do not think that the striking differences in network size between the parasites and the non-parasites were artefacts due to the different standards of functional annotation in the analyzed proteomes. The parasites' networks did not exhibit smaller diameters than those of the non-parasites (Figure 1b) while control graphs, shrunk to the size of the parasites' by random deletion of edges from the core metabolic reference graph, were fragmented and had significantly smaller diameters (Table 2). The total number of reactions in a given organism negatively correlated with the percentage of ATP-consuming reactions (Figure 4a) and positively correlated with the percentage of NADH- or NADPH-utilizing reactions (Figure 4b). This indicates that ATP-requiring enzymes, respectively the genes encoding their catalysts, have a higher propensity to be retained in the course of network evolution and that the retention of ATP-requiring reactions may be one of the selective forces acting on network evolution. A possible interpretation could be that ATP-consuming reactions are more likely to be essential than reactions that do not involve ATP, so loss of the corresponding enzymes would be more likely to be harmful. We tested this hypothesis on the results from the Saccharomyces Genome Deletion Project [39–41] and found an interesting accordance: of all S. cerevisiae genes annotated with EC number (n = 518), 18% were essential for growth on rich glucose medium. Of the genes encoding enzymes that catalyze ATP-consuming reactions (n = 87), 22% were essential while for reactions involving NADH or NADPH (n = 87), the fraction of essential genes was 14%. However, the differences were not statistically significant (p = 0.27, two-tailed Fisher's exact test). In summary, focusing on core pathways for the reconstruction of metabolic graphs has permitted comparative genomics between obligate endoparasites and free-living (or facultative parasitic) eukaryotes and has identified the preferred retention of ATP-consuming reactions, and the enhanced loss of NADH- or NADPH- utilizing reactions, as potential selective forces which may have acted during the reductive evolution of parasitic metabolism.
For all of the selected core metabolic pathways, educt-product pairs of reactions and the corresponding enzymes were retrieved manually from the Reference pathway maps of the KEGG database  and supplemented with data kindly provided by H. Ma and A.P. Zeng (German Research Center for Biotechnology, Braunschweig, Germany). Organism-specific enzyme lists were also obtained from KEGG, via LinkDB. The networks were reconstructed as directed graphs, but for all subsequent analyses treated as undirected. The basic network properties such as average connectivity or network diameter were determined with BioLayout , GraphCrunch , and Bioperl v1.5.2. The metabolites' connectivities and the connectivities' frequency distribution were determined with self-developed Perl scripts. Shrinkage of networks was performed with a Perl script that randomly removed educt-product pairs from metabolic maps until a given size was reached. These scripts are available upon request.
We wish to thank H. Ma and A.P. Zeng for providing reaction files. This work was supported by the Swiss National Science Foundation.
- Fairlamb AH: Novel biochemical pathways in parasitic protozoa. Parasitology. 1989, 99: 93-112. 10.1017/S003118200008344X.View Article
- Hassan HF, Coombs GH: Purine and pyrimidine metabolism in parasitic protozoa. FEMS Microbiol Rev. 1988, 4: 47-83.PubMedView Article
- de Koning HP, Bridges D, Burchmore RJ: Purine and pyrimidine transport in pathogenic protozoa: from biology to therapy. FEMS Microbiol Rev. 2005, 29: 987-1020. 10.1016/j.femsre.2005.03.004.PubMedView Article
- Vanacova S, Liston DR, Tachezy J, Johnson PJ: Molecular biology of the amitochondriate parasites, Giardia intestinalis, Entamoeba histolytica and Trichomonas vaginalis. Int J Parasitol. 2003, 33: 235-255. 10.1016/S0020-7519(02)00267-9.PubMedView Article
- van Hellemond JJ, Opperdoes FR, Tielens AG: The extraordinary mitochondrion and unusual citric acid cycle in Trypanosoma brucei. Biochem Soc Trans. 2005, 33: 967-971. 10.1042/BST20050967.PubMedView Article
- Painter HJ, Morrisey JM, Mather MW, Vaidya AB: Specific role of mitochondrial electron transport in blood-stage Plasmodium falciparum. Nature. 2007, 446: 88-91. 10.1038/nature05572.PubMedView Article
- Fraser CM, Norris SJ, Weinstock GM, White O, Sutton GG, Dodson R, Gwinn M, Hickey EK, Clayton R, Ketchum KA: Complete genome sequence of Treponema pallidum, the syphilis spirochete. Science. 1998, 281: 375-388. 10.1126/science.281.5375.375.PubMedView Article
- Fraser CM, Gocayne JD, White O, Adams MD, Clayton RA, Fleischmann RD, Bult CJ, Kerlavage AR, Sutton G, Kelley JM: The minimal gene complement of Mycoplasma genitalium. Science. 1995, 270: 397-403. 10.1126/science.270.5235.397.PubMedView Article
- Glass JI, Assad-Garcia N, Alperovich N, Yooseph S, Lewis MR, Maruf M, Hutchison CA, Smith HO, Venter JC: Essential genes of a minimal bacterium. Proc Natl Acad Sci USA. 2006, 103: 425-430. 10.1073/pnas.0510013103.PubMed CentralPubMedView Article
- Lüscher A, de Koning HP, Mäser P: Chemotherapeutic strategies against Trypanosoma brucei: Drug targets vs. drug targeting. Current Drug Targets. 2006, 13: 555-567.
- el Kouni MH, Guarcello V, Al Safarjalani ON, Naguib FN: Metabolism and selective toxicity of 6-nitrobenzylthioinosine in Toxoplasma gondii. Antimicrob Agents Chemother. 1999, 43: 2437-2443.PubMed CentralPubMed
- Papin JA, Price ND, Wiback SJ, Fell DA, Palsson BO: Metabolic pathways in the post-genome era. Trends Biochem Sci. 2003, 28: 250-258. 10.1016/S0968-0004(03)00064-1.PubMedView Article
- Jeong H, Tombor B, Albert R, Oltvai ZN, Barabasi AL: The large-scale organization of metabolic networks. Nature. 2000, 407: 651-654. 10.1038/35036627.PubMedView Article
- Barabasi AL: Linked - The New Science of Networks. 2002, Cambridge, MA: Perseus Publishing
- Forst CV, Flamm C, Hofacker IL, Stadler PF: Algebraic comparison of metabolic networks, phylogenetic inference, and metabolic innovation. BMC Bioinformatics. 2006, 7: 67-10.1186/1471-2105-7-67.PubMed CentralPubMedView Article
- Barabasi AL, Albert R: Emergence of scaling in random networks. Science. 1999, 286: 509-512. 10.1126/science.286.5439.509.PubMedView Article
- Karp PD, Riley M, Paley SM, Pellegrini-Toole A: The MetaCyc Database. Nucleic Acids Res. 2002, 30: 59-61. 10.1093/nar/30.1.59.PubMed CentralPubMedView Article
- Becker SA, Palsson BO: Genome-scale reconstruction of the metabolic network in Staphylococcus aureus N315: an initial draft to the two-dimensional annotation. BMC Microbiol. 2005, 5: 8-10.1186/1471-2180-5-8.PubMed CentralPubMedView Article
- Feist AM, Henry CS, Reed JL, Krummenacker M, Joyce AR, Karp PD, Broadbelt LJ, Hatzimanikatis V, Palsson BO: A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol Syst Biol. 2007, 3: 121-10.1038/msb4100155.PubMed CentralPubMedView Article
- Kreimer A, Borenstein E, Gophna U, Ruppin E: The evolution of modularity in bacterial metabolic networks. Proc Natl Acad Sci USA. 2008, 105: 6976-6981. 10.1073/pnas.0712149105.PubMed CentralPubMedView Article
- Borenstein E, Kupiec M, Feldman MW, Ruppin E: Large-scale reconstruction and phylogenetic analysis of metabolic environments. Proc Natl Acad Sci USA. 2008, 105: 14482-14487. 10.1073/pnas.0806162105.PubMed CentralPubMedView Article
- Forst CV: Host-pathogen systems biology. Drug Discov Today. 2006, 11: 220-227. 10.1016/S1359-6446(05)03735-9.PubMedView Article
- Borenstein E, Feldman MW: Topological signatures of species interactions in metabolic networks. J Comput Biol. 2009, 16: 191-200. 10.1089/cmb.2008.06TT.PubMed CentralPubMedView Article
- Reed JL, Vo TD, Schilling CH, Palsson BO: An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR). Genome Biol. 2003, 4: R54-10.1186/gb-2003-4-9-r54.PubMed CentralPubMedView Article
- Pal C, Papp B, Lercher MJ, Csermely P, Oliver SG, Hurst LD: Chance and necessity in the evolution of minimal metabolic networks. Nature. 2006, 440: 667-670. 10.1038/nature04568.PubMedView Article
- Ma H, Zeng AP: Reconstruction of metabolic networks from genome data and analysis of their global structure for various organisms. Bioinformatics. 2003, 19: 270-277. 10.1093/bioinformatics/19.2.270.PubMedView Article
- Pinney JW, Shirley MW, McConkey GA, Westhead DR: metaSHARK: software for automated metabolic network prediction from DNA sequence and its application to the genomes of Plasmodium falciparum and Eimeria tenella. Nucleic Acids Res. 2005, 33: 1399-1409. 10.1093/nar/gki285.PubMed CentralPubMedView Article
- Pinney JW, Papp B, Hyland C, Wambua L, Westhead DR, McConkey GA: Metabolic reconstruction and analysis for parasite genomes. Trends Parasitol. 2007, 23: 548-554. 10.1016/j.pt.2007.08.013.PubMedView Article
- Aoki-Kinoshita KF, Kanehisa M: Gene annotation and pathway mapping in KEGG. Methods Mol Biol. 2007, 396: 71-91. full_text.PubMedView Article
- Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M: The KEGG resource for deciphering the genome. Nucleic Acids Res. 2004, 32: D277-280. 10.1093/nar/gkh063.PubMed CentralPubMedView Article
- Arita M: The metabolic world of Escherichia coli is not small. Proc Natl Acad Sci USA. 2004, 101: 1543-1547. 10.1073/pnas.0306458101.PubMed CentralPubMedView Article
- Verkhedkar KD, Raman K, Chandra NR, Vishveshwara S: Metabolome based reaction graphs of M. tuberculosis and M. leprae: a comparative network analysis. PLoS ONE. 2007, 2: e881-10.1371/journal.pone.0000881.PubMed CentralPubMedView Article
- Gardner MJ, Hall N, Fung E, White O, Berriman M, Hyman RW, Carlton JM, Pain A, Nelson KE, Bowman S: Genome sequence of the human malaria parasite Plasmodium falciparum. Nature. 2002, 419: 498-511. 10.1038/nature01097.PubMedView Article
- Berriman M, Ghedin E, Hertz-Fowler C, Blandin G, Renauld H, Bartholomeu DC, Lennard NJ, Caler E, Hamlin NE, Haas B: The genome of the African trypanosome Trypanosoma brucei. Science. 2005, 309: 416-422. 10.1126/science.1112642.PubMedView Article
- Gaskell EA, Smith JE, Pinney JW, Westhead DR, McConkey GA: A unique dual activity amino acid hydroxylase in Toxoplasma gondii. PLoS One. 2009, 4: e4801-10.1371/journal.pone.0004801.PubMed CentralPubMedView Article
- Cross GA, Klein RA, Linstead DJ: Utilization of amino acids by Trypanosoma brucei in culture: L-threonine as a precursor for acetate. Parasitology. 1975, 71: 311-326. 10.1017/S0031182000046758.PubMedView Article
- Klein RA, Linstead DJ: Threonine as a perferred source of 2-carbon units for lipid synthesis in Trypanosoma brucei. Biochem Soc Trans. 1976, 4: 48-50.PubMedView Article
- Giaever G, Chu AM, Ni L, Connelly C, Riles L, Veronneau S, Dow S, Lucau-Danila A, Anderson K, Andre B: Functional profiling of the Saccharomyces cerevisiae genome. Nature. 2002, 418: 387-391. 10.1038/nature00935.PubMedView Article
- Winzeler EA, Shoemaker DD, Astromoff A, Liang H, Anderson K, Andre B, Bangham R, Benito R, Boeke JD, Bussey H: Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science. 1999, 285: 901-906. 10.1126/science.285.5429.901.PubMedView Article
- Literature Curation. [ftp://genome-ftp.stanford.edu/pub/yeast/data_download/literature_curation/]
- Kyoto Encyclopedia of Genes and Genomes. [http://www.genome.jp/kegg]
- BioLayout Express 3D. [http://www.biolayout.org/]
- Milenkovic T, Lai J, Przulj N: GraphCrunch: a tool for large network analyses. BMC Bioinformatics. 2008, 9: 70-10.1186/1471-2105-9-70.PubMed CentralPubMedView Article
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.