Global genome analysis of the shikimic acid pathway reveals greater gene loss in host-associated than in free-living bacteria
© Zucko et al; licensee BioMed Central Ltd. 2010
Received: 9 February 2010
Accepted: 11 November 2010
Published: 11 November 2010
A central tenet in biochemistry for over 50 years has held that microorganisms, plants and, more recently, certain apicomplexan parasites synthesize essential aromatic compounds via elaboration of a complete shikimic acid pathway, whereas metazoans lacking this pathway require a dietary source of these compounds. The large number of sequenced bacterial and archaean genomes now available for comparative genomic analyses allows the fundamentals of this contention to be tested in prokaryotes. Using Hidden Markov Model profiles (HMM profiles) to identify all known enzymes of the pathway, we report the presence of genes encoding shikimate pathway enzymes in the hypothetical proteomes constructed from the genomes of 488 sequenced prokaryotes.
Amongst free-living prokaryotes most Bacteria possess, as expected, genes encoding a complete shikimic acid pathway, whereas of the culturable Archaea, only one was found to have a complete complement of recognisable enzymes in its predicted proteome. It may be that in the Archaea, the primary amino-acid sequences of enzymes of the pathway are highly divergent and so are not detected by HMM profiles. Alternatively, structurally unrelated (non-orthologous) proteins might be performing the same biochemical functions as those encoding recognized genes of the shikimate pathway. Most surprisingly, 30% of host-associated (mutualistic, commensal and pathogenic) bacteria likewise do not possess a complete shikimic acid pathway. Many of these microbes show some degree of genome reduction, suggesting that these host-associated bacteria might sequester essential aromatic compounds from a parasitised host, as a 'shared metabolic adaptation' in mutualistic symbiosis, or obtain them from other consorts having the complete biosynthetic pathway. The HMM results gave 84% agreement when compared against data in the highly curated BioCyc reference database of genomes and metabolic pathways.
These results challenge the conventional belief that the shikimic acid pathway is universal and essential in prokaryotes. The possibilities that non-orthologous enzymes catalyse reactions in this pathway (especially in the Archaea), or that there exist specific uptake mechanisms for the acquisition of shikimate intermediates or essential pathway products, warrant further examination to better understand the precise metabolic attributes of host-beneficial and pathogenic bacteria.
In a comparative genomic study of four thermophilic microorganisms (Aquiflex aeolicus, Archaeoglobus fulgidus, Methanobacterium thermoautotrophicum and Methanococcus jannaschii), genes encoding DAHP synthase and DHQ synthase (the first two steps in the pathway) appear missing from these archaeans . For genes to be missing from an essential biosynthetic pathway might be accounted for by the presence of genes having low sequence similarity to the known genes, or substitution by alternative enzymes having an analogous function. Another possibility is that steps of a biosynthetic pathway may be bypassed if substrates and end-products are readily obtained from the surrounding environment . Evolutionary pressures would then lead to loss of genes encoding discrete elements of the pathway, which is an underlying cause of extreme genome reduction and instability as observed in the small genomes of some intracellular pathogens and symbionts, e.g. Mycobacterium leprae, Rickettsia, Bartonella and Buchnera[6–8]. The recent surge in microbial genome sequencing has produced a wealth of genetic data available for comparative genomic analyses to make possible the identification of diverse essential enzymes in critical metabolic pathways within the Archaea and Bacteria. Here we interrogate the hypothetical proteomes of prokaryotes, constructed from their published genomes, to profile the universality of the shikimic acid pathway with a view to understanding a key metabolic process of free-living and host-associated bacteria.
Templates used to interrogate shikimic acid pathway structure in the sequenced genomes of prokaryotes.
Shikimate Pathway Step
Enzymes and Enzyme Isoforms
Source and Genetic/Protein Templates
3-Deoxy-D-arabino-heptulosinate -7- phosphate (DAHP)
DAHP synthase EC 126.96.36.199 (aroF), (aroG), (aroH)
In E. coli there are three DAHP synthetase isoforms, each specifically inhibited by one of the three aromatic amino acids.
KDPGal aldolase EC 188.8.131.52
DHQ synthase EC 184.108.40.206 (aroB)
DHQ synthase exists as type 1 and 2 enzymes (previously EC 220.127.116.11)
Shikimate dehydrase EC 18.104.22.168 (aroD)
shikimate dehydrase and shikimate dehydrogenase are often a bifunctional enzyme
Shikimic acid (shikimate)
Shikimate dehydrogenase EC 22.214.171.124 (aroE)
E. coli has the putative enzyme YdiB paralog 
Shikimate kinase II EC 126.96.36.199 (aroL)
monofunctional shikimate kinase
Archaeal GHMP shikimate kinase
EPSP Synthase EC 188.8.131.52 (aroA)
The AroA gene, coding for the E. coli EPSP synthase, was first isolated from a lambda transducing phage (lambda-serC) found to contain a portion of the E. coli chromosome
Shikimate kinase I
EC 184.108.40.206 (aroM)
Pentafunctional gene consisting of aroB, aroD, aroE, aroL and aroA
Chorismic acid (chorismate)
Chorismate synthesis EC 220.127.116.11 (aroC)
previously EC 18.104.22.168 Chorismate synthase from various sources shows a high degree of sequence conservation.
The presence or absence of a complete shikimic acid pathway deduced from enzymes detected by HMM analysis in the predicted proteomes of 488 prokaryotes.
The presence or absence of a complete shikimic acid pathway deduced from enzymes detected by HMM analysis in the predicted proteomes of 488 Bacteria.
Results were also checked against the BioCyc curated reference database of bacterial genomes and metabolic pathways . Overall, the results were in agreement in 84% of cases (Additional Files 7, 8, 9). Of the 85 non-host associated bacteria represented in the database, 75 were predicted to contain a complete complement of shikimic acid pathway enzymes when compared to the HMM models. An additional 4 complete pathways were detected that were missed by the rapid, yet automated HMM search. The HMM search did, however, predict 4 complete pathways that were missing in the BioCyc database. Similarly, of the 204 host-associated bacteria that were evaluated using the BioCyc database, 168 were found to contain a complete shikimic acid pathway using both BioCyc and HMM searches, with 23 additional complete pathways detected using BioCyc and 12 additional complete pathways detected by the HMM search (Additional File 8). For the Archaea, 33 proteomes could be evaluated, of which 27 were common to both searches with 5 additional complete pathways found in the BioCyc database and 1 using the HMM search (Additional File 9). In searching the human genome for genes encoding enzymes of the shikimic acid pathway, only one previously annotated gene was found when interrogated using HMM profiles. This alignment gave a score of 149 with an e-value of 10-44, but corresponded only to the ATP-binding domain of shikimate kinase encoded on chromosome 10 (accession number NT_008705.15).
The shikimic acid pathway is well recognized in classical biochemistry to be essential for the synthesis of aromatic compounds in prokaryotes, fungi, certain apicomplexans and plants . The lack of the shikimic acid pathway in metazoans, most notably humans as evinced by our dietary requirement for shikimate-derived aromatic compounds, has stimulated much study of this pathway as a possible target for antimicrobial chemotherapy . The emergence of microbial pathogens resistant to many drugs in our current pharmacopeia has prompted widespread efforts to identify suitable novel targets for the design of antimicrobial drugs that lack untoward side effects and, since the pathway is lacking in humans, forms a rational basis for drug selectivity in lead target identification [13, 14]. Accordingly, the structure and evolution of this pathway in eukaryotes has been comprehensively investigated , although conservation of the pathway in prokaryotes has not been subjected to widespread comparative genomic analysis. A pan-genomic bioinformatics evaluation of the conservation of enzymes forming the shikimic acid pathway in prokaryotes was therefore undertaken using automated HMM searching, leading to the unexpected result that nearly one-third of all prokaryotes examined lack a complete, recognizable pathway [Table 2]. Our results were comparable to data in the comprehensively curated BioCyc database (Additional Files 7, 8, 9). Data, however, had to be extracted from this database manually which proved labour intensive and time consuming, nevertheless useful for method validation.
Expression of the shikimic acid pathway is regulated via feedback inhibition by pathway intermediates and downstream products . Although a variety of functional, biophysical, and fitness-related variables influence the evolutionary rates of proteins , the level of gene expression is one of the major determinants . If a protein is highly expressed, its overall indispensability to the organism is greater than if it were expressed only at low levels, so that the functionally active amino acid residues of the protein would be under strong purifying selection . Such selection on a large number of these protein residues leads to an overall reduced evolutionary rate and overall conservation of the metabolic pathway, since mutations in essential proteins are apt to be deleterious . Regulation of the shikimic acid pathway can, therefore, be coupled to the exogenous availability of products of its component enzymes, giving a positive selective force leading to the loss of pathway genes. This follows from the surprising result that large numbers of host-associated bacteria lack a complete shikimic acid pathway.
Many of these bacteria are associated with the human microbiome, but no enzymes of the shikimic acid pathway could be detected using the HMM profiles on the translated human genome. This supports current dogma that the human host does not synthesize shikimate-derived aromatic compounds de novo, and leads to the strong inference that human-associated heterotrophic bacteria having genomes that encode an incomplete shikimic acid pathway may have evolved highly efficient means of extracting essential shikimate-related metabolites from their microbial environment. In symbiosis this could be from trophically derived metabolites assimilated by the host or from metabolites produced by other bacterial consorts having a complete and functional shikimic acid pathway. Uptake mechanisms for intermediates in the shikimic acid pathway and for some of the products of chorismate-utilizing enzymes are known in bacteria. For example, the shikimate permease ShiA , various aromatic amino acid permeases , and transporters for vitamins  and folic acid  are known, but the full phylogenetic distribution of these uptake systems and their relevance in complementing shikimic acid pathway-depleted prokaryotes are yet to be determined. Sequestering of other shikimate-derived metabolites, for example, ubiquinones, menaquinones, iron chelating siderophores and vitamins remains unknown. Substituting these essential metabolites into synthetic growth media might be one approach to successfully culturing those symbionts so far refractory to laboratory culture ex hospite.
On examination of the 91 host-associated bacteria lacking a complete pathway in detail [Additional File 1], most (67/91) had lost five or more of the genes encoding enzymes of the shikimic acid pathway, whereas nearly all of the rest (20/91) have only lost a single enzyme. In the entire set of host-associated bacteria, genes encoding the seven different enzymes are lost rather uniformly [Figure 4], with the first enzyme of the pathway accounting for only 67/440 of lost genes. However, in the 20 host-associated bacteria that have only lost one gene, the majority (16/20) have lost the gene encoding the first enzyme of the pathway. This pattern would be expected if selection were occurring under conditions in which the pathway was induced, because a later block might result in the accumulation of redundant intermediates of the pathway, which would likely be deleterious for the bacterium. A possible scenario is that functional loss of the shikimic acid pathway could be an early step toward sustaining a host-associated life style in which bacteria are prevented from outgrowing their hosts in times of nutritional stress.
The phylogenetically widespread and differential lack of orthologous genes encoding shikimic acid pathway enzymes in free-living Archaea [Figures 6, 7] seems unlikely to be circumvented by evolving specific uptake mechanisms for essential aromatic compounds since metabolites derived from the shikimic acid pathway are known to be limiting in natural environments. Indeed, the presence of these compounds secreted by bacteria can act as predatory chemoattractants for soil amoebae . Given the variability in which particular enzymes are missing from such a wide sample of the Archaea and host-associated Bacteria, cultivable or not, there is no easy genetic explanation for this loss, since the genes encoding individual enzymes of the shikimate pathway are not clustered in these prokaryotes [Additional File 10]. In bacteria there is evidence for the intriguing possibility that in pathway equilibrium, lost intermediates from "missing" enzymic reactions could be supplied by reverse biosynthesis, as was demonstrated in E. coli for quinic and dehydroquinic acids derived from shikimic acid uptake .
Reductive evolution is the process whereby host-associated consorts decrease their genome size by abandoning genes that are needed by free-living microorganisms but that are dispensable when living in association if essential gene products are readily available from the host or from other symbiotic partners. A domino effect would follow: the more enzymes that are lost, the less likely are bacteria able to survive without the provision of shikimate pathway intermediates or end-products, driving survival toward obligate symbiotic associations and loss of the metabolic independence needed for culture ex hospite. This scenario is especially true for the accelerated evolution of endosymbiotic lineages as expected by the combined effects of the accumulation of irreversible mutations (Muller's ratchet) and mutational bias .
The most obvious explanation for "missing" enzymes is the existence of functionally equivalent proteins that lack homology to the HMM models used in this study. Examples of non-orthologous gene replacements encoding enzymes that catalyze the same reaction indeed are known for the shikimic acid pathway and were tested in this study. These include the first step catalyzed by DAHP synthase , and the fifth step, catalyzed by shikimate kinase . However, in our study there was no evidence based on HMM profiling using the non-orthologous proteins as models to indicate that such non-orthologous enzymes replaced those missing in the prokaryotes studied, which strongly suggests that other enzymes that have yet to be identified may fill these gaps [Additional File 4, 5, 6]. This would suggest that, at least in the Archaea, these prokaryotes can synthesize aromatic compounds by a novel biochemical pathway that is yet to be discovered. Indeed, examining the nutritional requirements of the Archaea, as evinced by a survey of the growth media recommended by the DSMZ culture collection http://www.dsmz.de/, reveals that most of the favoured media are minimal, lacking exogenous aromatic amino acids.
This comparative bioinformatics analysis of genes of the shikimic acid pathway in prokaryotes provides essential details that should help guide the choice of key organisms for future studies designed to reveal new metabolic processes in shikimic acid biosynthesis or to validate the loss of key biosynthetic genes. Such studies will undoubtedly provide new insight into the evolutionary history of the shikimic acid pathway that is particularly important in understanding how pathogenic bacteria synthesize or acquire shikimate-derived products and thus to identify new targets for antibiotic treatment.
This study analysed the completed genome sequences of organisms on the NCBI Microbial Genome Projects page http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi listed on June 3, 2009. The genomes were grouped using the 'all bacteria' and 'all archaea' selection tool and then filtered using the 'organism info' page to select for symbiotic prokaryotes using the key words 'disease', 'pathogenic in' and 'habitat - host associated'. Free-living prokaryotes were taken as all other entries with the key words 'habitat - multiple, aquatic, specialized or terrestrial'. The predicted proteomes from these genomes for each of the prokaryotes were downloaded from the FTP server at the NCBI ftp://ftp.ncbi.nih.gov/genomes/Bacteria/. The DNA sequences of all 23 haploid chromosomes and mitochondrial DNA of the human genome http://www.ncbi.nlm.nih.gov/sites/entrez?Db=genomeprj&cmd=ShowDetailView&TermToSearch=9558 were translated into all six reading frames using Transeq http://www.ebi.ac.uk/emboss/transeq. For profile analyses, HMMER version 2.3.2 http://hmmer.janelia.org and release 20 of the Pfam database http://www.sanger.ac.uk/Software/Pfam were used. HMM profiles for each of the enzymes of the shikimic acid pathway, KDPGal aldolase, GHMP kinase [see also Table 1], the hypothetical proteomes of the 486 prokaryotes analyzed, and a script written in Bioperl to manipulate output from HMMER analyses are available as flat files at the server http://bioserv2.pbf.hr/bmc/. Results were compared against existing metabolic pathway annotations available for 322 prokaryote proteomes in the BioCyc database http://biocyc.org/.
The authors would like to thank Gus Ronngren for his technical assistance. Financial support for this work has come from The School of Pharmacy, University of London (to BA, TL, JW and PFL) and the European Commission's Lifelong Learning (Erasmus) Programme (to FC, LH and PFL); a cooperation grant of the German Academic Exchange Service (DAAD) and the Ministry of Science, Education and Sports, Republic of Croatia (to DH and JC), by a stipendium of the DAAD (to JZ). Additional support for this work has come from the Australian Institute of Marine Science (to WCD) and the University of Maine, USA (to JMS).
- Knaggs AR: The biosynthesis of shikimate metabolites. Nat Prod Rep. 2003, 20: 119-136. 10.1039/b100399m.PubMedView ArticleGoogle Scholar
- Starcevic A, Akthar S, Dunlap WC, Shick JM, Hranueli D, Cullum J, Long PF: Enzymes of the shikimic acid pathway encoded in the genome of a basal metazoan, Nematostella vectensis, have microbial origins. Proc Natl Acad Sci USA. 2008, 105: 2533-7. 10.1073/pnas.0707388105.PubMed CentralPubMedView ArticleGoogle Scholar
- Nikoh N, Nakabachi A: Aphids acquired symbiotic genes via lateral gene transfer. BMC Biol. 2009, 7: 12-10.1186/1741-7007-7-12.PubMed CentralPubMedView ArticleGoogle Scholar
- Woodard RW: Unique biosynthesis of dehydroquinic acid. Bioorg Chem. 2004, 32: 309-315. 10.1016/j.bioorg.2004.06.003.PubMedView ArticleGoogle Scholar
- Cordwell SJ: Microbial genomes and "missing" enzymes: redefining biochemical pathways. Arch Microbiol. 1999, 172: 269-279. 10.1007/s002030050780.PubMedView ArticleGoogle Scholar
- Andersson SG, Alsmark C, Canbäck B, Davids W, Frank C, Karlberg O, Klasson L, Antoine-Legault B, Mira A, Tamas I: Comparative genomics of microbial pathogens and symbionts. Bioinformatics. 2002, 18 (Suppl 2): S17-PubMedView ArticleGoogle Scholar
- Dagan T, Blekhman R, Graur D: The "domino theory" of gene death: gradual and mass gene extinction events in three lineages of obligate symbiotic bacterial pathogens. Mol Biol Evol. 2006, 23: 310-316. 10.1093/molbev/msj036.PubMedView ArticleGoogle Scholar
- Moran NA, McLaughlin HJ, Sorek R: The dynamics and time scale of ongoing genomic erosion in symbiotic bacteria. Science. 2009, 323: 379-382. 10.1126/science.1167140.PubMedView ArticleGoogle Scholar
- Daugherty M, Vonstein V, Overbeek R, Osterman A: Archaeal shikimate kinase, a new member of the GHMP-kinase family. J Bacteriol. 2001, 183: 292-300. 10.1128/JB.183.1.292-300.2001.PubMed CentralPubMedView ArticleGoogle Scholar
- Ran N, Draths KM, Frost JW: Creation of a shikimate pathway variant. J Am Chem Soc. 2004, 126: 6856-7. 10.1021/ja049730n.PubMedView ArticleGoogle Scholar
- Caspi R, Foerster H, Fulcher CA, Kaipa P, Krummenacker M, Latendresse M, Paley S, Rhee SY, Shearer AG, Tissier C, Walk TC, Zhang P, Karp PD: The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases. Nucleic Acids Res. 2008, D623-31. 36 Database
- Ducati RG, Basso LA, Santos DS: Mycobacterial shikimate pathway enzymes as targets for drug design. Curr Drug Targets. 2007, 8: 423-35. 10.2174/138945007780059004.PubMedView ArticleGoogle Scholar
- Dias MV, Ely F, Palma MS, de Azevedo WF, Basso LA, Santos DS: Chorismate synthase: an attractive target for drug development against orphan diseases. Curr Drug Targets. 2007, 8: 437-444. 10.2174/138945007780058924.PubMedView ArticleGoogle Scholar
- McConkey GA: Targeting the shikimate pathway in the malaria parasite Plasmodium falciparum. Antimicrob Agents Chemother. 1999, 43: 175-177.PubMed CentralPubMedGoogle Scholar
- Richards TA, Dacks JB, Campbell SA, Blanchard JL, Foster PG, McLeod R, Roberts CW: Evolutionary origins of the eukaryotic shikimate pathway: gene fusions, horizontal gene transfer, and endosymbiotic replacements. Eukaryot Cell. 2006, 5: 1517-1531. 10.1128/EC.00106-06.PubMed CentralPubMedView ArticleGoogle Scholar
- Krämer M, Bongaerts J, Bovenberg R, Kremer S, Müller U, Orf S, Wubbolts M, Raeven L: Metabolic engineering for microbial production of shikimic acid. Metab Eng. 2003, 5: 277-283. 10.1016/j.ymben.2003.09.001.PubMedView ArticleGoogle Scholar
- Rocha EPC, Danchin A: An analysis of determinants of protein substitution rates in Bacteria. Mol. Biol. Evol. 2004, 21: 108-116. 10.1093/molbev/msh004.PubMedView ArticleGoogle Scholar
- Drummond DA, Raval A, Wilke CO: A single determinant dominates the rate of yeast protein evolution. Mol Biol Evol. 2006, 23: 327-337. 10.1093/molbev/msj038.PubMedView ArticleGoogle Scholar
- Wilson A, Carlson SS, White TJ: Biochemical evolution. Annu Rev Biochem. 1977, 46: 573-639. 10.1146/annurev.bi.46.070177.003041.PubMedView ArticleGoogle Scholar
- Ohta T: Slightly deleterious mutant substitutions in evolution. Nature. 1973, 246: 96-98. 10.1038/246096a0.PubMedView ArticleGoogle Scholar
- Whipp MJ, Camakaris H, Pittard AJ: Cloning and analysis of the shiA gene, which encodes the shikimate transport system of Escherichia coli K-12. Gene. 1998, 209: 185-192. 10.1016/S0378-1119(98)00043-2.PubMedView ArticleGoogle Scholar
- Reizer J, Finley K, Kakuda D, MacLeod CL, Reizer A, Saier MH: Mammalian integral membrane receptors are homologous to facilitators and antiporters of yeast, fungi, and eubacteria. Protein Sci. 1993, 2: 20-30. 10.1002/pro.5560020103.PubMed CentralPubMedView ArticleGoogle Scholar
- Rodionov DA, Hebbeln P, Eudes A, ter Beek J, Rodionova IA, Erkens GB, Slotboom DJ, Gelfand MS, Osterman AL, Hanson AD, Eitinger T: A novel class of modular transporters for vitamins in prokaryotes. J Bacteriol. 2009, 191: 42-51. 10.1128/JB.01208-08.PubMed CentralPubMedView ArticleGoogle Scholar
- Maeda Y, Mayanagi T, Amagai A: Folic acid is a potent chemoattractant of free-living amoebae in a new and amazing species of protist, Vahlkampfia sp. Zoolog Sci. 2009, 26: 179-186. 10.2108/zsj.26.179.PubMedView ArticleGoogle Scholar
- Knop DR, Draths KM, Chandran SS, Barker JL, von Daeniken R, Weber W, Frost JW: Hydroaromatic equilibrium during biosynthesis of shikimic acid. J Am Chem Soc. 2001, 123: 10173-10182. 10.1021/ja0109444.PubMedView ArticleGoogle Scholar
- Moran NA: Accelerated evolution and Muller's ratchet in endosymbiotic bacteria. Proc Natl Acad Sci USA. 2009, 93: 2873-2878. 10.1073/pnas.93.7.2873.View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.