The shikimic acid pathway is well recognized in classical biochemistry to be essential for the synthesis of aromatic compounds in prokaryotes, fungi, certain apicomplexans and plants . The lack of the shikimic acid pathway in metazoans, most notably humans as evinced by our dietary requirement for shikimate-derived aromatic compounds, has stimulated much study of this pathway as a possible target for antimicrobial chemotherapy . The emergence of microbial pathogens resistant to many drugs in our current pharmacopeia has prompted widespread efforts to identify suitable novel targets for the design of antimicrobial drugs that lack untoward side effects and, since the pathway is lacking in humans, forms a rational basis for drug selectivity in lead target identification [13, 14]. Accordingly, the structure and evolution of this pathway in eukaryotes has been comprehensively investigated , although conservation of the pathway in prokaryotes has not been subjected to widespread comparative genomic analysis. A pan-genomic bioinformatics evaluation of the conservation of enzymes forming the shikimic acid pathway in prokaryotes was therefore undertaken using automated HMM searching, leading to the unexpected result that nearly one-third of all prokaryotes examined lack a complete, recognizable pathway [Table 2]. Our results were comparable to data in the comprehensively curated BioCyc database (Additional Files 7, 8, 9). Data, however, had to be extracted from this database manually which proved labour intensive and time consuming, nevertheless useful for method validation.
Expression of the shikimic acid pathway is regulated via feedback inhibition by pathway intermediates and downstream products . Although a variety of functional, biophysical, and fitness-related variables influence the evolutionary rates of proteins , the level of gene expression is one of the major determinants . If a protein is highly expressed, its overall indispensability to the organism is greater than if it were expressed only at low levels, so that the functionally active amino acid residues of the protein would be under strong purifying selection . Such selection on a large number of these protein residues leads to an overall reduced evolutionary rate and overall conservation of the metabolic pathway, since mutations in essential proteins are apt to be deleterious . Regulation of the shikimic acid pathway can, therefore, be coupled to the exogenous availability of products of its component enzymes, giving a positive selective force leading to the loss of pathway genes. This follows from the surprising result that large numbers of host-associated bacteria lack a complete shikimic acid pathway.
Many of these bacteria are associated with the human microbiome, but no enzymes of the shikimic acid pathway could be detected using the HMM profiles on the translated human genome. This supports current dogma that the human host does not synthesize shikimate-derived aromatic compounds de novo, and leads to the strong inference that human-associated heterotrophic bacteria having genomes that encode an incomplete shikimic acid pathway may have evolved highly efficient means of extracting essential shikimate-related metabolites from their microbial environment. In symbiosis this could be from trophically derived metabolites assimilated by the host or from metabolites produced by other bacterial consorts having a complete and functional shikimic acid pathway. Uptake mechanisms for intermediates in the shikimic acid pathway and for some of the products of chorismate-utilizing enzymes are known in bacteria. For example, the shikimate permease ShiA , various aromatic amino acid permeases , and transporters for vitamins  and folic acid  are known, but the full phylogenetic distribution of these uptake systems and their relevance in complementing shikimic acid pathway-depleted prokaryotes are yet to be determined. Sequestering of other shikimate-derived metabolites, for example, ubiquinones, menaquinones, iron chelating siderophores and vitamins remains unknown. Substituting these essential metabolites into synthetic growth media might be one approach to successfully culturing those symbionts so far refractory to laboratory culture ex hospite.
On examination of the 91 host-associated bacteria lacking a complete pathway in detail [Additional File 1], most (67/91) had lost five or more of the genes encoding enzymes of the shikimic acid pathway, whereas nearly all of the rest (20/91) have only lost a single enzyme. In the entire set of host-associated bacteria, genes encoding the seven different enzymes are lost rather uniformly [Figure 4], with the first enzyme of the pathway accounting for only 67/440 of lost genes. However, in the 20 host-associated bacteria that have only lost one gene, the majority (16/20) have lost the gene encoding the first enzyme of the pathway. This pattern would be expected if selection were occurring under conditions in which the pathway was induced, because a later block might result in the accumulation of redundant intermediates of the pathway, which would likely be deleterious for the bacterium. A possible scenario is that functional loss of the shikimic acid pathway could be an early step toward sustaining a host-associated life style in which bacteria are prevented from outgrowing their hosts in times of nutritional stress.
The phylogenetically widespread and differential lack of orthologous genes encoding shikimic acid pathway enzymes in free-living Archaea [Figures 6, 7] seems unlikely to be circumvented by evolving specific uptake mechanisms for essential aromatic compounds since metabolites derived from the shikimic acid pathway are known to be limiting in natural environments. Indeed, the presence of these compounds secreted by bacteria can act as predatory chemoattractants for soil amoebae . Given the variability in which particular enzymes are missing from such a wide sample of the Archaea and host-associated Bacteria, cultivable or not, there is no easy genetic explanation for this loss, since the genes encoding individual enzymes of the shikimate pathway are not clustered in these prokaryotes [Additional File 10]. In bacteria there is evidence for the intriguing possibility that in pathway equilibrium, lost intermediates from "missing" enzymic reactions could be supplied by reverse biosynthesis, as was demonstrated in E. coli for quinic and dehydroquinic acids derived from shikimic acid uptake .
Reductive evolution is the process whereby host-associated consorts decrease their genome size by abandoning genes that are needed by free-living microorganisms but that are dispensable when living in association if essential gene products are readily available from the host or from other symbiotic partners. A domino effect would follow: the more enzymes that are lost, the less likely are bacteria able to survive without the provision of shikimate pathway intermediates or end-products, driving survival toward obligate symbiotic associations and loss of the metabolic independence needed for culture ex hospite. This scenario is especially true for the accelerated evolution of endosymbiotic lineages as expected by the combined effects of the accumulation of irreversible mutations (Muller's ratchet) and mutational bias .
The most obvious explanation for "missing" enzymes is the existence of functionally equivalent proteins that lack homology to the HMM models used in this study. Examples of non-orthologous gene replacements encoding enzymes that catalyze the same reaction indeed are known for the shikimic acid pathway and were tested in this study. These include the first step catalyzed by DAHP synthase , and the fifth step, catalyzed by shikimate kinase . However, in our study there was no evidence based on HMM profiling using the non-orthologous proteins as models to indicate that such non-orthologous enzymes replaced those missing in the prokaryotes studied, which strongly suggests that other enzymes that have yet to be identified may fill these gaps [Additional File 4, 5, 6]. This would suggest that, at least in the Archaea, these prokaryotes can synthesize aromatic compounds by a novel biochemical pathway that is yet to be discovered. Indeed, examining the nutritional requirements of the Archaea, as evinced by a survey of the growth media recommended by the DSMZ culture collection http://www.dsmz.de/, reveals that most of the favoured media are minimal, lacking exogenous aromatic amino acids.