- Research article
- Open Access
The eukaryotic MEP-pathway genes are evolutionarily conserved and originated from Chlaymidia and cyanobacteria
BMC Genomics volume 22, Article number: 137 (2021)
Isoprenoids are the most ancient and essential class of metabolites produced in all organisms, either via mevalonate (MVA)-and/or methylerythritol phosphate (MEP)-pathways. The MEP-pathway is present in all plastid-bearing organisms and most eubacteria. However, no comprehensive study reveals the origination and evolutionary characteristics of MEP-pathway genes in eukaryotes.
Here, detailed bioinformatics analyses of the MEP-pathway provide an in-depth understanding the evolutionary history of this indispensable biochemical route, and offer a basis for the co-existence of the cytosolic MVA- and plastidial MEP-pathway in plants given the established exchange of the end products between the two isoprenoid-biosynthesis pathways. Here, phylogenetic analyses establish the contributions of both cyanobacteria and Chlamydiae sequences to the plant’s MEP-pathway genes. Moreover, Phylogenetic and inter-species syntenic block analyses demonstrate that six of the seven MEP-pathway genes have predominantly remained as single-copy in land plants in spite of multiple whole-genome duplication events (WGDs). Substitution rate and domain studies display the evolutionary conservation of these genes, reinforced by their high expression levels. Distinct phenotypic variation among plants with reduced expression levels of individual MEP-pathway genes confirm the indispensable function of each nuclear-encoded plastid-targeted MEP-pathway enzyme in plant growth and development.
Collectively, these findings reveal the polyphyletic origin and restrict conservation of MEP-pathway genes, and reinforce the potential function of the individual enzymes beyond production of the isoprenoids intermediates.
With over 55,000 molecules, isoprenoids are the most ancient group of structurally and functionally diverse metabolites essential for all kingdoms of life . Isoprenoid-derived compounds in free-living organisms range from hormones, lipids, pigments, vitamins, electron transport chain and defense compounds, and as such of industrial interests for drugs, agrochemicals, rubber and fragrances . However, despite their diversity, isoprenoids are derived from two universal five-carbon precursors, isopentenyl diphosphate (IPP) and its isomer dimethylallyl diphosphate (DMAPP) . These precursors are synthesized either by mevalonate (MVA)-pathway  and/or by the alternative route methyl erythritol phosphate (MEP)-pathway . Almost all eukaryotes, archae and some gram-positive bacteria employ the MVA-pathway, whereas most gram-negative bacteria, cyanobacteria and green algae exclusively use MEP-pathway (Fig. 1a) . Plastid-bearing eukaryotes however are unique as they have retained both pathways compartmentalized in the cytosol (MVA) and plastids (MEP) . It is suggested that retention of the two pathways in the two different subcellular compartments of the plastid-bearing eukaryotic cell is to regulate isoprenoid biosynthesis according to the availability of carbon and energy currencies, and as a strategy to balance resource allocation between growth and adaptive responses to unfavorable environmental inputs . Given the established metabolic exchanges between MVA- and MEP-pathway in higher plants [8,9,10], the biological grounds for the indispensable function of each of these pathways in plants has remained an enigma.
One of the most profound outcome of evolution is the emergence of plastids through a single endosymbiotic event accompanied by a complex mix of loss, movement and replacement in the ancestor of eukaryotes . The endosymbiotic events that led to the origination of plastids were ensued by the transfer of genetic material from the endosymbiont to the nuclear genome of the host, followed by the evolution of protein import machinery for transferring nuclear-encoded plastid-targeted proteins and by extension the inevitable establishment of plastids-to-nucleus (retrograde signaling) signaling cascades [12, 13]. The retrograde signaling cascade is instrumental for coordination of vital activities between the two subcellular genomes in plastid-bearing eukaryotes.
One essential plastidial biochemical route is the MEP-pathway, responsible for catalyzing glyceraldehyde 3-phosphate and pyruvate into isopentenyl diphosphate (IPP) and dimethylallyl diphospahte (DMAPP), the central intermediates in the biosynthesis of isoprenoids (Fig. 1b). The MEP-pathway is comprised of seven nuclear genes encoding plastid-localized enzymes. Intriguingly, one of the MEP-pathway intermediates, MEcPP (2-C-methyl-D-erythritol-2, 4-cyclopyrophosphate), is found to be a bi-functional entity severing as a precursor of isoprenoids and as a stress-specific retrograde signaling metabolite coordinating expression of selected stress-response nuclear genes [14, 15].
Given the antiquity and essential function of isoprenoids, the evolutionary history of the MVA-pathway in eukaryotes is extensively examined [2, 6] whereas, the characteristics of MEP-pathway genes is thus far restricted to limited species and as such incomplete . Understanding the evolutionary history of the MEP-pathway across a wide range of species offers a novel insight into their contribution to the evolution of primary plastids.
Here, extensive and integrated phylogenetic analyses identify alpha-proteobacteria, cyanobacteria and Chlamydiae as the bacterial lineages that have contributed to the evolution of MEP-pathway genes in plastid-bearing eukaryotes. Syntenic analyses establish the predominant presence of MEP-pathway genes as single-copy resulted from the loss of duplicated copies post whole genome duplication (WGD) events in land plants. Inter-species syntenic block and substitution rate analyses reveal the evolutionarily conservation of the MEP-pathway genes. Moreover, genetics analyses establish essential but differential functions of the MEP-pathway enzymes in plant growth and development.
In summary, the finding uncovers the evolutionary history and characteristics of plastidial isoprenoid biosynthesis-pathway genes, and reinforces the uniqueness of the MEP-pathway for unmasking the origins and evolution of plastids.
MEP-pathway genes in plastid-bearing eukaryotes are derived from different bacteria lineages
To gain insight into the evolution of the MEP-pathway genes, we constructed phylogenetic trees for individual genes by using protein sequences of a wide range of species from eukaryotes, cyanobacteria, PVC (Planctomycetes, Verrucomicrobia and Chlamydiae) group bacteria, and other non-cyanobacteria and non-PVC group bacteria (hereafter named them as ‘other-eubacteria’). These analyses reveal the multiple origins of MEP-pathway genes in plastid-bearing eukaryotes (Figs. 2 a-f, 3 and S1-S7).
The phylogenetic tree analyses show DXS and MDS in plastid-bearing eukaryotes and other-eubacteria are sister groups. It is of note that while DXS in plastid-bearing eukaryotes is clearly derived from alpha-proteobacteria (Figs. 2a and S1), the specific inheritance source for MDS remains unclear (Figs. 2b and S2). The maximum likelihood tree of MDS moderately supports Aquifex Aeolicus and Leptospira interrogans as the closest relatives of plastid-bearing eukaryotes (Fig. S2A), whereas, the Bayesian tree clusters Deinococcus-Thermus bacteria (Thermus thermophiles, Deinococcus radiodurans) with plastid-bearing eukaryotes (Fig. S2B).
The phylogenetic trees of DXR and HDR group eukaryotes sequences with cyanobacteria (Figs. 2 c-d, and S3-S4), and eukaryotic CMS and CMK in cluster with Chlamydiae (Figs. 2 e-f and S5-S6).
Interestingly, the phylogenetic tree of HDS separates the plastid-bearing eukaryotes into two clades; one clade clusters with Chlamydiae and the other is closest to the cyanobacteria homologue (Figs. 3a and S7). Moreover, protein structure analyses show that depending on the organism, HDS enzymes have two different types of gcpE domains. Eubacteria HDS contain type I gcpE domain comprised of two N- and C-terminal parts, whereas the type II domain present in plants contains an additional domain between the N- and C-terminal parts of the protein, thought to enable the enzyme to function as a monomer (Fig. 3b) . Domain analyses identify red algae HDS as an eubacteria-like type I-enzyme rather than the expected type II-enzyme in eukaryotes, while Chlamydiae HDS possesses the type II domain instead of the expected eubacterial type I domain (Fig. 3b).
Collectively the results display contributions of different bacterial lineages to the origins of MEP-pathway genes in plastid-bearing eukaryotes.
Duplicated DXS copies are not functionally redundant
Among the seven MEP-pathway enzymes, DXS catalyzes the first step in isoprenoid biosynthesis  (Fig. 1b). Phylogenetic analyses of DXS show the presence of one gene copy in examined algae and eubacteria, and its expansion into three subfamilies (I to III) in land plants (Figs. 2a and S1).
In the subfamily I, gene duplications in each common ancestor of Brassicaceae (Cruciferae, e.g. cabbage and turnip) and Fabaceae (legume, e.g. soybean) resulted in the presence of two genes (DXS1 and DXS2). In the subfamily II, there is only one DXS copy, designated DXS3 in Arabidopsis thaliana, Brassicaceae and the ancestor of Fabaceae. Moreover, the subfamily II is absent in gymnosperms, but duplicated copies are present in several moss and lycophyte species. Strikingly, the subfamily III branch is lost in Brassicaceae family, whereas Fabaceae and grape display species-specific duplication(s), and gymnosperms maintain two copies of subfamily III in their ancestor.
Interestingly, despite of the presence of three DXS subfamilies in land plants, only one is reported to function as a housekeeping MEP-pathway gene, such as DXS1 in A. thaliana that encodes the functional MEP-pathway enzyme . This is supported by plastidial localization of DXS1 in Fabaceae species Medicago and soybean, in line with the function of the enzyme catalyzing the first step of the MEP-pathway [20, 21]. The DXS2, which has no DXS activity, is assumed to synthesize specific isoprenoids related to mycorrhiza in Medicago . DXS3, as the most divergent member of the family, has the expected DXS enzyme activity, but is expressed at very low levels, provoking the idea of its involvement in the synthesis of phytohormones in maize . It is of note that in Escherichia coli DXS is responsible for production of vitamin B6, but this synthesis in plants utilizes intermediates of the glycolytic and pentose phosphate pathways rather than that of DXS . This information eliminates the possible function of plant’s additional DXSs in vitamin B6 production.
In summary, despite the preservation of duplicated DXS copies in land plants, only one copy has retained the ancestral function of catalyzing the first step of the MEP-pathway, alluding to a possible loss or neo-functionalization of additional copies.
MEP-pathway genes are predominantly single copies
WGD events are most prevalent during angiosperm’s diversity and are found in the common ancestors of seed plants . However in spite of gene duplication events there are ~ 3124 nuclear-encoded single-copy genes, comprising ~ 11% of Arabidopsis genome, shared by other angiosperms . Excluding DXS, and with exception of few species that experienced a recent WGD, such as soybean and Brassica oleracea (Figs. S2-S7), the remainder six MEP-Pathway genes are among the single-copy genes in all algae and most land plants. Even in exceptional cases of soybean and Brassica oleracea, the MEP-pathway gene such as CMS remained as single copy (Fig. S5).
Despite multiple WGD events, the predominant presence of MEP-pathway genes as single-copy in most land plants leads to the question of when the duplicated copies were lost. To address this, we constructed intra-species conserved syntenic blocks of MEP-pathway genes for A. thaliana and Oryza sativa, the model eudicot and monocot species respectively. Both species are current diploids even though their most recent common ancestors experienced two WGDs. Except for the DXS in A. thaliana, the MEP-pathway genes in both species are single-copy, separately positioned in a syntenic block surrounded by pairs of paralogs retained after WGD(s) (Fig. 4 and Data S1). This data supports the notion of loss of MEP-pathway genes post WGD.
To gain insight into the fate of the ‘lost’ copy of MEP-pathway genes, we searched for remnants of duplication events, but found no evidence such as the presence of a pseudogene for any of the MEP-pathway genes in A. thaliana genome.
MEP-pathway genes are evolutionarily conserved
To investigate the evolutionary characteristics of MEP-pathway genes, we examined the evolutionary rate, and domain architectures of the encoded proteins.
The evolutionary divergence of DNA can be estimated by the ratio of substitution rates at non-synonymous (dN; amino acid altering) and synonymous (dS; silent) sites, a measure of the dynamics of molecular evolution . That is, a significantly low ratio of dN/dS marks slow evolution and as such the conserved nature of the protein. To investigate the MEP-pathway genes’ evolutionary rates we examined their respective dN/dS ratios in selected species from represented lineages of plastid-bearing eukaryotes. The markedly low dN/dS median values ranging 0.04–0.14 suggests a strong purification selection for all the seven MEP-pathway genes, thereby supporting their evolutionary conservation (Fig. 5a and Table S1).
Moreover the analyses of the protein domain(s) structure of MEP-pathway enzymes establishes that, with the exception of HDS, an enzyme with two different types of gcpE domains (Fig. 3b), the protein structures of the remainder of MEP-pathway enzymes are universally conserved .
MEP-pathway genes are highly expressed
There are two theories regarding gene conservation as the result of evolutionary rate of proteins; i) an inverse relationship between the expression levels and the evolutionary rate ; and ii) a slow evolution of functionally critical genes as opposed to less critical ones . To test the potential contribution of these two scenarios to the high conservation of the MEP-pathway genes, we obtained and ranked expression levels of all MEP-pathway genes by analyzing the publicly available genome-wide transcriptomic datasets of representative land species, such as eudicots (A. thaliana and soybean), monocots (O. sativa and Zea mays), gymnosperm (Picea abies), moss (P. patens) and lycophyte (S. moellendorffii). The data illustrate high expression levels for most MEP-pathway genes with the exception of the three duplicated copies of DXS and two duplicates of HDS in P. patens, and one duplicated copy of CMK in soybean (Fig. 5b and Table S2). Notably, in most species, expression-ranking data places the first two genes (DXS and DXR) and the last three genes (MDS, HDS and HDR) amongst the top 5–10% most abundant transcripts.
To compensate for the absence of transcriptomic datasets for several lineages, we recruited a widely used quantitative method, Codon Adaptation Index (CAI), to predict the expression level of a gene based on its codon sequence. The rationale of CAI is based on codon degeneracy, and that the highly expressed genes are biased towards the codon decoded by the most abundant tRNA species . We therefore calculated CAIs of all MEP-pathway genes from represented species with and without transcriptomic datasets in all life lineages. In most analyzed species, the MEP-pathway genes have a CAI value higher than 0.7 (Table S3). The median CAI values for the MEP-pathway genes (0.76–0.80), denote their high expression levels in all life lineages analyzed (Fig. 5c).
In summary, the high expression levels of the MEP-pathway genes support their evolutionary conservation.
MEP-pathway genes are indispensable for plant growth
Except green algae, plants possess both the cytosolic MVA- and the plastidial MEP-pathways, despite the established exchanges of the end products between the two isoprenoid producing routes . Given the indispensable function of MEP-pathway genes in eubacteria [32, 33], we employed genetic approaches to test the likelihood of the essentiality of the MEP-pathway genes in plants.
Unavailability of T-DNA insertion lines for the MEP-pathway genes, led us to employ the previously generated RNAi lines that were maintained as segregating population for individual MEP-pathway genes in A. thaliana . Homozygous RNAi lines, each with 92–95% reduced expression levels of the corresponding MEP-pathway genes , displayed seedling size and variegation leaf phenotypes distinct from each other and from those of the wild type plants transformed with an empty vector (EV). These visibly altered phenotypes include dwarfed stature of asDXR, asMDS and asHDS lines; in concert with pale-yellow leaves phenotype of the asMDS seedlings, and an albino phenotype of true leaves in all the other six RNAi lines (Fig. 6).
In summary, the phenotypes of RNAi lines confirm the indispensable function of MEP-pathway enzymes in plant growth and development, and that the markedly different size and phenotypic characteristics of each line suggest the involvement of these enzymes in distinct functions in addition of their role as intermediates in isoprenoids biosynthesis pathway.
The MEP-pathway is comprised of seven nuclear-encoded plastid-localized enzymes, essential for plant growth and key to stress-specific retrograde signaling as evidenced by the function of the MEP-pathway intermediate, methylerythritol cyclodiphosphate (MEcPP) as a retrograde signaling metabolite . The retrograde signaling function of MEcPP offers an exciting justification regarding the necessity of the MEP-pathway existence, not only for the production of the isoprenoids but also for retrograde signaling function of each of intermediates essential for coordinated action of the two organelles. This possibility could also explain the coexistence of MVA- and MEP-, the two isoprenoid producing pathways in plants.
MEP-pathway genes are resistant to duplication
In land plants, all the MEP-pathway genes with the exception of DXS, are present as single-copy in all the analyzed diploid plants in spite of ancient WGD events. In fact, although DXS experienced duplications, only one copy maintained the MEP-pathway-based enzyme activity [19,20,21]. The critical nature of gene duplication as a source of evolutionary innovation and adaptation , raises the question of why the MEP-pathway genes have remained single-copies. One explanation might be that under the relaxation of selective pressure, the duplicated copy is inclined to accumulate deleterious mutations , which in turn could result in a dominant negative inhibition of the other functional copy. Indeed this is in stark contrast with the existence of multiple copies of the cytosolic MVA-pathway genes, such as functionally redundant AACT1 and AACT2 (anthocyanin-5-aromatic acyl transferase-like) both of which encode the initial enzyme of the MVA-pathway , or HMG1 and HMG2 encoding the HMGR (3-hydroxy-3-methylglutaryl CoA reductase) , and MVD1 and MVD2 encoding the MVD (mevalonate diphosphate decarboxylase) . Plants lacking AACT1 or HMG2 are viable with no apparent phenotypes, in contrast to indispensability of MEP-pathway genes.
The polyphyletic origin of MEP-pathway genes
Among seven MEP-pathway genes, DXS and MDS have originated from ‘other-eubacteria’. The closest sister clade of plastid-bearing eukaryotes DXS is alpha-proteobacteria, also the known ancestor of mitochondrion . This suggests that plastid-bearing eukaryotes DXS might have originated directly from alpha-proteobacteria via horizontal gene transfer (HGT), or indirectly via endosymbiotic gene transfer (EGT) from the mitochondrion genome.
We were unable to place the origin of eukaryotic MDS, but through expanded phylogenetic analyses we determined notable homology between Chlamydiae and eukaryotes sequences of three (CMS, CMK, and HDS) of the seven MEP-pathway genes. Specifically, the phylogenetic trees of CMS and CMK, show that eukaryotes lineage form a sister cluster with the corresponding Chlamydiae gene, suggestive of HGTs between Chlamydiae and the common ancestor of eukaryotes. In addition, phylogenetic analyses of HDS depict clustering of red algae with cyanobacteria as opposed to other plastid-bearing eukaryotes that form a sister group with Chlamydiae. One potential explanation for this bifurcated clustering is that the ancestral plastid-bearing eukaryotes acquired HDS from Chlamydiae, but in red algae ancestor, the chlamydial HDS was lost as the result of two major phases of genome reduction , but later it was replaced by the second HGT event from cyanobacteria.
The necessity of Chlamydiae like HDS enzyme in plastid bearing organisms potentially could be justified as a response to the changing environmental conditions over time. Based on the oldest eukaryotic algae fossils findings in conjunction with the molecular clock data, plastids are predicted to have originated in Mesoproterozoic era ~ 0.9–1.7 billion years ago . During the Proterozoic era, oxygen began to rise and built up to above 10% of the levels existed in the atmosphere at Mesoproterozoic era [42, 43]. Simultaneously, the earth entered a warm period ending glaciations and raising the tropical mean sea surface temperatures from ~ 19.4–28.7 °C .
It is well established that HDS, a [4Fe-4S]-protein reactive to oxygen species, is hypersensitive to high radiation and supra-optimal temperatures. Under these unfavorable conditions inhibition of HDS results in accumulation of its substrate, MEcPP, that in turn protects MEP-pathway activity by restricting oxidative stress [45,46,47]. Plastid-bearing eukaryotes are frequently and simultaneously exposed to reactive oxygen species, high light irradiance and hot temperatures, and one could consider HDS enzyme as the gatekeeper maintaining the MEP-pathway’s functionality.
Accordingly, we propose that the evolutionary pressures resulted from high oxygen and higher temperatures at the era of plastid establishment may have led to acquisition of a monomeric Chlamydiae like HDS in plastid-bearing organisms. The presence of a middle domain in the monomeric enzyme would have provided a higher ratio of protein/ labile [Fe-S]-iron cluster, thereby a functionally more efficient enzyme than the dimeric form in a high oxygen and high temperature environment. As such, plastid-bearing eukaryotes, acquired the more efficient monomeric HDS donated by Chlamydiae.
Our overall finding poses the question of how multiple donors could have contributed to the MEP-pathway. The simplified schematic (Fig. 7) depicts the three potential scenarios addressing the question. Scenario I proposes inherited chimerism by EGT, and that the occurrence of prokaryotic HGT to the cyanobacteria genome happened prior to endosymbiosis event leading to plastid formation . If so, cyanobacteria must have acquired chlamydial MEP-pathway genes through HGT before EGT in plants (Fig. 7). In such a case, one would expect the presence of chlamydial type CMS, CMK and HDS sequences in Gloeomargarita lithophora genome, the prime candidate for extant relative of the cyanobacterial plastid progenitor . However, clustering of the three genes in G. lithophora with cyanobacteria and not with Chlamydiae (Figs. S3-S4, and S6), diminishes the probability of scenario I.
Scenario II suggests that CMS, CMK and HDS in eukaryotes are the result of HGT from Chlamydiae after the endosymbiosis (Fig. 7). But, the inability of Chlamydiae to infect current photosynthetic eukaryotes or plastid-containing organisms renders this scenario less plausible.
Scenario III supports co-contribution of cyanobacteria and Chlamydiae to the origin of the primary plastid (Fig. 7), once proposed as the `tripartite (ménage-à-trois- ‘household of three’) symbiotic relationship between the extant order Chlamydiales, a cyanobacterium, and an eukaryotic host for the establishment of the eukaryotes lineages [50,51,52,53]. The tripartite endosymbiosis supported by phylogenomic analyses of a considerable number of nuclear genes in eukaryotes related to chlamydial homologues, proposes that the chlamydial partner injected effector proteins into the ancestral eukaryotes as a strategy to manipulate host cell carbohydrate metabolism to the parasite’s advantage [50, 51, 54]. However, counter arguments question the correct evolutionary models of phylogenomic analyses, the high frequency of HGTs among prokaryotes and among prokaryote-to-eukaryote [55,56,57,58]. Our analyses based on the best-fitting evolutionary models for constructing phylogenetic trees of individual MEP-pathway genes support the chlamydial origination for three of seven MEP-pathway genes in plastid-bearing eukaryotes, even though the evolutionary pressure(s) that led to plastid-bearing eukaryotes harboring a chimera MEP-pathway remains an enigma.
Our data clearly presents contribution of both cyanobacteria and Chlamydiae to plastid-bearing eukaryotes MEP-pathway and by extension to the origin of the primary plastid.
The MEP-pathway genes are highly conserved and are essential for the survival of plastid-bearing eukaryotes. The plastid-bearing eukaryotes MEP-pathway genes originated from both cyanobacteria and Chlamydiae indicating their co-contributions to the evolution of primary plastids. The nuclear-encoded plastid-destined MEP-pathway enzymes enable the host eukaryotes to control plastids in a stable endosymbiosis system, while in return MEcPP, the plastid-produced intermediate of the MEP-pathway, coordinates expression of selected nuclear stress-response genes and the corresponding physiological ramifications. These bilateral controls mediated by MEP-pathway may also shed light on the basis of the co-existence of cytosolic and plastidial isopreneoid biosynthesis pathways in eukaryotes.
In summary, these findings uncover the evolutionary history and characteristics of the plastidial isoprenoid biosynthesis-pathway genes and its implications in origin and evolution of primary plastid.
Identification the homologues of MEP-pathway
Plant genome sequences were downloaded from the Phytozome v12 (https://phytozome.jgi.doe.gov/pz/portal.html), Amborella Genome Database (http://amborella.huck.psu.edu/data), Spruce Genome Project (http://congenie.org/start) and JGI Genome portal (https://genome.jgi.doe.gov/). Algea genomes were downloaded from Phytozome and Greenhouse (https://greenhouse.lanl.gov/greenhouse/organisms/). Annotated genome sequences of Chara braunii , Klebsormidium nites  are downloaded. Genome sequences of selected eubacteria were downloaded from Ensembl (release 90) (ftp://ftp.ensembl.org/pub/).
The names and IDs of MEP-pathway genes in A. thaliana are DXS (1-deoxy-D-xylulose-5-phosphate synthase): AT4G15560; DXR (1-deoxy-D-xylulose 5-phosphate reductoisomerase): AT5G62790; CMS (4-Diphosphocytidyl-2C-methyl-D-erythritol synthase): AT2G02500; CMK (4-(cytidine 5′-diphospho)-2-C-methyl-D-erythritol kinase activity): AT2G26930; MDS (2C-methyl-d-erythritol 2,4-cyclodiphosphate synthase): AT1G63970; HDS (4-hydroxy-3-methylbut-2-enyl diphosphate synthase): AT5G60600; and HDR (4-hydroxy-3-methylbut-2-en-1-yl diphosphate reductase): AT4G34350. The protein domain information for each MEP-pathway gene in A. thaliana was obtained from Phytozome v12, which are 1-deoxy-D-xylulose-5-phosphate synthase as PF13292 for DXS, 1-deoxy-D-xylulose-5-phosphate reductoisomerase as PF02670 and 1-deoxy-D-xylulose-5-phosphate reductoisomerase C-terminal as PF08436 for DXR, MobA-like NTP transferase as PF12804 for CMS, GHMP kinases N terminal as PF00288 and GHMP kinases C terminal as PF08544 for CMK, YgbB as PF02542 for MDS, GcpE as PF04551 for HDS, and LytB protein as PF02401 for HDR, respectively. Hidden Markov Models (HMM)  matrix presenting each domain of MEP-pathway enzymes was downloaded from Pfam (https://pfam.xfam.org/). Then, hmmsearch and fastacmd were used to obtain protein sequences in selected whole-genome sequenced species. And protein sequences of MEP-pathway genes in PVC bacteria were retrieved from PVCbase (http://pvcbacteria.org/pvcbase/) using BLASTP. All the identified proteins were examined on Pfam website to confirm the presence of the corresponding protein domain.
Multiple sequence alignment and phylogenetic tree construction
The MEP-pathway protein sequences were aligned using MUSCLE  v3.8.31 with default parameters. Prottest  was used to select out the best-fitting evolutionary model for each aligned protein matrix of MEP-pathway gene. Then the evolutionary model of WAG + G was specified for DXS, LG + I + G was specified for DXR, HDS and HDR, and VT + I + G was specified for CMS, CMK and MDS. Phylogenetic trees were constructed by RAxML  v7.1.0 and MrBayes  v3.2.7. As an exception to the MEP-pathway genes, the CMK belongs to the GHMP gene family with 13 copies in A. thaliana. All protein sequences of this family in each species were firstly retrieved and preliminary ML tree using aligned sequences was constructed. Lastly, members of CMK and MVK (mevalonate kinase in the MVA-pathway) were selected out for constructing the final phylogenetic tree. The MVK branch was set as outgroup.
The Locus search function in PGDD  (http://chibba.agtec.uga.edu/duplication/), a public database for cluster identification of plant genes based on intra- or cross-genome syntenic relationships, was implemented for identifying the intra-species duplication blocks around 500 kb region of each MEP-pathway genes in A. thaliana and O. sativa.
Nucleotide sequences for represented species in each lineage, namely A. thaliana a eudicot, O. sativa a monocot, A. trichopoda an early-diverging angiosperm, P. abies a gymnosperm, P. patens a moss, S. moellendorffii a lycophyte, Volvox carteri a green algae and Cyanidioschyzon merolae a red algae, were retrieved to calculate the nonsynonymous to synonymous rate ratio (ω = dN/dS) between A. thaliana and all other species. The ω was calculated by yn00 contained in the software PAML v4.5  using the Yang & Nielsen method, wehre 0 < ω < 1 indicates purifying selection, ω = 1 corresponds to neutral selection, and ω > 1 implicates positive selection. The distributions of all dN/dS values for each MEP-pathway gene were drawn by the boxplot function in the R  program.
Expression levels, codon adaption index (CAI)
The sources of the RPKM values of all expressed genes in wild type plant in represented species were listed as following: A. thaliana , Z. mays  (2 replicates of mock treated wild type), G. max (SoyBase Soybean Genome Annotation Page: https://soybase.org/soyseq/tables_lists/index.php), O. sativa (Rice Genome Annotation Project: http://rice.plantbiology.msu.edu/expression.shtml, use four libraries of four-leaf stage seedling), P. abies (Spruce Genome Project: ftp://plantgenie.org/Data/ConGenIE/Picea_abies/v1.0/Expression/), P. patens  and S. moellendorffii . Genes with RPKM ≥1 were retained for further analyses. The relative expression ranking of each gene was calculated using the formula: 1- (order of the gene) / (total number of all expressed genes). The relative expression ranking of all represented species for each gene were presented as scatter-boxplot in the R program.
Nucleotide sequences for each MEP-pathway gene in selected species were retrieved in corresponding datasets. Codon usage table for each selected species was obtained from Condon Usage Database (http://www.kazusa.or.jp/codon/). Lastly, fasta format of each nucleotide sequence and codon usage table of each species were inputted to calculate the CAI on the CAIcal SERVER  (http://ppuigbo.me/programs/CAIcal/).
Plant material and growth conditions
We employed the RNAi lines for all MEP-pathway genes in A. thaliana that were previously generated . Sterilized seeds sowed on 1/2 MS medium were maintained for 48 h at 4 °C. Two-week-old seedlings were grown at 22 °C under 16/8 h light/dark period.
Availability of data and materials
The datasets of phylogenetic matrices analyzed in the study are available in the figshare:
Whole genome duplication
Planctomycetes, Verrucomicrobia and Chlamydiae
Horizontal gene transfer
Brocks JJ, Logan GA, Buick R, Summons RE. Archean molecular fossils and the early rise of eukaryotes. Science. 1999;285(5430):1033–6.
Hoshino Y, Gaucher EA. On the origin of isoprenoid biosynthesis. Mol Biol Evol. 2018;35(9):2185–97.
Sacchettini JC, Poulter CD. Creating isoprenoid diversity. Science. 1997;277(5333):1788–9.
Wright LD. Biosynthesis of isoprenoid compounds. Annu Rev Biochem. 1961;30(1):525–48.
Rohmer M, Knani M, Simonin P, Sutter B, Sahm H. Isoprenoid biosynthesis in bacteria: a novel pathway for the early steps leading to isopentenyl diphosphate. Biochem J. 1993;295(Pt 2):517.
Lombard J, Moreira D. Origins and early evolution of the mevalonate pathway of isoprenoid biosynthesis in the three domains of life. Mol Biol Evol. 2010;28(1):87–99.
Ruiz-Sola MÁ, Barja MV, Manzano D, et al. A single Arabidopsis gene encodes two differentially targeted geranylgeranyl diphosphate synthase isoforms. Plant Physiol. 2016;172(3):1393–402.
Hemmerlin A, Hoeffler J-F, Meyer O, et al. Cross-talk between the cytosolic mevalonate and the plastidial methylerythritol phosphate pathways in tobacco bright yellow-2 cells. J Biol Chem. 2003;278(29):26666–76.
Bick JA, Lange BM. Metabolic cross talk between cytosolic and plastidial pathways of isoprenoid biosynthesis: unidirectional transport of intermediates across the chloroplast envelope membrane. Arch Biochem Biophys. 2003;415(2):146–54.
Laule O, Fürholz A, Chang H-S, et al. Crosstalk between cytosolic and plastidial pathways of isoprenoid biosynthesis in Arabidopsis thaliana. Proc Natl Acad Sci. 2003;100(11):6866–71.
Keeling PJ. The endosymbiotic origin, diversification and fate of plastids. Philos Trans R Soc B Biol Sci. 2010;365(1541):729–48.
Woodson JD, Chory J. Coordination of gene expression between organellar and nuclear genomes. Nat Rev Genet. 2008;9(5):383–95.
Archibald JM. Genomic perspectives on the birth and spread of plastids. Proc Natl Acad Sci. 2015;112(33):10147–53.
Walley J, Xiao Y, Wang JZ, et al. Plastid-produced interorgannellar stress signal MEcPP potentiates induction of the unfolded protein response in endoplasmic reticulum. Proc Natl Acad Sci U S A. 2015;112(19):6212–7.
Xiao YM, Savchenko T, Baidoo EEK, et al. Retrograde signaling by the Plastidial metabolite MEcPP regulates expression of nuclear stress-response genes. Cell. 2012;149(7):1525–35.
Lange BM, Rujan T, Martin W, Croteau R. Isoprenoid biosynthesis: the evolution of two ancient and distinct pathways across genomes. Proc Natl Acad Sci. 2000;97(24):13172–7.
Liu YL, Guerra F, Wang K, et al. Structure, function and inhibition of the two- and three-domain 4Fe-4S IspG proteins (vol 109, pg 8558, 2012). Proc Natl Acad Sci U S A. 2012;109(26):10605.
Kuzuyama T, Takagi M, Takahashi S, Seto H. Cloning and characterization of 1-deoxy-D-xylulose 5-phosphate synthase from Streptomyces sp. strain CL190, which uses both the mevalonate and nonmevalonate pathways for isopentenyl diphosphate biosynthesis. J Bacteriol. 2000;182(4):891–7.
Carretero-Paulet L, Cairo A, Talavera D, et al. Functional and evolutionary analysis of DXL1, a non-essential gene encoding a 1-deoxy-D-xylulose 5-phosphate synthase like protein in Arabidopsis thaliana. Gene. 2013;524(1):40–53.
Zhang M, Li K, Zhang C, Gai J, Yu D. Identification and characterization of class 1 DXS gene encoding 1-deoxy-D-xylulose-5-phosphate synthase, the first committed enzyme of the MEP pathway from soybean. Mol Biol Rep. 2009;36(5):879–87.
Walter MH, Hans J, Strack D. Two distantly related genes encoding 1-deoxy-d-xylulose 5-phosphate synthases: differential regulation in shoots and apocarotenoid-accumulating mycorrhizal roots. Plant J. 2002;31(3):243–54.
Floß DS, Hause B, Lange PR, Küster H, Strack D, Walter MH. Knock-down of the MEP pathway isogene 1-deoxy-d-xylulose 5-phosphate synthase 2 inhibits formation of arbuscular mycorrhiza-induced apocarotenoids, and abolishes normal expression of mycorrhiza-specific plant marker genes. Plant J. 2008;56(1):86–100.
Cordoba E, Porta H, Arroyo A, et al. Functional characterization of the three genes encoding 1-deoxy-D-xylulose 5-phosphate synthase in maize. J Exp Bot. 2011;62(6):2023–38.
Tambasco-Studart M, Titiz O, Raschle T, Forster G, Amrhein N, Fitzpatrick TB. Vitamin B6 biosynthesis in higher plants. Proc Natl Acad Sci. 2005;102(38):13687–92.
Ren R, Wang H, Guo C, et al. Widespread whole genome duplications contribute to genome complexity and species diversity in angiosperms. Mol Plant. 2018;11(3):414–28.
De Smet R, Adams KL, Vandepoele K, Van Montagu MC, Maere S, Van de Peer Y. Convergent gene loss following gene and genome duplications creates single-copy families in flowering plants. Proc Natl Acad Sci. 2013;110(8):2898–903.
Yang Z, Nielsen R. Synonymous and nonsynonymous rate variation in nuclear genes of mammals. J Mol Evol. 1998;46(4):409–18.
Jin J, Xie XY, Chen C, et al. Eukaryotic Protein Domains as Functional Units of Cellular Evolution. Sci Signal. 2009;2(98):76.
Zhang J, Yang J-R. Determinants of the rate of protein sequence evolution. Nat Rev Genet. 2015;16(7):409.
Jordan IK, Rogozin IB, Wolf YI, Koonin EV. Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. Genome Res. 2002;12(6):962–8.
Jansen R, Bussemaker HJ, Gerstein M. Revisiting the codon adaptation index from a whole-genome perspective: analyzing the relationship between gene expression and codon occurrence in yeast using a variety of models. Nucleic Acids Res. 2003;31(8):2242–51.
Rubin BE, Wetmore KM, Price MN, et al. The essential gene set of a photosynthetic organism. Proc Natl Acad Sci. 2015;112(48):E6634–43.
Goodall EC, Robinson A, Johnston IG, et al. The essential genome of Escherichia coli K-12. MBio. 2018;9(1):e02096–17.
Rensing SA. Gene duplication as a driver of plant morphogenetic evolution. Curr Opin Plant Biol. 2014;17:43–8.
Zhang J. Evolution by gene duplication: an update. Trends Ecol Evol. 2003;18(6):292–8.
Jin H, Song Z, Nikolau BJ. Reverse genetic characterization of two paralogous acetoacetyl CoA thiolase genes in Arabidopsis reveals their importance in plant growth and development. Plant J. 2012;70(6):1015–32.
Suzuki M, Kamide Y, Nagata N, et al. Loss of function of 3-hydroxy-3-methylglutaryl coenzyme a reductase 1 (HMG1) in Arabidopsis leads to dwarfing, early senescence and male sterility, and reduced sterol levels. Plant J. 2004;37(5):750–61.
Pulido P, Perello C, Rodriguez-Concepcion M. New insights into plant isoprenoid metabolism. Mol Plant. 2012;5(5):964–7.
Martijn J, Vosseberg J, Guy L, Offre P, Ettema TJ. Deep mitochondrial origin outside the sampled alphaproteobacteria. Nature. 2018;557(7703):101–5.
Qiu H, Price DC, Yang EC, Yoon HS, Bhattacharya D. Evidence of ancient genome reduction in red algae (Rhodophyta). J Phycol. 2015;51(4):624–36.
McFadden GI. Origin and evolution of plastids and photosynthesis in eukaryotes. Cold Spring Harb Perspect Biol. 2014;6(4):a016105.
Zhang K, Zhu X, Wood R, Shi Y, Gao Z, Poulton S. Oxygenation of the Mesoproterozoic Ocean and the evolution of complex eukaryotes. Nat Geosci. 2018;11:345–50.
Harada M, Tajika E, Sekine Y. Transition to an oxygen-rich atmosphere with an extensive overshoot triggered by the Paleoproterozoic snowball earth. Earth Planet Sc Lett. 2015;419:178–86.
Fiorella RP, Sheldon ND. Equable end Mesoproterozoic climate in the absence of high CO2. Geology. 2017;45(3):231–4.
Ostrovsky D, Diomina G, Lysak E, Matveeva E, Ogrel O, Trutko S. Effect of oxidative stress on the biosynthesis of 2-C-methyl-D-erythritol-2, 4-cyclopyrophosphate and isoprenoids by several bacterial strains. Arch Microbiol. 1998;171(1):69–72.
Rivasseau C, Seemann M, BOISSON A, et al. Accumulation of 2-C-methyl-D-erythritol 2, 4-cyclopyrophosphate in illuminated plant leaves at supraoptimal temperatures reveals a bottleneck of the prokaryotic methylerythritol 4-phosphate pathway of isoprenoid biosynthesis. Plant Cell Environ. 2009;32(1):82–92.
Ostrovsky DN, Dyomina GR, Deryabina YI, et al. Properties of 2-C-methyl-D-erythritol 2,4-cyclopyrophosphate, an intermediate in nonmevalonate isoprenoid biosynthesis. Appl Biochem Microbiol. 2003;39(5):497–502.
Ku C, Nelson-Sathi S, Roettger M, Garg S, Hazkani-Covo E, Martin WF. Endosymbiotic gene transfer from prokaryotic pangenomes: inherited chimerism in eukaryotes. Proc Natl Acad Sci. 2015;112(33):10139–46.
Ponce-Toledo RI, Deschamps P, López-García P, Zivanovic Y, Benzerara K, Moreira D. An early-branching freshwater cyanobacterium at the origin of plastids. Curr Biol. 2017;27(3):386–91.
Huang J, Gogarten JP. Did an ancient chlamydial endosymbiosis facilitate the establishment of primary plastids? Genome Biol. 2007;8(6):R99.
Ball SG, Subtil A, Bhattacharya D, et al. Metabolic effectors secreted by bacterial pathogens: essential facilitators of plastid endosymbiosis? Plant Cell. 2013;25(1):7–21.
Ball SG, Bhattacharya D, Weber AP. Infection and the first eukaryotes—response. Science. 2016;352(6289):1065–6.
Cenci U, Ducatez M, Kadouche D, Colleoni C, Ball SG. Was the chlamydial Adaptative strategy to tryptophan starvation an early determinant of plastid endosymbiosis? Front Cell Infect Mi. 2016;6:67.
Cenci U, Bhattacharya D, Weber AP, Colleoni C, Subtil A, Ball SG. Biotic host–pathogen interactions as major drivers of plastid endosymbiosis. Trends Plant Sci. 2017;22(4):316–28.
Domman D, Horn M, Embley TM, Williams TA. Plastid establishment did not require a chlamydial partner. Nat Commun. 2015;6:6421.
Gould SB. Infection and the first eukaryotes. Science. 2016;352(6289):1065.
Soucy SM, Huang J, Gogarten JP. Horizontal gene transfer: building the web of life. Nat Rev Genet. 2015;16(8):472.
Ku C, Nelson-Sathi S, Roettger M, et al. Endosymbiotic origin and differential loss of eukaryotic genes. Nature. 2015;524(7566):427.
Nishiyama T, Sakayama H, de Vries J, et al. The Chara genome: secondary complexity and implications for plant terrestrialization. Cell. 2018;174(2):448–464. e424.
Hori K, Maruyama F, Fujisawa T, et al. Klebsormidium flaccidum genome reveals primary factors for plant terrestrial adaptation. Nat Commun. 2014;5:3978.
Eddy SR. Hidden markov models. Curr Opin Struct Biol. 1996;6(3):361–5.
Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32(5):1792–7.
Darriba D, Taboada GL, Doallo R, Posada D. ProtTest 3: fast selection of best-fit models of protein evolution. Bioinformatics. 2011;27(8):1164–5.
Silvestro D, Michalak I. raxmlGUI: a graphical front-end for RAxML. Org Divers Evol. 2012;12(4):335–7.
Ronquist F, Teslenko M, Van Der Mark P, et al. MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst Biol. 2012;61(3):539–42.
Lee T-H, Tang H, Wang X, Paterson AH. PGDD: a database of gene and genome duplication in plants. Nucleic Acids Res. 2012;41(D1):D1152–8.
Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24(8):1586–91.
Team RC. R: a language and environment for statistical computing; 2013.
Bjornson M, Balcke GU, Xiao Y, et al. Integrated omics analyses of retrograde signaling mutant delineate interrelated stress-response strata. Plant J. 2017;91(1):70–84.
Kebede AZ, Johnston A, Schneiderman D, Bosnich W, Harris LJ. Transcriptome profiling of two maize inbreds with distinct responses to Gibberella ear rot disease to identify candidate resistance genes. BMC Genomics. 2018;19(1):131.
Khraiwesh B, Qudeimat E, Thimma M, et al. Genome-wide expression analysis offers new insights into the origin and evolution of Physcomitrella patens stress response. Sci Rep. 2015;5:17434.
You C, Cui J, Wang H, et al. Conservation and divergence of small RNA pathways and microRNAs in land plants. Genome Biol. 2017;18(1):158.
Puigbò P, Bravo IG, Garcia-Vallve S. CAIcal: a combined set of tools to assess codon usage adaptation. Biol Direct. 2008;3(1):38.
This work was supported by the U.S. National Science Foundation (NSF) grant IOS-1352478 to KD, and the U.S. National Institutes of Health (NIH) grant R01GM107311 to KD. The funders had no role in design of the study, or in the collection, analysis, and interpretation of data or in the writing of the manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they are no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1: Figure S1.
Phylogenetic tree of DXS. Figure S2. Phylogenetic tree of MDS. Figure S3. Phylogenetic tree of DXR. Figure S4. Phylogenetic tree of HDR. Figure S5. Phylogenetic tree of CMS. Figure S6. Phylogenetic tree of CMK. Figure S7. Phylogenetic tree of HDS.
Additional file 2: Table S1.
Synonymous (dN) and nonsynonymous (dS) substitution rates estimated by PAML. Table S2. The relative expression ratio of the MEP-pathway genes in represented species. Table S3. The Codon Adaption Index of the MEP-pathway genes.
Additional file 3: Data S1.
Paralogous syntenic blocks display loss of duplicated copies of MEP-pathway genes.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Zeng, L., Dehesh, K. The eukaryotic MEP-pathway genes are evolutionarily conserved and originated from Chlaymidia and cyanobacteria. BMC Genomics 22, 137 (2021). https://doi.org/10.1186/s12864-021-07448-x
- Plastid-bearing eukaryotes