AlgaGEM – a genome-scale metabolic reconstruction of algae based on the Chlamydomonas reinhardtii genome
© Gomes de Oliveira Dal'Molin et al; licensee BioMed Central Ltd. 2011
Published: 22 December 2011
Skip to main content
© Gomes de Oliveira Dal'Molin et al; licensee BioMed Central Ltd. 2011
Published: 22 December 2011
Microalgae have the potential to deliver biofuels without the associated competition for land resources. In order to realise the rates and titres necessary for commercial production, however, system-level metabolic engineering will be required. Genome scale metabolic reconstructions have revolutionized microbial metabolic engineering and are used routinely for in silico analysis and design. While genome scale metabolic reconstructions have been developed for many prokaryotes and model eukaryotes, the application to less well characterized eukaryotes such as algae is challenging not at least due to a lack of compartmentalization data.
We have developed a genome-scale metabolic network model (named AlgaGEM) covering the metabolism for a compartmentalized algae cell based on the Chlamydomonas reinhardtii genome. AlgaGEM is a comprehensive literature-based genome scale metabolic reconstruction that accounts for the functions of 866 unique ORFs, 1862 metabolites, 2249 gene-enzyme-reaction-association entries, and 1725 unique reactions. The reconstruction was compartmentalized into the cytoplasm, mitochondrion, plastid and microbody using available data for algae complemented with compartmentalisation data for Arabidopsis thaliana. AlgaGEM describes a functional primary metabolism of Chlamydomonas and significantly predicts distinct algal behaviours such as the catabolism or secretion rather than recycling of phosphoglycolate in photorespiration. AlgaGEM was validated through the simulation of growth and algae metabolic functions inferred from literature. Using efficient resource utilisation as the optimality criterion, AlgaGEM predicted observed metabolic effects under autotrophic, heterotrophic and mixotrophic conditions. AlgaGEM predicts increased hydrogen production when cyclic electron flow is disrupted as seen in a high producing mutant derived from mutational studies. The model also predicted the physiological pathway for H 2 production and identified new targets to further improve H2 yield.
AlgaGEM is a viable and comprehensive framework for in silico functional analysis and can be used to derive new, non-trivial hypotheses for exploring this metabolically versatile organism. Flux balance analysis can be used to identify bottlenecks and new targets to metabolically engineer microalgae for production of biofuels.
Microalgae are receiving increased attention as the search for sustainable and profitable biofuel feedstocks progresses. Algae-derived hydrogen, methane, triacylglycerols, and ethanol could all serve as potential biofuels [1–3], but many challenges remain to be addressed [4, 5]. In order to realise the rates and titres necessary for commercial production, system-level metabolic engineering will be required .
In modern, system-level microbial metabolic engineering, genome scale metabolic reconstructions (GEMs) are used to integrate and analyse large ‘omics datasets as well as to evaluate designs in silico. A GEM maps annotated metabolic genes and proteins to reactions based on the current best understanding of a given organism. A growing number of metabolic engineering studies have demonstrated the use of well-curated GEMs to generate strain designs that are neither intuitive nor obvious [7–12].
Currently there is no genome scale reconstruction available for algae. The first attempt to reconstruct a large metabolic reconstruction of algae (based on Chlamydomonas reinhardtii) featured 484 reactions and 458 metabolites located in the chloroplast, cytosol and mitochondria  . An independent model featured 259 reactions and 267 metabolites localized to the cytosol, mitochondria, chloroplast, glyoxysome, and flagellum . Despite the importance of these models, curation of cellular compartmentalization and genomic information was limited to central metabolism. Furthermore, in their current format, such models do not allow for the integration of other omics data (proteome, transcriptome and metabolome) for a system-level assessment of C. reinhardtii. For this, a full GEM is required.
GEMs have been developed for several model eukaryotes: yeast , mouse , human  and Arabidopsis . For algae and other less extensively studied eukaryotes, a major challenge is the scarcity of data regarding compartmentalisation. An approach to overcome this shortfall in information is to use the compartmentalisation data for related organisms (here Arabidopsis), where no biochemical data for algae exists. We recently used this approach for the metabolic reconstruction of GEMs for the C4 grasses, maize, sorghum and sugarcane, and the resultant model was able to predict differential protein expression between mesophyll and bundle sheath, a unique C4 phenomenon . The metabolism of single-cellular C. reinhardtii, however, has several features distinct from plants, including the presence of fermentative pathways, an inability to utilize sugars and a distinct mechanism for photorespiration.
In this paper, we develop the first compartmentalized, genome-scale model of algae metabolism (named AlgaGEM) based on the C. reinhardtii genome and a comprehensive evaluation of biochemical evidence found in literature complemented with missing compartmentalisation data derived from the GEM for Arabidopsis, AraGEM . AlgaGEM captures the unique algae phenotypes, identifies pathways known to be important during anaerobic growth and accurately predicts the effect of a known mutation on hydrogen production. The success highlights the potential of using chimeric models to access the immensely powerful tools available for analysing GEMs, when working with biochemically less characterized eukaryotes.
Online resources for the reconstruction of the metabolic network of Chlamydomonas reinhardtii
DOE Joint Genome Institute (JGI); Chlamydomonas reinhardtii v4.0
An Online Informatics Resource for Chlamydomonas (Chlamy Center)
Kyoto Encyclopedia of Genes and Genomes (KEGG)
ExPASy Biochemical Pathways
ExPASy Enzyme Database
Enzyme/Protein Localization and others Databases*
AraPerox (Arabidopsis Protein from Plant Peroxisomes)
SUBA (Arabidopsis subcellular database)
PPDB (Plant proteome database)
AlgaGEM was compiled and curated in Excel (Microsoft Corporation) for ease of annotation and commenting (Figure 1). From this gene centric database, a 2D reaction centric SBML (System Biology Markup Language, http://www.sbml.org) representation was generated using an in house Java (Oracle Corporation) application. As there is currently no specific element in SBML allocated to store the gene–protein–reaction associations (e.g. splice-variants, isozymes, protein complexes), these were added as notes to the reaction elements. Constraint-based reconstruction and analysis was performed using COBRA toolbox (http://opencobra.sourceforge.net/) ; a set of MATLAB scripts for constraint-based modelling run from within the MATLAB environment (Version 7.3, The MathWorks, Inc.). Simulated flux distributions were visualized on a metabolic flux map (for the visualization of overall changes in the central metabolism of a compartmentalized algal cell) drawn in Excel.
List of biomass components
Carbohydrates and sugars
Starch, sucrose, fructose, glucose, maltose
Protein (amino acids)
Alanine, arginine, aspartate, asparagine, cystein, lysine, leucine, isoleucine, glutamate, glutamine, histidine, methionine, phenylalanine, proline, serine, tyrosine, tryptophan, valine
ATP, GTP, CTP, UTP, dATP, dGTP, dCTP, dTTP
C16:0 (Palmitic acid)
Vitamins and cofactors
Biotin, coenzyme A, riboflavin, folate, chlorophyll, nicotinamide, thiamine, ubiquinone,
where v i is the corresponding biomass drain reaction. Where the maximum production rate of a biomass component was zero, gap analysis was performed. Some gaps were readily filled based on inspection of the corresponding pathways in KEGG, ChlamyCyc  and other available databases (Table 1). Others, such as inconsistent irreversibility constraints, stoichiometry errors, compound names, compartmentalization errors or missing transporters, required sequential tracing through the model to identify breakpoints and careful evaluation of the possible causes.
Once network gaps were closed, the individual biomass accumulation terms were combined into an overall biomass synthesis equation, with the appropriate coefficients assigned to each precursor to define the composition of biomass. The overall biomass synthesis equation depends on growth conditions and was designed to represent autotrophic, heterotrophic and mixotrophic conditions based on literature data . Trace elements were not included in the biomass equation, since their contribution to overall flux is trivial.
Minimal set of constraints imposed to represent different growth condition
Inputs, outputs and constraints
C source: CO2 uptake
C source: Acetate uptake
Photons uptake (free flux)
Optimization 1: minimize uptake of
Optimization 2: maximize product
Biomass rate (fixed) *
i.e., the distributions that minimize the use of the key energy substrate (photons or acetate), while achieving a specified growth rate.
i.e., the flux distributions that maximize H2 synthesis, while achieving a specified growth rate under autotrophic, mixotrophic or heterotrophic condition.
Characteristics of the reconstructed genome-scale model (AlgaGEM)
Unique metabolic reactions
Gaps (non-enzymatic reactions)
Forty two (42) biomass drain equations describe the accumulation of carbohydrates, sugars, amino acids and fatty acids, representing the major biomass drains for an algal cell (Table 2). At present, fatty acid biosynthesis is limited to palmitic acid biosynthesis in chloroplasts. The biosynthetic pathways of a limited number of vitamins and co-factors have been curated to date. Twenty-four (24) intercellular exchange reactions (cytoplasm–extracellular) have been included to describe the uptake of light (absorbed photons), and the uptake/secretion of inorganic compounds (CO2, H2O, HCO3-; O2, NO3, NH3, H2S, SO4 2–, PO4 3–), translocation of fermentative products (like acetate, glycolate, lactate and ethanol), H2 and amino acids (glutamine, glutamate, aspartate, alanine and serine). Together with biomass drains (39), the intercellular exchangers define the broad physiological domain of the model, that is, the curated aspects of C. reinhardtii primary metabolism captured by AlgaGEM. Inter-organelle transporters were added based on the biochemistry information available for algae (see additional file 1; Table S1). When not available, we used information that supports inter-organelle transporters for Arabidopsis (E.g.; Transport DB, Table 1). A total of 79 inter-organelle transporters were introduced in the model to achieve metabolic functionality. Apart from nomenclature and cellular compartmentalization issues, only three additional reactions without gene associations (non-enzymatic steps) were added during model curation before the model was able to simulate growth in silico.
The reconstruction of metabolic models for eukaryotes is challenging due to the scarcity of biochemical information at the subcellular level required for cellular compartmentalisation. AlgaGEM covers our current understanding of metabolic functionality and connectivity through different organelles for C. reinhardtii. It does rely, however, on AraGEM  for the compartmentalisation of reactions for which no data exist for C. reinhardtii. Given that there is a substantial, natural overlap between AlgaGEM and AraGEM with approximately 85% of all reactions present in both models, it is important to establish that AlgaGEM indeed predicts algae behaviour rather than slightly modified Arabidopsis behaviour.
Heterotrophic growth in AlgaGEM differs from AraGEM in that the former can metabolize acetate and glycolate, but lacks glucose and sucrose transporters and is unable to utilize these carbon sources from the media . Moreover, AlgaGEM has fermentative reactions and produces a range of fermentative products (like H2, glycolate, acetate, formate, lactate, etc). These differences are the direct result of added reactions and transporters.
A more interesting difference is the way algae and plants handle the RuBisCO oxygenation reaction. Assuming that plants have evolved to optimise photon efficiency, AraGEM accurately predicts that phosphoglycolyate is recycled using the classical photorespiration cycle involving reactions in plastids, peroxisomes and mitochondria . Moreover, it accurately predicts that photon requirements increase by 30-40%, if the ratio of oxygenation-to-carboxylation is 1:4.
Photorespiration of C. reinhardtii deviates from the classical plant photorespiration in that instead of oxidizing glycolate to glyoxylate via glycolate oxidase in peroxisomes, C. Reinhardtii and many other microalgae utilize glycolate in mitochondria [25, 39, 40]. In addition, because molecular O2 is not an electron acceptor for glycolate dehydrogenase, glycolate oxidation catalysed by this enzyme does not produce H2O2, so catalase should not be required for the photorespiration cycle in algae. Instead, glycolate dehydrogenase is expected to contribute electrons to the mitochondrial respiratory electron transport through reduction of ubiquinone pool . AlgaGEM accurately predicts that algae will catabolise rather than recycle phosphoglycolate, if sufficient oxygen is available and energy is needed, or alternatively secrete glycolate to the environment, as has also been observed [41, 42].
C. Reinhardtii produces hydrogen under heterotrophic and mixotrophic conditions (Figure 3). The rates are generally low and metabolic engineering strategies are being explored to improve production rates. Mutational studies identified a high producing, state transition mutant, Stm6, which is blocked in state 1 photosynthesis and hence has greatly inhibited cyclic electron flow around photosystem I .
Upregulation of key target enzymes when H2 is produced under mixotrophic conditions, highlighted by AlgaGEM
Cyclic electron flow (PSI)
Linear electron flow (PSII)
Ferredoxin-NADP+ reductase (FNR)
Pyruvate ferredoxin oxireductase
Calvin Cycle/CO2 assimilation
Pentose phosphate pathway (cytosolic)
Pentose phosphate pathway (plastidic)
Beta oxidation/glyoxylate cycle
glutamate synthase (ferredoxin);GSGOGAT
glutamine synthetase; GS/GOGAT
Also in line with observations for the Stm6 mutant, the model predicted increased linear electron flow and reduced mitochondrial TCA cycle activity (Table 5). Furthermore, the model predictions agreed with observations made in other studies regarding the correlation of hydrogen production with expression of various genes, including increased activity of reactions directly involved in hydrogen production (Fe-Hydrogenases, Ferredoxin-NADP+ reductase, glyceraldehyde -3-phosphate dehydrogenase, and pyruvate ferredoxin oxireductase) and a shift carbon assimilation through Calvin cycle/CO2 assimilation to acetate assimilation (Table 5). The model also suggested a number of possible further targets for investigation, including gluconeogenesis, pentose phosphate pathway, beta oxidation, glyoxylate cycle and GS/GOGAT.
AlgaGEM is a curated, compartmentalized genome scale model of algal cell primary metabolism. Continued curation efforts will focus on closing gaps, especially in the secondary metabolism and alternative fermentative pathways (not well understood or described at organelle level for algae) and the resolution of gene product targeting where this is yet to be established. In its current version, the model covers the primary metabolism including some of the fermentative pathways. Importantly, while the model shares 85% of the reactions with AraGEM and while AraGEM was used to compartmentalise many genes, the model predicts distinct algal behaviours such as the catabolism or secretion rather than recycling of phosphoglycolate in photorespiration.
The use of AlgaGEM for in silico flux predictions illustrates the potential of using genome scale models to explore complex, compartmentalized networks and develop non-trivial hypotheses. The metabolic changes highlighted by AlgaGEM to increase H2 yield show agreement with evidence found in the literature and the model predicted the magnitude of change observed in a stage transition mutant, Stm6. Further experimental investigations are also suggested to test new targets. Such results support the potential use of this framework for algae metabolic engineering.
We thank Professor Ben Hankamer, Dr. Ian Ross and Ms Winne Waudo (Institute for Molecular Biosciences/The University of Queensland) for the insightful discussion and for information about the H2 mutant (Stm6).
This article has been published as part of BMC Genomics Volume 12 Supplement 4, 2011: Proceedings of the 6th International Conference of the Brazilian Association for Bioinformatics and Computational Biology (X-meeting 2010). The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2164/12?issue=S4
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.