- Research article
- Open Access
De novo sequencing, assembly and characterisation of Aloe vera transcriptome and analysis of expression profiles of genes related to saponin and anthraquinone metabolism
© The Author(s). 2018
- Received: 18 May 2017
- Accepted: 22 May 2018
- Published: 1 June 2018
Aloe vera is a perennial, succulent, drought-resistant plant that exhibits many pharmacological characteristics such as wound healing ability against skin burns, anti-ulcer, anti-inflammatory, anti-tumor, anti-viral, anti-hypercholesterolemic, anti-hyperglycemic, anti-asthmatic and much more. Despite great medicinal worth, little genomic information is available on Aloe vera. This study is an initiative to explore the full-scale functional genomics of Aloe vera by generating whole transcriptome sequence database, using Illumina HiSeq technology and its progressive annotation specifically with respect to the metabolic specificity of the plant.
Transcriptome sequencing of root and leaf tissue of Aloe vera was performed using Illumina paired-end sequencing technology. De novo assembly of high quality paired-end reads, resulted into 1,61,733 and 2,21,792 transcripts with mean length of 709 and 714 nucleotides for root and leaf respectively. The non-redundant transcripts were clustered using CD-HIT-EST, yielding a total of 1,13,063 and 1,41,310 unigenes for root and leaf respectively. A total of 6114 and 6527 CDS for root and leaf tissue were enriched into 24 different biological pathway categories using KEGG pathway database. DGE profile prepared by calculating FPKM values was analyzed for differential expression of specific gene encoding enzymes involved in secondary metabolite biosynthesis. Sixteen putative genes related to saponin, lignin, anthraquinone, and carotenoid biosynthesis were selected for quantitative expression by real-time PCR. DGE as well as qRT PCR expression analysis represented up-regulation of secondary metabolic genes in root as compared to leaf. Furthermore maximum number of genes was found to be up-regulated after the induction of methyl jasmonate, which stipulates the association of secondary metabolite synthesis with the plant’s defense mechanism during stress. Various transcription factors including bHLH, NAC, MYB were identified by searching predicted CDS against PlantTFdb.
This is the first transcriptome database of Aloe vera and can be potentially utilized to characterize the genes involved in the biosynthesis of important secondary metabolites, metabolic regulation, signal transduction mechanism, understanding function of a particular gene in the biology and physiology of plant of this species as well as other species of Aloe genus.
- Aloe vera
- Next generation sequencing
- De novo assembly
- Secondary metabolism
- Differential gene expression
The genus Aloe belongs to the family Xanthorrhoeaceae, subfamily Asphodeloideae  consisting of many shrubby tropical/subtropical plant species with succulent and elongated leaves. The genus contains more than 360 species, out of which only four have been reported to exhibit medicinal properties: Aloe vera, Aloe arborescens, Aloe ferox, and Aloe perryi. Among all the species, Aloe vera Linne Synonym Aloe barbadensis Miller is considered to be medicinally most potent, therefore, it is most popular [2, 3]. Aloe vera shows innumerable medicinal attributes such as skin burn healing , anti-inflammatory , hepatoprotective , anti-tumor , anti-ulcer , antihypercholesterolemic , anti-hyperglycemic , anti-asthmatic and antioxidant activities . Aloe vera extracts have been demonstrated to be advantageous in the treatment of even AIDS . It has similar anti-aging effects as exhibited by vitamin A derivatives .
Aloe vera contains numerous active ingredients including anthraquinones, polysaccharides, alkylbenzenes, dehydrabietic acid derivatives, salicylic acid, lectin, carotenoids, lignin, saponins etc. that attribute for its high therapeutic value . Many of the medicinal properties of Aloe vera are ascribed to secondary metabolites though they are relatively minor in their concentration in the plant (< 1% dry weight) . Aloe vera exhibits antibacterial property because of anthraquinones that behave like tetracycline by blocking the ribosomal A site, thus, interrupting bacterial protein synthesis . Saponins are another important group of secondary metabolites synthesized as defensive compounds against pathogenic microbes and herbivores [18–20]. Saponins present in Aloe vera also act as an effective anti-microbial agent against various bacteria, viruses, fungi, and yeasts . Aloe vera extract act as an antioxidant due to carotenoids present in it . Lignins present in Aloe vera penetrate deep into the skin and help in introducing other medicinal ingredients to penetrate into the skin. Therapeutic effects of Aloe vera have not been correlated well with individual metabolite and it remains unknown whether the biological activities of the plant are due to a single component or the collaborative efforts of many components .
An increasing trend where consumers tend towards a healthy lifestyle, coupled with the increased use of Aloe vera extracts as an ingredient in food, pharmaceutical and cosmetics products, increases its market growth across the globe. International Aloe Science Council (IASC) estimated that the global consumption of Aloe vera will surpass 60,720.4 tonnes in 2016, accounting for revenues worth US$ 1.6 billion . Presently, a large industrial sector is attempting to exploit the great biological potential of Aloe vera. The biological potential is determined by the various complex interactions between the genome, gene products, and metabolites. Various functional genomics approaches are now emerging as powerful tools to accelerate the comprehensive understanding of the molecular basis of biological functions. Genomics approaches have also proved to be significant tools in identifying transcription factors, and candidate genes involved in the plant’s secondary metabolism . Next-generation sequencing technology has revolutionized the genomics field by providing a rapid, cost- effective and efficient gene sequencing data, enabling the identification of genes related to metabolic pathways, especially in non-model plants for which no reference genome is available; for instance; American ginseng,  Eucalyptus,  rubber tree  and many more. Transcriptome sequencing using Illumina mRNA-Seq reads has proved to be an efficient approach in many plants for which transcriptome data is validated and analyzed as well, including Eucalyptus,  blackberry,  Uncaria,  Lolium,  orphan,  sesame,  alfalfa,  sweet potato,  Centella,  Ocimum  pear,  rose-scented geranium,  Withania somnifera . The present study used NGS technology for whole transcriptome sequencing of Aloe vera root and leaf tissues, performed using Illumina Hi-Seq technology. It provides valuable sequence information from Aloe vera root and leaf tissues with special emphasis on genes related to secondary metabolic pathways. Genes related to secondary metabolic pathways were examined for differential expression in root and leaf tissue by quantitative real-time PCR. Real-time PCR comprises a dynamic range, remarkable sensitivity, and sequence-specificity that enables additional independent confirmation of NGS based data  Methyl jasmonate treatment was given to the experimental plant to check the relative expression of genes under stress conditions at different time intervals. The transcriptome sequencing and analysis provides a strong platform of genomic sequences, to serve biosynthetic pathway and metabolic engineering programs of basic as well as applied research on Aloe vera and catalyze path of its change of use from extracts to molecules.
De novo transcriptome assembly and validation
Summary of Transcript, Unigene and CDS Statistics
Transcription factor analysis
List of Top 15 transcription factors enriched in Root Sample
Transcription factors name
# of Hits
List of Top 15 transcription factors enriched in Leaf Sample
Name Of Transcription Factor
# of Hits
KEGG pathway mapping of CDS
KEGG Pathway classification of Root and Leaf Sample CDS
Amino acid metabolism
Metabolism of other amino acids
Glycan biosynthesis and metabolism
Metabolism of cofactors and vitamins
Metabolism of terpenoids and polyketides
Biosynthesis of other secondary metabolites
Xenobiotics biodegradation and metabolism
Genetic Information Processing
Folding, sorting and degradation
Replication and repair
Environmental Information Processing
Signaling molecules and interaction
Transport and catabolism
Cell growth and death
Potential gene identification related to secondary metabolism from KEGG pathway mapping
In our research, several putative genes related to saponin, lignin, anthraquinone, carotenoid and phenylpropanoid biosynthesis have been identified from Aloe vera root and leaf tissue by KEGG Pathway functional annotation. A total of 171 CDS from root encoding 80 enzymes and 165 CDS from leaf encoding 77 enzymes were identified which are involved in different secondary metabolites biosynthesis comprising saponin (root 12; leaf 15), anthraquinone (root 31; leaf 43), lignin (root 25; leaf 23), carotenoid (root 31; leaf 32) and phenylpropanoid pathways (root 72; leaf 52).
Sequencing of Aloe vera transcriptome reveals many putative genes (number of different transcripts obtained is given in bracket with each gene) related to saponin biosynthesis pathway including acetyl-CoA acetyltransferase (ACT: root 3; leaf 9), HMG-CoA synthase (HMGS: root 1; leaf 2), HMG-CoA reductase (HMGR: root 2; leaf 4), mevalonate kinase (MVK: root 9; leaf 14), phosphomevalonate kinase (PMVK: root 1; leaf 2), mevalonate-5-diphosphate decarboxylase (MDD: root 1; leaf 1), isopentenyl-PP isomerase (IPP isomerase: root 2; leaf 1), farnesyl diphosphate synthase (FDS: root 1; leaf 1), squalene synthase (SQS: root 1; leaf 1), squalene epoxidase (SQE: root 4; leaf 1), cycloartenol synthase (CAS: root 6; leaf 6), 1-deoxy-D-xylulose-5-phosphate synthase (DOXP: root 9; leaf 10) and 1-deoxy-D-xylulose-5-phosphate reductoisomerase (DOXPR: root 1; leaf 3). From transcriptome sequencing, 61 different transcripts from root and 52 transcripts from leaf that encode UDP glycosyltransferase (UGT) have been obtained. Anthraquinones are another important secondary metabolites present in Aloe vera. De novo sequencing of Aloe vera in this study has provided many unigenes encoding octaketide/polyketide synthase (8 from root and 12 from leaf), aldoketoreductase (25 from root and 22 from leaf) and UDP-glycosyltransferase (61 from root and 52 from leaf) which play a key role in anthraquinone biosynthesis. Enzymes related to lignin biosynthesis viz. L-phenylalanine ammonia-lyase (PAL), caffeoyl CoA O-methyltransferase (CCoAOMT), caffeic acid O-methyltransferase (COMT), 4-coumarate:coenzyme A(CoA) ligase (4CL), cinnamoyl-CoA reductase (CCR), hydroxycinnamoyl transferase (HCT), cinnamate-4-hydroxylase (C4H), 4-coumarate 3-hydroxylase (C3H), ferulate 5-hydroxylase (F5H), cinnamyl alcohol dehydrogenase (CAD) were identified from the Aloe vera database by KEGG pathway mapping and the results were then further validated by differential gene expression and real-time expression analysis.
CDS present in specific pathways
Name Of Pathway
No. of Enzyme
Name Of Pathway
No. of Enzyme
Differential gene expression analysis and real-time expression analysis of transcripts involved in secondary metabolism
Representing fold change in gene expression after 6,12and 24 h of methyl jasmonate treatment
Fold Change after 6 h of methyl jasmonate treatment
Fold Change after 12 h methyl jasmonate treatment
Fold Change after 24 h methyl jasmonate treatment
Saponin biosynthesis Pathway
Anthraquinone biosynthesis pathway
Lignin biosynthesis pathway
Candidate genes related to secondary metabolism identified in root by KEGG Pathway Mapping
Root Gene ID
Phenylalanine ammonia lyase
Caffeoyl CoA O-methyltransferase
Caffeic acid O-methyltransferase
4-Coumarate:coenzyme A(CoA) ligase
0E + 00
0.00E + 00
Cinnamyl alcohol dehydrogenase
Farnesyl diphosphate synthase
DOX Phosphate synthase
Candidate genes related to secondary metabolism identified in leaf by KEGG Pathway Mapping
Leaf Gene ID
CCoAOMT: Caffeoyl CoA O-methyltransferase
COMT: Caffeic acid O-methyltransferase
4CL: 4-Coumarate:coenzyme A(CoA) ligase
CCR: Cinnamoyl-CoA reductase
HCT: Hydroxycinnamoyl transferase
C3H: 4-Coumarate 3-hydroxylase
F5H: Ferulate 5-hydroxylase
CAD: Cinnamyl alcohol dehydrogenase
Farnesyl diphosphate synthase
As the secondary metabolites are considered to be produced as a defense mechanism of plants under various biotic and abiotic stresses,  here stress condition was provided to the plant by spraying a chemical elicitor, methyl jasmonate at time interval of 6, 12 and 24 h to stimulate the gene expression. Relative expression of genes using real-time quantitative PCR were calculated with 2− ΔΔCT method as described by Livak and Schmittgen, 2001  [Here ΔΔCT = (CTtarget -CTGAPDH) time x - (CTtarget - CTGAPDH) time 0].
Aloe vera is one of the most popular medicinal plant used worldwide nevertheless, hardly any work has been conducted on its functional genomics. Only a few nucleotide sequences encoding complete or partial gene sequences are available in public databases like NCBI. No ESTs or genome survey sequences (GSS) from Aloe vera have been deposited in the GenBank . Therefore, results of this investigation on whole transcriptome sequencing of Aloe vera root as well as leaf tissues are important in the perspective of functional genomics of the plant.
De novo transcriptome assembly and functional annotation
A number of high quality reads, fine transcripts and CDS length obtained from Aloe vera next generation transcriptome sequencing indicates the functionality of Aloe vera sequencing at first sight. CDS were functionally annotated and most of the identified CDS resulted into the significant blast hits for root and leaf samples respectively, which represented a big coverage of Aloe vera genome. The majority of hits were found to be against Elaeis guineensis followed by Phoenix dactylifera for both the samples, showing closer relationship with Aloe vera genotypically.
GO mapping was carried out to assign the functions of identified CDS and grouped into three main domains: Biological processes, molecular function and cellular component. From functional GO distribution it was concluded that in the biological processes maximum 5890 genes in root and 6329 genes in leaf were only limited to metabolism and even majority of top BLAST hits i.e. 980 hits in root sample and 922 in leaf sample were found to be involved in metabolic processes, proves high metabolic activity of our research plant.
Transcription factor analysis
The transcription factors (TFs) are sequence specific DNA binding proteins that interact with the promoter regions of target genes to modulate their expression. TFs play a significant role in the regulation of plant development, reproduction, intercellular signaling, cell cycle, response to the environment as well as in modulation of secondary metabolite biosynthesis. Transcription factors like AP2/ERF, bHLH and MYB and NAC were found to be involved in regulating secondary metabolism [46, 47]. TFs family WRKYs have eminent role in regulating secondary metabolism during stress conditions . Identified MYB and NAC transcriptional factors may function as switches in lignin biosynthesis pathways as observed in other plants . Transcription factors identified in Aloe vera database may play a crucial role in modifying the levels of secondary metabolites, to stimulate the production of valuable secondary metabolites and to reduce the level of an undesirable metabolite that have an adverse effect on the quality of Aloe vera.
KEGG pathway mapping and potential gene identification related to secondary metabolism
KEGG is the most widely used biological database in the world for biological interpretation of genome sequences using server KASS [50, 51]. The identified CDS were mapped to reference canonical pathways in KEGG and predicted genes were found potentially involved in 24 different KEGG pathway. Five different functional categories were allotted to isolated CDS which encompass metabolism, genetic information processing, environmental information processing, cellular processes and organismal systems. It was found that 194 genes in root and 192 genes in leaf were related to environmental adaptation which proves well adapted feature of Aloe vera in diverse conditions.
The major secondary metabolite related constituents of Aloe vera include anthraquinones, saponin, lignin, sterols, polysaccharides, alkylbenzenes, dehydrabietic acid derivatives, lactin and salicylic acid [52, 53] which attribute for its pharmacological activity. Biosynthetic pathway of secondary metabolites in Aloe vera remains undiscovered till date. Several putative genes related to saponin, lignin, anthraquinone, carotenoid and phenylpropanoid biosynthesis have been identified from Aloe vera root and leaf tissue by KEGG Pathway functional annotation. Saponins are the major secondary metabolites of Aloe vera, well known for pharmaceutical and cosmetic properties . Saponins are a complex and chemically varied group of compounds consisting of triterpenoid or steroidal aglycones linked to oligosaccharide moieties. Steroidal saponins are hypothesized to share a common route with triterpene saponin from C5 isoprenoids, isopentenyl diphosphate (IPP) to the formation of the C30 unit squalene and 2, 3 oxidosqualene. The cytosolic MVA pathway was accepted as the only biosynthetic route to IPP until the plastid bound MEP pathway was elucidated in bacteria and plants. Now, it is generally suggested that sesquiterpenes, triterpenes and steroid are preferentially formed via MVA pathway, whereas, monoterpenes, diterpenes, and carotenoid are formed predominantly via the MEP pathway  There is no information as of today that which of the pathways (MVA or MEP) or both are involved in the biosynthesis of saponins precursors and subsequently the number of genes that are involved in the final biosynthesis of saponins in planta / Aloe vera. Squalene synthase (SQS) catalyzes the condensation of two farnesyl pyrophosphate (C15 unit) to 30 carbons compound squalene, the first committed intermediate in sterol and triterpene saponin biosynthesis pathway . Squalene synthase is well characterized in various plants like Chlorophytum borivilianum,  Dioscorea zingiberensis,  and Siraitia grosvenorii . Squalene epoxidase (SQE) with the cofactors O2 and NADPH catalyzes the conversion of squalene to 2,3-oxidosqualene that acts as a substrate for various oxidosqualene cyclases . Oxidosqualene cyclases (OSC) catalyze the cyclization of 2, 3-oxidosqualene which is a branching point for the sterol and triterpenoids saponin synthesis . Different OSCs have been characterized in the past few years and named after their respective products like lanosterol synthases, cycloartenol synthases,  lupeol synthases  and β-amyrin synthases (BASs) [64, 65]. Glycosylation is the final step in steroidal saponin biosynthesis, regulating the biological activities of saponins, catalyzed by glycosyltransferases . Several unigenes encoding acetyl-CoA acetyltransferase, HMG-CoA synthase, HMG-CoA reductase, mevalonate kinase, phosphomevalonate kinase, mevalonate-5-diphosphate decarboxylase, isopentenyl-PP isomerase, farnesyl diphosphate synthase, squalene synthase, squalene epoxidase, cycloartenol synthase, 1-deoxy-D-xylulose-5-phosphate synthase, 1-deoxy-D-xylulose-5-phosphate reductoisomerase and UDP glycosyltransferases which may be potentially involved in saponin biosynthesis pathway, have been identified from transcriptome sequencing. From Aloe vera transcriptome sequencing data no unigene encoding β-amyrin synthase was found, it denotes the lack of triterpenoid saponin in our research plant. However downward steps towards saponin biosynthesis are still unknown for Aloe vera.
Anthraquinones, another important group of secondary metabolites in Aloe vera are tricyclic aromatic quinines having strong antibacterial, antiviral, antifungal activity. Aloin, an anthraquinone glycoside is one of the most active metabolite of this group present in Aloe vera . The conjugation of a metabolite with a sugar moiety eases its entry into the target cells leading to the enhancement of the pharmacological activity [68, 69]. To date, very little is known about the biosynthetic steps leading to the formation of aloin/anthraquinones in Aloe vera. Abe et al.  have identified from Aloe arborescens a plant-specific polyketide synthase of type-III called octaketide synthase (OKS), that shares 50% amino acid sequence identity with other plant enzymes belonging to chalcone synthase superfamily. OKS catalyzes the iterative condensation of eight molecules of malonyl-CoA, might be involved in the biosynthesis of the octaketide anthrone aloin in Aloe vera. However, recombinant OKS expressed in E. coli has been reported to produce unnatural octaketide SEK4 / SEK4b as derailed shunt products either due to misfolding or glycosylation of heterologously expressed recombinant protein or the absence of interactions with an unidentified tailoring enzyme possibly ketoreductases. The physiological role of OKS in planta is yet to be identified. Earlier Grun and Franz  studied the in vitro biosynthesis of aloin from aloe-emodin anthrone and reported that the enzyme responsible for the C- glycosylation of aloe-emodin anthrone is specific for UDP-Glc. The transfer of activated sugar like UDP-glucose to aglycone acceptor molecule is catalyzed by the enzyme UDP-glycosyltransferase. Various genes like octaketide/polyketide synthase, aldoketoreductase, and UDP-glycosyltransferase have been identified from transcriptome sequencing which can be useful for unravelling the aloin biosynthesis in Aloe vera.
Enzymes related to lignin biosynthesis viz. L-phenylalanine ammonia-lyase (PAL), caffeoyl CoA O-methyltransferase (CCoAOMT), caffeic acid O-methyltransferase (COMT), 4-coumarate:coenzyme A(CoA) ligase (4CL), cinnamoyl-CoA reductase (CCR), hydroxycinnamoyl transferase (HCT), cinnamate-4-hydroxylase (C4H), 4-coumarate 3-hydroxylase (C3H), ferulate 5-hydroxylase (F5H), cinnamyl alcohol dehydrogenase (CAD), meticulously investigated for their roles in plant development, have been identified from the Aloe vera sequencing database which may be helpful to reveal the steps toward the lignin biosynthesis in Aloe vera.
Aloe vera has been reported to have anti-aging effect similar to vitamin A derivatives. Carotenoids in Aloe vera, serve as precursors of vitamin A in human diet, and are of interest as potential anti-cancer agents . In the past few years, genes encoding enzymes involved in carotenoid biosynthesis in plants have been identified and characterized at the molecular level. Briefly, geranylgeranyl diphosphate (GGPP) which acts as a precursor for carotenoids is synthesized by recruiting MEP pathway of isoprenogenesis. Further, condensation of two GGPP molecules is catalyzed by phytoene synthase (PSY). Subsequent desaturation by phytoene desaturase produces lycopene while the next tier of modifications is catalyzed by cyclases, hydroxylases, and ketolases, resulting in the production of different carotenoids . Various genes belonging to carotenoid biosynthesis including phytoene synthase, 15-cis-phytoene desaturase, zeta-carotene isomerase, prolycopene isomerase, lycopene beta-cyclase, lycopene epsilon-cyclase, zeaxanthin epoxidase, violaxanthin de-epoxidaseand abscisic-aldehyde oxidase have been isolated from sequencing, may pave the way of carotenoid biosynthesis in Aloe vera. In short, the unigenes related to different secondary metabolites biosynthesis, identified from transcriptome sequencing may unravel the secondary metabolism which is most responsible for potent applications of Aloe vera and still undiscovered.
Differential gene expression analysis and real-time expression analysis of transcripts involved in secondary metabolism
Differential Gene Expression (DGE) enables quick and thorough analysis of the gene expression under various conditions for a variety of tissues as a comparative landscape . Based on FPKM values, it was revealed that most of the genes related to secondary metabolism were highly expressed in root as compared to leaf tissue. Most of the root specific transcripts for saponin biosynthesis pathway were found up regulated in DGE data, same was also reported in Asparagus racemosus for steroidal saponin biosynthesis .
Several putative genes involved in saponin biosynthesis like acetyl-CoA acetyltransferase (ACT), mevalonate kinase (MVK), phosphomevalonate kinase (PMVK), mevalonate-5-diphosphate decarboxylase (MDD) and cycloartenol synthase (CAS) were up-regulated in root tissue while DOXP synthase and DOXP reductoisomerase genes were found to be up-regulated in leaf tissue indicating that plastid bound MEP pathway for saponin biosynthesis is dominant in leaf tissue. UDP glycosyltransferase (UGT) and octaketide synthase genes (OKS) were highly expressed in both root and leaf tissue stipulate that aloin is synthesized in both root and leaf tissue in Aloe vera.
The results obtained from quantitative real-time PCR assay of selected genes of saponin biosynthesis including HMG-CoA synthase, HMG-CoA reductase, mevalonate kinase, mevalonate-5 diphosphate decarboxylase, isopentenyl-PP isomerase and cycloartenol synthase reveal that mostly genes were found more expressed in root than in leaf tissue as given in DGE expression data. Exceptionally, HMG-CoA synthase showed higher expression in root tissue which was revert to DGE expression value. From quantitative real time expression results it was UDP-glycosyltransferase, found highly expressed in both root and leaf tissue with somewhat more expression in root indicating biosynthesis of anthraquinones in leaf as well as root. Putative genes encoding lignin biosynthesis pathway were also substantially better expressed in Aloe vera root as compared to leaf tissue as presented in DGE data indicating its reliability.
The secondary metabolites are considered to be produced as a defense mechanism of plants against various undesirable environmental encounters including biotic and abiotic stresses . Several studies have demonstrated that chemical elicitors like methyl jasmonate mediate such metabolic responses to environment through an extensive transcriptional reprogramming of the plant metabolism. In this study it was observed that most of the genes of saponin pathway were up-regulated on exposure of leaf tissue to methyl jasmonate, similar to that reported in Asparagus racemosus, a saponin rich medicinal plant . Expression of HMG-CoA reductase, mevalonate kinase, mevalonate-5 diphosphate decarboxylase and isopentenyl-PP isomerase genes was maximum after 12 h of treatment and declined after 24 h of treatment while HMG-CoA synthase and cycloartenol synthase showed maximum expression after 24 h of treatment. The aglycone moiety of saponins is a triterpene derivative. Triterpenes are usually synthesized predominantly via mevalonate pathway of isoprenogenesis wherein HMGS catalyzes a step that not only holds the second degree of regulatory position but also produces a precursor for the primary regulatory step, the HMGR catalyzed reaction. The observed high expression of CAS is in alignment with the elevation in saponin biosynthesis through improved production of triterpene alcohols in preference to triterpene hydrocarbon like β-amyrin through carbon flux control at this branch point in favour of sterols. Failure to have a detectable level of expression of β-amyrin synthase is an avowal of this postulation. Aloin biosynthesis pathway related genes were also upregulated and showed maximum expression after 24 h of methyl jasmonate induction, indicating the correlation of both aloin and saponin with defense mechanism of Aloe vera. There was a little fold change in expression was noticed for the putative genes encoding lignin biosynthesis, showing no significant role of lignin in plant defense mechanism during stress conditions.
De novo assembly of transcriptome data in conjugation with DGE analysis served as a powerful approach for the identification of genes involved in the biosynthesis of important secondary metabolites pertaining to different chemical classes in Aloe vera - a non-model plant. The transcriptome database generated by this study will provide an important resource that may aid in identification and characterization of gene related to the specialized metabolism in the plant as well as understanding the function of gene set(s) in the biology and physiology of plant, metabolic pathways and their regulations, signal transduction mechanism, and marker-assisted breeding particularly for chemotype development in this species as well as other species of genus Aloe.
Aloe vera is well known plant used worldwide for its medicinal and cosmetic properties due to its specialized metabolic competence. However, despite significant knowledge on chemical composition and healthful properties, any significant information about its genomics is completely lacking. Therefore, in the present study, transcriptome sequence data for Aloe vera root and shoot was generated using NGS technology. The transcriptome sequences have been assembled, annotated and analyzed with special emphasis on secondary metabolism. The assembly and genes have been validated by gene expression analysis. The potential genes isolated can be exploited for characterization of metabolic understanding and modulation of saponin, aloin, carotenoid and lignin biosynthesis in Aloe vera. Identified transcription factors may be recruited to understand their relative regulatory significance across different metabolic processes in the plant and undertake metabolic engineering studies. To our knowledge, this would be the first transcriptome sequencing study of Aloe vera until now.
Sample preparation and Total RNA isolation
Aloe vera was grown in the herbal nursery at Guru Jambheshwar University of Science and Technology, Hisar, India. Young leaves and roots were collected from the healthy plant, snap freezed in liquid nitrogen and stored at − 80 °C for further use. Total RNA was isolated from each tissue using RNeasy Plant Mini Kit (Qiagen) according to manufacturer’s instructions. The quality of the isolated RNA was checked on 1% denaturing agarose gel for the presence of 28S and 18S bands. Further, the RNA quality and quantity was analysed by using Qubit fluorometer.
Total RNA isolated from the plant samples was used for the preparation of RNA-Seq paired end sequencing libraries with the help of TrueSeq® Stranded mRNA sample preparation kit (Illumina). Enrichment of mRNA from the total RNA was done with the help of poly-T attached magnetic beads which was followed by enzymatic fragmentation and 1st strand cDNA conversion. The second strand was then synthesised form the 1st strand using second strand mix and Act-D mix to facilitate RNA dependent synthesis. Then the double stranded cDNA samples were purified using AMPure XP beads (Agencourt Biosciences). These beads selectively binds larger double stranded cDNA samples and excess of primers, nucleotides, salts and enzymes were removed making the products free from any kind of contaminants. It was then followed by adapter ligation, A-tailing and enrichment by limited number of PCR cycles. The PCR amplified library was analyzed in Tape Station 4200 (Agilent Technologies) using High Sensitivity (HS) D5000 Screen Tape assay kit as per manufacturer instructions.
Sequencing and quality control
The cDNA library was then used for paired end sequencing using Illumina Hi-Seq platform (2 × 150 bp chemistry) to generate the raw data for both the samples. In paired end sequencing the template fragment is sequenced in both forward and reverse directions. The samples were allowed to bind with complementary adapter oligos on paired-end flow cell with the help of kit reagents. The adapters were designed in order to allow selective cleavage of the forward strands after re-synthesis of the reverse strand during sequencing. The opposite end of the fragment was then sequenced from the copied reverse strand. Prior to the assembly the raw data obtained was processed to obtain high quality reads. Trimmomatic v0.35 was used to remove adapter sequences, ambiguous reads (reads with unknown nucleotides “N” larger than 5%), and low-quality sequences (reads with more than 10% quality threshold (QV) < 20 phred score). A minimum length of 100 nt (nucleotide) after trimming was applied. After removing the adapter and low quality sequences from the raw data high quality reads were retained for root and leaf sample respectively. This high quality (QV > 20), paired-end reads were used for de-novo assembly.
De-novo transcriptome assembly, validation and CDS prediction
The high quality reads for both the samples were then assembled into transcripts using RNA-Seq assembler Trinity . While assembling the transcripts there is a chance that large amounts of misassembled transcripts, erroneous and poorly supported transcripts may arise, therefore, all high quality reads were assembled with their respective assembled transcripts using Burrows-Wheeler Aligner . The non-redundant transcripts were further clustered together using CD-HIT-EST-454  at 95% identity and query coverage. After the assembly and clustering of transcripts, sequences were obtained that could not be extended further, these sequences were termed as unigenes. The unigenes were then used to predict coding sequences within them using TransDecoder.
Gene ontology analysis
For the annotation, the predicted CDS were searched against NCBI non redundant(Nr) protein database (http://www.ncbi.nlm.nih.gov) using Basic local alignment search tool (BLASTx) with a common significance threshold cut-off of E-value ≤1e-05. Gene ontology (GO) annotations of the CDS were carried out with the help of Blast2GO program . The BLASTx result accession IDs were searched directly in the gene product table (dbxref) of GO database. The GO mapping differentiated the predicted CDS into three major domains representing gene product properties namely: Biological process, Molecular function and Cellular component. Each predicted CDS may have more than one GO term assigned either in the same domain or in different domains i.e. biological process, molecular function and cellular component .
Functional annotation of KEGG pathway
For the identification of possible involvement of the predicted CDS in various biological pathways, the CDS were mapped to the reference canonical pathways in Kyoto Encylopedia of Genes and Genomes (KEGG) database . Five major divisions under which the CDS were distributed included metabolism, genetic information processing, environmental information processing, cellular processes and organismal systems. The information obtained upon KEGG analysis included KEGG Orthology (KO) assignments, their corresponding enzyme commission (EC) number and prediction of metabolic pathway using KEGG automated annotation server KASS .
Abundance estimation and differential gene expression analysis (DGE)
FPKM (Fragments Per Kilobase of transcript per million mapped reads) values were calculated to measure the expression level of each assembled transcript sequence. For FPKM measurement a reference transcriptome was first generated by clustering both the samples unigenes i.e. leaf and root. The high-quality cleaned reads from each sample were aligned separately on reference transcriptome (clustered unigene of both the samples) using burrows wheeler aligner (bwa). The read count profile from the output file (.sam) of bwa alignment was generated by using SAMtools . Differential gene expression (DGE) analysis was performed employing a negative binomial distribution model (DESeqv1.8.1 package http://www-huber.embl.de/users/anders/DESeq/) . Dispersion values were calculated using following parameters: method = blind, sharing mode = fit-only and fit type = local. On the basis of log fold change (FC) the transcripts were further classified as up and down regulated. The log fold change value was calculated by using the formula: FC = Log2 (Treated/Control). Transcripts having FC value greater than zero were considered up-regulated whereas less than zero, were down-regulated. To obtain statistically significant results P value threshold of 0.05 was used. With the help of Multiple Experiment Viewer (MEV v4.9.0), a complete linkage hierarchical cluster analysis was performed on top 100 differentially expressed genes. A heat map (cluster) depicts the level of transcript abundance. Levels of expression are represented as log2 ratio of transcript abundance between leaf and root samples. A heat map was constructed employing the log-transformed and the normalized value of genes based on Pearson correlation distance as well as based on complete linkage method.
Transcription factor analysis
The predicted CDS were searched against Plant transcription factor database (PlantTFdb)  to obtain the transcription factors from both root and leaf CDS.
Gene validation with qRT-PCR
The transcripts obtained by sequencing were further validated by qRT-PCR. Sixteen unigenes involved in anthraquinone, saponin and lignin biosynthesis were selected for quantitative real-time expression. RNA was isolated with the help of CIA-PCIA method from the root and leaf samples as well as from the plant which was treated externally with methyl jasmonate (250 μM) at time interval of 6, 12 and 24 h. Isolated RNA was further reverse transcribed with the help of Revert Aid First Strand cDNA Synthesis Kit (Thermo Scientific) using oligo dT(18) primers. Specific primers were designed for sixteen unigenes and two housekeeping genes (GAPDH and beta tubulin) with the help of Primer Express software v3.0.1 (List of primers is given in Additional file 3). The qRT-PCR was carried out in triplicates using SYBR® Green Jump Start™ Taq Ready Mix™ (Sigma) on Applied Biosystems’ Step One™ Real Time PCR System. The reaction mixture used included 10 μl of SYBR green master mix, 20 pmol/μl forward and reverse primers and 2 μl of cDNA for a reaction volume of 20 μl. The thermal cycle used was as follows: initial denaturation at 95 °C for 20 s followed by denaturation and annealing at 95 °C for 3 s and 60 °C for 30 s for 40 cycles followed by melt curve analysis: 95 °C for 15 s, 60 °C for 1 min and 95 °C for 15 s. The relative expression levels were determined using 2-∆∆Ct method .
The author’s acknowledge Eurofins Genomics India Pvt. Ltd., Bengaluru, India for illumina sequencing and bioinformatics analysis. MR acknowledge the support of Haryana State Council of Science and Technology, Panchkula in the form of Junior Research Fellowship.
This work was supported by University Grants Commission, New Delhi, in the form of Major Research Project vide Grant No. F: 41–586/2012(SR).
Availability of data and materials
The dataset generated and /or analysed during the current study are available in the NCBI short sequence read archive (Accession number: SRR5167034 and SRR5161731) and Bioproject number: PRJNA359629.
VC conceived the research plan and designed the project. RK and AK helped to sample plant material and contributed in RNA extraction. PC and MR performed research and drafted manuscript. PC, MR, VC and RS participated in data analysis. VC and RS drew inferences from data, and critically improved and edited the manuscript. PC and MR contributed equally as first author to this manuscript. All authors have read and approved the final manuscript.
The authors declare that they have no competing interests.
Ethics approval and consent to participate
Aloe vera was grown in the herbal nursery at Guru Jambheshwar University of Science and Technology, Hisar, India under natural conditions according to institutional and national guidelines.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Fox LT, Gerber M, Preez JL, Plessis JD, Hamman JH. Skin permeation enhancement effects of the gel and whole-leaf materials of Aloe vera, Aloe marlothii and Aloe ferox. J Pharm Pharmacol. 2015;67(1):96–106.View ArticlePubMedGoogle Scholar
- Pugh N, Ross SA, ElSohly MA, Pasco DS. Characterization of Aloeride, a new high-molecular-weight polysaccharide from Aloe vera with potent immunostimulatory activity. J Agric Food Chem. 2001;49(2):1030–4.View ArticlePubMedGoogle Scholar
- Chen W, Van Wyk BE, Vermaak I, Viljoen AM. Cape aloes—a review of the phytochemistry, pharmacology and commercialisation of Aloe ferox. Phytochem Lett. 2012;5(1):1–2.View ArticleGoogle Scholar
- Tarameshloo M, Norouzian M, Zarein-Dolab S, Dadpay M, Mohsenifar J, Gazor R. Aloe vera gel and thyroid hormone cream may improve wound healing in Wistar rats. Anat Cell Biol. 2012;45(3):170–7.View ArticlePubMedPubMed CentralGoogle Scholar
- Langmead L, Makins RJ, Rampton DS. Anti-inflammatory effects of aloe vera gel in human colorectal mucosa in vitro. Aliment Pharmacol Ther. 2004;19(5):521–7.View ArticlePubMedGoogle Scholar
- Chandan BK, Saxena AK, Shukla S, Sharma N, Gupta DK, Suri KA, Suri J, Bhadauria M, Singh B. Hepatoprotective potential of Aloe barbadensis mill. Against carbon tetrachloride induced hepatotoxicity. J Ethnopharmacol. 2007;111(3):560–6.View ArticlePubMedGoogle Scholar
- Lin JG, Chen GW, Li TM, Chouh ST, Tan TW, Chung JG. Aloe-emodin induces apoptosis in T24 human bladder cancer cells through the p53 dependent apoptotic pathway. J Urol. 2006;175(1):343–7.View ArticlePubMedGoogle Scholar
- Borrelli F, Izzo AA. The plant kingdom as a source of anti-ulcer remedies. Phytother Res. 2000;14(8):581–91.View ArticlePubMedGoogle Scholar
- Kumar M, Rakesh S, Nagpal R, Hemalatha R, Ramakrishna A, Sudarshan V, Ramagoni R, Shujauddin M, Verma V, Kumar A, Tiwari A. Probiotic lactobacillus rhamnosus GG and Aloe vera gel improve lipid profiles in hypercholesterolemic rats. Nutrition. 2013;29(3):574–9.View ArticlePubMedGoogle Scholar
- Huseini HF, Kianbakht S, Hajiaghaee R, Dabaghian FH. Anti-hyperglycemic and anti-hypercholesterolemic effects of Aloe vera leaf gel in hyperlipidemic type 2 diabetic patients: a randomized double-blind placebo-controlled clinical trial. Planta Med. 2012;78(04):311–6.View ArticlePubMedGoogle Scholar
- Qadir MI. Medicinal and cosmetological importance of Aloe vera. Int J Nat Ther. 2009;2:21–6.Google Scholar
- Miladi S, Damak M. In vitro antioxidant activities of Aloe vera leaf skin extracts. J Soc Chim Tunisie. 2008;10(10):101–9.Google Scholar
- Yamaguchi I, Mega N, Sanada H. Components of the gel of Aloe vera (L.) Bunn. F. Biosci Biotechnol Biochem. 1993;57(8):1350–2.View ArticlePubMedGoogle Scholar
- Danhof IE. Potential reversal of chronological and photo-aging of the skin by topical application of natural substances. Phytother Res. 1993;7(7)Google Scholar
- Wynn RL. Aloe vera gel: Update for dentistry. Gen Dent. 2005;53(1):6–9.PubMedGoogle Scholar
- Dixon RA. Natural products and plant disease resistance. Nature. 2001;411(6839):843–7.View ArticlePubMedGoogle Scholar
- Radha MH, Laxmipriya NP. Evaluation of biological properties and clinical effectiveness of Aloe vera: a systematic review. Journal of traditional and complementary medicine. 2015;5(1):21–6.View ArticlePubMedGoogle Scholar
- Osbourn AE. Preformed antimicrobial compounds and plant defense against fungal attack. Plant Cell. 1996;8(10):1821.View ArticlePubMedPubMed CentralGoogle Scholar
- Kuzina V, Ekstrøm CT, Andersen SB, Nielsen JK, Olsen CE, Bak S. Identification of defense compounds in Barbarea vulgaris against the herbivore Phyllotreta nemorum by an ecometabolomic approach. Plant Physiol. 2009;151(4):1977–90.View ArticlePubMedPubMed CentralGoogle Scholar
- Szakiel A, Pączkowski C, Henry M. Influence of environmental abiotic factors on the content of saponins in plants. Phytochem Rev. 2011;10(4):471–91.View ArticleGoogle Scholar
- Atherton DP. Aloe Vera myth or medicine. Ramsgate: Positive Health Publications; 2002.Google Scholar
- Hamman JH. Composition and applications of Aloe vera leaf gel. Molecules. 2008;13(8):1599–616.View ArticlePubMedGoogle Scholar
- Habeeb F, Shakir E, Bradbury F, Cameron P, Taravati MR, Drummond AJ, Gray AI, Ferro VA. Screening methods used to determine the anti-microbial properties of Aloe vera inner gel. Methods. 2007;42(4):315–20.View ArticlePubMedGoogle Scholar
- Global demand of Aloe vera extracts to reach 60,720 tonnes in 2016; emergence of innovative, high quality and cost effective products ramping up adoption. http://www.futuremarketinsights.com/press-release/aloe-vera-extracts-market. Accessed 22 Dec 2016.
- Sawada Y, Toyooka K, Kuwahara A, Sakata A, Nagano M, Saito K, Hirai MY. Arabidopsis bile acid: sodium symporter family protein 5 is involved in methionine-derived glucosinolate biosynthesis. Plant Cell Physiol. 2009;50(9):1579–86.View ArticlePubMedPubMed CentralGoogle Scholar
- Sun C, Li Y, Wu Q, Luo H, Sun Y, Song J, Lui EM, Chen S. De novo sequencing and analysis of the American ginseng root transcriptome using a GS FLX titanium platform to discover putative genes involved in ginsenoside biosynthesis. BMC Genomics. 2010;11(1):262.View ArticlePubMedPubMed CentralGoogle Scholar
- Mizrachi E, Hefer CA, Ranik M, Joubert F, Myburg AA. De novo assembled expressed gene catalog of a fast-growing Eucalyptus tree produced by Illumina mRNA-Seq. BMC Genomics 2010 1;11(1):681.Google Scholar
- Liu JP, Xia ZQ, Tian XY, Li YJ. Transcriptome sequencing and analysis of rubber tree (Hevea brasiliensis Muell.) to discover putative genes associated with tapping panel dryness (TPD). BMC Genomics. 2015;16(1):398.View ArticlePubMedPubMed CentralGoogle Scholar
- Garcia-Seco D, Zhang Y, Gutierrez-Mañero FJ, Martin C, Ramos-Solano B. RNA-Seq analysis and transcriptome assembly for blackberry (Rubus sp. Var. Lochness) fruit. BMC Genomics. 2015;16(1):5.View ArticlePubMedPubMed CentralGoogle Scholar
- Guo Q, Ma X, Wei S, Qiu D, Wilson IW, Wu P, Tang Q, Liu L, Dong S, Zu W. De novo transcriptome sequencing and digital gene expression analysis predict biosynthetic pathway of rhynchophylline and isorhynchophylline from Uncaria rhynchophylla, a non-model plant with potent anti-alzheimer’s properties. BMC Genomics. 2014;15(1):676.View ArticlePubMedPubMed CentralGoogle Scholar
- Czaban A, Sharma S, Byrne SL, Spannagl M, Mayer KF, Asp T. Comparative transcriptome analysis within the Lolium/Festuca species complex reveals high sequence conservation. BMC Genomics. 2015;16(1):249.View ArticlePubMedPubMed CentralGoogle Scholar
- Cannarozzi G, Plaza-Wüthrich S, Esfeld K, Larti S, Wilson YS, Girma D, de Castro E, Chanyalew S, Blösch R, Farinelli L, Lyons E. Genome and transcriptome sequencing identifies breeding targets in the orphan crop tef (Eragrostis tef). BMC Genomics. 2014;15(1):581.View ArticlePubMedPubMed CentralGoogle Scholar
- Wei W, Qi X, Wang L, Zhang Y, Hua W, Li D, Lv H, Zhang X. Characterization of the sesame (Sesamum indicum L.) global transcriptome using Illumina paired-end sequencing and development of EST-SSR markers. BMC Genomics. 2011;12(1):451.View ArticlePubMedPubMed CentralGoogle Scholar
- Liu Z, Chen T, Ma L, Zhao Z, Zhao PX, Nan Z, Wang Y. Global transcriptome sequencing using the Illumina platform and the development of EST-SSR markers in autotetraploid alfalfa. PLoS One. 2013;8(12):e83549.View ArticlePubMedPubMed CentralGoogle Scholar
- Wang Z, Fang B, Chen J, Zhang X, Luo Z, Huang L, Chen X, Li Y. De novo assembly and characterization of root transcriptome using Illumina paired-end sequencing and development of cSSR markers in sweetpotato (Ipomoea batatas). BMC Genomics. 2010;11(1):726.View ArticlePubMedPubMed CentralGoogle Scholar
- Sangwan RS, Tripathi S, Singh J, Narnoliya LK, Sangwan NS. De novo sequencing and assembly of Centella asiatica leaf transcriptome for mapping of structural, functional and regulatory genes with special reference to secondary metabolism. Gene. 2013;525(1):58–76.View ArticlePubMedGoogle Scholar
- Rastogi S, Meena S, Bhattacharya A, Ghosh S, Shukla RK, Sangwan NS, Lal RK, Gupta MM, Lavania UC, Gupta V, Nagegowda DA. De novo sequencing and comparative analysis of holy and sweet basil transcriptomes. BMC Genomics. 2014;15(1):588.View ArticlePubMedPubMed CentralGoogle Scholar
- Xu Y, Li X, Lin J, Wang Z, Yang Q, Chang Y. Transcriptome sequencing and analysis of major genes involved in calcium signaling pathways in pear plants (Pyrus calleryana Decne.). BMC Genomics. 2015;16(1):738.View ArticlePubMedPubMed CentralGoogle Scholar
- Narnoliya LK, Kaushal G, Singh SP, Sangwan RS. De novo transcriptome analysis of rose-scented geranium provides insights into the metabolic specificity of terpene and tartaric acid biosynthesis. BMC Genomics. 2017;18(1):74.View ArticlePubMedPubMed CentralGoogle Scholar
- Gupta P, Goel R, Pathak S, Srivastava A, Singh SP, Sangwan RS, Asif MH, Trivedi PK. De novo assembly, functional annotation and comparative analysis of Withania somnifera leaf and root transcriptomes to identify putative genes involved in the withanolides biosynthesis. PLoS One. 2013;8(5):e62714.View ArticlePubMedPubMed CentralGoogle Scholar
- Jozefczuk J, Adjaye J. 6 quantitative real-time PCR-based analysis of gene expression. Methods Enzymol. 2011;500:99.View ArticlePubMedGoogle Scholar
- Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011;29(7):644–52.View ArticlePubMedPubMed CentralGoogle Scholar
- Mazid M, Khan TA, Mohammad F. Role of secondary metabolites in defense mechanisms of plants. Biology and medicine. 2011;3(2):232–49.Google Scholar
- Livak KJ, Schmittgen TD. Analysis of relative gene expression data using real-time quantitative PCR and the 2− ΔΔCT method. Methods. 2001;25(4):402–8.View ArticlePubMedGoogle Scholar
- Search NCBI databases. https://www.ncbi.nlm.nih.gov/gquery/?term=Aloe+vera. Accessed 20 Feb 2017.
- Li CF, Zhu Y, Yu Y, Zhao QY, Wang SJ, Wang XC, Yao MZ, Luo D, Li X, Chen L, Yang YJ. Global transcriptome and gene regulation network for secondary metabolite biosynthesis of tea plant (Camellia sinensis). BMC Genomics. 2015;16(1):560.View ArticlePubMedPubMed CentralGoogle Scholar
- Yang CQ, Fang X, Wu XM, Mao YB, Wang LJ, Chen XY. Transcriptional regulation of plant secondary metabolismF. J Integr Plant Biol. 2012;54(10):703–12.View ArticlePubMedGoogle Scholar
- Schluttenhofer C, Yuan L. Regulation of specialized metabolism by WRKY transcription factors. Plant Physiol. 2015;167(2):295–306.View ArticlePubMedGoogle Scholar
- Yoon J, Choi H, An G. Roles of lignin biosynthesis and regulatory genes in plant development. J Integr Plant Biol. 2015;57(11):902–12.View ArticlePubMedPubMed CentralGoogle Scholar
- Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res. 2015;44(D1):D457–62.Google Scholar
- Moriya Y, Itoh M, Okuda S, Yoshizawa AC, Kanehisa M. KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Res. 2007;35(suppl 2):W182–5.View ArticlePubMedPubMed CentralGoogle Scholar
- Eshun K, He Q. Aloe vera: a valuable ingredient for the food, pharmaceutical and cosmetic industries—a review. Crit Rev Food Sci Nutr. 2004;44(2):91–6.View ArticlePubMedGoogle Scholar
- Luta G, McAnalley B. Aloe vera: chemical composition and methods used to determine its presence in commercial products. GlycoScience Nutrition. 2005;6:1–2.Google Scholar
- Sparg S, Light ME, Van Staden J. Biological activities and distribution of plant saponins. J Ethnopharmacol. 2004;94(2):219–43.View ArticlePubMedGoogle Scholar
- Moses T, Pollier J, Almagro L, Buyst D, Van Montagu M, Pedreño MA, Martins JC, Thevelein JM, Goossens A. Combinatorial biosynthesis of sapogenins and saponins in Saccharomyces cerevisiae using a C-16α hydroxylase from Bupleurum falcatum. Proc Natl Acad Sci. 2014;111(4):1634–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Su H, Liu Y, Xiao Y, Tan Y, Gu Y, Liang B, Huang H, Wu Y. Molecular and biochemical characterization of squalene synthase from Siraitia grosvenorii. Biotechnol Lett 2017 28:1–0.Google Scholar
- Kalra S, Kumar S, Lakhanpal N, Kaur J, Singh K. Characterization of squalene synthase gene from Chlorophytum borivilianum (Sant. And Fernand.). Mol Biotechnol. 2013;54(3):944–53.View ArticlePubMedGoogle Scholar
- Ye Y, Wang R, Jin L, Shen J, Li X, Yang T, Zhou M, Yang Z, Chen Y. Molecular cloning and differential expression analysis of a squalene synthase gene from Dioscorea zingiberensis, an important pharmaceutical plant. Mol Biol Rep. 2014;41(9):6097–104.View ArticlePubMedGoogle Scholar
- Abe I, Abe T, Lou W, Masuoka T, Noguchi H. Site-directed mutagenesis of conserved aromatic residues in rat squalene epoxidase. Biochem Biophys Res Commun. 2007;352(1):259–63.View ArticlePubMedGoogle Scholar
- Kumar V, Kumar CS, Hari G, Venugopal NK, Vijendra PD, Basappa G. Homology modeling and docking studies on oxidosqualene cyclases associated with primary and secondary metabolism of Centella asiatica. Springerplus. 2013;2(1):189.View ArticlePubMedPubMed CentralGoogle Scholar
- Joffrion TM, Collins MS, Sesterhenn T, Cushion MT. Functional characterization and localization of pneumocystis carinii lanosterol synthase. Eukaryot Cell. 2010;9(1):107–15.View ArticlePubMedGoogle Scholar
- Zhu ZH, Liu WH, Ge LJ, Yu Q, Zhao WC, Yang JH. Molecular cloning and characterization of a cDNA encoding cycloartenol synthase from Fritillaria thunbergii Miq. Afr J Biotechnol. 2012;11(26):6896–903.Google Scholar
- Hayashi H, Huang P, Takada S, Obinata M, Inoue K, Shibuya M, Ebizuka Y. Differential expression of three oxidosqualene cyclase mRNAs in Glycyrrhiza glabra. Biol Pharm Bull. 2004;27(7):1086–92.View ArticlePubMedGoogle Scholar
- Kajikawa M, Yamato KT, Fukuzawa H, Sakai Y, Uchida H, Ohyama K. Cloning and characterization of a cDNA encoding β-amyrin synthase from petroleum plant Euphorbia tirucalli L. Phytochemistry. 2005;66(15):1759–66.View ArticlePubMedGoogle Scholar
- Zheng X, Luo X, Ye G, Chen Y, Ji X, Wen L, Xu Y, Xu H, Zhan R, Chen W. Characterisation of two oxidosqualene cyclases responsible for triterpenoid biosynthesis in Ilex asprella. Int J Mol Sci. 2015;16(2):3564–78.View ArticlePubMedPubMed CentralGoogle Scholar
- Ross J, Li Y, Lim EK, Bowles DJ. Higher plant glycosyltransferases. Genome Biol. 2001;2(2):reviews3004.1–3004.6.Google Scholar
- Kumar R, Naik PK, Kumar A, Aggarwal H, Kumar A, Chhokar VA. Combined approach using RAPD, ISSR and bioactive compound for the assessment of genetic diversity in Aloe vera (L.) Burm. F. Indian J Biotechnol. 2016;15:538–49.Google Scholar
- Chung JH, Cheong JC, Lee JY, Roh HK, Cha YN. Acceleration of the alcohol oxidation rate in rats with aloin, a quinone derivative of Aloe. Biochem Pharmacol. 1996;52(9):1461–8.View ArticlePubMedGoogle Scholar
- Tian B, Hua YJ, Ma XQ, Wang GL. Relationship between antibacterial activity of aloe and its anthaquinone compounds. Zhongguo Zhong yao za zhi. 2003;28(11):1034–7.PubMedGoogle Scholar
- Abe I, Oguro S, Utsumi Y, Sano Y, Noguchi H. Engineered biosynthesis of plant polyketides: chain length control in an octaketide-producing plant type III polyketide synthase. J Am Chem Soc. 2005;127(36):12709–16.View ArticlePubMedGoogle Scholar
- Grün M, Franz G. In vitro biosynthesis of the C-glycosidic bond in aloin. Planta. 1981;152(6):562–4.View ArticlePubMedGoogle Scholar
- Cunningham FX Jr, Gantt E. Genes and enzymes of carotenoid biosynthesis in plants. Annu Rev Plant Biol. 1998;49(1):557–83.View ArticleGoogle Scholar
- Cazzonelli CI, Pogson BJ. Source to sink: regulation of carotenoid biosynthesis in plants. Trends Plant Sci. 2010;15(5):266–74.View ArticlePubMedGoogle Scholar
- Eveland AL, Satoh-Nagasawa N, Goldshmidt A, Meyer S, Beatty M, Sakai H, Ware D, Jackson D. Digital gene expression signatures for maize development. Plant Physiol. 2010;154(3):1024–39.View ArticlePubMedPubMed CentralGoogle Scholar
- Upadhyay S, Phukan UJ, Mishra S, Shukla RK. De novo leaf and root transcriptome analysis identified novel genes involved in steroidal sapogenin biosynthesis in Asparagus racemosus. BMC Genomics. 2014;15(1):746.View ArticlePubMedPubMed CentralGoogle Scholar
- Li H, Durbin R. Fast and accurate long-read alignment with burrows–wheeler transform. Bioinformatics. 2010;26(5):589–95.View ArticlePubMedPubMed CentralGoogle Scholar
- Fu L, Niu B, Zhu Z, Wu S, Li W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012;28(23):3150–2.View ArticlePubMedPubMed CentralGoogle Scholar
- Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005;21(18):3674–6.View ArticlePubMedGoogle Scholar
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA. Gene ontology: tool for the unification of biology. Nat Genet. 2000;25(1):25–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M. The KEGG resource for deciphering the genome. Nucleic Acids Res. 2004;32(suppl 1):D277–80.View ArticlePubMedPubMed CentralGoogle Scholar
- KAAS Job Request. http://www.genome.jp/kaas-bin/kaas_main. Accessed 5 June 2016.
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25(16):2078–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010;11(10):R106.View ArticlePubMedPubMed CentralGoogle Scholar
- Jin J, Zhang H, Kong L, Gao G, Luo J. PlantTFDB 3.0: a portal for the functional and evolutionary study of plant transcription factors. Nucleic Acids Res. 2013;42(D1):D1182–7.View ArticlePubMedPubMed CentralGoogle Scholar