Global transcriptome analysis of Huperzia serrata and identification of critical genes involved in the biosynthesis of huperzine A
© The Author(s). 2017
Received: 3 January 2017
Accepted: 10 March 2017
Published: 22 March 2017
Huperzia serrata (H. serrata) is an economically important traditional Chinese herb with the notably medicinal value. As a representative member of the Lycopodiaceae family, the H. serrata produces various types of effectively bioactive lycopodium alkaloids, especially the huperzine A (HupA) which is a promising drug for Alzheimer’s disease. Despite their medicinal importance, the public genomic and transcriptomic resources are very limited and the biosynthesis of HupA is largely unknown. Previous studies on comparison of 454-ESTs from H. serrata and Phlegmariurus carinatus predicted putative genes involved in lycopodium alkaloid biosynthesis, such as lysine decarboxylase like (LDC-like) protein and some CYP450s. However, these gene annotations were not carried out with further biochemical characterizations. To understand the biosynthesis of HupA and its regulation in H. serrata, a global transcriptome analysis on H. Serrata tissues was performed.
In this study, we used the Illumina Highseq4000 platform to generate a substantial RNA sequencing dataset of H. serrata. A total of 40.1 Gb clean data was generated from four different tissues: root, stem, leaf, and sporangia and assembled into 181,141 unigenes. The total length, average length, N50 and GC content of unigenes were 219,520,611 bp, 1,211 bp, 2,488 bp and 42.51%, respectively. Among them, 105,516 unigenes (58.25%) were annotated by seven public databases (NR, NT, Swiss-Prot, KEGG, COG, Interpro, GO), and 54 GO terms and 3,391 transcription factors (TFs) were functionally classified, respectively. KEGG pathway analysis revealed that 72,230 unigenes were classified into 21 functional pathways. Three types of candidate enzymes, LDC, CAO and PKS, responsible for the biosynthesis of precursors of HupA were all identified in the transcripts. Four hundred and fifty-seven CYP450 genes in H. serrata were also analyzed and compared with tissue-specific gene expression. Moreover, two key classes of CYP450 genes BBE and SLS, with 23 members in total, for modification of the lycopodium alkaloid scaffold in the late two stages of biosynthesis of HupA were further evaluated.
This study is the first report of global transcriptome analysis on all tissues of H. serrata, and critical genes involved in the biosynthesis of precursors and scaffold modifications of HupA were discovered and predicted. The transcriptome data from this work not only could provide an important resource for further investigating on metabolic pathways in H. serrata, but also shed light on synthetic biology study of HupA.
KeywordsTranscriptome Huperzia serrata Biosynthetic pathway Huperzine A Lycopodium alkaloid
Huperzia serrata (H. serrate) is a model member of the Huperzia (Phlegmariurus) genus which belongs to the plant Family Lycopodiaceae, with a total of about 500 species worldwide [1, 2]. The whole plant of H. serrata, named Qian Ceng Ta (in Chinese), is one of the oldest medicinally important traditional Chinese herbs since 739 (during the Tang Dynasty) and has been extensively used for the treatment of a number of ailments, including contusions, strains, swellings, schizophrenia, myasthenia gavis and organophosphate poisoning . These pharmaceutical applications of H. serrata are mainly due to its biologically active lycopodium alkaloids. The four classes of lycopodium alkaloids with diverse chemical structures include lycopodine-type, lycodine-type, fawcettimine-type, and a set of miscellaneous-type compounds have been isolated from the H. serrata, which makes it a unique model plant for studying the biosynthesis of lycopodium alkaloids .
The limited transcriptomic data hamper the biosynthetic study of active lycopodium alkaloids in H. serrata and the biosynthetic pathway of HupA remains to be elucidated. The HiSeq4000 platform is the most efficient for high-throughput next-generation RNA-Seq which was used for transcriptomic profiling of non-model organisms with no available genomic data. In the current study, a global transcriptome analysis was designed to investigate the full gene contents of H. serrata and characterize their expression profiles in differentiated tissues (root, stem, leaf, and sporangia). Our work produced a total of 300 million sequence reads (40.1 Gb of clean data), resulting in 181,141 assembled unigenes. Genes involved in the biosynthesis of the HupA precursor and late stage were identified and predicted, the results of current work will serve as a valuable public resource facilitating the synthetic biology research on bioactive lycopodium alkaloids of traditional herbs.
Results and discussion
Sample preparation, and Illumina sequencing
Overview of the sequencing and assembly of transcriptome of H. serrata
No. of reads (bytes)
No. of bases (bp)
Leaf-1 (from Hs-1)
Stem-1 (from Hs-1)
Root-1 (from Hs-1)
Sporangia-1 (from Hs-1)
Leaf-2 (from Hs-2)
Stem-2 (from Hs-2)
Total raw data
Total high-quality data
Average length of unigenes
Unigenes ≥ 300 bp
De novo transcriptome assembly, gene expression comparison among tissues, and comparison with previous 454-ESTs report
A total of 267,314,012 high quality reads is three orders of magnitude greater than the previous reported from H. serrata, in which 140,930 high quality reads generated from 57 Mb data using Roche 454 GS FLX Titanium system (Additional file 2) [8, 9]. The reads generated from 454 platform were assembled to 14,085 contigs using GS De Novo Assembler software v2.0.01 (454 Life Sciences, Roche) with an average length of 608 bp. With the higher throughput of Illumina Hiseq platform, more and longer contigs were produced. Our clean reads were assembled to 830,623 contigs, with the average length is 812 bp. 181,141 unigenes were generated and 58.25% of them were annotated at last, to some extent, showing the disadvantage of short-read sequence assembly was avoided by more sequence data.
Function annotation of transcriptome for H. serrata
Summary of unigenes annotations
GO and KEGG classification
These annotation and classifications provided a resource for investigating specific pathways in H. serrata, such as lycopodium alkaloids biosynthetic pathway. Lycopodium alkaloids are lysine derivated alkaloids, therefore, the 3,768, 1,858, and 1,333 unigenes clustered into “amino acid metabolism”, “biosynthesis of other secondary metabolites”, and “metabolism of terpenoids and polyketides” might potentially be involved in the biosynthesis and metabolism of lycopodium alkaloids.
Transcription factor (TF)
Genes involved in the biosynthesis of lycopodium alkaloids and HupA
To identify active biological pathways in H. serrata, the sequences were mapped to the reference pathways in the KEGG. Most metabolic related genes were involved in primary metabolism, such as carbohydrate metabolism, lipid metabolism, amino acid metabolism, and energy metabolism. The 1,858 genes were grouped in the “Biosynthesis of other secondary metabolites” category, which were important contributors to highly valuable secondary metabolite biosynthesis. Especially 140 genes were mapped to the “tropane, piperidine and pyridine alkaloid biosynthesis”. The 119 genes were mapped to the “isoquinoline alkaloid biosynthesis” would be useful for defining metabolic pathways and metabolic genes for lycopodium alkaloids synthesis in H. serrata (Fig. 6b).
Possible unigenes and encoding enzymes involved in biosynthesis of HupA precursor
Degenerate primes method
Unigenes from transcriptomic data
AB915696.1, AB915697.1 and HsLDC-X1 to -X6
unigene96617 and unigene94988
CL4248.1, CL4248.2, CL4248.3
HsPKS1 (ABI94386.1), HsPKS2, and HsPKS3
unigene393, CL2724.2, and unigene394
Validation of RNA-Seq results by qRT-PCR
qRT-PCR was performed to validate four types of differentially expressed genes identified by RNA-Seq in the four tissues of H. serrata. With tubulin (Unigene50132) as the internal control, 7 selected genes involved in HupA biosynthesis, including HsLDC, HsCAO, HsPKS and CYP450s (SLS and BBE classes), were evaluated. The validation results were consistent with the gene expression patterns identified by RNA-Seq (Additional file 10). The expression levels of Unigene25121 were slightly higher based on qRT-PCR than RNA-Seq. These results highlighted the fidelity and reproducibility of the RNA-Seq analysis used in the present study.
We conducted deep RNA-sequencing analysis on four tissues of H. serrata and a total of 300 million reads were generated. 181,141 unigenes were assembled, in which nearly 60% were successfully annotated. The data offered a comprehensive coverage of H. serrata transcriptome and paved the way for elucidation of the biosynthesis pathway of lycopodium alkaloids, like HupA. The three types of biochemically confirmed enzymes in the biosynthesis of the precursors, LDC, CAO and PKS, were all identified in this study. Moreover, a large number of CYP450s involved in the secondary metabolic pathway were evaluated. We predicted that the BBE and SLS types of CYP450s were involved the ring closed and cleavage in the biosynthesis of HupA. Further studies are needed to elucidate the CYP450s involved in the ring formation and oxidative modification of the biosynthesis of HupA. The study provides valuable resources for bioengineering and synthetic biology study of the lycopodium alkaloids.
Plant materials and treatments
Two independent H. serrata plants (Hs-1 and Hs-2) were collected from Xiangxi, Hunan, China, in December 2015 and identified by Dr. Zhu Mulan. The plants were carefully rinsed in running tap water and soil was removed by hand. Root, stem, leaf, and sporangia, were kept in collection tubes immediately after separated from the plant and immersed in liquid nitrogen, and then stored at -80 °C until further use.
RNA isolation, cDNA library construction and Illumina sequencing
Total RNA was extracted from four different tissues of H. serrata, including root, stem, leaf and sporangia with TIANGEN RNAprep Pure Plant Kit. DNase I was used to digest contaminated DNA. The purified total RNA was quantified using Nanodrop, Agilent 2100, and agrose gel electrophoresis. Oligo(dT) was used to isolate mRNA followed by fragmentation. cDNA was synthesized using the mRNA fragments as templates. Short fragments were purified and resolved with EB buffer for end reparation and single nucleotide A (adenine) addition followed by connected with adapters. The suitable fragments were selected for the PCR amplification. Agilent 2100 Bioanaylzer and ABI StepOnePlus Real-Time PCR System were used in quantification and qualification of the sample library. The cDNA library was sequenced from both of 5’ and 3’ ends on the Illumina HiSeq4000 platform with paired-end sequecing length of 150 bp according to the manufacturer’s instructions.
De novo assembly and mapping of sequencing reads and analysis
Trinity was used to perform de novo assembly with clean reads that PCR duplication removed (in order to improve the efficiency), and Tgicl was used to cluster transcripts to unigenes. After assembly, clean reads were mapped to unigenes using Bowtie2 , and then gene expression level was calculated with RSEM . To assess the gene expression abundance, the differentially expressional levels of unigenes in the four tissues were measured by FPKM values, with FPKM ≥ 0.5 used as a cut-off.
Functional annotation and classification
All assembled unigenes were searched against the Nr database and the SWISS-PROT database using BLASTX. Unigenes were also compared with the COG and KEGG using BLASTX. InterPro domains were annotated by InterProScan5  and functional assignments were mapped onto GO database.
Amino acid sequences were aligned using the CLUSTAL W program and evolution distances were computed using the Poisson correction method, and a Neighor-Joining (NJ) tree was constructed with MEGA6. Bootstrap values which have been converted into the percentage obtained after 1000 replications are given on the branches
Quantitative real-time PCR
Quantitative real-time PCR (qRT-PCR) amplification was performed to validate the RNA-seq data with the designed primers (Additional file 10). The experiment was conducted with CFX Real-Time PCR Detection System (BIO-RAD, USA) using SYBR® Premix Ex Taq™ kit (TaKaRa), and it was repeated three times. The mean value of three replicates was normalized using Tubulin (unigen50132) as the internal control. PCR mixtures (final volume, 25.0 μL) contained 200 ng of cDNA, 0.200 μM each primer, 8.00 μL of sterile water, and 12.5 μL of SYBR Green Premix Ex Taq (TakaRa). The conditions for amplification were described as follows: 10 min denaturation at 95 °C, 40 cycles of 95 °C for 10 s, 57 °C for 20 s, and 72 °C for 20 s. Melting curves were determined ranging from 60 °C to 95 °C at 0.5 °C/min.
Copper amine oxidase
Coding DNA sequence
Clusters of orthologous groups of proteins database
Fragment per kilobase per million mapped reads
Gene ontology databases
Kyoto encyclopedia of genes and genomes
National center for biotechnology information
NCBI non-redundant nucleotide database
NCBI non-redundant protein database
Swiss institute of bioinformatics
We thank Dr. Lei Lei for RNA sample preparation assistance.
This work was financially supported by the Science and Technology Commission of Shanghai Municipality (Grant 15JC1400402), CAS-JIC Centre of Excellence in Plant and Microbial Sciences (CEPAMS) funding, the “Thousand Talents Program” young investigator award, and the National Natural Science Foundation of China (Grant 21572243).
Availability of data and materials
The raw sequence data reported in this paper have been deposited in Genome Sequence Archive in BIG Data Center (http://bigd.big.ac.cn/gsa), Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, under accession number: PRJCA000351 that are publicly accessible at http://gsa.big.ac.cn:80/preview/preview.action?code=0KDhCN95.
The authors declare that they have no competing interests.
YX conceived study and wrote manuscript. MY performed the most of experiments and data analysis. WY and SW performed qRT-PCR. ZF, BX and XL participated in data analysis and experiments coordination. MZ collected the samples. All authors read, commented, and approved the final manuscript.
Consent for publication
Huperzia serrata is a wild plant source, and it could not be artificially cultivated in field. Huperzia serrata has not been listed in the appendices I, II and III of the Convention on the Trade in Endangered Species of Wild Fauna and Flora which was validated from Jan 2th, 2017 (website: https://cites.org/eng/app/appendices.php). The collection of this plant materials comply seriously with the Chinese and international guidelines. Huperzia serrata in this case were collected from Xiangxi, Hunan, China. Researchers had deposited this member of family, Huperzia serrata, in the publicly Chinese herbarium Chinese Virture Herbarium (CVH) in 2008 with Herbarium: HUST BarcodeID:00019690 (website: http://www.cvh.ac.cn/en/spm/HUST/00019690).
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Christenhusz M, Zhang X, Schneider H. A linear sequence of extant families and genera of lycophytes and ferns. Phytotaxa. 2011;19(1):7–54.
- Kitajima M, Takayama H. Lycopodium alkaloids: isolation and asymmetric synthesis. In: Alkaloid Synthesis. Berlin: Springer; 2011;1–31.
- Ma X, Tan C, Zhu D, Gang DR, Xiao P. Huperzine a from huperzia species-an ethnopharmacolgical review. J Ethnopharmacol. 2007;113(1):15–34.View ArticlePubMedGoogle Scholar
- Ferreira A, Rodrigues M, Fortuna A, Falcão A, Alves G. Huperzine A from Huperzia serrata: a review of its sources, chemistry, pharmacology and toxicology. Phytochem Rev. 2016;15(1):51–85.View ArticleGoogle Scholar
- Liu J, Zhu Y, Yu C, Zhou Y, Han Y, Wu F, Qi B. The structures of huperzine A and B, two new alkaloids exhibiting marked anticholinesterase activity. Can J Chem. 1986;64(4):837–9.View ArticleGoogle Scholar
- Tang X, Han Y, Chen X, Zhu X. Effects of huperzine A on learning and the retrieval process of discrimination performance in rats. Acta Pharmacol Sin. 1986;7(6):507.Google Scholar
- Tang X, De Sarno P, Sugaya K, Giacobini E. Effect of huperzine A, a new cholinesterase inhibitor, on the central cholinergic system of the rat. J Neurosci Res. 1989;24(2):276–85.View ArticlePubMedGoogle Scholar
- Luo H, Sun C, Li Y, Wu Q, Song J, Wang D, Jia X, Li R, Chen S. Analysis of expressed sequence tags from the Huperzia serrata leaf for gene discovery in the areas of secondary metabolite biosynthesis and development regulation. Physiol Plantarum. 2010;139(1):1–2.View ArticleGoogle Scholar
- Luo H, Li Y, Sun C, Wu Q, Song J, Sun Y, Steinmetz A, Chen S. Comparison of 454-ESTs from Huperzia serrata and Phlegmariurus carinatus reveals putative genes involved in lycopodium alkaloid biosynthesis and developmental regulation. BMC Plant Biol. 2010;10(1):1.View ArticleGoogle Scholar
- Bunsupa S, Hanada K, Maruyama A, Aoyagi K, Komatsu K, Ueno H, Yamashita M, Sasaki R, Oikawa A, Saito K, Yamazaki M. Molecular evolution and functional characterization of a bifunctional decarboxylase involved in lycopodium alkaloid biosynthesis. Plant Physiol. 2016;171(4):2432–44.PubMedPubMed CentralGoogle Scholar
- Xu B, Lei L, Zhu X, Zhou Y, Xiao Y: Identification and characterization of L-lysine decarboxylase from Huperzia serrata and its role in the metabolic pathway of lycopodium alkaloid. Phytochemistry 2017, doi:10.1016/j.phytochem.2016.12.022.
- Sun J, Morita H, Chen G, Noguchi H, Abe I. Molecular cloning and characterization of copper amine oxidase from Huperzia serrata. Bioorg Med Chem Lett. 2012;22(18):5784–90.View ArticlePubMedGoogle Scholar
- Wanibuchi K, Zhang P, Abe T, Morita H, Kohno T, Chen G, Noguchi H, Abe I. An acridone‐producing novel multifunctional type III polyketide synthase from Huperzia serrata. FEBS J. 2007;274(4):1073–82.View ArticlePubMedGoogle Scholar
- Morita H, Kondo S, Kato R, Wanibuchi K, Noguchi H, Sugio S, Abe I, Kohno T. Crystallization and preliminary crystallographic analysis of an acridone-producing novel multifunctional type III polyketide synthase from Huperzia serrata. Acta Crystallogr F. 2007;63(7):576–8.View ArticleGoogle Scholar
- Wang J, Wang X, Liu X, Li J, Shi X, Song Y, Zeng K, Zhang L, Tu P, Shi S. Synthesis of unnatural 2-substituted quinolones and 1, 3-diketones by a member of type III polyketide synthases from Huperzia serrata. Org Lett. 2016;18(15):3550–3.View ArticlePubMedGoogle Scholar
- Hemscheidt T, Spenser ID. Biosynthesis of lycopodine: Incorporation of acetate via an intermediate with C2v symmetry. J Am Chem Soc. 1993;115(7):3020–1.View ArticleGoogle Scholar
- Ma X, Gang DR. The lycopodium alkaloids. Nat Prod Rep. 2004;21(6):752–572.View ArticlePubMedGoogle Scholar
- Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011;29(7):644–52.View ArticlePubMedPubMed CentralGoogle Scholar
- Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9(4):357–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Katiyar A, Smita S, Lenka SK, Rajwanshi R, Chinnusamy V, Bansal KC. Genome-wide classification and expression analysis of MYB transcription factor families in rice and Arabidopsis. BMC Genomics. 2012;13(1):1.View ArticleGoogle Scholar
- Hartmann U, Sagasser M, Mehrtens F, Stracke R, Weisshaar B. Differential combinatorial interactions of cis-acting elements recognized by R2R3-MYB, BZIP, and BHLH factors control light-responsive and tissue-specific activation of phenylpropanoid biosynthesis genes. Plant Mol Biol. 2005;57(2):155–71.View ArticlePubMedGoogle Scholar
- Rueffer M, Zenk MH. Canadine synthase from Thalictrum tuberosum cell cultures catalyses the formation of the methylenedioxy bridge in berberine synthesis. Phytochemistry. 1994;36(5):1219–23.View ArticleGoogle Scholar
- Yamamoto H, Katano N, Ooi A, Inoue K. Secologanin synthase which catalyzes the oxidative cleavage of loganin into secologanin is a cytochrome P450. Phytochemistry. 2000;53(1):7–12.View ArticlePubMedGoogle Scholar
- Ma X, Tan C, Zhu D, Gang DR. Is there a better source of huperzine A than Huperzia serrata? Huperzine A content of Huperziaceae species in China. J Agric food Chem. 2005;53(5):1393–8.View ArticlePubMedGoogle Scholar
- Li B, Dewey CN. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinf. 2011;12(1):1.View ArticleGoogle Scholar