Putative genes in alkaloid biosynthesis identified in Dendrobium officinale by correlating the contents of major bioactive metabolites with genes expression between Protocorm-like bodies and leaves

Background Dendrobium officinale, an endangered Chinese herb, possesses extensive therapeutic effects and contains bioactive ingredients such as major polysaccharides, alkaloids, and minimal flavonoids. We first obtained the protocorm-like bodies (PLBs) of this plant through tissue culture in order to determine the distribution of the main secondary metabolites in each organelle and the PLBs. We then analyzed the correlation between gene expression level from comparative transcriptome sequencing and metabolite content in different organs to identify putative genes encoding enzymes involved in the biosynthesis of polysaccharides, alkaloids, and flavonoids. Results We used seeds as explants for protocorm induction and PLB propagation of D. officinale. The optimal medium formula for PLB propagation was 1/2 MS + α-NAA 0.5 mg·L− 1 + 6-BA 1.0 mg·L− 1 + 2, 4-D 1.5–2.0 mg·L− 1 + potato juice 100 g·L− 1. Stems, PLBs and leaves of D. officinale had the highest content of polysaccharides, alkaloids and flavonoids, respectively. Naringenin was only produced in stem; however, PLBs with high alkaloid content can replace other organs producing alkaloids. The hot water extraction method outperformed the ultrasound-assisted extraction method for extracting polysaccharides from D. officinale. A comparative transcriptome analysis of PLBs and leaves of D. officinale revealed differential expression of genes encoding enzymes involved in polysaccharide, alkaloid and flavonoid biosynthetic pathways. Putative genes encoding enzymes involved in these biosynthetic pathways were identified. Notably, we identified genes encoding the alkaloid biosynthesis enzymes strictosidine β-D-Glucosidase, geissoschizine synthase and vinorine synthase in D. officinale. Conclusions The identification of candidate genes encoding enzymes involved in metabolite biosynthesis will help to explore and protect this endangered species and facilitate further analysis of the molecular mechanism of secondary metabolite biosynthesis in D. officinale. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07887-6.


Background
Dendrobium officinale Kimura et Migo, a precious, perennial, epiphytic herb in China and other Asian countries [1], is endangered owing to overexploitation and habitat destruction because of its high medicinal value. The stem of D. officinale has been used as a Traditional Chinese Medicine (TCM) since the Tang dynasty about 1300 years ago, its effects are tonifying the stomach, nourishing body fluids, clearing heat, nourishing "yin" and improving immunological function [2]. Furthermore, D. officinale plays an important role in treating atrophic gastritis, diabetes, cancer, cardiovascular disease and cataract disease, and in delaying aging [3]. Its bioactive components include major D. officinale polysaccharides (DOPs) [4] and alkaloids [5], and minimal flavonoids [6]. DOPs cure hypoglycemia [7], and have antioxidation [8], immunomodulation [9] and antitumor effects. Alkaloids of D. officinale show antioxidant, anticancer, and neuroprotective activities [10], while flavonoids of D. officinale possess good anticytotoxicity and antioxidant functions [11].
Polysaccharides are the major medicinal components of D. officinale [12], and their biosynthesis includes three main stages in medicinal plants [13,14]. Firstly, sucrose produced by photosynthesis is converts into two types of monosaccharides-glucose 1-phosphate (Glc-1P) and fructose-6-phosphate (Fru-6P)-via a series of enzymatic reactions. Secondly, these two monosaccharides are extended into various nucleotide-diphospho-sugars (NDPsugars, with NDP including UDP and GDP) by enzymatic reactions. Finally, all of the NDP-sugars are transformed into macromolecular polysaccharides through successive catalysis by glycosyltransferases (GTs) [15]. The biosynthetic pathway of DOPs was predicted using the KEGG database as shown in Additional file 1 [16,17]. NDP-sugars can be turned into many types of DOP in the D. officinale Golgi complex, catalyzed by different kinds of glycosyltransferases such as glucosyltransferases, fucosyltransferases, mannosytransferases and xylosyltransferases [12,18]. Water-soluble polysaccharides, mostly mannose and to a lesser extent glucose, are the dominant DOPs in D. officinale stems [19]. Mannan polysaccharide is catalyzed by mannan synthases, which are encoded by members of the Cellulose synthase-like A (CslA) family [20,21] (Additional file 1).
Alkaloids are another major medicinal component of D. officinale, most of which are terpenoid indole alkaloids (TIAs) [22]. Many types of TIAs are formed from a strictosidine backbone precursor [23], representing a conserved upstream biosynthetic pathway for TIAs in plants producing various specific downstream alkaloids [24]. Moreover, there might be a series of P450 monooxygenases and aminotransferases following strictosidine in the downstream biosynthetic pathway of TIAs [25].
Cytochrome P450s are involved in oxidation and hydroxylation reactions, while aminotransferases transfer amino acids to form alkaloids, which are nitrogenous amino acid derivatives [26]. A polyneuridine-aldehyde esterase (PNAE) gene participates in 16-epivellosimine biosynthesis in D. officinale [12]. Therefore, based on the KEGG database and chemical structure of the compounds, we proposed a putative alkaloid biosynthetic pathway of D. officinale (Additional file 2) [18,22,27]. In this downstream alkaloid biosynthetic pathway, strictosidine is catalyzed to strictosidine aglycone by β-D-glucosidase (SG). The conversion of strictosidine aglycone to 4,21-dehydro-geissoschizine is catalyzed by successive unknown enzymes, and 4,21-dehydro-geissoschizine is reciprocally catalyzed to geissoschizine by another unknown enzyme. We predict that vinorine might be one of the alkaloids in D. officinale, biosynthesized from 16-epivellosimine via the catalysis of vinorine synthase (VS).
Flavonoids all originated from two intermediates (Ltyrosine and L-phenylalanine) produced by the shikimate pathway in plants, which generates pcoumarinyl-CoA via enzyme catalysis of p-cinnamic acid and cinnamoyl-CoA, respectively [28]. Different types of flavonoids, such as flavone, flavanone, flavonol and anthocyanin, are then produced by various enzymatic reactions from p-coumaryl-CoA [29]. Among these flavonoids, naringenin is important and has many clear effects on the human body, such as inhibiting the inflammatory response of many cell types [30] and having hypotensive [31], anti-fibrotic and anti-cancer as well as hepatoprotective effects [32]. Chalcone isomerase (CHI) catalyzes naringin chalcone to naringin, which is condensed from three malonyl-CoAs and one pcoumaryl-CoA via chalcone synthase (CHS) [33]. The putative biosynthesis pathway of flavonoids in D. officinale is shown in Additional file 3 [34].
Wild D. officinale is endangered and on the IUCN Red List of Threatened Species [34]. Therefore, it is imperative to conserve wild resources of D. officinale and seek alternative ways to propagate it. Tissue culture technology may be a suitable method for D. officinale breeding via the induction of protocorms from embryogenic tissues. Protocorm-like bodies (PLBs) from nonembryogenic tissues of D. officinale have been induced [35]; however, the content of the main bioactive compounds in D. officinale PLBs varies with medium formulae and the elicitors used as additives [36].
This study aimed to establish a system for protocorm and PLB induction and proliferation for D. officinale. We compared the contents of main metabolites such as polysaccharides, alkaloids, flavonoids and naringenin (one of the special flavonoids) in PLBs and organs from the original plant. There were significant differences in contents of these metabolites between PLBs and plant organs, suggesting that PLBs and leaves were good for analyzing the expression levels of candidate enzymeencoding genes involved in the specific biosynthetic pathways of these metabolites in D. officinale. Based on the differential expression of genes (DEGs) from transcriptome sequencing, we identified or verified putative genes encoding enzymes involved in biosynthetic pathways of the main metabolites. This study will help to conserve D. officinale resources and more deeply understand and study the biosynthesis and regulation of its main metabolites.

Protocorm and PLB induction of D. officinale
We screened the optimal explants for PLB induction and the optimum formula for PLB propagation in order to construct a comprehensive system for PLB proliferation and protocorm/PLB induction for D. officinale. Induction rate and the growth status were determined among the five explants: capsules approaching maturity, leaves, stem tips, stem fragments and stems with nodes. Seeds was the most suitable material for protocorm of PLB induction, and capsules approaching maturity were collected and punctured to release seeds under sterile condition because of the tiny seeds (Table 1).
Total content of polysaccharides, alkaloids and flavonoids in PLBs and organs of D. officinale We determined contents of polysaccharides, flavonoids and alkaloids in various organs of D. officinale, including the whole plant, root, stem, leaf and PLB (Fig. 1 b). Significant differences in contents of these three components among samples were detected by variance analysis and multiple comparisons (Fig. 2).

Differential contents of Total polysaccharides (DOPs)
The regression equations for the D-glucose standard curves obtained using ultrasound-assisted extraction (UAE) and hot water extraction (HWE) methods of polysaccharide extraction were Y = 0.0491X -0.0035 (R 2 = 0.9997) and Y = 0.0684X -0.0323 (R 2 = 0.9998) at 488 nm, respectively (Additional file 4). Contents of polysaccharides in PLBs and organs measured using two different extraction methods showed the same content distribution, but the extraction rate using the HWE method was higher than that using the UAE method ( Fig. 2 a and b). We therefore concluded that the HWE method was better for DOP extraction.
Distributions of DOPs in plant organs were as follows: stem > whole plant > leaf > root. The stem of D. officinale had the highest content of polysaccharides among all organs at 402.93 mg·g − 1 DW (Dried weight). While, PLBs had a lower content of polysaccharides than plant organs with average amount of 75.30 mg·g − 1 DW (Fig. 2).

Differential contents of Total alkaloids
The equation for the standard curve using dendrobine as reference was Y = 0.1154X-0.0048 (R 2 = 0.9990), with absorbance detected at 620 nm (Additional file 4). We observed no significant differences in total alkaloid contents among different parts of D. officinale by multiple comparisons [all around 0.3 mg·g − 1 DW]. Meanwhile, the total alkaloids contents in PLBs ranged up to 0.56 mg·g − 1 DW, which was almost twice as high as that of D. officinale plant organs (Fig. 2 c). PLBs can therefore be used as alternative materials for dendrobine extraction.

Differential contents of Total flavonoids and Naringenin
The regression equation for the standard curve for total flavonoids measured using rutin as the reference was Y = 0.0114X-0.0009, with R 2 = 0.9992 and absorbance is at 510 nm. D. officinale stems had the lowest content of total flavonoid, with only 4.07 mg·g − 1 DW, which was consistent with most findings reported previously [34]. Leaf had the highest content of total flavonoids at 11.74 mg·g − 1 DW followed by root (Additional file 4). The total flavonoids content in PLBs was higher than that in the whole plant and stem, but lower than that in the leaf and root (Fig. 2 d). Considering the low biomass of leaf and root in D. officinale, PLBs would be the most   The content of naringenin, an important flavonoid was determined by HPLC-UV, and the chromatogram showed a retention time (RT) was at 44.8 min (Fig. 3 a). The regression equation of the standard curve for naringenin was Y = 2624X + 0.6714, R 2 = 0.9999 (Fig. 3 b). Naringenin could only be quantified in the stems of D. officinale, which is consistent with a previous study in D. officinale [37], and the average content was approximately 0.029 mg·g − 1 DW. No naringenin was detected in the PLBs of D. officinale, indicating that PLBs cannot be used as an alternative material for naringenin extraction (Fig. 3 c). Based on similarity in naringenin content between the whole plant (0.021 mg·g − 1 DW) and the stem (approx 0.029 mg·g − 1 DW), we concluded that naringenin derives mainly from the stem. We also established that the biomass of D. officinale was mainly from stems (Additional file 4).

Sequencing, assembly and Unigenes functional annotation
A total of 39.11 Gb transcriptomic data were generated using the BGISEQ-500 platform. Clean reads were assembled using Trinity software and then clustered to remove redundancy and obtain unigenes using Tgicl. We obtained 157,901 unigenes (Additional file 5). The total length, average length, N50, and GC content were 202,415, 413 bp, 1281 bp, 2194 bp, and 40.39%, respectively.
We mapped all unigenes to the major functional databases for annotation. There  Among these DEGs, 39,597 unigenes were up-regulated and 31,710 unigenes were down-regulated in PLBs compared with leaf ( Fig. 4 a). A Poisson distribution revealed that 27,242 unigenes were specifically expressed in leaves, 11,976 unigenes were expressed only in PLBs, and 98,096 unigenes were expressed in both of these samples ( Fig. 4 b, Additional file 7). The large number of DEGs identified demonstrated significant differences between leaves and PLBs, which directly reflected the critical difference in the appearance of the different tissue types.

Differential expression of genes involved in polysaccharide biosynthetic pathways
Polysaccharides were much more abundant in leaves than in PLBs. We therefore studied the differences in expression of genes encoding enzymes involved in polysaccharide biosynthesis between PLBs and leaves (Fig. 5). Expression levels of some genes were much higher in leaves than in PLBs, such as genes encoding INV, CslA, FBPase, PFP, MPI, scrK, KK, USP, UGPA, PMM, GMPP and UXS1. A higher level of gene expression was positively correlated with a higher content of polysaccharide in leaves. Compared with PLBs, the INV gene was highly expressed while SuS gene expression was increased mildly in leaves, indicating that more sucrose was required to biosynthesize more downstream polysaccharides. Lower expression of a gene encoding CslA was positively correlated with the lower polysaccharides (mainly mannose monosaccharides) content in PLBs (Fig. 5 a).
Meanwhile, there was no significant difference in the expression of genes encoding enzymes such as PGM, HK, TK, UXE, PFK, TSTA3 and UER1 between leaf and PLB, suggesting that the catalysis reactions performed by these enzymes are not restrictive steps for DOP biosynthesis. Genes encoding enzymes such as UAE, UGD, XK, AR and GMDS were negatively correlated with DOP content, these enzymes are all bifunctional and in closed-loop pathways (Fig. 5 a). This might indicate that these bifunctional enzymes have greater ability for degradation than polysaccharide biosynthesis.
Glycosyltransferases are a very widespread group of carbohydrate-active enzymes participating in glycan and glycoside biosynthesis in higher plants [38]. We identified 553 glycosyltransferase genes, including 305 glucosyltransferase genes, 29 fucosyltransferase genes, 110 mannosyltransferase genes and 106 xylosyltransferase genes, using BLASTX in the transcriptomes. From these, 22 full-length glucosyltransferases genes, 5 fucosyltransferase genes, 9 mannosyltransferase genes and 8 xylosyltransferase genes were identified. The expression levels in leaf and PLBs of genes encoding enzymes such as CGT, UGT76C1_2, SGT1 and ALG11 were in line with the patterns of DOP content, suggesting these could be candidate genes encoding enzymes involved in DOP biosynthesis in D. officinale (Fig. 5 b, Additional file 9).

Analysis of differential gene expression in alkaloids biosynthesis
A previous study showed that the alkaloids in D. officinale are mainly terpenoid indole alkaloids (TIAs). We analyzed differences in expression of genes encoding enzymes upstream of MVA, MEP and Shikimate pathway between leaves and PLBs. There were significant differences between leaves and PLBs, but regulatory patterns among these genes were not consistent (Fig. 6 a). Dis-order of gene expression indicated that the intermediates upstream of TIA biosynthesis are not restrictive products, which was further verified by the expression of two genes encoding LAMT and TDC involved in biosynthesizing loganin and tryptamine, respectively. The content of alkaloids was higher in PLBs than in leaves, but the expression levels of genes encoding LAMT and TDC enzymes were lower in PLBs than in leaves. There was no significant difference in expression of the gene encoding SCS between PLBs and leaves; SCS catalyzes the conversion of loganin into secologanin. In conclusion, most intermediates, especially loganin and secologanin, as direct substrates for strictosidine, were sufficient for biosynthesis of specific downstream alkaloids. The expression level of the gene encoding enzyme STR was much higher in PLBs than in leaves, meaning a higher content of precursor strictosidine is necessary for producing additional alkaloids in PLBs.
In the downstream biosynthesis pathway of specific alkaloids in D. officinale, expression level of putative genes encoding enzymes SG, GS and PNAE were much higher in PLBs than in leaves, and the expression level of the gene encoding VS was slightly higher in PLBs (Fig. 6 a). Genes encoding SG, GS and VS were therefore predicted to be involved in D. officinale alkaloids biosythesis. Moreover, these predicted genes involved in TIAs biosynthesis in D. officinale indicate that alkaloids of D. officinale might include strictosidine aglycone, geissoschizine and vinorine.
Some aminotransferases and P450 superfamily enzymes are necessary for alkaloid biosynthesis. By comparing expression levels between PLBs and leaves, we found that some aminotransferases and P450s were more highly expressed in PLBs than in leaves (Fig. 6 b, c, Additional file 9). We therefore speculated that some highly expressed aminotransferases and P450s might participate in alkaloid biosynthesis in D. officinale.

Differential expression of genes in the flavone biosynthetic pathway
Expression levels of genes encoding enzymes involved in flavones biosynthesis were different between PLBs and leaves (Fig. 7). TAL and PAL were expressed more highly in leaves than in PLBs, which corresponded with the higher content of flavonoids in leaves. ADT and PDT were expressed much more highly than PAL or TAL in PLBs, suggesting that PLBs need to produce more L-phenylalanine as a substrate for downstream flavonoid biosynthesis. The lower expression levels of ANS and ANR in leaves compared with PLBs suggested low contents of cyanidin and epicatechin in leaves of D. officinale.
Naringenin could not be quantified in PLBs. Based on low expression of CHI, high expression of DFR, and relatively high expression of F3'H and F3'5'H in PLBs, we concluded that naringenin is an intermediate product converted to downstream metabolites and not stored in PLBs. The naringenin content in leaves of D. officinale was too low to be detected. However, high expression of CHI, F3'H and F3'5'H, and lower expression of DFR indicated that naringenin might be transferred to stem and converted to downstream flavonoids. Naringenin could be detected only in the stem of D. officinale. There is a possibility that a special organelle exists in the stem, functioning to accumulate naringenin and transfer it from the leaf to the stem in D. officinale. HCT and HST were more highly expressed in PLBs than in leaves, suggesting that most of the eriodictyol in PLBs comes from caffeoyl-CoA, which is consistent with relatively lower expression of F3'H and F3'5'H, and low naringenin content in PLBs.

Validation of genes expression levels by real-time fluorescent quantitative PCR (RT-qPCR)
We verified the expression levels of genes encoding enzymes such as STR, SG, PNAE, VS, CslA, CHI, INV and ANS using RT-qPCR analysis (Fig. 8 a, b, c). Expression levels of genes encoding enzymes involved in alkaloids biosynthesis were higher in PLBs than in leaves (Fig. 8  a). Meanwhile, expression levels of those genes encoding enzymes in polysaccharide and flavonoid biosynthesis were higher in leaves than in PLBs (Fig. 8 b, c). These results from RT-qPCR analysis were consistent with the data from RNA-seq analysis.

Detection of transcription factor families
Transcriptional factors (TFs) play an important role in the regulation of secondary metabolites by coordinating the expression of biosynthetic genes and controlling the production of secondary metabolites in plants [39]. We identified 1113 unique putative transcripts encoding TFs differentially expressed between PLBs and leaves, which  Fig. 8 d, and a striking number of differentially expressed TFs were identified between PLBs and leaves.

Discussion
Extraction method is reported to have little effect on the types of monosaccharide, but to influence the monosaccharide composition of heterogeneous polysaccharides [40]. The diffusion rate of polysaccharides in the HWE method night increase dramatically owing to the raised temperature of the water, improving the extraction efficiency [41]. Our results indicates that the HWE method is more suitable than the UAE method for DOPs extraction, consistent with this conclusion.
Among organs of D. officinale, the stem had the highest content of polysaccharides-up to 402.93 mg·g − 1 DW. This is consistent with the findings of previous researchers in D. officinale [22]. By contrast, PLBs had a lower content of polysaccharides, indicating that the PLBs cannot be used as a substitute for plant organs in producing DOPs. In spite of the relatively low content of total alkaloids in D. officinale, more and more attention has been paid to their broad pharmacological activities, such as antipyretic, analgesic and anti-tumor effects [12]. PLBs were much more suitable for producing alkaloids in our study, providing a new strategy for future large-scale production of alkaloids in D. officinale.
KEGG enrichment analysis indicated that comparative RNA-Seq between leaves and PLBs was consistent with the contents of DOPs, alkaloids and flavonoids between PLBs and leaves of D. officinale. It also demonstrated our reasonable prediction of the putative biosynthesis pathway of DOPs, alkaloids and flavonoids in D. officinale.
Characterization of the correlation between the content of DOPs, alkaloids and flavonoids and comparative transcriptome sequence results allowed discovery of many putative genes encoding enzymes involved in biosynthesis of DOPs, alkaloids and flavonoids in D. officinale. Next, we will clone of the genes encoding enzymes such as SG, GS, PNAE and VS from D. officinale according to orthologous sequences from RNA-seq and construct the expression vectors to express recombinant proteins. The enzyme activities of these recombinant proteins in vitro will be assessed to validate their function in alkaloid biosynthesis. The expression patterns and substrate specificity of these putative enzymes and their subcellular localization will also need to be studied.
Previous reports indicate that TFs play important roles in regulating alkaloid biosynthesis, such as members of the apetala2/ethylene response factor (AP2/ERFs) [42,43], C2H2 zinc fingers [44], WRKY [45] and basic helixloop-helix (bHLH) [46] families that are involved in regulating TIA biosynthesis. ORCA3, an AP2/ERF TF, induces the expression of a number of TIA biosynthetic genes such as G10H, CPR, STR, AS, TDC, and DXS in the vindoline or tryptamine branch pathway [47,48]. CrMYC, belonging to the bHLH family, activates ORCA3 and then induces expression of several TIA biosynthetic genes, such as TDC and CPR [49]. CrBPF1 is a MYB transcription factor that regulates the expression of STR in C. roseus [50]. In our study, we identified TFs such as MYB, AP2/EREBP, bHLH and NAC to be involved in alkaloids biosynthesis in D. officinale.
The distribution of polysaccharides, alkaloids and flavonoids among the organs of D. officinale plant was clarified in this study. Contents of polysaccharides, alkaloids and flavonoids are highest in stems, PLBs and leaves of D. officinale, respectively. Hot water extraction (HWE) is a better method than UAE for polysaccharides extraction from D. officinale. PLBs can be used as an alternative for producing alkaloids in D. officinale. However, naringenin is produced exclusively in stems, with no detectable amount in PLBs.
Comparative RNA-Seq analysis of D. officinale PLBs and leaves enabled identification of candidate genes encoding enzymes involved in polysaccharide, alkaloid and flavonoid biosynthesis. We report on genes encoding enzymes such as SG, GS and VS in alkaloids biosynthesis of D. officinale for the first time to our knowledge. Moreover, some TFs involved in regulating alkaloid biosynthesis are differentially expressed in leaves and PLBs of D. officinale.

Plant materials and reagents
Plants were from Huoshan County, Anhui Province, China, identified as Dendrobium officinale Kimura et Migo. They were grown at 25 ± 2°C during the day and 23 ± 2°C at night with 60-70% relative humidity and a light/dark cycle of 14/10 h in the greenhouse of Anhui University of Chinese Medicine, Hefei, China. A specimen of D. officinale used in this study has been deposited in the Herbarium of Anhui University of Chinese Medicine with the depository no. is ACM025738.
Standards of (+)-glucose, dendrobine, rutin and naringenin (> 98% purity) were purchased from Chengdu Push Bio-technology CO., Ltd. Methanol (HPLC grade) was purchased from Oceanpak, and CTAB-pBIOZOL reagent used for total RNA extraction was purchased from Bioflux (Beijing, China). HPLC-grade water was prepared using a water purification system from Pall Filter Co., Ltd. (Beijing, China). Plant growth regulators (PGRs) including 6-BA, α-NAA and 2, 4-D were purchased from Solarbio. Agents for induction and propagation of PLBs were purchased from Sinopharm group, and all of the other agents were analytical grade.

Induction and propagation of D. officinale Protocorms and PLBs
D. officinale capsules approaching maturity, leaves, stem tips, stem fragments, and stems with nodes were collected and sterilized to induce protocorms and PLBs (capsules were punctured to release seeds as explants on medium under sterile condition); each explant was prepared with more than 30 replicates. Explants were placed on 1/2 MS medium + 2, 4-D 0.5 mg·L − 1 + 30 g·L − 1 sucrose + 7 g·L − 1 agar at pH 5.6 to 5.8, and placed in a tissue culture room at 70% relative humidity, 1600 lx illumination and 16/8 h light/dark for 60 days to identify the optimal explants for protocorms and PLBs induction.
Protocorms and PLBs induced as above were collected as explants to be cut into pieces of approximately 0.5 cm × 0.5 cm to proliferate more PBLs, and the optimum culture protocol for proliferation of D. officinale PLBs was screened. An L 16 (4 5 ) orthogonal experiment without regarding interactions among factors was performed; the factors and their levels were selected on the basis of previous experiments and reports [51]. Four different concentrations of MS medium were used as test media (Factor A): 1/4MS, 1/2MS, MS and 2MS. Different concentrations of plant growth regulators were arranged in different combinations to screen the optimal combination: α-NAA (Factor B) (0.1, 0.2, 0.5 and 1.0 mg·L − 1 ), 6-BA (Factor C) (0.5, 1.0, 1.5 and 2.0 mg·L − 1 ) and 2,4-D (Factor D) (1.5, 2.0, 2.5 and 3.0 mg·L − 1 ). Explants in tissue culture with natural additives and fruit juice grew much better than those without natural additives and fruit juice [52,53]. Therefore, for the PLBs proliferation process, potato juice was selected as an additive to accelerate PLB growth and propagation. To prepare this additive, 200 g fresh potato was chopped into small pieces, mixed with purified water to 1 L and boiled to a mush. The mushy potato was passed through gauze to collect the juice as additive. Concentrations of potato juice (Factor E) used were 50, 100, 150 and 200 g·L − 1 . All test media for PLB proliferation included basic medium, plant growth regulators, potato juice, sucrose (30 g·L − 1 ) and agar (7 g·L − 1 ) at pH 5.6 to 5.8; the design of the orthogonal experiment table head is shown in Additional file 11. Each group comprised 10 bottles with 4 explants, and culture conditions were light/dark for 16/ 8 h per day, at 1600 lx and 25 (± 1)°C. Observations and statistical analysis were performed to analyze the propagation coefficient 2 months after inoculation. The propagation coefficient was calculated as follows: Propagation coefficient (%) = (FW 1 -FW 0 )/ FW 0 × 100%; FW0 and FW1 are the fresh weight before inoculation and after culture, respectively.

Ultrasound-assisted extraction (UAE) and hot water extraction (HWE) of Total polysaccharides
Stems, leaves, roots and whole plants from approximately 3-years-old D. officinale and PLBs were harvested, cleaned and dried in an oven at 50°C to a constant weight. Samples were ground to a fine powder using a pulverizer and sieved through an aperture of 40 mesh sieve. The water-soluble polysaccharides in D. officinale were extracted using the UAE method [41] and HWE method [8] with a few modifications. This allowed selection of the better technique for extraction of polysaccharides from D officinale.
The modified UAE method was as follows: 10 mL of double-distilled water was added to each 0.2 g pulverized sample, and the samples were homogenized in an ultrasonic cleaner at 40°C, 40 Hz for 0.5 h and filtered through filter-paper to obtain the filtrate. The above steps were repeated once, and all filtrate was collected. The filtrate was concentrated at 55°C until its volume was reduced to 10 mL. Next, 40 mL of ethanol was added into the concentrated filtrate, which was centrifuged at 4000 g for 5 min, and the supernatant was discarded. The resulting samples were dissolved in 25 mL of pure water and 25 mL of Savage reagent was added to remove impurities such as protein and nucleic acid. Sample were centrifuged at 1000 g for 5 min, and 20 mL of the supernatant was transferred into a 50 mL volumetric flask and mixed with pure water to a constant volume of 50 mL for polysaccharide extraction.
The HWE method with a few modifications was as follows: 0.3 g sample was placed in a round-bottomed flask, and 200 mL water was added. The sample solution was then heated and refluxed for 3 h. The reflux extract was cooled to room temperature before filtering. The filtrate was transferred into a 250 mL volumetric flask and mixed with double-distilled water to a constant volume of 250 mL. Then, 2 mL of sample solution was taken into a 15 mL centrifuge tube and 10 mL of solute ethanol was added, shaken and refrigerated for 1 h followed by centrifugation at 1000×g for 20 min. The supernatant was discarded, and the precipitate was washed twice with 8 mL of 80% ethanol. The final precipitate was dissolved in heated water and transferred to a 25 mL volumetric flask. The solution was cooled before mixing it with pure water to a constant volume of 25 mL for polysaccharide extraction.

Determination of Total polysaccharides
A sample (1.0 mL) of each polysaccharide extract was transferred into a 10.0 mL test tube, then 1.0 mL of 5% phenol was added and vortexed quickly. The solution was mixed thoroughly before adding 5.0 mL of concentrated sulfuric acid, shaking and placing in a water bath at 100°C for 20 min. The solution was then placed in an ice bath to cool for 5 min. The absorbance of the sample solution was measured at a wavelength of 488 nm using an ultraviolet/visible spectrophotometer with 1 mL of water as a blank; parallel detection was conducted three times. A standard curve was prepared from the D-glucose reference. Each sample was assayed three times, and the polysaccharide content was expressed as the weight of polysaccharides to the dried weight of starting materials.

Extraction and determination of Total alkaloids
Sample powder (0.5 g) was placed into a 125 mL ground-mouth flask and mixed with 30 mL of petroleum ether. The sample mixture was placed in a constant temperature water bath (35°C) to degrease for 30 min, then the supernatant was removed and the petroleum ether evaporated. The pH was adjusted with ammonium hydroxide, and 10.0 mL of chloroform was added. The solution was then refluxed for 2 h in a water bath at 80°C and allowed to cool for 20 min at room temperature before filtering. The filtrate was obtained as total alkaloids solution.
Total alkaloids solution (5.0 mL) was mixed with 5 mL of chloroform, then 5.0 mL of pH 4.5 buffer and 2.0 mL 0.04% bromocresol green solution were added sequentially. The mixture was shaken vigorously for 3 min and let stand for 30 min before filtering. The filtrate (5.0 mL) was mixed with 1.0 mL 0.01 M NaOH-ethanol solution and vortexed. The absorbance of the samples was measured at 620 nm using a UV-visible spectrophotometer with chloroform as blank. A standard curve was prepared using dendrobine as reference (> 98% purity; bought from Chengdu Push Bio-technology Co., Ltd). (0.857, 1.714, 2.571, 3.429, 4.286, and 5.143 μg·mL − 1 ). Parallel detection was carried out three times, and the alkaloid content was shown as the weight of total alkaloids to the dried weight of starting materials.

Extraction and determination of Total flavonoids
Sample powder (1.00 g) was dissolved in 50 mL of 70% ethanol and refluxed at 60°C for 2 h, then filtered. The filtrate was mixed with 70% ethanol again to a constant volume of 50 mL to yield the total flavonoid extracts.
Total flavonoid extract (1.0 mL) was mixed with 5.0 mL of 70% ethanol, then 1.0 mL of 5% NaNO 2 was added and mixed well. After standing for 6 min, 1.0 mL of 10% Al (NO 3 ) 3 was added and mixed well. The solution was allowed to stand for a further 6 min before adding 10 mL of 1 M NaOH and then mixing with 70% ethanol to 25 mL and standing for 15 min. The absorbance of the sample solution was measured at 510 nm using an ultraviolet/visible spectrophotometer. A standard curve was prepared from rutin as reference (2.208, 4.416, 6.624, 8.832, and 11.040 μg·mL − 1 ). Each sample was assayed three times, and the flavonoid content was shown as the weight of total flavonoids to the dried weight of starting materials.

Extraction and quantification of Naringenin
Naringenin was extracted using optimized methods: methanol (20 mL) mixed with 20% hydrochloric acid (5.0 mL) was used as solvent; particle size of dry powder of D. officinale was < 0.355 mm; temperature was 80°C; condensation reflux extraction time was 90 min; supernatant (20 mL) was evaporated using a rotating evaporator until dry. Methanol (5.0 mL) was added to dissolve naringenin, and the solution was filtered through a 0.22 mm nylon membrane filter before HPLC analysis.
HPLC analysis was performed on an Agilent HPLC system, including quaternary solvent management, sampler manager, separation system and detection systems. An Agilent C 18 column (4.6 mm × 250 mm, 5 μm) was used at a column temperature of 30°C. Standards and samples were separated using a gradient mobile phase consisting of 0.2% phosphate buffer (A) and methanol (B). The gradient elution program was as follows: 0. Naringenin stock solution was prepared and diluted to an appropriate concentration for preparation of a calibration curve. The calibration curve was prepared according to linear plots of naringenin concentration versus the corresponding chromato-graphic peak area.

Total RNA extraction, construction of cDNA libraries and RNA-Seq
The ethanol precipitation protocol and CTAB-pBIOZOL reagent were used to purify total RNA from PLBs and leaves according to the manufacturer's instructions (Bioflux), and each sample was prepared with three biological replicates. Tissue samples (80 mg) were ground into powder in liquid nitrogen and the powdered samples were transferred into 1.5 mL of preheated 65°C CTAB-pBIOZOL reagent. The sample solution was incubated in a thermostatic mixer at 65°C for 15 min to completely dissociate the nucleoprotein complexes. The supernatant was obtained by centrifuging the solution at 12,000×g, at 4°C for 5 min. The supernatant was mixed with 400 μL of chloroform (per 1.5 mL of added CTAB-pBIOZOL reagent), and the sample mixture was centrifuged at 12,000×g, 4°C for 10 min. The supernatant was transferred to a new 2 mL tube, and 700 μL of acidic phenol and 200 μL of chloroform were added followed by centrifugation at 12,000×g, 4°C for 10 min. The aqueous supernatant was collected, and an equal volume of chloroform was added into it followed by centrifugation at 12,000×g, 4°C for 10 min. An equal volume of isopropyl alcohol was added into the supernatant, and the mixture was placed at − 20°C for 2 h to precipitate before centrifuging at 12000×g, 4°C for 20 min; and the supernatant was removed. The RNA pellet was washed with 1 mL of 75% ethanol before being air-dried in a biosafety cabinet and dissolved in 50 μL of DEPC-treated water. Subsequently, quality control and quantification of total RNA were performed using a Nano Drop and Agilent 2100 bioanalyzer (Thermo Fisher Scientific, MA, USA).
Oligo (dT)-attached magnetic beads were used to purify mRNA. The purified mRNA was split into small pieces using fragment buffer at an appropriate temperature. Random hexamer-primed reverse transcription was performed to synthesize first-strand cDNA, followed by second-strand cDNA synthesis. A-Tail Mix and RNA Index Adapter were then added to repair the ends. The cDNA fragments obtained were amplified by PCR. The products were purified using Ampure XP Beads and dissolved in EB solution. Products were verified for quality control using an Agilent 2100 bioanalyzer. Double-stranded PCR products were denatured and cycled using the splint oligo-nucleotide sequences to obtain the final library in single-stranded circular DNA (ssCir DNA) format. The library was amplified with phi29 to obtain DNA nanoballs (DNBs), each molecule of which had over 300 copies. The DNBs were added into the patterned nanoarray, and 100 base pairs of reads was generated on the BGIseq500 platform.

Functional annotation of Unigenes using a reference genome
Original sequencing data (original polymerase reads) produced from raw data generated by the Pacific Biosciences Sequel system were processed using the SMRT analysis package version 2.3.0 according to the IsoSeq protocol (Pacific Biosciences, https://www.pacb.com/ products-and-services/analytical-software/smrt-analysis). ROIs (reads of insert) were generated from original polymerase reads having full passes > 0 and the predicted consensus accuracy > 0.75. According to whether the primers were 5′ primers, 3′ primers or poly-A tails, ROIs with a minimum length of 300 bp were divided into non-full-length and full-length transcribed sequences. The full-length sequences were processed to de novo consensus isoforms using the ICE (Iterative Clustering for Error Correction) algorithm and then polished using the Quiver quality-aware algorithm. High quality de novo consensus isoforms (expected Quiver accuracy ≥0.95) from each library were combined, and then redundancy was removed using CD-HIT [54] founded on the sequence similarity to get final, unique, full-length isoforms.
Final, complete subtypes were mapped to SwissProt (manually annotated and reviewed protein sequence database), NR (NCBI non-redundant protein sequence), KEGG (Kyoto Encyclopedia of Genes and Genomes), NT (NCBI non-redundant nucleotide sequence) and KOG (Clusters of Eukaryotic Orthologous Groups) databases using BLAST software (version 2.2.23) [55] with default parameters (under a threshold E-value ≤10-5) to obtain isoform annotations. GO (Gene Ontology) annotations and functional classifications were acquired using the Blast2GO program (version 2.5.0, E-value ≤10-5) [56] according to NR annotations. InterProScan5 software (version 5.11-51.0) [57] was used to acquire annotations from the InterPro database.

Identification of differentially expressed genes (DEGs)
All clean reads were mapped to the full-length reference transcriptome using BLAST software. Gene expression levels were determined by the number of full-length transcripts belonging to a cluster after ice clustering and the CD-HIT process. Total isoforms counts were used to normalize the counts in each sample. DEGs were acquired using DEseq2 with Q value (adjusted P value) < 0.001 and fold change (FC) ≥ 2 or ≤ − 2 [58]. These DEGs were then carried into GO and KEGG enrichment with Phyper in the R package using Q value ≤0.05 as default.

Comparative analysis of putative gene expression in metabolite biosynthesis pathways
The amino acid sequences of putative genes involved in the polysaccharide, alkaloid and flavonoid biosynthetic pathways were identified in the National Center for Biotechnology Information (NCBI) database (Additional file 12). All amino acid sequences of putative genes were mapped to the protein information from RNA-Seq using BLAST software to obtain the average FPKM value of each gene in each sample. Heatmaps of putative genes involved in polysaccharide, alkaloid and flavonoid biosynthesis were drawn based on FPKM value. Differences in expression levels of putative genes encoding enzymes involved in alkaloid, polysaccharide and flavone biosynthesis between PLBs and leaves were measured using the value of diffexp_log2fc_Cas-vs -Ctr (Cas is PLB, Ctr is leaf) for unigenes with high amino acid sequence identity from the RNA-seq database.

Real-time fluorescent quantitative PCR (RT-qPCR) analysis
To verify RNA-Seq data, RT-qPCR analysis was conducted using a SuperReal PreMix Plus SyBr Green PCR kit (Qiagen) on a Cobas z480 Real-Time PCR System. Randomly selected candidate primers were designed using Primer Premier (version 5.0) (Additional file 13). Reactions contained 2.0 μl of diluted cDNA, 0.6 μl of each primer, 10 μl of 2 × SuperReal PreMix Plus, and 6.8 μl of RNase-free double-distilled water (ddH 2 O). All RT-qPCRs were performed as follows: denaturation at 95°C for 15 min, followed by 45 cycles of 95°C for 10 s, 58°C for 20 s and 72°C for 30 s. Successive RT-qPCR assays were performed using three biological replicates; to verify product specificity, melting curve analysis was performed after each amplification. A housekeeping gene (5.8S) was used as a reference, and the relative expression level of each gene was calculated using the 2 −ΔΔCt approach.
Statistical analysis Experimental data are expressed as the mean ± standard deviation of three independent biological replicates. Statistical differences between samples were analyzed using two-way analysis of variance by SPSS (version 17.0). Values at P < 0.01 were considered statistically significant.