Integrative analysis of the steroidal alkaloids distribution and biosynthesis of bulbs Fritillariae Cirrhosae through metabolome and transcriptome analyses

Bulbus Fritillariae Cirrhosae (BFC) is an endangered high-altitude medicine and food homology plant with anti-tumor, anti-asthmatic, and antitussive activities as it contains a variety of active ingredients, especially steroidal alkaloids. Bulbus Fritillariae Thunbergia (BFT) is another species of Fritillaria that grows at lower altitude areas. Production of plant-derived active ingredients through a synthetic biology strategy is one of the current hot topics in biological research, which requires a complete understanding of the related molecular pathways. Our knowledge of the steroidal alkaloid biosynthesis in Fritillaria species is still very limited. To promote our understanding of these pathways, we performed non-target metabolomics and transcriptome analysis of BFC and BFT. Metabolomics analysis identified 1288 metabolites in BFC and BFT in total. Steroidal alkaloids, including the proposed active ingredients of Fritillaria species peimine, peimisine, peiminine, etc., were the most abundant alkaloids detected. Our metabolomics data also showed that the contents of the majority of the steroidal alkaloids in BFC were higher than in BFT. Further, our comparative transcriptome analyses between BFC and BFT identified differentially expressed gene sets among these species, which are potentially involved in the alkaloids biosynthesis of BFC. These findings promote our understanding of the mechanism of steroidal alkaloids biosynthesis in Fritillaria species.


Background
Fritillaria (Liliaceae) is a perennial herb genus with more than 130 species found in North America, the Mediterranean, and Asia [1]. Many species of Fritillaria have been used as herbal remedies in many countries' folk medicines for a long history [2,3]. Steroidal alkaloids are specialized metabolites found mainly in the families of Liliaceae [4]. In China, the dried bulbs derived from various Fritillaria species (called "Beimu" in Chinese) are exploited as medicine and food homology plants because of their anti-tumor, anti-asthmatic, and antitussive activities [5][6][7]. Among the different species of Fritillaria, Bulbus Fritillariae cirrhosae (BFC) and Fritillariae Thunbergia (BFT), which are called "Chuan-Bei-Mu" and "Zhe-Bei-Mu" in Chinese, and distributed at different altitudes, are considered as two most valuable species for their superior therapeutic effects on cough and asthma, compared with other Fritillaria species [8] (More than 100 kinds of Chinese patented medicines use BFC as the raw material [9]. BFC can be derived from six original plants, Fritillaria [5,10]. Wild BFC resources are endangered due to long-term excessive mining [11,12]. The estimated annual market demand for BFC is about 2,000 tons, with a gap of 1,900 tons [3]. BFT is a member of the geo-authentic crude drugs in Zhejiang Province, its production ranks first in the commodity Bulbus Fritillaria and has been approved as a national health functional food for relieving cough and reducing sputum [13]. Steroidal alkaloids are reported to be the main parts of BFC that give it its health benefitsBFC [5,14]. Dissecting the pathways responsible for the BFC alkaloids synthesis is conducive to producing these active ingredients in biotechnological manner, especially synthetic biology. Until now, many efforts have been made to identify each of the enzymes involved in synthesizing the steroidal alkaloids [15]. However, our understanding of the whole picture of alkaloid synthesis pathways in BFC is still limited. Recently, a few studies have been carried out to investigate the molecular mechanisms that control the steroidal alkaloid biosynthesis in Fritillaria species. For example, Zhao et al. performed transcriptome analysis to map the gene expression profile of in vitro regenerated BFC [16]. Kumar et al. identified the essential genes and regulatory transcription factors (TFs) of imperialine biosynthesis in Fritillaria roylei Hook using comparative de novo transcriptome analysis [17]. The molecular basis of organ-specific expression of isosteroidal alkaloids biosynthesis in F. roylei Hook was investigated by transcriptome analysis [18]. In the current study, we conducted the first metabolomics and transcriptomics combined analysis to systematically investigate the characteristics of steroidal alkaloids synthesis in the five species of BFC and BFT, another Fritillaria species distributed in the lower altitude areas. Our findings from the current study promoted our understanding of the mechanism of steroidal alkaloids biosynthesis in Fritillaria species.

Metabolome data quality analysis
As shown in Fig. 1A-B, while the biological replicates of each species clustered in different areas in the PCA analysis, there was an evident separation between BFC and BFT, indicating significant differences in terms of their metabolites. A higher PC1 value showed more genetic variation among varieties. PC1 and PC2, two principal components, were 16.77% and 12.42% in the positive ion mode and 17.69% and 11.23% in the negative ion mode, respectively.

Comparison of the metabolites between BFC and BFT
A total of 1288 metabolites were detected from BFC and BFT by conducting liquid chromatography-tandem mass spectrometry (LC − MS/MS) (File S1 and File S2). Alkaloids, saponins, phenylpropanoids, carboxylic acids and their derivatives, flavonoids, and organic acids and their derivatives were the most common metabolites.

Identification of differentially produced alkaloids in BFC compared to BFT
To find the significant differential metabolites, the top 20 differentially (up and down) produced metabolites in BFC and BFT were analyzed. The results indicated that the primary differential metabolites in six species were steroidal alkaloids, tropine alkaloids, pyridines alkaloids, indole alkaloids, isoquinoline alkaloids, organic amines alkaloids, etc. (Table S1). Steroidal alkaloids contained peimine, peimisine, peiminine, jervine, tomatidine, solanine, solasonine, solamargine, solasodine, and veratramine, as well as two undescribed compounds, i.e.edpetiline, and khasianine. As shown in Fig. 1E, the content of solamargine, jervine, solanine, solasonine, solasodine, edpetiline, khasianine, veratramine, and tomatidine in BFC is significantly higher than that in BFT. Moreover, the highest contents of these compounds were observed in F. delavayi (SS). Nevertheless, peimine and peimisine were found to be higher in BFT. There was little difference in the content of peiminine between BFC and BFT (Table S2).

De novo assembly and functional annotation of BFC and BFT unigenes
Eighteen RNA-Seq libraries were constructed and sequenced for the six species, designated JY_S1-S3, GS_ S1-S3, SS_S1-S3, TB_S1-S3, WB_S1-S3, and ZB_S1-S3, respectively. Overall, each library had an average of 23.43 million raw reads, ranging from 20.64 million to 26.25 million (Table S3). Adapters, empty reads, and low-quality sequences were removed from the raw reads before being analyzed for their genetic content. Consequently, each library generated an average of 23.02 million clean reads, totaling 6.91 Gb in size and approximately 49.71% GC content. This study used NCBI-NR, NCBI-NT, PFAM, KOG/COG, GO, and KEGG to annotate gene function (Fig. S2A). A total of 31,342 genes were annotated via GO clustering analysis. These genes were clustered into three major categories and 43 subcategories, including 5 cellular components, 12 molecular functions, and 26 biological processes. In the biological process, "metabolic process," "biological regulation," and "cellular process" are the main functional categories ( Fig. S2B and File S4). Twenty-six thousand nine hundred seventy genes were annotated for 295 KEGG metabolic pathways in the five F. cirrhosa species and F.thunbergii. Among them, "Steroid biosynthesis" (55 unigenes), "Ubiquinone and other terpenoid-quinone biosynthesis" (102 unigenes), "Terpenoid backbone biosynthesis" (131 unigenes), "Diterpenoid biosynthesis" (91 unigenes), and "Sesquiterpenoid and triterpenoid biosynthesis" (13 unigenes) are related to the regulation of alkaloids biosynthesis ( Fig.  S2C and File S4).

Comparative functional annotation of identified DEGs in BFC and BFT
To further understand the biological activities of these genes involved in alkaloid biosynthesis, functional classifications of DEGs found among bulbs in BFC, and BFT were carried out. First, GO clustering analyses were adopted to annotate the DEGs function between the bulbs of BFC and BFT. We found 3597 GO terms enriched in at least one DEG group, and 3208 shared GO terms were enriched in all four DEG groups (File S5). We compared the enriched biological processes of these five DEG groups since we were primarily interested in producing and accumulating alkaloids. As shown in Fig. 2A, alkaloid biosynthetic process, specialized metabolite biosynthetic process, C21-steroid hormone metabolic process, sterol biosynthetic process, seed development, and reproductive system development were enriched in BFC and BFT. In addition, steroid metabolic process, steroid biosynthetic process, photosynthesis, and isoprenoid metabolic process are also the significantly enriched biological processes with the most significant number of DEGs.
Next, KEGG enrichment analysis showed 6 shared pathways enriched in all five DEG groups in the top 20 KEGG pathways (Fig. 2B, File S6). In addition, 10 enriched KEGG pathways (Fig. 2C, and Fig. S4), such as ubiquinone and other terpenoid-quinone biosynthesis, terpenoid backbone biosynthesis, isoquinoline alkaloid biosynthesis, tyrosine metabolism, and tropane, piperidine, and pyridine alkaloid biosynthesis significantly enriched in BFC. In addition, steroid biosynthesis, diterpenoid biosynthesis, and brassinosteroid biosynthesis were also enriched in BFC. These pathways are involved in the biosynthesis of the precursor substances of alkaloids. Furthermore, photosynthesis is also mainly enriched in BFC. One of the most significant environmental elements in plant existence is photosynthesis. It is not only a source of energy for plant growth and development, but it also serves as a signal to control plant growth and development. In most plants, photosynthesis has an impact on specialized metabolite synthesis. The production of alkaloids is influenced by photosynthesis [19,20].

Identification of transcription factors (TFs) possibly involved in the biosynthesis of steroidal alkaloids by BFC
We further screened for the transcription factors that might be associated with the synthesis of steroid alkaloids in Fritillaria. In total, 1957 transcripts from RNAseq were identified with transcription factor domains, which were further categorized into 75 transcription factor families, including MYB families (151)

qRT-PCR analysis of gene expression
To validate the transcriptome data, we performed qRT-PCR to analyze the expression of 9 structural genes and 3 regulatory genes involved in the steroidal alkaloid synthesis pathway in BFC. The results indicated that the expression patterns of 12 genes were consistent with their transcriptome expression profiles (Fig. S5).

Discussion
Because of its antitussive, anti-inflammatory, and antitumor properties, BFC has long been used as a medicine and food homology plant [27,28]. Its health benefits have been attributed to the active ingredients, especially steroidal alkaloids [10,29]. The resources of wild BFC are becoming scarce. Bioengineering might be a feasible way to solve this problem [30][31][32]. However, we must first clarify the biosynthetic pathways and genes involved in BFC steroidal alkaloids biosynthesis.
Saponins are less studied metabolites of BFC. A recent report suggested that saponins contribute to the pharmacological activities of BFC [13]. This study found that BFC and BFT contained abundant steroid saponins and triterpenoid saponins, including timosaponin BII, protoClusterscin, astragaloside IV, chikusetsusponin Iva, polyphyllin II, polyphyllin VI, timosaponin A-III, timosaponin A1, trillin, and liriopemuscaribaily saponins. Moreover, their contents in BFC were significantly higher in F. thunbergii. The discovery of new metabolites is helpful to elucidate further the material basis of the effect of BFC and BFT. Our data provided new insights into understanding the active substances of BFC.
Further differential metabolite analysis showed that steroidal alkaloids were the major differentially produced alkaloids between and BFC and BFT. We found that imperialine, peimine, and edpetiline were rich in BFC [39,40]. In contrast, peimisine and peiminine were higher in F. thunbergii than in BFC, consistent with the previously reported data [41]. According to our data, most of the steroidal alkaloid content in BFC was higher than in F. thunbergii, which might be why BFC is more effective than BFT. In plants, the terpenoid skeletons are synthesized by DMAPP/IPP via the MVA/MEP pathway [42]. In this study, six genes predicted to be involved in DMAPP/IPP synthesis via the MVA pathway were found to be highly expressed in BFC, while another set of genes predicted to be involved in the MEP pathway was expressed higher in F. thunbergii. Thus, we inferred that the MVA pathway is the primary route responsible for DMAPP/IPP synthesis in BFC, which is consistent with the results from another recent study [17]. On the other hand, the enzymes in the MEP pathway are more likely to control how DMAPP/ IPP is synthesized in BFT. We don't know much about the genes involved in the steps after cholesterol that lead to the production of steroidal alkaloids in BFC. Steroidal alkaloids can be biosynthesized from cholesterol via a series of hydroxylation or oxidation reactions of cholesterol at C-22 and C-26, hydroxylation at C-23, and transamination at C-26 [43]. CYP90B1 from Arabidopsis thaliana has been reported to catalyze steroid hydroxylation at the C-22 position [44]. CYP90A4 can catalyze hydroxylation at C-23 in Oryza sativa L. [45]. CYP734A6 was reported to mediate a C-26 hydroxylation reaction in Solanum lycopersicum and A. thaliana [46]. The steroid C-26 hydroxylase/oxidase (CYP94N1) activity was previously confirmed in Veratrum californicum [15]. AOP2, GAME12, and GAME 17 have been shown to introduce the amine group at C-26 of cholesterol, participate in the biosynthesis of steroidal glycoalkaloids (SGA) [44]. In the present study, we identified three candidate genes (Cluster-1034.31647, Cluster-1034.23901, Cluster-1034.31945) with C-22, C23, and C-24 hydroxylase activity. The candidate with C-26 hydroxylase/oxidase activity pointed to one contig (Cluster-1034.19824). These genes were significantly upregulated in BFC. Thus, we hypothesized that the higher expression of CYP90B1, CYP734A6, CYP94N2, and CYP90A4 contributes to the accumulation of steroid alkaloids in BFC.
AP2/ERF superfamily has been shown to regulate the transcription of genes involved in alkaloid synthesis by recognizing various GC-rich boxes in their promoters in tobacco and Catharanthus roseus [4]. In this study, we identified a couple of candidate transcription factors, belonging to AP2/ERF, MYB, C2H2, and bHLH families, that were highly expressed in BFC. These transcription factors have been shown to directly or indirectly regulate the biosynthesis of alkaloids in Fritillaria roylei Hook [17,18,47,48].

Conclusion
In summary, we systematically analyzed the primary specialized metabolites and the metabolic pathways of steroidal alkaloids biosynthesis in BFC by comparing them with BFT. Our metabolomics data also showed that the contents of the majority of the steroidal alkaloids in BFC were higher than in BFT. Our comparative transcriptome analyses between BFC and BFT identified differentially expressed gene sets among these species. We identified a series of genes that might be associated with alkaloids biosynthesis regulation in Fritillaria species. The results from this study would be valuable to promote our understanding of the steroidal alkaloids biosynthesis pathways.

Plant materials
The bulbs of the five species of BFC were collected in the alpine regions of northwestern China: The morphology of bulbs from six varieties is shown in Fig. 5. For each species, the bulbs of six individuals were randomly collected for metabolic profiling. Three biological replicates of bulb samples from each species were frozen in liquid nitrogen and stored at − 80℃ for transcriptome sequencing and reverse transcription-quantitative polymerase chain reaction (RT-qPCR) analyses. Metabolic and transcriptional profiling analyses were performed by Novogene Co., LTD. (Beijing, China).

Metabolite extraction
Bulb homogenate was reconstituted from 100 mg of each sample with prechilled 80% methanol and 0.1% formic acid by vortexing after being ground in liquid nitrogen. Incubation of the samples on ice for five minutes followed by centrifugation for 20 min at 15,000 g (4℃). 53% methanol was used to dilute the supernatant, and the supernatant was centrifuged once more for 20 min at 15000 g (4 °C). Analyses with the LC-MS/MS system were performed on the resultant supernatant.

UHPLC-MS/MS analysis
Novogene Co., Ltd. (Beijing, China) performed the UHPLC-MS/MS analyses with a Vanquish UHPLC system and an Orbitrap Q Exactive TM HF mass spectrometer (Germany, Thermo Fisher). A 17-min linear gradient was used with a flow rate of 0.2 mL/min to inject samples onto a 100 × 2.1 mm Hypesil Goldcolumn (1.9 μm). Eluent A (0.1% formic acid in Water) and Eluent B (Methanol) were used in the positive polarity mode. Eluent A with 5 mM ammonium acetate (pH 9.0) and B with methanol was used in the negative polarity mode. It was decided that the following gradient of solvent concentrations would be used: 2% B, 1.5 min; 2% -100%B, 12.0 min; 100% B, 14.0 min; 100-2% B, 14.1 min; 2% B, 17 min. We used the Q Exactive TM HF mass spectrometer for this experiment, set it to spray voltage mode, and ran at a capillary temperature of 320 °C with a sheath gas flow rate of 40 arb. We also used an aux gas flow rate of 10 arb.

Metabolome analysis
To evaluate and validate the differences and reliability of metabolites in the samples, principal component analysis (PCA) was used. To search for differential metabolites, significant difference criteria [variable importance in projection (VIP) ≥ 1 and t-test p < 0.05] were employed. The Kyoto Encyclopedia of Genes and Genomes (KEGG) database was used to investigate the function of the differential metabolites and metabolic pathways. Differentially accumulated metabolites (DAMs) were screened based on fold-change (FC) ≥ 2 or ≤ 0.5 among the metabolites with a VIP value > 1 and a P-value < 0.05. The top 20 differentially (up and down) produced metabolites in BFC and BFT were analyzed to find the major differential metabolites and steroidal alkaloids. Heat maps were used to figure out how the main steroid alkaloids in BFC and BFT were different.

RNA Sequencing
Six species' bulbs were used to extract the total mRNA. Each sample's mRNA library was constructed. The index-coded samples were clustered using the TruSeq PE Cluster Kit v3-cBot-HS (Illumia) on a cBot Cluster Generation System. Each sample had six biological replicates. Raw fastq data (raw reads) was first processed by in-house Perl scripts. This step removed adapter, ploy-N, and low-quality reads from raw data. Simultaneously, Q20, Q30, and GC content were calculated. All downstream analyses used clean, high-quality data.

Functional annotation and screening of differentially expressed genes (DEGs)
These databases used to annotate the gene function included the Nr (NCBI non-redundant protein sequences), Nt (NCBI non-redundant nucleotide sequences), Pfam (Protein Family), KO (KEGG Ortholog database), and GO (Gene Ontology). The DESeq2R package was used to compare differential expression between two conditions/groups (two biological replicates each) (1.20.0). DESeq2 classified genes as differentially expressed if their adjusted P-value was less than 0.05. In the cluster Profiler R package, gene length bias was corrected before performing GO enrichment analysis on differentially expressed genes. Derived from the GO database, differentially expressed genes were considered to be significant. We used the R package cluster Profiler to perform KEGG pathway enrichment of differentially expressed genes. We found the GO terms or KEGG pathways involved alkaloids biosynthesis that was significantly enriched in DEGs compared to BFC and BFT. The biosynthetic route for BFC steroidal alkaloids was hypothesized based on an analysis of the chemical structures of several kinds of steroid alkaloids and the role of their biocatalytic enzymes in functionalizing the steroid skeleton. Structural genes associated with alkaloid synthesis were identified by comparing significantly different genes. TFs genes upregulated in two or more of the five BFC species were selected to regulate the alkaloid synthesis.

Quantitative Reverse Transcription Polymerase Chain Reaction (qRT-PCR) for Verification of the RNA-Seq Data
qRT-PCR was used to validate the expression levels of the candidate genes in the proposed biosynthetic pathway of steroidal alkaloids. The primer sequences were shown in Supplemental Table S5. Ct values were calculated from 3 biological replicates and 3 technical replicates. The Ct value determined the expression level of the reference genes, and the expression level of the reference gene was calculated by 2 − △△Ct.