Comparative transcriptome analyses of genes involved in sulforaphane metabolism at different treatment in Chinese kale using full-length transcriptome sequencing

Background Sulforaphane is a natural isothiocyanate available from cruciferous vegetables with multiple characteristics including antioxidant, antitumor and anti-inflammatory effect. Single-molecule real-time (SMRT) sequencing has been used for long-read de novo assembly of plant genome. Here, we investigated the molecular mechanism related to glucosinolates biosynthesis in Chinese kale using combined NGS and SMRT sequencing. Results SMRT sequencing produced 185,134 unigenes, higher than 129,325 in next-generation sequencing (NGS). NaCl (75 mM), methyl jasmonate (MeJA, 40 μM), selenate (Se, sodium selenite 100 μM), and brassinolide (BR, 1.5 μM) treatment induced 6893, 13,287, 13,659 and 11,041 differentially expressed genes (DEGs) in Chinese kale seedlings comparing with control. These genes were associated with pathways of glucosinolates biosynthesis, including phenylalanine, tyrosine and tryptophan biosynthesis, cysteine and methionine metabolism, and glucosinolate biosynthesis. We found NaCl decreased sulforaphane and glucosinolates (indolic and aliphatic) contents and downregulated expression of cytochrome P45083b1 (CYP83b1), S-alkyl-thiohydroximatelyase or carbon–sulfur lyase (SUR1) and UDP-glycosyltransferase 74B1 (UGT74b1). MeJA increased sulforaphane and glucosinolates contents and upregulated the expression of CYP83b1, SUR1 and UGT74b1; Se increased sulforaphane; BR increased expression of CYP83b1, SUR1 and UGT74b1, and increased glucosinolates contents. The desulfoglucosinolate sulfotransferases ST5a_b_c were decreased by all treatments. Conclusions We confirmed that NaCl inhibited the biosynthesis of both indolic and aliphatic glucosinolates, while MeJA and BR increased them. MeJA and BR treatments, conferred the biosynthesis of glucosinolates, and Se and MeJA contributed to sulforaphane in Chinese kale via regulating the expression of CYP83b1, SUR1 and UGT74b1. Electronic supplementary material The online version of this article (10.1186/s12864-019-5758-2) contains supplementary material, which is available to authorized users.


Background
Sulforaphane is a natural isothiocyanate available from widely consumed cruciferous vegetables, including broccoli, radish, and kale. Multiple characteristics of sulforaphane have been widely reported during the past two decades. Sulforaphane has various effects including antioxidant, antitumor formation and anti-inflammatory effect [1][2][3][4][5]. This compound has been shown to suppress or prevent tumor formation and development [2,5].
One reason of for the great interest on sulforaphane is that it is an inducer of phase 2 enzymes, including phase 2 detoxification enzymes glucuronyltransferase (GT) and NAD(P)H:quinone reductase 1 (NQO1) [2,6,7]. Moreover, sulforaphane could stimulate the accumulation of glutathione, which plays important roles in cells including antioxidation, immune response, detoxification, and cancer pathogenesis [8]. For instance, sulforaphane has been reported to induce apoptosis and cell cycle arrest of human colon cancer cells by increasing cyclins A and B1 expression [2]; the administration of sulforaphane shows antitumor formation [9]; sulforaphane inhibits the Aβ1-42-induced excessive secretion of interleukin-1β (IL-1β) in human microglia-like cells via activation of nuclear factor erythroid 2-related factor 2 (Nrf2), a bZIP transcription factor and an inducer of phase 2 genes [10].
They confirmed that NaCl and SA co-treatment induced sulforaphane and isothiocyanates accumulation was independent of the biosynthetic genes of cytochrome P45083b1 (CYP83b1), CYP83a1 and CYP79b2, but associated with the increased hydrolytic enzymes of myrosinase (MYO) and epithiospecifier moifier 1 (ESM1) [17]. Other researches showed the indole-3-acetic acid, methyl jasmonate (MeJA) and brassinosteroids treatments upregulated the production of glucosinolates [19,20],whereas the contents of glucosinolates could be decreased under selenium (Se) stress [21,22]. Tian et al. showed the expression of CYP79b2, CYP83b1, and CYP83a1 were significantly downregulated in B. oleracea following Se treatment [22]. Ku et al. performed a comparative transcriptome analysis in two broccoli cultivars under MeJA treatment and identified that the glucosinolates' core structure biosynthesis genes (including CYP79b2, UDP-glycosyltransferase 74B1, UGT74b1, SUR1) were upregulated by MeJA [23]. Several comparative transcriptome analyses in Chinese kale (B. oleracea var. alboglabra Bailey) [24,25] and watercress [26] showed that the expression patterns of aforementioned genes related to glucosinolates biosynthesis were changed in different tissues. These reports demonstrate the complex and diversity glucosinolates biosynthesis and metabolism by different stresses and in different plants. It has been reported that the influence of stress-induced biosynthesis of sulforaphane is dependent on plant variety and growth conditions [17,27]. However, the causality of abiotic stresses-induced dysregulation of sulforaphane and glucosinolates biosynthesis has yet to be clearly explored, and the underlying mechanisms should be dissected.
In comparison with the next (second)-generation sequencing (NGS), the third-generation transcriptome (single-molecule real-time, SMRT) sequencing produced more complete transcriptome data with longer reads. SMRT sequencing has been widely used for plant genome assembly [28]. The problem of higher error rate in SMRT sequencing data could be resulted by correcting with NGS-generated short reads. Here we investigated the molecular mechanisms related to glucosinolates biosynthesis under different abiotic stresses in Chinese kale using combined NGS and SMRT sequencing data. Chine kale is a native Chinese brassica vegetable with good flavor and high content of vitamin C, total phenolics, and glucosinolates. The market demand of Chinese kale sprouts is increasing due to the rich in glucosinolates. Transcriptome with complete and full-length reads would be to dissect molecular mechanisms of glucosinolates biosynthesis in Chinese kale.
We confirmed that the contents of sulforaphane (Fig.  1b) and glucosinolates (Fig. 1c, total, aliphatic and indolic) in Chinese kale seedlings were significantly reduced by NaCl treatment. Se and MeJA treatments obviously increased the contents of sulforaphane in kale seedlings in comparison with control (p < 0.01; Fig. 3a), and MeJA and BR treatments significantly increased the contents of total and aliphatic glucosinolates (p < 0.01, Fig. 1c). The content of aromatic glucosinolates was only increased by Se treatment (p < 0.01) compared with Fig. 1 The content of sulforaphane and glucosinolates in Chinese kale seedlings in response to different treatments. a the content of sulforaphane in Chinese kale seedlings under different treatments with series concentrations. b the content of sulforaphane under different treatments with optimal contents. c the total content of 18, aliphatic, aromatic and indolic glucosinolates in Chinese kale seedlings (dry weight, g). d the content of 18 glucosinolates by ULPC-MS methods. MeJA, methyl jasmonate. Se, sodium selenite. BR, brassinolide. GLs, glucosinolates. In Fig. 3a and b, * and ** notes p < 0.05 and 0.01 vs. control, respectively, while # and ## notes p < 0.05 and p < 0.01 vs. NaCl, respectively. In Fig. C, * notes p < 0.05 vs. control, and # notes p < 0.05 vs. NaCl, respectively control, while MeJA increased both aliphatic and indolic glucosinolates, but decreased aromatic glucosinolates (p < 0.01, Fig. 1c). Figure 1d shows the influence of four treatments on the accumulation of 18 glucosinolates. We observed that BR showed slight influence on indolic glucosinolates, followed by aromatic glucosinolates and significantly obvious influence on aliphatic glucosinolates, including gluconapin, glucobrassicanapin, and glucoraphasatin (Fig. 1d); and MeJA showed obvious influence on most glucosinolates, including glucobrassicanapin, glucoraphanin, glucoraphasatin, n-hexyl-glucosinolates and glucobrassicin (p < 0.05).  Table 2). PacBio RSII platform generated a total of 617,063 circular consensus sequences (CCSs) with a full length of 467,672 bp. The full-length non chimera (FLNC) reads number was 434,967, with an average length of 1720 bp ( Table 2). PacBio RSII platform produced a total of 199,106 consensus reads and 14,267,901 subreads (13.46 G bases, with an average length of 944 bp, and an N50 of 1572 bp), which were then corrected using the Illumina reads (Additional file 1: Figure S1). The length distributions of gene (Additional file 2: Figure S2A and B), corrected subreads and consensus reads (Additional file 2: Figure S2C), lncRNA number (Additional file 2: Figure S2 S2D) and simple sequence repeat (SSR) motifs (Additional file 2: Figure S2E) are shown in Additional file 2: Figure S2. There was similar distribution of CDS of these unigenes produced by PacBio RSII platform and Illumina Hiseq 2500 platform, and most CDSs were less than 1800 bp (Additional file 2: Figure  S2A and B). A total of 6369 long non-coding RNAs (lncRNAs) were identified from PacBio transcriptome (Additional file 2: Figure S2D).

Summary of Illumina
In comparative analysis, we observed that the unigene number originated from PacBio RSII transcriptome (185,134 genes) was higher than that of Illumina transcriptome (130,553 genes; Table 1 and Fig. 2a and b), including 130,553 overlapping genes; the number of transfection factors (TFs) from PacBio RSII transcriptome was 9407, which was higher than 8006 from Illumina (Table 1; Fig. 2c and d).

Functional annotation of unigenes
The GO, KEGG, and KOG functional annotations of unigenes in PacBio RSII transcriptome were identical to unigenes in Illumina transcriptome (Additional file 3 Figure S3). GO classification showed most unigenes were associated with cellular and metabolic processes with the molecular functions of binding and catalytic activity (Additional file 3 Figure S3A). KEGG and KOG classification showed that the top clusters involved unigenes were energy and carbohydrate metabolism, transport, catabolism and signal transduction (Additional file 3 Figure S3B and C). Annotation against NR database showed that unigenes in PacBio transcriptome were identical to Brassica oleracea (48.4%), followed with Brassica napus (38.0%) and Brassica rapa (3.0%), while  Figure S3D).  Fig. 3b). The heatmap clustering of these DEGs is shown in Fig. 3c.

GO and KEGG pathway enrichment analysis
Functional enrichment analysis showed NaCl, BR, MeJA, and Se-induced DEGs were associated with 26 135 GO terms including biological process terms of indole-containing compound metabolic process, cellular glucan metabolic process, carbohydrate metabolic process, and polyol biosynthetic process; cellular component terms of Holliday junction resolvase complex and external encapsulating structure; and molecular function terms of hydrolase activity of hydrolyzing O-glycosyl compounds, calcium ion binding, xyloglucan-xyloglucosyltransferase activity, and polysaccharide metabolic process (Additional file 4: Table  S1). These DEGs were significantly involved in KEGG pathways including starch and sucrose metabolism, limonene and pinene degradation, stilbenoid, diarylheptanoid and gingerol biosynthesis, phenylalanine, tyrosine and tryptophan biosynthesis, valine, leucine and isoleucine biosynthesis, cysteine and methionine metabolism, and glucosinolate biosynthesis (Additional file 5: Table S2). The DEGs induced by MeJA, BR, and Se treatments were associated with plant-pathogen interaction pathway, and Se-induced DEGs were associated with the biosynthesis of glucosinolates (Additional file 5: Table S2 and Additional file 6: Figure S4). The KEGG pathways associated with DEGs induced by four stimuli are shown in Additional file 5: Table S2.

Discussion
Glucosinolates are a group of sulfur-rich anionic products naturally generated in widely consumed cruciferous vegetables. We performed the combined transcriptome analysis of NGS and SMRT sequencing and investigated the molecular mechanism of sulforaphane and glucosinolate biosynthesis in Chinese kale seedlings in response to abiotic stresses. The high error rate in long reads generated from PacBio platform was corrected by short reads produced by Illumina sequencing. We found SMRT transcriptome contained 185,134 unigenes, that was higher than 130,553 unigenes from Illumina transcriptome, suggesting higher performance of SMRT sequencing compared with NGS.
The combination of NGS short reads and SMRT long reads identified 6893, 13,287, 13,659 and 11,041 DEGs in kale seedlings in response to NaCl (75 mM), MeJA (40 μM), Se (100 μM) and BR (1.5 μM) treatment, respectively. KEGG pathway enrichment analysis showed these genes were associated with energy metabolism, biosynthesis of amino acids including phenylalanine, tyrosine, tryptophan, phenylalanine, cysteine, methionine, valine, leucine and isoleucine, and glucosinolate (Additional file 5: Table S2). These pathways were core pathways for the synthesis of glucosinolates (Additional file 6: Figure S4). These facts showed that these biosynthetic pathways were influenced by NaCl, MeJA, Se and BR treatments in Chinese kale seedlings, and the glucosinolates might take important roles in the plant defense or resistance to these abiotic stresses.
In stressed or specific conditions, glucosinolates are released from vacuoles and are hydrolyzed by MYO to unstable thiohydroximate-O-sulfonate, which spontaneously converts to isothiocyanates including sulforaphane [17,29]. Glucosinolates are grouped into three groups, including aliphatic, aromatic and indolic glucosinolates. The tryptophan amino acid is the substrate of indolic glucosinolates, and methionine is the substrate of aliphatic glucosinolates [13,17]. The subsequently conversion of tryptophan-derived indolic glucosinolates and methionine-derived aliphatic glucosinolates to S-alkylthiohydroximates (SAM) is catalyzed by CYP83b1 and CYP83a1 [30]. Next, SAMs are conversed into Fig. 4 The FKPM values and relative levels of 23 genes related to glucosinolate biosynthesis. a the FKPM values of 23 genes in samples treated with different treatments by sequencing. b the relative expression levels of 23 genes determined by qRT-PCR. * notes p < 0.05 vs. control, and # notes p < 0.05 vs. NaCl, respectively. MeJA, methyl jasmonate; Se, sodium selenite, BR, brassinolide thiohydroximates, desulfoglucosinolates and sulfated products by SUR1, UGT74B1 and ST5b_c, respectively, in step-by-step ways [30][31][32]. We confirmed that the content of aliphatic glucosinolates was highest in Chinese kale seedlings, followed with indolic glucosinolates and then aromatic glucosinolates. In addition, we confirmed that the content of aliphatic glucosinolates was increased by MeJA and BR treatments, while aromatic glucosinolates was increased by Se and BR treatments (Fig. 3). Transcriptome analysis showed that the expression of genes in core pathways of aliphatic (including CYP83a1, SUR1, and UGT74b1) and aromatic glucosinolates (including CYP83b1, SUR1 and UGT74b1), were increased by MeJA and BR treatments ( Fig. 4 and Additional file 6: Figure S4). In addition, we found the contents of sulforaphane and total glucosinolates were increased by MeJA and BR treatments compared with control. The expressions of CYP83b1, SUR1 and UGT74b1 were upregulated followed MeJA treatment (Fig. 4). Although NaCl treated decreased the content of aliphatic and indolic glucosinolates and sulforaphane, the expression of the aforementioned core genes did not decrease by NaCl treatment. This might due to the decreased expression of ST5a_b_c by NaCl treatment (Additional file 7: Figure S5). We also observed the decrement in ST5a_b_c expression by Se, MeJA and BR. It has been reported that the synthesis of aliphatic glucosinolates could be controlled by ST5b_c, and the aromatic glucosinolates is controlled by ST5a (Additional file 6: Figure S4). These demonstrated the complex and ordinal mechanism of glucosinolates biosynthesis in plant [33].
ST5a_b_c, are three desulfoglucosinolate SOTs that catalyze the final step in glucosinolate biosynthesis by catalyzing different substrate specificities and producing S-containing products [34,35]. The most important glucosinolate, sulforaphane, is a sulfur-rich hydrolysis product of aliphatic (methylseleno)glucosinolate, the methionine-derived glucosinolates [36][37][38]. Our current study confirmed that the contents of all glucosinolates and sulforaphane in Chinese kale were decreased by NaCl stress (75 mM). This was consistent with the results from the study by Esfandiari et al. [17]. Esfandiari et al. showed that NaCl stress (80 and 160 mM) decreased the content of glucosinolates in broccoli via suppressing the expression of CYP83a1 and CYP83b1 [17]. They also suggested that the increased contents of sulforaphane by NaCl stress (80 and 160 mM) were associated with upregulation of MYO and ESM1. Guo et al. showed the higher expression of MYO was found in Chinese kale roots with low-glucosinolates [25], which might responsible for the increased sulforaphane [17]. Our study demonstrated that the expression of CYP83a1 and CYP83b1 were not significantly influenced by 75 mM NaCl stress. In addition, the treatment with 75 mM NaCl decreased the contents of both sulforaphane and glucosinolates. Although total and the major aliphatic glucosinolates were increased by MeJA and BR treatments, the FPKM of ST5a (crucial for aromactic glucosinolates) and ST5b_c (crucial for aliphatic  Figure S4) coding sequences were decreased by MeJA and BR compared with control (Additional file 7: Figure S5). These differences might due to the plant variety, tissues or growth conditions [17,27,39].
Other studies of Chinese kale transcriptome showed that the expression of CYP83a1 and CYP83b1 showed high expression level in root [24], and in plants with low contents of glucosinolates [25]. It has been reported that the biosynthesis of glucosinolates in susceptible varieties of Chinese cabbage were active than that in resistant varieties, and glucosinolates were increased by infection of Plasmodiophora [39], while that in resistant varieties was not influenced by treatments. They also showed that the two susceptible varieties had different contents of indolic and aliphatic glucosinolates in leaves or roots, while aromatic glucosinolates were the major glucosinolates in resistant varieties in response to Plasmodiophora, which were differentially influenced by the addition of SA and JA [39]. Our study was performed with 7-day old (post germination) Chinese kale seedlings, and the whole plant contents of glucosinolates were detected. The distribution of glucosinolates was averaged then, and these might also be the reasons for the differences between our results with others from Ludwig-Müller et al. [40], Esfandiari et al. [17] and others.

Conclusion
In summary, we performed the de novo assembly of Chinese kale genomes using SMRT sequencing combined with NGS sequencing. DEGs induced by NaCl, MeJA, BR and Se treatment were identified using combination of NGS and SMRT transcriptome. MeJA and BR treatments increased the content of total glucosinolates and/or sulforaphane by upregulated the expression of genes related to glucosinolate core pathways, including CYP83b1, CYP83b1, SUR1 and/or UGT74b1. We also confirmed that NaCl (75 mM) inhibited the biosynthesis of sulforaphane, tryptophan-derived indolic glucosinolates and methionine-derived aliphatic glucosinolates, and MeJA treatment enhanced them. We concluded that the treatment of MeJA, Se and BR, especially MeJA, confer the formation and biosynthesis of glucosinolates and sulforaphane.

Extraction and measurements of sulforaphane and glucosinolate
Sulforaphane was extracted from frozen samples of kale seedlings. Samples were grinded into powder and then immersed into purified water for 2 h at room temperature, followed with dichloromethane extraction for 10 min. Sodiumfulfate was used for drying. Extractions were then firstly filtered, vacuum dried, eluted using acetonitrile, and finally filtered (0.22 μm). Sulforaphane standard was purchased from Shanghai yuanye Bio-Technology, China, (CAS No. 4478-93-7, Art. No.G18O8C45982). The content of sulforaphane was determined using high performance liquid chromatography (HPLC: Shimadzu, Kyoto, Japan) methods with conditions as follows: C18 column (250 mm × 4.6 mm, 5 μm, at 30°C), mobile phase (20% acetonitrile-80% water) at speed of 1 mL/min, retention time 20 min, and wavelength 202 nm.
Glucosinolate were methanol-extracted from kale seedlings. Grinded powder was extracted with 70% methanol at 75°C for 20 min, with addition of sinigrin (in water, 0.5 mg, internal standard). After the glucosinolate extraction became cold, barium acetate was added, followed with centrifugation. The supernatants were then extracted with 70% methanol and centrifuged for supernatants. Desulfoglucosinolates were isolated using anion-exchange chromatography. DEAESephadex A-25 was pretreated with acetic acid and then filled with resin. Crude extract was eluted with water and desulfoglucosinolates were eluted using sulfatase solution at room temperature overnight. Desulfitation glucosinolates were collected using water-elution. Desulfoglucosinolates were separated using chromatography with gradually increased methanol concentrations [5% methanol for 1 min, linear gradient methanol (5%~100%) for 8 min, 100% methanol for 2 min, and 5% methanol for 2 min]. The contents of desulfoglucosinolates were determined using ULPC-MS technology [diode array detector (229 nm), C18 column (100 × 2.1 mm), mobile phase of 75% methanol-25% water] and mass spectrometer.

RNA extraction and transcriptome sequencing
Altered genomic expression profiles in response to different stimulation (NaCl, MeJA, SE and BR) were identified using the NGS. Total RNA was isolated from kale seedlings using TRIzol (Invitrogen, USA) and treated with RNase-free DNase I (Takara, Japan). After quality assessment using Agilent 2100 Bioanalyzer (Agilent Technologies, USA) and NanoDrop2000c spectrophotometer (NanoDrop products, USA), RNA samples were used for the synthesis of the first strand cDNA using a SMARTer PCR cDNA Synthesis Kit (Clontech Laboratories, USA). PCR optimization followed with large scale PCR were conducted for Illumina sequencing (the NGS, Hiseq PE 150) and the PacBio RSII sequencing (the third generation sequencing). Fifteen libraries were constructed and subjected to the NGS using Illumina sequencing platform and third-generation sequencing using the Pacific Biosciences RSII sequencing platform.

Data processing
The NGS data were collected from Illumina sequencing platform and processed as usual. Data quality estimators, including base error rate (%), phred score (Q 30 ), GC content (%), and clean data rate (%), was evaluated. Sequence data (CCS) containing full length and non-full length reads were generated from PacBio RSII platform and were processed using the SMRTlink 4.0 software (min Passes = 1, min Predicted Accuracy =0.8). Non-full length and full-length fasta files were then fed into the cluster step (isoform-level clustering, ICE). Hiseq reads in full-length transcripts were adjusted using proovread software [41] and used as the reference sequences in this study. Redundancy in the adjusted transcripts was removed using CD-HIT-EST program and unigenes were obtained subsequently.

QRT-PCR analysis
Isolated total RNA from kale seedlings were reversely transcribed into the first-strand cDNA using Bestarq PCR RT Kit (DBI Bioscience, Germany). Primers were synthesized by Sangon (Shanghai, China) and were listed in Table 2. DNA template (1 μg) was used for PCR amplification using specific primers synthesized by Sangon (Shanghai, China, Table 1). Amplification (20 μL) was performed using Sybr Green qPCR master Mix (GEN-Eray, GK8020) on an ABI 7500 fast real time PCR machine (ABI, USA). The reaction conditions were as follows: 94°C for 3 min, 40 cycles of 94°C 20 s, 58°C 20 s, 72°C 20 s. Relative expression levels of genes were analyzed using 2 -△△Ct methods. The 18 s gene was used as internal normal control gene in this study.

Statistical analysis
All data of qRT-PCR analysis and contents of sulforaphane and glucosinolate were presented as mean ± standard deviation. All experiments were repeated at least three times. Statistics were performed using GraphPad Prism 6.0. One-way ANOVA was used to compare the data between groups with Holm-Sidak post hoc test. P < 0.05 was considered statistically significant.