Transcriptomes analysis reveals novel insight into the molecular mechanisms of somatic embryogenesis in Hevea brasiliensis

Background Somatic embryogenesis (SE) is a promising technology for plant vegetative propagation, which has an important role in tree breeding. Though rubber tree (Hevea brasiliensis Muell. Arg.) SE has been founded, few late SE-related genes have been identified and the molecular regulation mechanisms of late SE are still not well understood. Results In this study, the transcriptomes of embryogenic callus (EC), primary embryo (PE), cotyledonary embryo (CE), abnormal embryo (AE), mature cotyledonary embryo (MCE) and withered abnormal embryo (WAE) were analyzed. A total of 887,852,416 clean reads were generated, 85.92% of them were mapped to the rubber tree genome. The de novo assembly generated 36,937 unigenes. The differentially expressed genes (DEGs) were identified in the pairwise comparisons of CE vs. AE and MCE vs. WAE, respectively. The specific common DEGs were mainly involved in the phytohormones signaling pathway, biosynthesis of phenylpropanoid and starch and sucrose metabolism. Among them, hormone signal transduction related genes were significantly enriched, especially the auxin signaling factors (AUX-like1, GH3.1, SAUR32-like, IAA9-like, IAA14-like, IAA27-like, IAA28-like and ARF5-like). The transcription factors including WRKY40, WRKY70, MYBS3-like, MYB1R1-like, AIL6 and bHLH93-like were characterized as molecular markers for rubber tree late SE. CML13, CML36, CAM-7, SERK1 and LEAD-29-like were also related to rubber tree late SE. In addition, histone modification had crucial roles during rubber tree late SE. Conclusions This study provides important information to elucidate the molecular regulation during rubber tree late SE. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07501-9.


Background
Rubber tree (Hevea brasiliensis Muell. Arg.), a tropical rubber-producing tree within the Euphorbiaceae family which is native to the great Amazonian basin of South America, is now widely cultivated to produce natural rubber in Southeast Asia [1]. Rubber tree is a perennial cross-pollination tree with a long juvenile period, which makes low efficiency of hybrid breeding [2]. Rubber tree is still propagated mostly by grafting, although the interaction between scion and rootstock of the grafted tree affects the growth, and natural rubber yield [3,4].
Somatic embryogenesis (SE) is a promising and rapid vegetative propagation technique for plant regeneration. Plant regeneration via SE process in rubber tree had been established using different kinds of explants including immature anthers, internal integuments of immature fruits, inflorescence, as well as root [5][6][7][8]. The regenerated plants have juvenile characters and their own roots, which are called self-rooted juvenile clones (SRJCs). Compared with donor clones, SRJCs is superior in growth, rubber yield and stress resistance [9][10][11], which is a promising new rubber tree planting material in the future. There are two pathways (indirect primary SE, direct primary SE) to obtain primary somatic embryos [11]. Secondary SE allows to produce an unlimited number of secondary somatic embryos in a cyclic routine [10]. At present, the SE process is limited by irregular germination of the somatic embryos and low efficiency of plantlet recovery from somatic embryos [11], only a limited number of rubber tree genotypes can obtain regeneration plant [11][12][13][14][15].
SE of rubber tree can only be obtained for a limited number of genotypes [12][13][14]. Few studies have reported the molecular regulation mechanism of rubber tree SE. For example, Charbit et al. found that five cDNAs were differentially expressed in the embryogenic regenerating line could be enable an early diagnosis of friable rubber tree callus embryogenic potential, but the functions of these cDNAs haven't been identified [12]. Li et al. [36] found that three MADS-box genes (genes encode transcription factors that promote SE in many plant species [37][38][39]), were differentially expressed during rubber tree early embryogenesis , suggesting MADS-box genes involved rubber tree SE. Piyatraku et al. reported that 11 AP2/ERF genes might act as expression markers linked to the different stages of the somatic embryogenesis process in rubber tree [14]. Some studies have also shown that AP2/ERF genes play important roles in somatic embryogenesis as these genes involved in SE regulation [40][41][42]. However, the molecular regulation mechanisms of the late stage of rubber tree SE are still not well understood. To clarify whether the regenerate competence of different embryos depend on the genes during late SE, we investigated the expression profiling using RNA-seq technology. This study will offer valuable information for the molecular regulation mechanisms of rubber tree late SE.

Induction of somatic embryogenesis
The procedure of somatic embryogenesis and regeneration in H. brasiliensis was established ( Fig. 1) as described previously [5]. The immature anthers were cultured in solid Murashige and Skoog (MS) medium supplemented with 2, 4-dichlorophenoxyacetic acid (2, 4 -D), kinetin (KT) and naphthylacetic acid (NAA) for 50 days. At the end of the period, the embryogenic calluses (ECs) were obtained. ECs were placed in the MS medium containing indole-3-acetic acid (IAA) and gibberellic acid (GA 3 ) for embryo induction. After 40 days, primary embryos (PEs) were collected. The PEs were transferred to MS medium containing 6-benzyl aminopurine (6-BA) and AgNO 3 for growing. After 40 days, two different embryos based on their phenotype (cotyledonary embryo (CE), abnormal embryo (AE)) were observed in the culture medium.. We observed a significant difference between CEs and AEs in phenotype. The CEs and AEs were placed on half-strength MS medium containing IAA and BA. The CEs turned stronger into the mature cotyledonary embryo (MCE) 20 days later, whereas the AEs turned brown and grown up into withered abnormal embryo (WAE). After 30 days, the MCEs grew into complete seedlings, whereas the WAEs turned black and died. Based on the above phenotypic observation, six different samples during SE were selected for further study.

Transcriptome analysis of rubber tree SE
High-throughput sequencing generated 915,535,874 raw reads in EC, PE, CE, AE, MCE and WAE samples. A total of 887,852,416 clean reads were retained by filtering the reads with adaptor sequences and ambiguous "N" base. The percentage of quality score above 30 (Q30) was 97.92% and the GC percentage was 43% (Table 1). On average, 85.92% of the clean reads were mapped to H. brasiliensis genome.

Global analysis of gene expression during rubber tree
A Venn diagram was created to find the overlapped genes in the four different developmental stages of H. brasiliensis SE (Fig. 3a). A total of 25,841 genes overlapped in the four stages. Among them, 155 genes overlapped between EC and PE; 290 genes overlapped between PE and CE; 193 genes overlapped between CE and MCE. A total of 388, 297, 152 and 582 genes were   (Fig. 4).

GO analysis of DEGs
To further demonstrate the unigenes functions, GO assignments were carried out using the Blast2GO program. In AE vs. CE, 843 DEGs were classified into three major categories: biological processes (BP), cellular components (CC) and molecular function (MF). A total of 41 GO subcategories were enriched over three major functional categories. The main subcategories are shown in Fig. 5a. The six major subcategories of BP were metabolic process, cellular process, single-organism process, biological regulation, localization and response to stimulus. The five major subcategories of CC were membrane,  (Fig. 5b). The major subcategories of three categories were consistent with the result in AE vs. CE.

Differential expression of hormone signal transduction related genes between CE and AE
Various phytohormones induced SE and regeneration in several plants have already been reported. For instance, auxin was used alone or in combination with other plant growth regulators on plant SE induction [43,44]. To further understand hormone regulation, FPKMs of hormonal signal transduction related genes were analyzed ( Fig. 7a and Table S1). Among auxin signal transduction related genes, AUX-like5, IAA9-like, IAA28-like and GH3.1 were up-regulated in CE. SAUR71-like were highly expressed in AE than in CE. AUX22D-like, AUX28like, AUX-like1, AUX-like2, SAUR32-like, IAA14-like and IAA27-like were highly expressed in MCE. ARF5-like was lowly expressed in CE but highly expressed in MCE. These genes participated in the auxin signaling pathway, which was important for cell enlargement and plant growth (Fig. 7b).
Among abscisic acid (ABA) signal transduction related genes, PYL2-like was down-regulated in CE. PYL4-like was down-regulated in AE. Among jasmonic acid (JA) signal transduction related genes, JAZ7 was highly expressed in CE than in AE. JAZ5 was upregulated in AE. Among ethylene (ET) signal transduction related genes, RAP2-3 was up-regulated in CE and in AE. RAP2-12-like and WRI1-like were highly expressed in CE. ERF4-like was up-regulated in MCE. ERF018-like was only up-regulated in AE. All the genes involved in the hormones signaling transduction pathways, including auxin, ABA, JA, ET, suggested that these hormones had an indispensable role in their complicated crosstalk process during H. brasiliensis late SE. In vitro studies have suggested the role of various regulatory genes in embryogenic transition that are triggered by plant hormones [44]. The dynamic changes of these genes expression were critical for development of SEs.   Fig. 8a and Table S2. WRKY40 and WRKY70 were up-regulated in CE and down-regulated in AE. WRKY23 were highly expressed in AE than in CE. MYB26-like and MYB98-like were up-regulated in AE. MYBS3-like and MYB1R1-like were up-regulated in MCE. AGL11 and AGL15 were up-regulated in AE. BBM2 was highly expressed in AE. AIL6 was highly expressed in CE than in AE. bHLH93-like was highly expressed in CE. The  [46], SERK [47,48], LEA [49,50], have been identified to play a vital role during plant embryogenesis. CML13 and CML36 were up-regulated in CE but down-regulated in AE. CAM-5like and CAM (LOC110641724) were up-regulated in AE but had not changed in CE. CAM-7 was upregulated in CE but down-regulated in AE. SERK1 was up-regulated in CE. LEAD-34-like and SERK2-like showed higher expression in AE than in CE. LEAD-29like was up-regulated in MCE. The dynamic variation of the FPKM of these somatic embryogenesis-related genes suggested that they were critical for H. brasiliensis late SE.

Differential expression of histone modifications related genes between CE and AE
The plant growth regulators and abiotic stress can contribute SE. In the meantime, these factors may induce epigenetic modifications [51]. Histone modification is one of the most important epigenetic modifications and plays a key role in the regulation of gene expression [52]. Therefore, the expression levels of histone modifiers were analyzed and shown in Fig. 8b and Table S3. CURLY LEAF (CLF), encoding one of polycomb repressive complex 2 (PRC2) catalytic subunit that repress gene expression through trimethylating histone H3 at lysine 27 (H3K27me3), was higher expression in AE than in CE. The histone H3 lysine 9 methyltransferase genes (SUVH1-like, SUVH3-like, SUVH4-like and SUVH9), SUVR3-like, EZA1-like and ASHH3-like were expressed at a higher level in CE. In addition, histone demethylation related genes, LSD1-homolog 1-like was highly expressed in CE. LSD1-homolog 2 was up-regulated in MCE. The increased expression of genes in CE or MCE suggested that it is likely to have a function during late SE.
The acetylation of histones is believed to promote open chromatin state and activate gene transcription.  Ten of the eleven genes related to histone acetylation showed significant differential expression in CE vs. AE. HAG6, HAC12-like, MCC1 and GCN5-like were upregulated in CE. HAG11, HAG16, HAG18 and HATBlike were up-regulated in AE. 7 of the 13 genes related to histone deacetylation showed an obvious difference of expression in CE vs. AE. HDAC15-like and HDAC19 were highly expressed in CE. HDAC6-like, HDAC9 and SAP18-like were up-regulated in AE.
The histone phosphorylation related genes (Aurora-1, Aurora-2 like, Aurora-3 and Aurora-3 like) were highly expressed in AE than in CE. Plant Auroras can be divided into two categories according to the functions of Auroras. The alpha Auroras (Aurora 1 and Aurora 2) involve in controlling formative divisions throughout plant development. The beta Aurora (Aurora 3) associate with chromosome separation [53]. These genes highly expressed in AE can be used as candidate genes for indepth study in vitro embryogenesis.

qPCR verification of selected DEGs
To verify the reliability of transcriptome data, twenty genes related to SE were selected to carry out expression level analysis using qRT-PCR across 6 different tissues of H. brasiliensis (Fig. 9). Based on the transcriptome data analysis of H. brasiliensis SE, ARF4-like, GST, I2'Hlike, PRX5-like, RBX1a-like, WRKY40 and WRKY70 were highly expressed in CE than in AE. E2 20-like, two EP3likes, ERF9-like, FLC-like, five H3.2 genes, H3.2-like, MYB98-like and U17-like were lowly expressed in CE than in AE. The qPCR results validated the expression levels of 19 genes which were highly consistent with transcriptome data.

Discussion
SE is a promising and rapid vegetative propagation technique for plant regeneration. However, the process of SE remains poorly understood and many factors impact upon competence for SE. Many problems need to be resolved and one of these could be a deep understanding of the molecular mechanisms involved either negatively o positively in the generation of the somatic embryos. The transcriptome analysis of plant SE revealed a large number of potential key factors of embryogenesis [25,26,[54][55][56]. In longan early SE, 27 SE specific or preferential genes and 28 NEC (Non-embryogenic callus) preferential genes were characterized as molecular markers genes for longan early SE. The NEC-specific marker genes maybe the key inhibitor of the transition from NEC to EC, while the SE markers may function on SE development [26]. In this study, we obtained the transcriptome analysis of rubber tree SE derived from EC, PE, CE, AE, MCE and WAE. The de novo assembly generated 36,937 unigenes. We found the regenerate competence of CE and AE had obvious differences during late SE. Therefore, this study mainly focused on DEGs in CE vs. AE and MCE vs. WAE.
In CE vs. AE, 376 DEGs were identified and assigned to 46 KEGG pathways. The 771 DEGs were also assigned to 57 KEGG pathways in MCE vs. WAE. The most representative pathways were phytohormones signaling pathway, biosynthesis of phenylpropanoid, and sucrose and starch metabolism in CE vs. AE and MCE vs. WAE. The significant role of phenylpropanoid biosynthesis in plant SE development has been studied, this pathway is associated with the tolerance of stress responses, probably through the reinforcement of the cell wall [57]. The phenylpropanoid biosynthesis-related genes were significantly enriched in papaya embryogenic callus [25] and in strawberry embryogenic callus [19]. In addition, external stimuli and plant hormones related genes played a key role in the SE process [58,59]. In longan early SE, plant hormones related genes were enriched, especially the cytokinin and auxin signaling components [26]. Moreover, signaling involved in sucrose and starch accumulation is essential for somatic embryogenetic development [60]. The nature of the carbohydrate supply can reflect the signaling networks that control development, including somatic embryogenesis [61]. Sucrose was added to the culture medium as exogenous carbon sources in conifers SE [62,63]. The germination of Norway spruce (Picea abies) somatic embryos was affected by carbohydrates [64]. Endogenous carbohydrate status varies throughout the somatic embryogenesis of conifers [65], and can be used to identify cell lines with high-quality embryos [66,67]. Genes involved in the three pathways can play important role in H. brasiliensis late SE.
Auxin is critical regulator in different developmental stages of SEs [68][69][70]. The addition of exogenous auxin can affect the expression level of endogenous IAA [59,[71][72][73]. Dynamic change of endogenous IAA has been proved to induce plant SE and improve SE competency [74]. Auxin/Indole-3-Acetic Acids (Aux/IAAs), Gretchen Hagen3s (GH3s), small auxin upregulated RNAs (SAURs) and auxin response factor (ARF) have been identified as auxin-responsive genes in auxin signaling and homeostasis [75][76][77], which can regulate downstream genes precisely and rapidly, and further regulated plant growth and developmental processes. Aux/IAA family plays a key role in inhibiting the expression levels of genes transcriptional activated by ARFs [78,79]. In low auxin levels, Aux/IAA proteins interacted with ARFs and inhibited activation of auxin-responsive genes. In high auxin levels, these proteins can interact with transport inhibitor response 1/auxin signaling F-box (TIR1/AFB) receptors to be ubiquitinated and subsequently resolved by the 26S proteasome [80][81][82]. The liberated ARFs regulated the expression of auxin-responsive genes (Fig.  7b). There were 29 Aux/IAA family members in Arabidopsis, but not all of them were induced by auxin [83]. Many Aux/IAA genes have also been identified in other plants, such as, Eucalyptus grandis [84], Solanum Lycopersicon [85], Cucumis sativus [86], Populus trichocarpa [87], Zea mays [88] and Oryza sativa [89,90]. SAUR genes were consisted of a large multigene family, played crucial roles in regulating plant growth and development [91,92]. GH3 family participated in a series of hormonedependent processes in plant, including root growth, and flowering [93,94]. In this study, high concentration of IAA and 2, 4-D were added in MS medium for inducing EC from immature male flowers. The concentration of IAA and 2, 4-D were reduced and withdrawal in the medium to trigger SE. This helps to slow down callus growth to induce embryo formation [95]. The transition was connected with changes in gene expression. Some AUX/IAA family genes were highly expressed in CE or MCE. GH3.1 was up-regulated in CE. SAUR32-like and ARF5-like were up-regulated in MCE. These genes could be good gene expression markers and play a key role in the embryogenesis development process. In addition, JA and ET have also been reported to play a role in SE induction [96]. JAZ7, RAP2-12-like and WRI1-like were highly expressed in CE. The phytohormones signaling pathway related genes displayed intricate regulation during H. brasiliensis late SE. The regulatory mechanisms of these genes in H. brasiliensis late SE will be confirmed in the future study.
Transcription factors are key factors in plant embryogenesis and development. Many studies on SE development showed that complicated transcription regulation networks maintaining embryogenic competency, and embryogenic callus formation [63,97]. Some members of the WRKY TFs family genes can be stimulated by stress and are involved in carpel and ovule as well as in embryogenesis development [98,99]. Some WRKY genes have also been reported to be upregulated in embryogenic callus formation of bread wheat [54]. Transcriptome analysis showed that some WRKY genes are inducible in papaya and Arabidopsis thaliana embryogenic callus [25,100]. In Panax ginseng, the expression of PgWRKY6 increased in SE process in response to 2, 4-D inducing. PgWRKY6 functioned in the development of embryogenic callus possibly through the signaling cross-link of auxins with reactive oxygen species in somatic embryogenesis [101]. These finds indicates WRKY TFs have crucial role in the process of somatic embryogenesis. To our best knowledge, there is no report on WRKY TFs regulating genes associated with SE. MYB family was also involved in plant development and growth [102][103][104][105], hormone signal transduction [106,107]. In this study, WRKY40, WRKY70, MYBS3-like and MYB1R1-like were highly expressed in CE, suggesting that they might be used as marker genes for H. brasiliensis late SE. WRKY23, MYB26-like and MYB98-like were up-regulated in AE, indicating that these genes might act as negative modulators of SE. In addition, AtEMK, a member of the AP2/ERF family, was ectopically expressed and promotes the initiation of somatic embryos in Arabidopsis and H. brasiliensis [14,108]. BBM had been reported as a marker in Brassica napus SE [109]. The over-expression of BBM can enhance SE and regeneration ability in tobacco, sweet pepper, cacao [40,110,111]. The bHLH family is involved in developmental, growth, abiotic stress responses [112], and axillary meristem formation [63]. They also participate in abscisic acid and brassinosteroid signaling in Arabidopsis and rice [113]. A member of bHLH protein BIM1 regulated Arabidopsis SE and be involved in auxin and BR signaling pathways [114]. In this study, AIL6 and bHLH93-like were highly expressed in CE, suggesting that they might play a key role in H. brasiliensis late SE. AGL11, AGL15, BBM2 and bHLH94-like were up-regulated in AE, indicating that they have a negative regulatory role in late SE. To our knowledge, few transcription factors have been identified as negative modulators of plant SE. It will be of great interest to elucidate the function of these genes as negative modulators of SE. SERK has been proved as a key factor in plant SE. AtSERK1 was higher expression during Arabidopsis embryogenic formation [115]. SERK was abundant in embryogenic tissues in Dactylis glomerate [116]. However, SERKs were also tested in nonembryogenic tissues in maize, rice and wheat [47,117,118]. Ca 2+ has been identified to play a mediating role during plant SE [46,119]. LEA5, a late embryogenesis abundant proteins gene, was highly expressed in late embryogenesis [120]. In this study, SERK1, CML13, CML36 and CAM-7 were up-regulated in CE. LEAD-29-like were up-regulated in MCE. These genes can have various regulatory functions in H. brasiliensis late SE. LEAD-34-like and SERK2-like werehighly expressed in AE than CE, implying that they acted as negative modulators of SE. Further investigation of regulatory machinery of these genes will be important in improving natural rubber SE.
The histone modifications played important roles in gene expression, DNA replication and transcription, chromatin compaction [121,122]. Histone lysine methylations possessed the function of activating or derepressing transcription. H3K4, H3K36 and H3K79 methylations are associated with active transcription, whereas, H3K9, H3K27 and H4K20 methylations are involved in gene silencing [123]. H3K27me3 and H3K4me3 are the most frequent histone methylation marks. H3K27me3 is catalyzed by the trithorax-group (TrxG) and polycomb-group (PcG) proteins, of documented roles in regulating plant responses to environmental cues, cellular reprogramming, and plant stem cell maintenance [124]. The PcG proteins (PRC1 and PRC2), which cooperate to repress the genes via histone methylation during plant development [125]. In this study, CLF was higher expression in AE, suggesting H3K27me3 might inhibit the expression of genes associated with SE. Seven histone methylation related genes (SUVH1-like, SUVH3-like, SUVH4-like, SUVH9, SUVR3like, EZA1-like and ASHH3-like) were expressed at a higher level in CE. In addition, histone demethylation related genes, LSD1-homolog 1-like were highly expressed in CE. LSD1-homolog 2 were only up-regulated in MCE. KRYTONITE (KYP), encoding a histone H3 lysine 9 methyltransferase, also showed a higher expression level in Arabidopsis somatic embryos [16]. Some HATs including HAG1, HAF2, HAC1, HAC2, HAC4, HAC5 and HAC12 have been identified in Arabidopsis [16,126,127]. HAC2, HAG2 and HAG3 showed more accumulation in somatic embryos as compared to leaf tissues [16]. Similarly, in this study, histone acetylation related genes (HAG6, HAC12-like, MCC1 and GCN5-like) and histone deacetylation related genes (HDAC15-like, HDAC19) showed higher expression in CE. HDAC6-like, HDAC9 and SAP18-like were highly expressed in AE. HAC1 have been identified its function in reproductive and vegetative development [127]. HbHDA3 have been identified to interact with HbWRKY14 to regulate the expression of HbSRPP [128]. It is possible that those histone modifications related genes may also have an important function in embryogenesis. However, detection of changed transcript levels for key genes involved in histone modification provides an indirect indication of changed histone modifications during SE. It is not clear whether the expression changes we observed are due to in vitro conditions (i.e. externally supplied auxin, stress responses) or changed histone modification signatures. Therefore, it will be of great interest to perform a global analysis of the epigenome architecture of somatic embryos in order to underlying the relationship of the expression of genes associated with SE and histone modification.

Conclusions
In this study, the transcriptome data for rubber tree SE were generated. A comparative analysis of gene expression profiles during rubber tree late SE identified a series of DEGs that regulated late SE in rubber tree. We revealed the expression level of some genes related to phytohormones signaling pathway such as auxin, JA and ET signaling pathway, implying their important roles in rubber tree embryogenesis development process. The transcription factors such as WRKY, MYB, AP2 and bHLH, as well as CAM, SERK and LEA that were related to rubber tree late SE, might play a key role and become potential molecular marker genes in late SE. Histone modification might have crucial roles during late SE. This study provides novel insights into the molecular regulation mechanisms during rubber tree late SE.

Plant material and induction of somatic embryogenesis
Plants of Hevea brasiliensis Muell. Arg. cultivar (reyan 7-33-97) were planted in National Rubber Tree Varieties Resource Garden of the Chinese Academy of Tropical Agriculture Sciences, Danzhou, Hainan, China.
Immature male flowers were gathered from the rubber tree of reyan7-33-37. Immature male flowers were surface-sterilized with 75% (v/v) ethanol for 30 s, and followed to immerse in 0.2% (v/v) mercuric chloride solution for 10 min, and then washed four times with distilled water. The immature anthers were cultured in solid MS medium containing 1 mg l − 1 2,4-D, 1 mg l − 1 KT and 0.5 mg l − 1 NAA. After an additional 5-6 weeks of growth, EC were obtained in the darkness and 26-28°C. These samples of PE, CE, AE, MCE and WAE were collected successively. All samples were rapidly frozen in liquid nitrogen, and stored at − 80°C until RNA extraction. Three biological replicates were prepared for each sample.

Construction of cDNA library and sequencing
Total RNA was extracted according to the instructions of RNAprep pure plant Kit (Polysaccharides and Poly phenolics-rich, QIAGEN). RNA degradation and contamination were monitored on 1% agarose gels. The quality of RNA was detected by using the NanoDrop 2000 spectrophotometer (IMPLEN, CA, USA). The mRNA was enriched from total RNA using magnetic beads containing Oligo (dT) and broken into small fragments. Transcriptome libraries were constructed according to the instructions of the Truseq™ RNA sample preparation kit from Illumina (San Diego, CA). The library quality was examined using the Qsep100 Analyzer (BIOptic Inc., Taiwan, China). The cDNA libraries were deep sequenced on the Illumina novaseq6000 cDNA sequencing platform.

Analysis of differentially expressed genes (DEGs)
The expression level of the unigenes was calculated by FPKM. The FC represented the ratio of FPKM between two samples. The Benjamini-Hochberg correction method was adopted to correct the significance P-value obtained from the original hypothesis test. FDR was obtained by correcting the P-value of different significance. The genes with − 1.0 ≥ Log 2 [FC] ≥ 1.0 and the threshold of FDR < 0.01 were regarded as DEGs. A Venn diagram was created to find the overlapped DEGs in different developmental stages of H. brasiliensis SE using VennMaster as described previously [129].
Expression profiles of genes in H. brasiliensis SE FPKM was applied to analyze the gene expression level. The heat map was created using log 2 [FPKM] with the pheatmap package [130].

Quantitative PCR (qPCR)
Twenty genes were chosen for validation by qPCR. The samples of EC, PE, CE, AE, MCE and WAE were used for RNA extraction, and then reverse transcribed into cDNA as template. Each sample included three biological replicates. qPCR specific primers for the twenty genes were designed by using Primer Premier software 6.0 (Table S4). HbACT7 was amplified as a standard control as described previously [131]. qPCR was performed on a Mx3005P Real-Time PCR system using a SYBR Premix EXTaq II™ Kit (TaKaRa, China). All reactions were performed at 95°C for 30 s, 40 cycles at 95°C for 10 s, 58°C for 20 s, and 72°C for 25 s. The 2 -ΔΔCt method was used to calculate the relative expression levels of genes [132]. The statistical differences were analyzed by ANOVA (One-way analysis of variance) based on Fisher's LSD test (P < 0.05 and P < 0.01) [133].  Table 3 and Tables S1, S2, S3.