Transcriptome Analysis Identi es Genes Involved in the Somatic Embryogenesis of Eucalyptus

Background Eucalyptus, a highly diverse genus of the Myrtaceae family, is the most widely planted hardwood in the world due to its increasing importance for fiber and energy. Somatic embryogenesis (SE) is one large-scale method to provide commercial use of the vegetative propagation of Eucalyptus and dedifferentiation is a key step for plant cells to become meristematic. However, little is known about the molecular changes during the Eucalyptus SE. Results We compared the transcriptome profiles of the differentiated and dedifferentiated tissues of two Eucalyptus species - E. camaldulensis (high embryogenetic potential) and E. grandis x urophylla (low embryogenetic potential). Initially, we identified 18,777 to 20,240 genes in all samples. Compared to the differentiated tissues, we identified 9229 and 8989 differentially expressed genes (DEGs) in the dedifferentiated tissues of E. camaldulensis and E. grandis x urophylla, respectively, and 2687 up-regulated and 2581 down-regulated genes shared. Next, we identified 2003 up-regulated and 1958 down-regulated genes only in E. camaldulensis, including 6 somatic embryogenesis receptor kinase, 17 ethylene, 12 auxin, 83 ribosomal protein, 28 zinc finger protein, 10 heat shock protein, 9 histone, 122 cell wall related and 98 transcription factor genes. Genes from other families like ABA, arabinogalactan protein and late embryogenesis abundant protein were also found to be specifically dysregulated in the dedifferentiation process of E. camaldulensis. Further, we identified 48,447 variants (SNPs and small indels) specific to E. camaldulensis, including 13,434 exonic variants from 4723 genes (e.g., annexin, GN, ARF and AP2-like ethylene-responsive transcription factor). qRT-PCR was used to confirm the gene expression patterns in both E. camaldulensis and E. grandis x urophylla. Conclusions This is the first time to study the somatic embryogenesis of Eucalyptus using transcriptome sequencing. It will improve our understanding of the molecular mechanisms of somatic embryogenesis and dedifferentiation in Eucalyptus. Our results provide a valuable resource for future studies in the field of Eucalyptus and will benefit the Eucalyptus breeding program.


Background
Eucalyptus is a highly diverse genus of the Myrtaceae family and is widely planted across the world for its increasing importance for timber and pulp [1]. Natural regeneration of Eucalyptus mainly replies on seeds, however, their breeding is always a slow process due to the length of their juvenile phase [1]. Thus, vegetative propagation becomes an alternative option for Eucalyptus to breed and to gain new characteristics. Vegetative propagation consists of methods of both micropropagation (e.g., air layering, grafting, rooting of cuttings) and micropropagation using in vitro tissue culture techniques, including adventitious budding, axillary shoot tips, somatic embryogenesis (SE) [2]. Among them, in vitro tissue culture-induced SE, in which whole fertile plants are regenerated under proper culture conditions, is widely used to propagate selected genotypes for commercial purposes because it can provide large-scale regenerations for plants with relative lower cost [2].
In forest trees SE is a complex process involving many factors in several different steps [2]. The rst step is the callus induction in which differentiated somatic cells (e.g., seed, leaf, stem) acquire embryogenetic competence with or without a dedifferentiation step [3]. In Eucalyptus, Ouyang et al. for the rst time reported SE from callus of seedlings in 1980 [4]. Pinto and colleagues reviewed the status and future perspectives of SE in Eucalyptus and some tissues (e.g., hypocotyls, cotyledons, leaves, shoots) from both young seedlings and old trees have been reported for successful induction of callus in multiple Eucalyptus species, such as E. botryoides, E. dunnii, E. grandis, E. globulus and E. rudis [5]. Including the culture medium, some other factors have been reported to affect the SE induction, such as the addition of antioxidants, the carbon source, the effect of genotype and the morphogenic pathway [5]. It is clear that the capacity of SE in Eucalyptus species varies, however, our knowledge about the genomic and transcriptomic information controlling the SE and vegetative propagation is still poor.
Some studies have been demonstrated to study the genetics related to the capacity of vegetative propagation in Eucalyptus. For example, Grattapaglia and colleagues used a pseudo-testcross strategy and RAPD markers to study the quantitative trait loci (QTLs) controlling vegetative propagation in E. grandis and E. urophylla [6]. Marques and colleagues identi ed some QTLs related to adventitious rooting, sprouting ability and the stability of adventitious rooting [7]. Thumma studied identi ed some QTLs related to the wood quality in Eucalyptus nitens [8]. In a review study, Ohtani summarized the impacts of the regulation of RNA metabolism for in vitro dedifferentiation of plant cells, such as rRNA biosynthesis, pre-mRNA splicing, and miRNA-based RNA decay [9]. Some studies have been demonstrated to investigate the gene regulation in the SE in plants, such as Arabidopsis [10], alfalfa [11], camphor tree [12], carrot [13], cotton [14][15][16], orange [17], potato [16] and soybean [18][19][20]. Many genes have been reported to be associated with the SE, such as auxin response factor19 (ARF19), somatic embryogenesis receptor-like kinase (SERK), leafy cotyledon (LEC), baby boom (BBM), and wuschel (WUS) [21,22]. However, it is still unknown about the gene changes and gene regulations in the SE and the dedifferentiation of Eucalyptus.
In the present study we used transcriptome sequencing to study the gene expression pro les and gene variants in the dedifferentiation process of two Eucalyptus cultivars -E. camaldulensis (high embryogenetic potential) and E. grandis x urophylla (low embryogenetic potential). We identi ed some genes related to SE and dedifferentiation of Eucalyptus, such as auxin related genes, embryogenesis related genes, ethylene related genes, ribosomal protein (RP) genes, zinc nger protein (ZFP) genes, histone genes, heat shock protein (HSP) genes and various transcription factors (TFs). Further, we identi ed the variants in these two Eucalyptus species that might control the callus induction and development. This is the rst time to study the SE transcriptomes in Eucalyptus. Our results will improve our understanding about the molecular mechanisms in the SE of Eucalyptus. The output of this study will also provide a valuable resource for future studies and will bene t the research in this eld and the breeding program of Eucalyptus.

Results
Transcriptome sequencing and gene expression pro les To study the gene expression changes in the dedifferentiation of Eucalyptus, we analyzed the transcriptome data of differentiated (stem) and dedifferentiated (callus) tissues from two E. camaldulensis and E. grandis (stem of E. camaldulensis: A1, callus of E. camaldulensis: A2, stem of E. grandis x urophylla: B1, callus of E. grandis x urophylla: B2). Table 1 showed the overview of transcriptome sequencing and the numbers of genes identi ed in this study. Initially, we obtained 528 million reads for all 12 samples (in triplicates, n=3) and each individual replicate sample had 44 million reads on average. Then, we aligned the reads to the E. grandis genome. It showed 64.66% to 74.84% of the total reads were mapped in the samples with 31.3 million mapped reads on average. Out of the total 36,349 E. grandis genes, we identi ed 18,777 to 20,240 genes with TPM > 1 in the samples (Additional le 1). If we set the TPM cut-off to 5, only 12,650 to 14,464 genes were identi ed. Distribution of genes with different expression levels ( Figure 1A) showed that more than 60% of the genes were lowly expressed (TPM < 5) and only 0.165% to 0.272% of the Eucalyptus genes were expressed higher than 1,000 TPM. Interestingly, highly expressed genes varied in differentiated and dedifferentiated tissues of Eucalyptus. Figure 1B showed comparison of top 10 genes in all samples according to the average TPM. Dedifferentiated tissues of both E. camaldulensis and E. grandis x urophylla shared 8 genes including Eucgr.H01085 (ethylene-responsive transcription factor ERF071), Eucgr.G03106 (wound-induced protein 1) and Eucgr.F00114 (zinc nger protein ZAT10). We next compared the genes identi ed in the samples (TPM > 1) and found 10,374 genes shared ( Figure 1C). Also, 628, 459, 671 and 580 genes were speci cally identi ed in A1, A2, B1 and B2, respectively. Based on the gene expression, we analyzed the correlation between samples. It was revealed that replicates were performed well, and differentiated tissues were distinguishable from dedifferentiated tissues in E. camaldulensis and E. grandis x urophylla ( Figure 1D).
Then, we analyzed some gene groups might be involved in the dedifferentiation process of E. camaldulensis. We identi ed 27 up-regulated and 22 down-regulated genes related to ethylene (Table 2) (Table 2). Notably, 8 up-regulated and 3 down-regulated embryogenesis related genes were identi ed in the dedifferentiated tissue of E. camaldulensis compared to the differentiated tissue, such as late embryogenesis abundant protein, somatic embryogenesis receptor kinas 1 and embryogenesisassociated protein EMB8 (Additional le 2).

DEGs in E. grandis x urophylla
We next analyzed the 4,200 up-regulated and 4,708 down-regulated genes in the dedifferentiated tissue of E. grandis x urophylla compared to the differentiated tissue ( Figure 2C). According to the FDR, top 5 DEGs include Eucgr.B03016 (LOB domain-containing protein 40), Eucgr.H02264 (probable indole-3-acetic acidamido synthetase GH3.1), Eucgr.I02271 (endochitinase), Eucgr.L02894 and Eucgr.L02534 (Additional le 2). KEGG pathway enrichment analysis showed "Glycolysis / Gluconeogenesis" (ko00010), "Carbon metabolism" (ko01200) and "Biosynthesis of secondary metabolites" (ko01110) were the top three pathways involved by the DEGs. Also, we found "Plant hormone signal transduction" (ko04075) signi cantly enriched by 260 DEGs (q-value: 8.36E-14). Like E. camaldulensis, GO analysis of these DEGs ( Figure 2D) revealed that the numbers of DEGs had a similar trend but varied across all the GO terms, such as biological adhesion, virion and nucleoid. Table 2 showed the numbers of DEGs from the seven gene groups identi ed in the dedifferentiation process of E. grandis x urophylla. Interestingly, we found 41 up-regulated and 8 down-regulated genes encoding ribosomal proteins in E. grandis. It showed that more ribosomal protein DEGs were up-regulated in E. grandis x urophylla but down-regulated in E. camaldulensis. Moreover, we found in E. grandis x urophylla more down-regulated ethylene related genes and up-regulated genes encoding HSP and histone ( Table 2). Figure 2E showed an overview of all the DEGs identi ed in the dedifferentiated tissues compared to the differentiated tissues in both E. camaldulensis and E. grandis x urophylla. It was revealed that they shared some DEGs and some other DEGs were speci cally identi ed in E. camaldulensis or E. grandis x urophylla, which might be related to the regenerative ability in Eucalyptus.
Interestingly, we found 6, 17, 12, 83, 28, 10, 9 and 98 DEGs related to embryogenesis, ethylene, auxin, RP, ZFP, HSP, histone and TF, respectively, speci cally dysregulated in the dedifferentiated tissue of E. camaldulensis ( Table 2, Additional le 2). The 6 embryogenesis related genes include 5 up-regulated genes (2 SE receptor kinase, 3 LEA) and 1 down-regulated gene encoding embryogenesis-associated protein EMB8. Among the 17 DH201-2 speci c DEGs encoding auxin related products ( Figure 3B), Eucgr.K02992 (auxin transporter-like protein 4) and Eucgr.C02984 (auxin-responsive protein IAA26) were down-regulated and Eucgr.H03171 (auxin-induced protein 22D) was up-regulated in E. camaldulensis but down-regulated in E. grandis x urophylla (Additional le 2). All of the ethylene related DEGs speci cally in E. camaldulensis were found to encode ER TFs and 7 had reverse regulation in E. camaldulensis and E. grandis x urophylla ( Figure 3C). Interestingly, it is shown in the heat maps ( Figure 3D and Figure 3E) that most of the E. camaldulensis speci c DEGs related to HSP and RP were down-regulated in the dedifferentiated tissue compared to the differentiated tissue. Although some of the DEGs encoding histone were more or less changed in E. grandis x urophylla, the difference of their expression was not signi cant as that in E. camaldulensis ( Figure 3F). Importantly, we found another 4 TF subfamilies including ASIL2, bHLH, MYB and WRKY were dysregulated only in E. camaldulensis ( Figure 3G). Further, the MYB and WRKY TF genes were up-regulated in the callus of E. camaldulensis compared to the stem. In addition, some other gene families were identi ed to be speci cally differentially expressed in E. camaldulensis (Table 2, Additional le 3), including 3 ABA related genes, 3 arabinogalactan protein genes, 4 ABC transporter genes and 21 abscisic stress-ripening protein genes.

qRT-PCR validation
We used qRT-PCR to con rm the expression patterns of DEGs identi ed in the dedifferentiation process of E. camaldulensis and E. grandis x urophylla. We randomly selected 8 genes (Eucgr.A00971, Eucgr.A01091, Eucgr.B03715, Eucgr.C03048, Eucgr.D01811, Eucgr.F00490, Eucgr.F01164 and Eucgr.H03077) for the qRT-PCR experiment and the H2B gene was used as internal control. The primers for each gene can be accessed in the Additional le 5. Each gene was replicated three time in each sample, so we performed 9 reactions in total for the differentiated and dedifferentiated tissues. We used log2FC (log2 fold change) and log2RNE (log2 relative normalized expression) to present the gene changes detected by transcriptome sequencing and qRT-PCR, respectively. Overall, 13 (81.25%) out of 16 events were agreed by both qRT-PCR and deep sequencing (Figure 4). It is notable that the elevation of WRKY TF (Eucgr.D01811) and the decrease of RP gene (Eucgr.A00971) in E. camaldulensis were con rmed by qRT-PCR. The dysregulation of Eucgr.B03715, Eucgr.C03048 and Eucgr.F00490 in E. grandis x urophylla were also con rmed. High agreement of gene expression patterns in transcriptome sequencing and qRT-PCR indicate that the genes identi ed in this study might be associated with the regenerative ability and the SE in Eucalyptus, which requires future experiments to be explored.

SNPs and indels
We next called the variants including SNPs and small indels in E. camaldulensis and E. grandis x urophylla using the transcriptome sequencing data. Table 3 shows the numbers of variants identi ed in the samples. Initially, we obtained 97,504 and 75,582 variants in the differentiated and dedifferentiated tissues of E. camaldulensis, respectively. After the variants with less than 100 reads were ltered, we identi ed 97,974 SNPs and indels for E. camaldulensis. Likewise, 72,208 and 66,311 variants were called in the differentiated and dedifferentiated tissues of E. grandis x urophylla, respectively, and they produced 78,977 highly covered variants (Table 3). Comparison showed 49,527 variants shared by E. camaldulensis and E. grandis x urophylla, and 48,447 were speci cally identi ed in E. camaldulensis. Then, we annotated the E. camaldulensis speci c variants using ANNOVAR and found that 13,434 variants were functional, such as nonsynonymous, frameshift insertion, frameshift deletion, stop gain and stop loss variants (Table 3, Additional le 6). Interestingly, these 13,434 variants were produced from 4,723 Eucalyptus genes, such as annexin (Eucgr.F02423, Eucgr.H00564), ARF guanine-nucleotide exchange factor GNOM (Eucgr.B03196), AP2-like ER TF (Eucgr.I00278, Eucgr.J02113), auxin response factors (Eucgr.G00076, Eucgr.C02178, Eucgr.C03293, Eucgr.J00923, Eucgr.F02090, Eucgr.D00264, Eucgr.E00888) and wall-associated receptor kinas-like (Eucgr.I01022). KEGG pathway enrichment analysis showed that 85 pathways including "longevity regulating pathway" (ko04211) and "Plant hormone signal transduction" (ko04075) were enriched by the of the DH201-2 speci c mutated genes (Additional le 7). Discussions SE has become an e cient way for the propagation of Eucalyptus and SERK genes have been reported to play a key role during the SE in many plants, such as Arabidopsis [23], cacao [24], rice [25], sun ower [26], maize [27], grape [28] and pineapple [29]. In Arabidopsis, SERK1 was highly expressed during the formation of embryogenic cells in vitro culture and can be detectable in all cells of the developing embryo during early SE up to the formation of the heart stage after which expression ceases [23]. SERK2 is signi cantly increased in the embryogenic callus and maturation stage compared to the nonembryogenic callus [30]. In the present study we identi ed the up-regulation of some SERK genes in the dedifferentiated tissue compared to differentiated tissue in both E. camaldulensis and E. grandis x urophylla (Additional le 2). Further, we found that 2 SERK gene members were speci cally up-regulated in E. camaldulensis. These results suggested that different members of the SERK gene family may have unique functions. Singh and Khurana also observed the different expression of SERK genes in wheat and reported that SERK genes were differentially expressed in response to different plant growth regulators [31]. For example, SERK2 and SERK3 can elicit the auxin-speci c responses while SERK1 and SERK5 may be mediated by the signaling pathway of brassinosteroids. The functions of SERK genes in the SE of Eucalyptus require further experiments to be explored.
Méndez-Hernández reviewed the interactions between the different plant growth regulators, mainly auxins, cytokinins, ethylene and abscisic acid (ABA), during the induction of SE [32]. Although the how cells initiate embryo formation is not clear, an irregular distribution of auxins must be established to initiate embryo formation. In present study, we identi ed three genes (Eucgr.B00948, Eucgr.G02187, Eucgr.A02229) encoding auxin e ux carrier up-regulated speci cally in the dedifferentiated tissue of E. camaldulensis (Table 2, Additional le 3). In addition, three genes encoding auxin response factors were speci cally up-regulated in E. camaldulensis. We also identi ed DEGs related to other plant hormones, like late embryogenesis abundant protein (LEA) and abscisic acid (ABA). For example, twelve ABA related genes were dysregulated in this study, including seven ABA receptor and ve 8'-hydroxylase genes (Additional le 3). Among them, one ABA receptor and two ABA 8'-hydroxylase genes were speci cally down-regulated in the dedifferentiated tissue of E. camaldulensis. Chen et al, reported that the ABA transcripts peaked at 8 h after ABA treatment and then signi cantly decreased at latter time points [33].
Although we lack multiple time points of the callus development, it can be speculated that the downregulation of ABA genes might be required for the callus development of Eucalyptus. Interestingly, we found seven genes encoding LEA in the callus compared to the stem (Additional le 3). However, the expression level of down-regulated LEA gene (Eucgr.E00787) was much higher than the ve up-regulated LEA genes. Among them, three (Eucgr.K01312, Eucgr.I01292, Eucgr.A02687) were speci cally upregulated in the callus of E. camaldulensis. The LEA genes play a pivotal role during the plant somatic embryogenesis process [34] and have been studied in some plant species like cotton [35], white spruce [36] and sweet orange [37].
In Eucalyptus, Marques identi ed QTLs for adventitious rooting, one QTL for sprouting ability and four QTLs for the stability of adventitious rooting [7]. Their results indicated that the phenotypic variation in these traits had a meaningful genetic component that relate to the capacity of vegetative propagation. We identi ed 13,434 variants in 4,723 Eucalyptus genes speci c in E. camaldulensis (Additional le 6), including two genes encoding SERK1 and SERK2, 25 auxin related genes (e.g., auxin transporter like protein, auxin-responsive protein, auxin e ux carrier component, auxin responsive factor) and 8 genes encoding ABA receptors (e.g., PYR1, PYL2, PYL4, PYL8, PYL9). Also, we found the E. camaldulensis speci c mutated gene Eucgr.B03196 (the ARF guanine-nucleotide exchange factor GNOM, EMB30 or GN) has the potential to mediate the endosomal recycling, auxin transport and auxin-dependent plant growth [38]. Mutant EMB30 has the capacity of altering the cell wall in Arabidopsis [39]. It is also involved in the speci cation of apical-basal pattern formation in the early embryo and during root formation [40,41]. Although our study lacks the fundamental experiments, the pathways involved by the mutated genes in E. camaldulensis indicate that they might be associated with the SE and the regenerative ability of Eucalyptus.
TFs have been reported to function in cell division, cell growth, cell death, cell migration and organization during embryonic development, and respond to biotic and abiotic stresses. They can be another big category that affect the gene expression during the SE induction [32]. Some TFs have been reported to be associated with the SE, including ABI3, WOX9-1, LEC, WUSCHEL, TALE, BBR-BPC and AP2 [32,42]. In this study we identi ed some dysregulated TF genes speci c in the callus of E. camaldulensis (Table 2, Additional le 2). Among them, ER, bHLH, MYB and WRKY TFs were up-regulated. It is interesting that QTL analysis identi ed a WRKY TF gene related to the adventitious rooting from apple hardwood cuttings [43]. In Eucalyptus, WRKY33 was up-regulated in response to the fungal affection of Chrysoporthe austroafricana [44] and Calonectria pseudoreteaudii [45]. Also, WRKY TF was up-regulated in the Eucalyptus camaldulensis seeding subjected to the water stress [46]. Furthermore, some other genes and proteins have been studied to relate to the vegetative propagation, like RP [47], HSP [48], and histone [49,50]. While very few studies have been demonstrated to investigate their functions and associations in the SE of Eucalyptus, our results indicate that the dysregulation of these genes or the known pathways like metabolisms involved by these genes might be related to the SE of Eucalyptus.

Conclusions
In conclusion, we studied the transcriptome pro les of the SE of two Eucalyptus species -E. camaldulensis and E. grandis x urophylla. Our results showed dysregulated genes from some gene families like auxin, ethylene, SERK, RP, ZFP, HSP, histone, ABA, LEA and TF might be related to the SE and the regenerative ability of Eucalyptus. Also, we called SNPs and small indels in E. camaldulensis and E. grandis x urophylla, and it was revealed that genetic variants may also contribute to the SE and the regenerative ability of Eucalyptus. This is the rst time to study the SE and the dedifferentiation in Eucalyptus using transcriptome sequencing. Our results will improve our understanding of the molecular mechanisms during the SE in Eucalyptus. We provide a valuable resource for future studies in the eld of Eucalyptus and, more importantly, will bene t the Eucalyptus breeding program.

Plants
The original seeds of E. camaldulensis (high regenerative ability, voucher ID: c0009) and E. grandis x urophylla (low regenerative ability, voucher ID: j0017) were obtained from the wide in 1984 and no permissions were required to collect these plants. Then, they were maintained in the experimental elds of Guangxi Forestry Research Institute and con rmed by a senior botanist Prof. Dongyun Xiang. The second generation of in vitro tissue-culture induced seedlings of E. camaldulensis and E. grandis x urophylla were maintained on the MS medium supplemented with 20mg/L Ca(NO 3 ) 2 , 0.5 mg/L 6-BA and 0.1 mg/L IAA until 2 to 3 cm height for this project. The second to the third stems from the stem tip of the seedlings were obtained and cut into 0.3 ~ 0.5 cm segments. About 60 segments of each Eucalyptus species were then transferred onto the induction MS medium (supplemented with 20mg/L Ca(NO 3 ) 2 , 1 mg/L KT and 0.5 mg/L 2,4-D) and maintained in dark at 28 2 C for 10 days. Successful callus samples were used as dedifferentiated tissues. The induction experiment was replicated three times.
RNA isolation, library construction and transcriptome sequencing Differentiated and dedifferentiated tissues (100 mg) were collected for the total RNA isolation using the TRIzol reagent (Invitrogen) according to the manufacturer's protocol, as previously described [51,52]. Then, the quantity and quality of the total RNA were determined by the Agilent 2100 Bioanalyzer and equal amount of total RNA (1 μg) was used to construct the libraries for BGISEQ-500 RNA-Seq. Brie y, magnetic oligo (dT) beads were used to enrich the ploy(A) mRNAs, which were then fragmented intõ 200 bp. Then, the fragments were used for the dscDNA library construction by random hexamer (N6) primers, followed by the end repair using phosphate at the 5' end and sticky 'A' at the 3' end. Next, sequencing primers were ligated to build the nal cDNA library, which was then sequenced on the BGISEQ-500 RS platform at BGI-Shenzhen with paired-end 150 strategy.

Genome mapping and gene expression pro ling
The raw data of each sample was cleaned by trim_galore (https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) to remove sequencing adaptors, low quality reads and contamination reads. Then, the clean data was quality controlled by fastqc (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and then was aligned to the E. grandis genome (v2.0, https://plantgenie.org) using Hisat2 with default parameters [53]. Then, Stringtie (v1.3.4d) was used to pro le the gene expression in each sample [53]. TPM (transcripts per million reads mapped) method was used to normalize the gene expression in each sample. Numbers of reads aligned to the Eucalyptus genes were calculated by htseq-count, as described [54].

Differential expression analysis
We used edgeR to identify differentially expressed genes in stem and callus of the two Eucalyptus cultivars. Stringent cut-offs were employed to select the DEGs, including log2 fold change (log2FC) > 1 or < -1, p-value < 0.05 and false discovery rate (FDR) < 0.1 [48].

Variant calling
We used strelka (2.9.10-4-gd737744) to call the variants, including SNPs and small indels, in the stem and callus tissues of both E. camaldulensis and E. grandis x urophylla [55]. Then, variants passed the program lter were maintained for each sample, and variants identi ed in all three replicates were kept for downstream analysis. We next ltered the variants by the read depth (> 100) and obtained the nal variants for E. camaldulensis and E. grandis x urophylla samples. The variants were annotated by ANNOVAR, as previously described [54], and variants from noncoding regions and synonymous variants were discarded based on the annotation.

Functional analysis
To analyze the potential pathways and biological processes involved by the DEGs, we rst annotated the E. grandis genes using the KEGG pathway and Gene Ontology databases, as previously described [52]. Then, enriched KEGG pathways and GO terms by the DEGs were identi ed using two statistical values: pvalue (calculated by Fisher's exact test, < 0.05) and q-value (calculated by the R package 'qvalue', < 0.05), as previously described [52].

qRT-PCR
We performed the quantitative real-time PCR to con rm the expression changes of candidate genes. A total of eight genes (Eucgr.A00971, Eucgr.A01091, Eucgr.B03715, Eucgr.C03048, Eucgr.D01811, Eucgr.F00490, Eucgr.F01164 and Eucgr.H03077) were randomly selected and the H2B gene was used as the internal control. Primers were predicted using the Primer3 and synthesized at BGI-Shenzhen. The procedure of qRT-PCR was same as a previous study [56]. We performed three times for each gene in every replicate (n=9). After the Ct values were obtained, we used Ct value to present the gene expression. Then, Ct was used to show the difference of a gene in the callus compared to the stem. Relative normalized expression (RNE) was used to show the gene expression change: .

Declarations
Ethics approval and consent to participate No speci c permits were required for the described eld studies. The location is not privately-owned or protected in any way, and the eld studies did not involve endangered or protected species.

Consent for publication
Not applicable.

Availability of data and materials
The raw sequencing data can be accessed from the NCBI Sequence Read Archive (SRA) platform (http://trace.ncbi.nlm.nih.gov/Traces/sra/) under the accession number PRJNA634476.

Competing interests
The authors declare that there is no con ict of interest.