High-throughput profiling of diapause regulated genes from Trichogramma dendrolimi, an important egg parasitoid

Background The parasitoid insect Trichogramma dendrolimi can successfully enter diapause at the prepupal stage. Thus, diapause is an efficient preservation method during the mass production of T. dendrolimi. Previous studies on diapause have focused mostly on ecological characteristics, so the molecular mechanism of diapause in T. dendrolimi is mostly unknown. In this study, we compared transcriptomes of diapause and non-diapause T. dendrolimi to identify genes involved in the development of diapause. Results Transcriptome sequencing was performed using different samples, including diapause prepupae, pupae after diapause, normal prepupae, and pupae. This initially yielded a total of 87,022 unigenes with an average length of 1,604 bp. By removing redundant sequences and those without significant BLAST hits, a non-redundant dataset was generated, containing 7,593 sequences with an average length of 3351 bp. Among the differentially expressed genes, 1,204 were specifically expressed during the diapause stage, and 820 were significantly up-regulated. The result of GO enrichment analysis revealed that binding, oxidation−reduction process, and ribosome biogenesis were significantly affected. Ten genes were selected for validation using a quantitative real-time PCR (qPCR). The changes showed the same trend between the qPCR and RNA-Seq results. Based on previous studies and our data, several genes were identified to be involved in diapause, including UDP-glucuronosyltransferase; Glutathione-S-transferase, ATGs, zinc finger protein genes, cell cycle-related genes, Trehalase, trehalose transporter, hsp68, p53, and DNA damage-regulated gene 1 (pdrg1), and genes related to lipid metabolism were also included. Conclusions In this study, we generated a great amount of transcriptome data from T. dendrolimi, providing a resource for gene function research. The diapause-related genes that we identified establish a valuable basis for future studies on the molecular mechanisms of diapause, not only for T. dendrolimi, but for other species

4 complicated process. To date, many studies have focused on ecological characteristics to optimize conditions for diapause induction or termination to improve biological control programs [17,18] . Several diapause-associated genes, such as dilp1, foxo, and akt, have been identified in other insect species [19][20][21][22][23][24][25][26][27][28][29] . Few studies, however, have been carried out on the molecular mechanism of diapause in Trichogramma spp. Moreover, diapause induction and termination are time-consuming, which limits the use of Trichogramma in practical applications. Therefore, it is necessary to understand the diapause mechanism.
Transcriptome sequencing has become a favorable tool for gene expression research. In fact, many studies have used RNA-sequencing (RNA-Seq) to solve a wide variety of problems. Some studies focused on insect resistance to insecticide [30][31][32][33] , some on insect adaptability to extreme environments [34,35] , and others on selected genes, like chemosensory genes [36] . But there have been few studies on insect diapause using RNA-Seq. Hao et al. (2019) identified the candidate genes (rai1 and foxo) related to the FOXO pathway in the egg diapause regulation of Locusta migratoria [37] .
In this study, we are the first to report the gene expression profiles of T. dendrolimi in diapause and non-diapause states. The objective of this study was to use RNA-Seq to characterize diapause-related genes in this species. The results of this study are expected to provide a reference for deciphering the diapause mechanism in T. dendrolimi and guiding the use of T. dendrolimi in biological control programs.

Diapause induction, termination, sequencing, and gene identification
After the induction process was completed, the parasitized host eggs were dissected to verify whether T. dendrolimi entered diapause successfully or not. Diapausing parasitoids should stay at the prepupal stage. If diapause induction failed, they would continue to 5 develop into pupae or adults. Following the diapause termination process, the parasitized host eggs were transferred to normal development conditions (26°C ± 1°C, 60% ± 5% RH, 16:8 L:D). If diapause was disrupted, the parasitoid would convert from prepupa to pupa within several days; conversely, if diapause was undisrupted, the T. dendrolimi would remain in the prepupa stage. In this study, 99% of the parasitoids entered diapause successfully, and about 95% broke out of the diapause state during the 70-day termination treatment.
RNA samples obtained from distinct stages of this species were prepared and sequenced using the Illumina Hiseq2000 sequencing platform. Four cDNA libraries were constructed from the samples described above (Dpre, Dp, NDpre, and NDp). After filtering raw reads (reads containing adaptor, reads containing N larger than 10%, and low-quality reads (Q phred < 20) were removed), clean reads were retained ( Table 3). The clean data were assembled and hierarchically clustered by Trinity and Corset with 87,022 unigenes produced, and an average length of 1,604 nt and an N50 of 3,148. Of the unigenes, 35,231 (40.5%) were longer than 1000 bp (Table 4).
In order to gain information of gene function, transcripts were annotated using BLASTX searches against the non-redundant (NR) sequence database; 39,969 (45.92%) of them displayed homology to known proteins (E < 1e-5; Fig. 1A). Nearly 25,000 annotated unigenes, over 65% of the annotated unigenes, were homologous to T. pretiosum, probably because the genome of T. pretiosum was the only available genome in Trichogramma. Fewer unigenes were homologous to Nasonia vitripennis (1,217, 3.1%), Apis florea (26, 0.03%), A. dorsata (12, 0.01%), or A. cerana (15, 0.01%) (Fig. 1B). Less than 40 unigenes were matched to those from Microplitis demolitor. Based on these results, we can infer that there are significant genetic differences between Trichogramma and other parasitoid species. Among all annotated unigenes, 73.0% had significant 6 homology with an E-value of <10 -30 (Fig. 1C), and 52.3% had a similarity higher than 80% (Fig. 1D). After filtering and removing redundant sequences, we kept the unigenes with significant BLAST hits and constructed a non-redundant dataset containing 7,593 sequences with an average length of 3351 nt. Based on the annotation, such as gene length, ID, and speculative function, the diapause-related genes and potential genes involved in diapause were sorted out for further analysis.

Identification of DEGs and functional classification
Ten genes were selected for validation with qPCR, and GAPDH was selected as the reference gene after measuring its stable expression level for diapause and non-diapause.
The tendencies of these genes' expressions were similar according to RNA-Seq and qPCR ( Fig. 2). Among these 10 selected genes, all except tre were up-regulated during diapause.
To get more acquainted with diapause-specific transcriptional changes in T. dendrolimi induced by low temperature, we carried out pairwise comparisons between different libraries to identify the DEGs. The result of DEG analysis revealed that more unigenes showed striking changes in mRNA levels during diapause stage. DESeq2 identified 1,204 DEGs exclusively changed in Dpre compared to NDpre, Dp, and NDp. We used the expressions of these genes at four time points to construct a gene expression profile composed of two groups: 820 genes were up-regulated and thus belonged to group 1, and the others were down-regulated significantly and thus belonged to group 2 (Fig. 3). All these genes might be involved in diapause maintenance of T. dendrolimi.
To figure out the potential function of identified DEGs, GO enrichment was performed.
Compared with non-diapause individuals, the number of up-regulated genes in the diapause development process was significantly higher than down-regulated genes. In the gene repertoire of Dpre, more DEGs were assigned to protein binding, oxidation-reduction processes, and metal ion binding. Among them, metal ion binding and ribosome biogenesis were concerned in the subsequent analysis (Fig. 4).

Comparative analysis of genes involved in diapause
Diapause is a dynamic process accompanied by a series of physiological transitions.
Several studies have focused on the general gene expression pattern of insect diapause without clear elucidation of the diapause mechanism. This is due to the complexity of the diapause process as well as the variations among insect species.
We identified 79 genes potentially involved in T. dendrolimi diapause based on their homology to genes with known functions from other insect species. Several new genes with the capability of being included in the diapause genetic toolkit were also analyzed. In the following section, they are organized into different categories according to their function.
According to the results of GO enrichment, we focused on the genes enriched in the metal ion binding term. There were eight genes coded zinc finger protein were identified in the transcriptome. Zinc finger protein gene 271 had an SFP domain (Fig. 5). Genes containing this domain were considered to be a putative transcriptional repressor during G2/M transition. One of the most noticeable characteristics of diapause is the blockage of ontogeny, and this blockage always occurs with the cell cycle cessation [38,39] . In Nasonia vitripennis, the S phase of the cell cycle disappeared in the beginning stage of diapause due to the cells being arrested in the G0/G1 and G2 phases [40] . Similarly, in drosophilid fly, Chymomyza costata, the cell cycle of CNS cells was arrested in the G0/G1 (86.6%) and G2 (12.8%) division phases during diapause. The wee1 gene, coding for a kind of inhibitory 8 kinase, was up-regulated during diapause in this species [41] . We obtained similar results in T. dendrolimi (Fig. 5). Therefore, the wee1 gene might be a molecular marker of diapause.
It was reported that cyclin-dependent kinase 4 (CDK4) and cyclin D regulated the G1/S transition during cell cycle [42] . The expressed level of CDK4 and cyclin D played an essential role in cell cycle arrest during diapause, while in T. dendrolimi, the expression of these two genes were up-regulated during diapause. We speculate that the cell cycle might be stuck at other phases, like G2 or G0. In addition, cyclin A1/G/G2/K was upregulated during diapause. This suggested that these cyclin-related genes might play a part in T. dendrolimi diapause maintenance. p53 induces growth arrest or apoptosis and has a negative regulation on cell division by controlling an array of genes in this process [43] . In addition, p53 and DNA damageregulated gene 1 (pdrg1) are involved in multiple cellular, such as apoptosis, DNA damage repair, cell cycle. In Artemia sinica, PDRG1 takes an essential role in diapause termination and regulation of cell cycle during early embryonic development [44] . By blocking the expression of pdrg1 in human colon cancer cells, cell growth reduced significantly [45] .
Transcriptome based analysis indicated that the expression of p53 was significantly increased during diapause, and the contrary result was observed for pdrg1 in T.
dendrolimi. Moreover, this result also revealed that the apoptosis activity was enhanced although the diapause individual remained in a dormancy state. It is speculated that, during diapause, the wasps had accumulated harmful substances to some degree to maintain diapause.
Ribosome biogenesis has an vital role in the control of cell growth and division in eukaryotes [46] . In the present study, we identified 31 DEGs involved in ribosome 9 biogenesis, of which 28 were up-regulated during diapause in the prepupae stage (Fig. 6).
It was reported that the rate of ribosome synthesis during diapause was lower than that of non-diapause eggs in Bombyx mori, so the upregulation of ribosomal proteins played an important role in blocking diapause [47,48] . In mosquitos, Culex pipiens, the expression of rpS3a was dramatically reduced for a short time in the diapause stage. After the injection of rpS3a dsRNA into non-diapaused females, the development of follicle was arrested, similarly to in the diapause state [49] . Conversely, in T. dendrolimi, diapause prepupae had a higher expression of ribosomal proteins compared to non-diapause wasps. It is noticeable that B. mori and C. pipiens enter diapause as adults, whereas T. dendrolimi enter diapause as prepupae. Although diapause decreased metabolism level, they may still need more energy to maintain this diapause condition. were up-regulated during T. dendrolimi prepupal diapause. This was different from other species, such as solitary bee Tetrapedia diversipes [50] and Tetranychus urticae ! , in which transcripts are down-regulated during diapause. In other species, the downregulation of GST and UDPGT might be correlated with non-feeding conditions, so the amount of exogenous substances decreases accordingly. However, T. dendrolimi is a kind of endoparasitoid wasp that spends its whole life cycle within the host egg, meaning starvation is a rare situation until the prepupae stage. Therefore, we speculated that the reason for the observed up-regulation of these two genes in diapause T. dendrolimi might be due to the increased resistance under unfavorable environmental conditions. Cytochrome P450s (CYP450s) are hemoproteins involved in several physiological processes such as biosynthesis of hormones and degradation of xenobiotics [51] . Previous studies showed that CYP450 of Schistosoma mansoni was essential for worm survival and egg development [52] . Besides, CYP4G1 was proved to be related to cuticular hydrocarbon biosynthesis in Drosophila [53] . In T. denrolimi transcriptome, we screened three genes (CYP4G15, CYP6A2, CYP6K1) belonging to two CYP families, CYP4 and CYP6. These genes were up-regulated in diapause individuals, suggesting that when T. denrolimi entered diapause, the environmental conditions were not suitable for survival. In this process, many harmful substances may be produced. So, a possible function for these genes is to balance out harmful substances, maintaining cellular homeostasis.

UDP-glucuronosyltransferase
Autophagy is a conserved degradative pathway to remove unnecessary or dysfunctional cellular components [54,55] . It is an indispensable catabolic and evolutionarily conserved process. Cytoplasmic components, including macromolecules and organelles are the targets in this pathway, which for degradation in response to stress conditions, such as developmental tissue remodeling, prolonged starvation periods, and nutritional fluctuations in the environment [56,57]. To date, at least 30 autophagy-related (ATG) genes have been identified [58][59][60]. In T. dendrolimi, there were ATG1, ATG2, ATG9, and ATG13, which were significantly up-regulated in diapause.
In yeast, Saccharomyces cerevisiae, autophagy-related protein 13 (ATG13) is required for glycogen storage during the stationary phase [61] . This result agreed with our results that the mRNA abundance of ATG13 was higher in diapause T. dendrolimi than non-diapause T.
dendrolimi. In addition, for all we know, the present study is the first to report that ATG9 might be engaged in diapause. It has been reported that ATG1 and ATG13 might be regulated by the mTOR signaling pathway [56] . In this study, the expressing profiles of genes in the mTOR signaling pathway were distinct among different diapause stages. Most of them were up-regulated, including ATG1 and ATG13. These findings were largely different from observations of other insect species (Fig. 7). The mTOR signaling pathway and ATGs may have an unknown regulatory relationship in terms of controlling diapause. A phylogenetic tree was constructed using ATG13 from different insect species (Fig. 8).
TdATG13 did not cluster with species in the same genus but instead with species in Lepidoptera. This might be due to an evolutionary adaption between endoparasitoids and hosts.
Besides, ATG2 is the most downstream ATG protein in the preautophagosomal structure organization process. Valverde (2019) implied that lipid homeostasis mainly regulated by the activity of this autophagy-dependent protein [62] , which was in line with Velikkakath's previous finding (2012) [63] . ATG2 also plays a significant role in life span extension [64] .
Certainly, extended longevity is one of the typical characteristics of diapause. Through phylogenetic analysis of ATG2, T. dendrolimi and N. vitripennis have high similarity, suggesting that ATG2 is possibly conservative to parasitoid wasp species (Fig. 9).
Lipid metabolism is essential for energy homeostasis of any organism. It has been shown that some diapausing insects use lipids as predominant energy stores [65,66] . During diapause, almost all selected lipid metabolism-related genes were up-regulated, coinciding with the mobilization of TAG reserves (Fig. 10). These genes have important influence in the formation of triacylglycerol, which is the main caloric reserve during diapause. In addition, several genes' expressions significantly increased after diapause termination. This might be due to post-diapause development. In our previous study, diapause T. dendrolimi had higher numbers of parasitized hosts than non-diapause T.
dendrolimi [9] . The increasing lipid storage can provide more energy for maintaining activities.
Trehalose, the main hemolymph sugar in many insects, functions in energy metabolism and protection in adverse ambient stresses, such as dehydration, heat, freezing, desiccation, and oxidation [67,68] . During diapause, trehalose is accumulated to increase survival [69] . Trehalase ( tre) is an essential enzyme in trehalose catabolism. The trehalose transporter (tret) plays a decisive role in regulating trehalose concentration. In our results, tre was down-regulated during diapause of T. dendrolimi. However, the opposite was verified for tret, which was significantly up-regulated at the diapause stage (Fig. 11).
The downregulation of tre and upregulation of tret increased the amount of trehalose during diapause.
Heat shock protein 68 (hsp68) was expressed constitutively in T. dendrolimi, although the expression level during the diapause stage was higher than that during the other three stages. HSPs are known as stress proteins, and Hsp68 belongs to the Hsp70 super family [70] . As in T. dendrolimi, the hsp68 of Drosophila melanogaster are cold inducible. A study on D. melanogaster found that after inducing diapause by low temperature, hsp68 mRNA accumulated to higher amounts than in normal developmental conditions [71] ! .
According to this result, hsp68 might take effect in the process of T. dendrolimi diapause.

Conclusions
Diapause is not only an important physiological process in insects but has also shown great potential as an effective method for the preservation of natural enemy commodities.
Taking advantage of the characteristics of diapause, the developmental cycle can elongat to increase the application efficiency of T. dendrolimi. This is important to the mass production of Trichogramma on commercial and industrialized scales. Although diapause has been successfully used as a technical means, the molecular mechanism of diapause 13 remains largely unknown.
In this study, we compared gene expression profiles among different diapause stages of T.
dendrolimi. Our results were either consistent with previous studies or provided additional information to understand the mechanism of diapause. Other novel genes, such as p53, pdrg1, and ATG13, might be relevant to diapause or autophagy-related processes. Further studies are needed to elucidate the functions of candidate genes during diapause. It should be noted that several pathways enrichment was not significant due to the lack of data on this species. Hence, we should not ignore the role of independent genes.
Our results not only provide crucial information to generate the genetic diapause toolkit but also establish a basis for improving the practical application of T. dendrolimi in biological control programs. However, as diapause is a complex process influenced by many factors, future studies are required to build a dynamic network to elucidate the adaptability between insects and environment.

Diapause induction and termination
Approximately 1,000 eggs of A. pernyi were parasitized for 2 h by approximately 2,500 T. dendrolimi adults. Then, the parasitized host eggs were separated into two groups for different treatments (Table 1). T. dendrolimi entered diapause at the prepupa stage.
Diapause prepupae and pupae after diapause were denoted as Dpre and Dp, respectively.
Non-diapause prepupae and non-diapause pupae were denoted as NDpre and NDp. The methods of diapause induction and termination in this experiment were selected based on those of Zhang et al. (2017). Specifically, diapause was induced by keeping T. dendrolimi at 12 °C for 30 days, and diapause was terminated by keeping T. dendrolimi at 3 °C for 70 days (Fig. S1; Table 1) [17] .

RNA isolation and qualification and library preparation
Total RNA was extracted from four sample sets described above using TRIzol Reagent (Sigma Aldrich, St. Louis, MO). Three replicates of four parasitized eggs were evaluated for each experiment.
A total of 1.5 μg RNA per sample was used as input material for library preparation.
Libraries used for sequencing were generated using the NEBNext ® Ultra TM RNA Library Prep Kit for Illumina ® (NEB, USA) following manufacturer's recommendations, and index codes were added to attribute sequences in each sample. The prepared libraries were then sequenced with Illumina HiSeq (Novogene, China). The RNA-seq raw reads were deposited as project number PRJNA597631 in the Sequence Read Archive of the National Center for Biotechnology Information (NCBI). Three replicates were sequenced for each sample.

De novo transcriptome assembly and functional annotation
Firstly, raw data (raw reads) were processed using in-house Perl scripts. Clean data, namely clean reads, were got after removing adaptor, ploy-N, and low-quality reads. In the meantime, the Q20, Q30, GC-content, and sequence duplication level were measured. De novo transcriptome assembly was performed with Trinity 2.4.0 software [72] with min_kmer_cov set to 25 and all other parameters at the default values. All assembled unigenes were aligned using DIAMOND v0.8.22, NCBI blast 2.2.28+, HMMER 3.0, and KAAS, and with the protein and nucleotide sequences in five public databases (NR, NT, Pfam, KEGG, and Swiss-Prot) with a threshold of e < 0.00001. Functional annotation of all the unigenes was carried out using the Blast2GO v2.5 [73] , and databases Gene Ontology (GO) and Clusters of Orthologs Groups for Eukaryotic Complete Genomes (KOG) were used.

Identification of differential expression genes (DEGs) and functional classification
Transcript expression levels were estimated by RSEM (RNA-Seq by Expectation-Maximization) [74] . Clean data were mapped back to the assembled transcriptome, and the readcount for each gene was obtained from the mapping results. Differentially expressed genes (DEGs) between three conditions (Dpre vs. NDpre; Dpre vs. Dp; Dpre vs. NDp) were identified using the DESeq R package. The resulting P-values were adjusted using the Benjamini and Hochberg's method to control the false discovery rate (FDR). Transcripts with an adjusted P-value less than 0.05 and fold change greater than two found by DESeq were considered as differentially expressed.
Gene Ontology (GO) enrichment analysis of DEGs was implemented by the GOseq R package [75] . For GO enrichment analysis, corrected P-values less than 0.05 were considered as significantly enriched in DEGs.

Validation experiment by quantitative real-time PCR (qPCR) analysis
Quantitative real-time PCR (qPCR) was performed on a qTOWER 3

Ethics approval and consent to participate
Not applicable

Consent for publication
Not applicable

Competing interests
The authors declare that they have no competing interests. Note: A, B, and C represent the three biological replicates of each sample.     GO enrichment analysis. According to our data, ten GO items that might be related to diapause were selected. The value of the horizontal ordinate represented the number of DEGs in each GO item. Up-or down-regulated genes were coded by different colors. Metal ion binding, oxidation reduction process, and ribosome biogenesis, painted with grey background color, were identified as items to be further discussed.

Supplementary Files
This is a list of supplementary files associated with the primary manuscript. Click to download.