Transcriptomic analysis of Macrobrachium rosenbergii (giant fresh water prawn) post-larvae in response to M. rosenbergii nodavirus (MrNV) infection: de novo assembly and functional annotation

Background Macrobrachium rosenbergii, is one of a major freshwater prawn species cultured in Southeast Asia. White tail disease (WTD), caused by Macrobrachium rosenbergii nodavirus (MrNV), is a serious problem in farm cultivation and is responsible for up to 100% mortality in the post larvae stage. Molecular data on how M. rosenbergii post-larvae launches an immune response to an infection with MrNV is not currently available. We therefore compared the whole transcriptomic sequence of M. rosenbergii post-larvae before and after MrNV infection. Results Transcriptome for M. rosenbergii post-larvae demonstrated high completeness (BUSCO Complete: 83.4%, fragmentation: 13%, missing:3.3%, duplication:16.2%; highest ExN50 value: 94%). The assembled transcriptome consists of 96,362 unigenes with N50 of 1308 bp. The assembled transcriptome was successfully annotated against the NCBI non-redundant arthropod database (33.75%), UniProt database (26.73%), Gene Ontology (GO) (18.98%), Evolutionary Genealogy of Genes: Non-supervised Orthologous Groups (EggNOG) (20.88%), and Kyoto Encyclopedia of Genes and Genome pathway (KEGG) (20.46%). GO annotations included immune system process, signaling, response to stimulus, and antioxidant activity. Differential abundance analysis using EdgeR showed 2413 significantly up-regulated genes and 3125 significantly down-regulated genes during the infection of MrNV. Conclusions This study reported a highly complete transcriptome from the post-larvae stage of giant river prawn, M. rosenbergii. Differential abundant transcripts during MrNV infection were identified and validated by qPCR, many of these differentially abundant transcripts as key players in antiviral immunity. These include known members of the innate immune response with the largest expression change occurring in the M. rosenbergii post-larvae after MrNV infection such as antiviral protein, C-type lectin, prophenol oxidase, caspase, ADP ribosylation factors, and dicer.


Background
Macrobrachium rosenbergii, known as the giant freshwater prawn, is an economically important crustacean species in Southeast Asia. Production of M. rosenbergii increased dramatically over the last 20 years and has exceeded 200,000 t per year since 2002 [1]. A major problem in the cultivation of M. rosenbergii relates to loss through infectious diseases caused by viruses, bacteria, and fungi [2]. One of the most serious viral threats to M. rosenbergii culture is Macrobrachium rosenbergii nodavirus (MrNV) that causes white tail disease (WTD). This disease was first discovered in Guadeloupe Island [3] and subsequently in Taiwan [4], China [5], India [6], Thailand [7], and Australia [8]. MrNV is a non-enveloped, icosahedron with 26-27 nm diameter composed of a nucleocapsid bearing two positive single-stranded RNA genomes (RNA-1 and RNA-2). RNA-1 is 3202 bp -ssRNA encoding protein A or RNA-dependent RNA polymerase (RdRp) and protein B2 [9]. Protein B2 is capable of inhibiting the RNAi pathway of the host cell [10]. RNA-2 is 1175 bp -ssRNA encoding capsid protein [11]. The prominent sign of infected post larvae (PL) is the appearance of whitish muscle commonly in abdominal region. Mortalities may reach 100% within 7-15 days after the infection or 3-5 days after the appearance of the first anatomical signs [6]. However, experimental transmission of MrNV revealed that the virus failed to cause mortality in adult prawns [6]. As there is no cure for WTD infected prawns, preventive procedures have been implemented such as screening of brood stock and PL, and good farm management [7,12,13]. Understanding the prawn's immune response specifically to WTD may also provide bio-rationale targets to help contain and restrict disease outbreak.
As an arthropod crustacean, the innate immune system of M. rosenbergii is composed of both humoral and cellular responses. Humoral immune responses include up-regulation of the prophenol oxidase system (ProPO), clotting proteins, melanization and antimicrobial peptides [14]. Cellular immune responses involve hemocyte activities such as phagocytosis, apoptosis, nodule formation, and encapsulation which function to neutralize pathogens [15]. Indirect recognition of pathogens or pathogen-associated molecular patterns (PAMPs) by pattern recognition receptors (PRRs) such as Toll receptors leads to the activation of these humoral and cellular immune responses [16].
In recent years, high-throughput technology such as next generation sequencing (NGS) has emerged and is widely used in both genomic and transcriptomic research. NGS technology can also be used to study differential gene-expression on various tissues or certain conditions such as stress and pathogen infection [17] and in both model, and non-model organisms [18]. There are transcriptomic studies for penaeid shrimps such as Peneaus monodon [19][20][21], Litopeneaus vannamei [22][23][24], Fenneropeneaus chinensis [25,26], Fenneropenaeus merguiensis [27,28], and Marsupeneaus japonicus [29] to investigate tissue-specific expression, the stress response, and viral infection. Moreover, many studies have been performed on whole transcriptome sequencing of the hepatopancreas of M. rosenbergii in response to Vibrio parahaemolyticus infection [30], hepatopancreas and lymphoid organ in response to white spot syndrome virus (WSSV) [31,32], and intestinal tissue in response to WSSV or the viral PAMP mimic (poly I:C) [33]. Most recently, transcriptomic analysis of hematopoietic tissue of M. rosenbergii adult prawn in response to MrNV infection has been studied [34]. Many of differentially abundant transcripts were belonged to various immune mechanisms such as pattern-recognition receptors, antioxidants, and antimicrobial peptides [34].
MrNV has direct impact on M. rosenbergii post-larvae culture. However, there is no transcriptomic data on M. rosenbergii post-larvae in response to the infection with MrNV. To address this deficiency, the whole transcriptome from six biological replicates of each treatment from healthy and MrNV-infected PL were sequenced. In this study, a highly complete transcriptome for M. rosenbergii post-larvae was assembled and annotated. This transcriptome can be used as reference transcriptome for the further gene expression analysis of M. rosenbergii. Differential abundant transcripts were examined and immune-related genes in response to MrNV infection were reported. In addition, differential abundance and RNAseq results were validated by quantitative PCR in separate biological samples.

Results
Sequence read data and raw data pre-processing RNA sequencing for two groups with six biological replicates (n = 12) produced a total of 522,296,142 raw reads with an average of 43.5 ± 5.1 (mean ± SD) million read pairs per sample ( Table 1). The raw reads were subjected to quality trimming which included adaptor removal. After quality trimming, a total of 501,900,423 (96.09%) 65 bp trimmed reads had high average Phred score (96.13% with Phred ≥30).

De novo transcriptome assembly and quality assessment
The trimmed reads were subjected to de novo transcriptome assembly using Trinity. The trimmed reads were further reduced into 51,971,920 reads (10.36%) during in silico normalization prior to de novo assembly. The Trinity assembler produced 109 Table 2.

Functional annotation
The assembled transcripts were subjected to the homology search by Blastx software using UniProt and nonredundant arthropod database. Blastx search against UniProt database yielded 25,761 (26.73%) significant hits (E-value < 1-e5). The majority of top-hit species distribution were Homo sapiens with 5587 (21.7%) hits followed by Mus musculus and Drosophila melanogaster with 4234 (16.4%) and 4043 (15.7%) hits, respectively (Fig. 1a).

Differential abundance analysis
To identify differentially abundant transcripts between two groups, transcripts that had count per millions (CPM) more than 1 in at least two samples were selected before the analysis. Total of 31,377 transcripts survived the cut-off and were subjected to TMM normalization. The differential abundance analysis using EdgeR was performed followed by Benjamini-Hochberg method for multiple p-value correction. Total of 5538 transcripts were reported to be differentially expressed (FDR < 0.05, LogFC < ±1). Of those, 2413 transcripts were significantly up-regulated and 3125 transcripts were significantly down-regulated after the infection of MrNV. Full-list of differentially abundant transcripts is available in Additional file 2: Table S3. Summary of the transcriptome assembly, annotation and differential abundance analysis were listed in Table 3.
To examine the homogeneity across biological replicates, principle component analysis (PCA) was performed. Transcripts with extremely low abundance (sum of read count < 10) were filtered out. Total of 84,092 transcripts survived the cut-off and were subjected to the PCA. The PCA results showed strong clustering within each group. The both groups formed distinct clusters within principle component 1 (PC1) which responsible for 42.62% of the variance in the expression (Fig. 5). In addition, the top 100 most differentially abundant transcripts were clustered using Pearson's correlation and displayed in heatmap (Fig. 6). The biological replicates were clustered within the same group and demonstrate clear distinction between control and infected group.

Quantitative RT-PCR
To validate differential abundance results from the RNAseq pipeline, qRT-PCR was performed using nine selected genes from the list of DEG involved in the immune system. Elongation factor1-alpha (EF1-alpha) was used as an internal reference gene. Four separate biological replicates from each group were used in two-step qRT-PCR. The abundance levels were calculated using the delta-delta C t method. The qRT-PCR results showed that all up-regulated genes had greater differences in transcript abundance than those of RNAseq, whereas all down-regulated genes had smaller differences (Table 5). According to the qRT-PCR results, all of the selected genes were differentially abundant after the infection of MrNV (p > 0.05) except Spz which had p-value of 0.053 (Table 5). Comparative heatmaps were generated using relative abundance (qRT-PCR) and transcript per  (Fig. 7a). Expression patterns of all selected genes were comparable between qRT-PCR and RNAseq. Furthermore, the Pearson's correlation coefficients were calculated using the average log-fold change ratio between the two methods and demonstrated highly significant correlation as shown in Fig. 7b (R 2 = 0.9531).

Discussion
In this study, we report a highly complete transcriptome for M. rosenbergii post-larvae and identified immunerelated genes in response to MrNV infection using NGS platform as well as verified the transcript abundance of selected genes using qRT-PCR. We not only present the transcriptome data, but we have published our computational pipeline to benefit the wider scientific community. Raw data has been uploaded to the National Centre for Biotechnology Information Sequence Read Archive (SRA) under the accession BioProject number: PRJNA550272. The data analysis pipeline is available on GitHub at https://github.com/prawnseq/Mrosenbergii_ MrNV_RNAseq.
The transcriptome of giant fresh water prawn (M. rosenbergii) was assembled to expand the transcriptomic resources for further gene expression analysis of this species. Therefore, we aimed for maximizing transcriptome coverage while minimizing the redundancy to generate high quality transcriptome. The BUSCO results showed that the assembled transcriptome was highly complete (C:83.4%) with low fragmentation (F:13%), missing (M:3.3%), and duplication (D:16.2%). These     [20]. According to Blastx results against UniProt database, the top hits were mammalian species (H. sapiens, M. musculus) followed by D. melanogaster because UniProt database is a comprehensive annotation database which annotated and reviewed by the experts [38]. Therefore, majority of UniProt subjects were obtained from well-studied model organisms such as human, mouse and fruit fly. However, according to Blastx results against Nr-Arthropod database, majority of the top hits were matched to crustacean species, Hyalella Azteca. Previous transcriptomic studies of M.rosenbergii performed de novo assembly separately from each sample and then clustered into a global transcriptome [30][31][32][33].
In this study, all read data were merged before the assembly, resulting in an increase in the abundance of low expressed transcripts, therefore increasing coverage of the assembled transcripts [39]. In addition, the highest E x N 50 value at 94% indicated that the assembled transcripts had high coverage and was assembled from sufficient read data [40].
To identify differentially abundance transcripts associated with MrNV infection, we performed RNA sequencing on six replicates of each healthy PL and experimentally MrNV infected PL. The results showed that 5538 transcripts were differentially expressed with 2413 transcripts were up-regulated and 3125 transcripts were down-regulated. Among those transcripts, some of these were involved in the innate immune system in response to viral infection.
To validate our transcript assembly and differential abundance results, we also selected 9 targets for validation with qPCR, and demonstrated a very low false discovery rate in these genes. We filtered our list of DEGs according to functional groups of importance including pattern recognition proteins (PRPs) and antiviral protein, prophenol oxidase (ProPO) system, the Toll-IMD signaling pathways, antimicrobial peptides (AMPs) and blood clotting system, phagocytosis and apoptosis, antioxidant system, and RNA interference (RNAi). Numerous studies demonstrated high correlation between RNAseq and qPCR data [41][42][43]. Moreover, RNAseq studies of M. rosenbergii in response to bacterial and viral infection also showed high correlation between RNAseq and qPCR results [30][31][32]. To enhance our confidence, we validated these results in independent biological samples. Crustacean innate immune system requires the recognition of the pathogens by pattern recognition proteins (PRPs) to trigger the immune responses [44]. PRPs are groups of germ-line encoded proteins that can activate humoral and cellular immune response via immune signaling pathway [45]. Several studies identified prawn PRPs and examined its role in prawn immune system regarding viral infection. M. rosenbergii C-type lectin (MrCTL) expression in hepatopancreas was up-regulated after a challenge with Vibrio parahaemolyticus or white spot syndrome virus (WSSV) [46]. Ficolin expression in M. rosenbergii hepatopancreas was found to be upregulated after the infection of V. anguillarum and WSSV [47]. Moreover, M. rosenbergii mannose-binding lectin (MBL) expression in gills was also up-regulated in response to WSSV or MrNV infection [48]. In this study, various PRPs such as C-type lectin, ficolin, and antiviral protein which are part of C-type lectin family were differentially expressed after the infection of MrNV suggesting that these PRPs play important roles in immune system against MrNV infection. Importantly, we validated the expression of one of these genes, antiviral protein, which was up-regulated by 2.48-fold in the RNAseq and 8.97-fold by qPCR (Table 5). However, in our results, MBL expression was down-regulated which contradicts the previous report [48]. This maybe because MBL is not required for antiviral response of post-larvae prawn against MrNV or the expression of MBL in postlarvae prawn being suppressed by MrNV. Further investigation is needed to understand the effects of MrNV to MBL expression in post-larvae stage.
Prophenol oxidase (ProPO) activating system participates in first line of immune system by triggering melanization and other responses such as hemocyte induction, encapsulation, and nodule formation [14,49,50]. Recognition of PAMPs by PRPs leads to activation of serine proteinases cascade, resulting in the production of active PO enzyme. The active PO enzyme produces polymeric melanin around invading pathogens resulting in melanization [50]. The RNAseq presented here shows a 2.01-fold up-regulation of ProPO expression in response to MrNV infection in which the expression was validated by qPCR (9.38-fold up-regulation; Table 5). In addition, three types of prophenoloxidase activating enzyme were also up-regulated in RNAseq results. These results indicate involvement of ProPO-activating system in MrNV infection.
The Toll-IMD signaling pathways are considered to be the most crucial immune signaling pathways in invertebrates [51]. The toll receptor is triggered by cytokine- Fig. 6 Heatmap of top 100 differentially expressed transcripts. The heatmap was generated using trimmed mean of M-values (TMM). Sample clustering was done using Pearson's correlation. The Z-score scale is shown in the top-right corner ranging from − 2 (blue) to 2 (yellow)  [52,53]. Triggering the toll receptor leads to activation of NF-kappa-B family protein Dif/Dorsal and then leads to up-regulation of immune-related genes such as antimicrobial peptide (AMPs) [53]. Toll receptor from M. rosenbergii have been identified [54,55] and was found to be gradually up-regulated in gills during the WSSV challenge [55]. M. rosenbergii spätzle protein has been found to be upregulated in hemocytes after the bacteria infection [56].
In addition, spätzle protein in F. chinensis was upregulated after challenged with V. anguillarum and WSSV [57]. Nuclear factor NF-kappa-B p110,also known as relish, is an important nuclear transcription factor in the IMD signaling pathway which functions parallel to the Toll pathway [53,58]. M. rosenbergii relish was reported to be involved in bacterial infection and overexpression of relish induced the expression of various AMPs [59]. In this study, we found that the toll and NFkappa-B p110 expression were up-regulated after the infection of MrNV, whereas the expression of spätzle was down-regulated. We also validated the expression of spätzle using qPCR which was 3.26-fold down-regulated compared to 17.88-fold down-regulated in RNAseq (Table 5). These results suggested that these genes are involved in the immune system against MrNV infection. Down-regulation of spätzle may be caused by a negative feedback regulation from the activation of the toll pathway [60,61]. Antimicrobial peptides (AMPs) are also important components in first line of immune system. AMPs are usually small cationic, amphipathic, germ-line encoded proteins that have rapid and efficient antimicrobial effects against broad spectrum of microorganisms including bacteria, fungi, and viruses. AMPs differ in structural conformation, charge, and amphipathicity [62,63]. Fundamentally, AMPs disrupt the membrane integrity of the target and then destroy the microbe by membrane destabilization or pore formation [62,64]. Several studies reported that anti-lipopolysaccharide factors (ALFs) and lysozymes expression were up-regulated after the infection of WSSV in M. rosenbergii, F. chinensis, and L. vannamei [35,65,66]. Crustin expression in M. rosenbergii hemocytes was found to be up-regulated after WSSV, infectious hypodermal and hematopoietic necrosis virus (IHHNV), and Aeromonas hydrophila challenges [67]. However, crustin isoform 1 and 2 expression in P. monodon were down-regulated after WSSV infection, whereas crustin isoform 3 was up-regulated [68]. In our report, ALFs, i-type lysozyme-like protein 2 (LYZL2), and crustin members were differentially expressed during MrNV infection. ALFs and LYZL2 were up-regulated, whereas several crustin isoforms were both up-and downregulated. Additionally, the validation by qPCR showed that ALF1 was up-regulated by 5.86-fold which was comparable to the 2.25-fold up-regulation by RNAseq (Table 5). These suggested that ALFs and LYZL2 are involved in the prawn immune response against MrNV, whereas only certain isoforms of crustin are involved.
Blood clotting is a humoral response which prevents hemolymph loss and microbial spread during injury [69]. In crustacean, blood clotting involves cross-linking aggregation of clotting proteins (CPs) by calcium-dependent transglutaminase (TGase) produced by the hemocyte [70]. Previous studies showed that lysozyme and crustin expression in M. japonicus were down-regulated in TGase depleted prawn suggesting that there is a link between the blood clotting system and AMPs [71]. We found that transglutaminase and hemicentin-1-like isoform X2 (HMCN1) expression were up-regulated. In addition, we validated the expression of HMCN1 which was upregulated by 5.31-fold in the RNAseq and 16.07-fold by qPCR (Table 5). These results suggested that these two blood coagulation components are involved as a response to the infection of MrNV. Moreover, up-regulation of two blood coagulation components, lysozyme, and certain isoforms of crustin may indicate a link between the blood clotting system and AMPs in M. rosenbergii.
Apoptosis or programmed cell death plays an important role in cellular immune response by limiting viral replication and eliminating virus-infected cells in multicellular organisms [16,72]. Apoptosis requires the activation of caspases, highly conserved cysteine proteases that are involved in the execution of cell death [73]. Previous studies characterized M. rosenbergii caspase (MrCasp) and demonstrated that the capsid protein of MrNV could inhibit apoptotic responses of MrCasp in Sf9 cells [74]. M. rosenbergii caspase 3c expression in hemocyte was found to be up-regulated after IHHNV challenge [75]. M. rosenbergii inhibitor of apoptosis protein (IAP) has been characterized and was found to be upregulated in hepatopancreas after IHHNV infection [76]. IAP is one of the apoptosis regulators that binds and inhibits the activity of caspase [77]. Additionally, L. vannamei IAPs-silenced by RNAi demonstrated higher expression of WSSV genes and significant downregulation of AMP genes [35]. In this study, we reported up-regulation of caspase, caspase4, and IAP in response to MrNV challenge. We also validated the expression of caspase which was 2.64-fold up-regulation in RNAseq and 12.05-fold by qPCR (Table 5). Up-regulation of these genes suggests that these gene are involved in MrNV infection.
Phagocytosis plays a role in innate immune system by ingesting microparticles including microbial pathogens and cellular debris from apoptosis and necrosis [78]. Recent studies reported that the small GTPases play a role in antiviral immunity by controlling cellular trafficking and regulating phagocytosis [79][80][81]. ADP ribosylation factors (Arfs), which are small ubiquitously expressed GTPases have been characterized in M. rosenbergii and M. japonicus. Arfs expression was up-regulated in both species after WSSV challenge [82,83] [84]. The expression of Rab GTPases, Ras-like GTPase superfamily members, are up-regulated in response to WSSV [85] and MrNV infection. We validated, ADP ribosylation factors (Arfs), using qPCR and found 5.45-fold upregulation compared to 2.19-fold by RNAseq (Table 5). These indicates involvement of these small GTPases in antiviral immunity against MrNV.
In prawns, RNA interference (RNAi) plays a crucial role in antiviral immunity. Several key components of RNAi has been characterized in prawns including isoforms of dicer and argonaute in M. rosenbergii [90], L. vannamei [91][92][93], and P. monodon [94,95]. Dicer, RNase III-like enzyme, is responsible for the cleavage of long dsRNA into 21-30 bp siRNA which initiates RNAi mechanism [96,97]. Genome derived silencing RNAs (siRNAs) are unwound and incorporated with argonaute protein which is the main component of the RNA-induced silencing complex (RISC) [98]. RISC then recognizes and degrades target mRNA which represses the expression of the target genes [99]. Previous studies reported that administration of synthetic dsRNA/ siRNA specific to genes of yellow head virus (YHV) [100,101], or WSSV [102] resulted in specific inhibition of viral replication. Moreover, dsRNA can be formed during the replication of both RNA (taura syndrome virus (TSV) and YHV) and DNA viruses (WSSV) which triggers antiviral responses via RNAi [103]. Recent studies revealed that 31 miRNAs were differentially expressed during WSSV infection suggesting that these miRNA may be involved in innate immunity [104]. Additionally, a total of 24 miRNAs have been identified in M. japonicus and reported that these miRNAs may be involved in innate immunity including regulating processes such as phagocytosis, proPO system, and apoptosis [105]. Taken together, both siRNAs and miRNAs have participated in shrimp antiviral immunity. In this study, dicer-2 and argonaute-3 were significantly up-regulated in response to the MrNV infection. Importantly, the qPCR validation showed 6.81-fold up-regulation of dicer-2 compared to 3.66fold up-regulation from RNAseq results (Table 5) indicating that the production of RNAi may play an antiviral role against MrNV infection in M. rosenbergii.

Conclusion
In conclusion, this study reported a highly complete transcriptome from the post-larvae stage of giant river prawn, M. rosenbergii. This transcriptome expands the transcriptomic resources for further gene functional analysis and transcriptome profiling in response to certain conditions. Transcripts abundance analysis between control and MrNV-infected group revealed significant differences in the transcript abundance of various immune responses e.g. immune signaling pathway, prophenol oxidase system, antimicrobial peptides, blood clotting system, phagocytosis and apoptosis, and RNA interference. To our knowledge, this study is the first report of the transcriptomic profile of M. rosenbergii post-larvae during the infection of MrNV. This study also provides preliminary insight on molecular responses of the prawn to MrNV infection that improve our understanding of M. rosenbergii antiviral responses and may provide molecular targets to help contain the disease outbreak.

Materials and methods
Preparation of MrNV infected M. rosenbergii post-larvae M. rosenbergii post-larvae (PL 25-30) were purchased from local farm at Suphan Buri provinces, Thailand. The healthy PL were kept in a glass tank with dechlorinated freshwater and continuous aeration at room temperature (25-27°C). After 1 day of acclimation, the PL were divided into two groups including control and MrNV group. In the MrNV group, the PL were challenged with 10% w/v MrNV-infected PL homogenated in TN buffer (20 mM Tris-HCl and 0.4 M NaCl, pH 7.4) using the immersion method [106]. The control group were challenged with TN buffer instead of MrNV. After 4 days of immersion, the MrNV-infected PL and the PL from control group were divided into 10 subgroups (10 PL per each subgroup). Six subgroups were used in Illumina sequencing, whereas four subgroups were used in quantitative RT-PCR experiment. The PL samples were immersed in DNA/RNA Shield (Zymo Research, USA) and stored at -80°C until RNA extraction.
The PL from both groups were subjected to MrNV detection using RT-PCR. Viral nucleic acid was extracted from the PL using High Pure Viral Nucleic Acid Kit (Roche). The MrNV-specific primers designed by Senapin and others were used in this study [107]. The RT-PCR was carried out using Super-Script® III One-Step RT-PCR System (Invitrogen) and the extracted viral nucleic acid as a template. The RT-PCR condition was 50°C for 30 min, followed by 94°C for 5 min, 35 cycles of 94°C for 1 min, 55°C for 45 s, and 72°C for 1 min, and final extension at 72°C for 10 min.

RNA extraction and Illumina sequencing
Total RNA was extracted from each subgroup using Quick-RNA™ MiniPrep (Zymo Research, USA) according to manufacturer's protocol. The extracted RNA was quantified using DropSense 16 Micro-volume spectrophotometer (Unchained Labs, USA) and stored at -80°C until library preparation.
The cDNA library was prepared using SENSE mRNA-Seq Library Prep Kit V2 (Lexogen, USA) and the purification module with magnetic beads (Lexogen, USA). Each library was indexed by 6nucleotide-long i7 indices during library amplification step according to manufacturer's protocol. The prepared cDNA library was subjected to quality assessment using Qubit 4 Fluorometer (Invitrogen, USA) and LabChip GX Touch 24 microfluidic nucleic acid analyzer (PerkinElmer, USA). Before denaturation, equal amount of each cDNA library were pooled together and re-purified using the magnetic beads. Total of 2 nM pooled library was subjected to denaturation steps according to NextSeq System Denature and Dilute Libraries Guide (Illumina, USA). Cluster generation and paired-end sequencing with 75 bp were performed on a NextSeq 550 sequencer (Illumina, USA) using NextSeq 500/550 High Output Reagent Cartridge v2 150 cycles (Illumina, USA).

Data analysis pipeline
The automated data analysis pipeline was written in python using the Snakemake tool [108]. The data analysis pipeline contained three major sections including raw data pre-processing, transcriptome assembly, and post-processing of the transcriptome (Fig. 8). Quality assessment of the sequence data was performed using FastQC v. 0.11.5 [109] and complied using MultiQC v 1.8 [110]. Raw read data were subjected to quality trimming including trim low quality bases, removing N nucleotides, and discarding reads below 36 bases long using Trimmomatic v 0.36 [111]. The trimmed reads were subjected to quality assessment using FastQC and MultiQC and then merged for transcriptome assembly using merge command.
The de novo transcriptome assembly was performed using Trinity software v 2.8.0 using default parameters [112]. The in-silico normalization was performed within Trinity prior to de novo assembly. This process resulted in transcripts with relatively high redundancy. To remove the redundancy, transcripts that have more than 95% of identity were clustered together using CD-HIT software [113]. The final transcriptome was subjected to quality assessment including calculate the fragment mapping rates Fig. 8 The overviews of analysis pipeline. The pipeline is divided into three major parts including raw data pre-processing, transcriptome assembly, and post-precessing of the transcriptome which indicated by area of different colors. Boxes represent datasets. Rounded boxes represent analyses. The software for each analysis is indicated at the top of the box whereas the database for homology search are listed under the box using Bowtie 2 v 2.3.0 [114], examine orthologs completeness using BUSCO v 3 [115] against arthro-poda_odb9 database, and generate N x and E x N 50 statistics using 'contig_ExN50_statistic.pl' script within Trinity assembler.
The final transcriptome was aligned against UniProt and NCBI's non-redundant arthropod database using Blastx. The results from Blastx against UNIPROT database were then used to obtain functional annotation from EggNOG, KEGG, and GO database using Trinotate v 3.0.2 [40].
To identify differentially abundant transcripts, the raw read counts were calculated by RSEM software [116] and then used to generate abundance matrix. Differential abundance analysis was performed using EdgeR [117]. EdgeR uses the trimmed mean of Mvalues normalization method (TMM) to calculate the transcript abundance levels [118] with the Benjamini-Hochberg method for multiple p-value correction [119]. The transcripts that had at least two-fold change with a false discovery rate (FDR or adjusted p-value) less than 0.05 was considered as differentially abundant transcripts.
Total RNA was extracted from separate biological samples using Quick-RNA™ MiniPrep (Zymo Research, USA). The first-strand cDNA was synthesized using SensiFAST™ cDNA Synthesis Kit (Bioline, UK) according to manufacturer's protocol. The qPCR was carried out using SensiFAST™ SYBR® Lo-ROX Kit (Bioline, UK). The qPCR condition was 95°C for 2 min, 40 cycles of 95°C for 5 s, 60°C for 10 s, and 72°C for 15 s, followed by melting curve analysis. The deltadelta C t method was used to calculate relative fold-change of gene expression between control and infected samples [121]. Statistical differences between two groups were conducted using a simple student t-test. The results from qPCR were then compared with the results from RNAseq pipeline using heatmap and coefficient of determination (R 2 ). Table 6 Primers used in the qRT-PCR experiment