Integration of transcriptomics and network analysis reveals co-expressed genes in Frankliniella occidentalis larval guts that respond to tomato spotted wilt virus infection

Background The gut is the first barrier to infection by viruses that are internally borne and transmitted persistently by arthropod vectors to plant and animal hosts. Tomato spotted wilt virus (TSWV), a plant-pathogenic virus, is transmitted exclusively by thrips vectors in a circulative-propagative manner. Frankliniella occidentalis (western flower thrips), the principal thrips vector of TSWV, is transmission-competent only if the virus is acquired by young larvae. To begin to understand the larval gut response to TSWV infection and accumulation, a genome-assisted, transcriptomic analysis of F. occidentalis gut tissues of first (early L1) and second (early L2 and late L2) instar larvae was conducted using RNA-Seq to identify differentially-expressed transcripts (DETs) in response to TSWV compared to non-exposed cohorts. Results The larval gut responded in a developmental stage-dependent manner, with the majority of DETs (71%) associated with the early L1 stage at a time when virus infection is limited to the midgut epithelium. Provisional annotations of these DETs inferred roles in digestion and absorption, insect innate immunity, and detoxification. Weighted gene co-expression network analysis using all assembled transcripts of the gut transcriptome revealed eight gene modules that distinguish larval development. Intra-module interaction network analysis of the three most DET-enriched modules revealed ten central hub genes. Droplet digital PCR-expression analyses of select network hub and connecting genes revealed temporal changes in gut expression during and post exposure to TSWV. Conclusions These findings expand our understanding of the developmentally-mediated interaction between thrips vectors and orthotospoviruses, and provide opportunities for probing pathways for biomarkers of thrips vector competence. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-08100-4.


Background
The alimentary canal of arthropod vectors is the primary site of entry by various microbes and viruses through feeding on diverse hosts [1,2]. In contrast to externallyborne, non-persistent and semi-persistent plant viruses, persistently transmitted plant viruses become internalized (enter tissue systems), thus the virus must overcome multiple insect membrane barriers and ultimately traverse the salivary gland system for plant inoculation during feeding [3,4]. The insect gut is the first major barrier that these persistent viruses must encounter during their entry into vectors [3,5]. Despite the fact that many viral determinants of transmission have been characterized [6], vector molecules that govern the initial infection of gut tissues are just beginning to be characterized. With advances in high-throughput sequencing, myriad global transcriptional responses of insect vector species to the persistent viruses they transmit have been reported [7][8][9][10][11][12][13][14][15][16]. However, most of these studies focused primarily on the responses of whole bodies to virus infection with little work centered on the gut tissue level [9,10,17]. Given that vector competence for persistent viruses is initially dependent on infection of the gut tissue, understanding the molecular interactions between vector gut and virus is essential.
Tomato spotted wilt orthotospovirus (common name: tomato spotted wilt virus, TSWV), type species in the genus Orthotospovirus (Family Tospoviridae, Order Bunyavirales), causes serious disease in the United States and global agriculture. TSWV is exclusively transmitted by herbivorous thrips species (Thysanoptera: Thripidae). By virtue of the wide geographical distribution and efficiency in virus transmission, Frankliniella occidentalis Pergande (the western flower thrips) is considered as the primary vector of TSWV [4]. TSWV consists of a tripartite genome of negative-sense, single-stranded RNAs (denoted as L, M, and S) that are encased in a lipid bilayer envelope. Each genome segment is encapsidated by viral nucleoprotein (N) and bound to viral RNAdependent RNA polymerase (L). The viral glycoproteins (Gn and Gc) that decorate the virus particle are the viral determinants of vector acquisition in the midgut [18][19][20][21]. Two non-structural proteins, NSm and NSs, are also encoded by TSWV, functioning as a movement protein in plant hosts and a silencing suppressor in plant and insect cells, respectively [22][23][24].
TSWV is a persistent-propagative virus and transmission is tightly linked to the development of F. occidentalis [15,25,26]. TSWV infection can be sustained in the midgut after virus ingestion by adult thrips, but the transmission-competency of thrips presents only if the virus is acquired as larvae [27]. The anterior region of the larval midgut is the first site of TSWV infection. After infection of a few epithelial cells in the first instar larva (L1), TSWV replicates and disseminates into adjacent cells and eventually infects the remaining midgut tissue. This results in an increase of virus titer as the L1 develops into the second instar larva (L2), reaching a peak at the late L2 stage [26]. Following the first phase of virus accumulation in the larval stages, virus titer decreases rapidly in the propupal and pupal stages. The second phase of virus accumulation appears during the adult stage when the principal salivary glands become the primary site for TSWV replication [25,26,[28][29][30].
The transcriptome responses of whole-body F. occidentalis and Frankliniella fusca to TSWV exposure at various developmental stages (larval, pupal, adult) revealed that TSWV perturbed the transcriptomes of both species in a developmental stage-specific manner. However, the overall responses of both species to virus exposure were subtle. Less than 1 and 1.6% of transcripts were differentially expressed in F. occidentalis and F. fusca, respectively [14,15]. A proteome study of TSWVexposed and non-exposed F. occidentalis L1s also reported that only 5% of proteins were differentially regulated due to TSWV infection [31]. These findings may indicate the nature of non-pathogenicity of TSWV to specific thrips vectors or the relationship between viral tissue tropism and the degree of thrips response to TSWV infection.
Considering the tight linkage between thrips development and TSWV transmission, as well as the determinative role of gut tissue in the initial infection process, we aimed to identify gut genes that are differentially regulated by TSWV infection during larval development. We conducted a comparative transcriptome analysis on gut samples of F. occidentalis associated with TSWV infection at three different time points during the larval developmental stages (L1, early L2, and late L2) using RNA sequencing (RNA-Seq). The three time points represent not only the different larval developmental stages, but the varying levels of virus accumulation and dissemination in the larval guts from a few infected epithelial cells of the anterior region of the midgut to a more widespread infection of the entire midgut [25,32,33]. Additionally, we performed a weighted gene correlation network analysis (WGCNA) to identify clusters of coexpressed gut genes (modules) significantly correlated with thrips larval stages and putative hub genes responsive to virus infection in the most significant gene modules. Lastly, to gain a deeper insight into the effect of virus on gut gene expression before and after removal of virus inoculum, a time-course experiment was performed to detect the temporal expression of network hub and peripheral transcripts. To our knowledge, this is the first gut tissue-specific omics study for any thysanopteran species. This work will provide insights into the important genetic components that respond to TSWV exposure in F. occidentalis larval guts and also further our understanding of the roles of guts in thrips-TSWV interactions.

Results
Sequence read quality and summary statistics Over 1.8 billion raw reads in total were generated for the 24 RNA-Seq libraries, and after quality filtering, 80% of the reads on average were retained with 88% of them having Phred quality scores > 30 and an average length of 143 bp (Table S1). On average across the 24 libraries, 86% of the quality paired end reads aligned to the annotated gene models in the F. occidentalis genome reference.

Differential gene expression of larval guts in response to TSWV
There were 147 non-redundant DETs in response to TSWV over the course of larval growth and development, of which 71% were associated with L1 ( Fig. 1). The number of up and down-regulated DETs in L1 guts was comparable; however, for both L2 stage-times, most to all of the DETs were down-regulated relative to their non-infected counterparts (Fig. 1A). The least perturbed stage-time was early L2s, 24 h after removal of inoculum, with only 7 DETs. Across the three stage-times, the absolute change in transcript expression between virusinfected and non-infected ranged from 2 to 588 fold (1.04-9.2 log 2 fold), and the average log2fold change ranged from − 3.68 (48-L2) to 3.41 (3-L1). The top 10 most virus-responsive transcripts and their predicted annotations for each stage-time are listed in Table 1.
Of the 147 DETs, 18.4% of the sequences matched hypothetical proteins (Table S2), whereby the most significant matches were NCBI Genbank accessions of the F. occidentalis OGS v1.0 peptides, as well as hypothetical proteins of Thrips palmi, another genome-sequenced thysanopteran. In addition, only 3.4% of the DETs matched uncharacterized proteins in other metazoans. Only four DETs in the total dataset did not contain predicted Interpro IDs specifying conserved peptide motifs, e.g., transmembrane domain, cytoplasmic domain, coil, and these sequences belonged to the 'hypothetical proteins' set.

Developmental stage-dependent larval gut response to TSWV
There was little to no overlap in virus-responsive DETs across the three larval stage-times (Fig. 1B). The two stages farthest apart in developmental time (3-L1 and 48-L2) shared the most in number of DETs (13), which might be expected given the few number of DETs associated with the early L2 stage (24-L2). The 13 shared DETs have provisional gene ontologies of proteolysis and hydrolysis, carbohydrate metabolism, lipid metabolism, and structural constituent of cuticles, such as transmembrane protease serine, putative beta-glucosidase, myrosinase, acyl-CoA desaturase, and endocuticle structural glycoprotein. Most of these 13 transcripts were down-regulated in both larval stage-times. Of those 13, showing the distribution of differentially-expressed transcripts (DETs) in virus-infected guts relative to non-infected guts at three different times over larval development: 3-L1 = cohort of first instars three hours after removal of virus inoculum after a 24-h acquisition access period; 24-L2 = same cohort of insects developed into early second instars, 24 h after removal of inoculum; and 48-L2 = same cohort of insects entering into late second instars, 48 h after removal of inoculum. The horizontal and vertical dotted lines indicate cutoffs for log2-fold change ≥1 and FDR adjusted P-values ≤ 0.05, respectively. (B) Venn diagram depicting the number of shared and unique DETs between the three time-larval stages; numbers in parentheses indicate number of DETs in each time-larval stage five disagreed in direction between 3-L1 and 48-L2, and these were annotated to be involved in lipid metabolism and structural constituent of cuticle. The single DET shared between 3-L1 and 24-L2 was putatively predicted to be a myrosinase, down-regulated in both stages.

Gene ontologies of DETs mirror developmental stagedependent response to virus
Of the 147 DETs in larval guts, 58% were significantly assigned GO terms. These GO-annotated sequences were classified by biological process (BP, 64 sequences), molecular function (MF, 81 sequences), and cellular localization (CC, 17 sequences) (Fig. 2, Table S2). Reflecting the little overlap of DETs between the developmental stages, enriched GO terms associated with these DETs also varied across stages. Distributed across the three types of GO classifications, the most commonly occurring annotations for guts of the 3-L1 stage were 'oxidation reduction process', 'carbohydrate metabolic process', 'hydrolase activity', and 'integral component of membrane', while the predominant GOs in the 48-L2 stage indicated proteins associated with 'lipid metabolic process', 'structural constituent of cuticle', and 'extracellular region' and 'integral component of membrane'. Proteins with GO annotations of 'transport' and 'heme-binding', and 'iron-binding' were identified only for 3-L1 guts. The DETs associated with roles in 'transport' included transmembrane proteins that transport sugars (Tret1-2) and solutes (MFS-type  Table S2 for a list of all significant (P adj < 0.05) differentially-expressed transcripts and annotations transporter) or cations (organic cation transporter), and proteins that transport lipids in the hemolymph to various tissues (apolipophorins). The 'heme-binding' and similarly the 'iron-binding' functional group was comprised of cytochrome p450s (CYP gene superfamily) belonging to clans 3 (typically CYP6) and 4 (typically CYP4), including two sequences (FOCC013756 and FOCC015103) belonging to two newly discovered and curated CYP gene families (CYP3653A2 and CYP3654A1, respectively) from the F. occidentalis genome [35].

Validation of the RNA-Seq analysis
Using dissected 3-L1 guts from a similarly conducted and replicated TSWV acquisition assay, it was determined that 80% of a subset of 10 selected DETs were validated with ddPCR in direction and, for the most part, the magnitude of change in expression as compared to RNA-Seq (Fig. 3). A correlation analysis of the log 2 foldchange values yielded a significant Pearson correlation coefficient of 0.72 (P = 0.0185), indication of significant validation of the gut RNA-Seq.  Table S3). With only 7 DETs associated with 24-L2, most of them were associated with the 'blue' module. Examination of gene descriptions and GO annotations revealed that the 'turquoise' module represents proteins involved in breakdown of harmful exogenous compounds (e.g. toxic plant substances and insecticides), thermal or antiviral response, proteolysis, and hydrolysis of plant-derived secondary metabolites, collectively considered to describe detoxification and host defense activities. The 'blue' module contains proteins known in other insects to be involved in protein, lipid and sugar transport and storage in preparation of quiescent periods or response to environmental stress, and proteolysis and deglycosylation of proteins in the lysosomecollectively Droplet-digital PCR (ddPCR) validation of the RNA-seq analysis using selected differentially-expressed transcripts (FOCC prefix) of Frankliniella occidentalis first instar larval (L1) guts in response to tomato spotted wilt virus infection. Each transcript was normalized to F. occidentalis actin RNA and calibrated to non-infected L1 guts; for ddPCR: each bar represents the mean (+ standard error of the mean) of three biological replicate samples, with each sample consisting of a pool of 100 L1 guts sampled after a 24-h acquisition access period followed by a 3 h-clearing on green bean pods. RNA-Seq: each bar represents the mean of four biological replicate samples (100 L1 guts per sample); due to the method of determining fold changes in expression from RNA-Seq data (fitting data using the generalized linearized model (GLM) method), standard deviation or standard error of means are not calculated considered to describe metabolic energy economics, i.e., reserves and recycling. The 'lightcyan' module consists of proteins associated with cuticles (many cuticle proteins and endocuticle glycoproteins) and membranes, lipid metabolism, several transcription factors, and neuropeptides, collectively considered to be consistent with insect growth and development.
Three DET hubs were identified in each of the 'turquoise' and 'blue' modules and four were identified for the 'lightcyan' module ( Fig. 5, Table 2). Module memberships, i.e., the correlation of a node to the module eigengene, of these 10 hubs ranged from 0.685-0.985 depending on the module ( Table 2). The DET hubs with the highest connectivity within their modules were CYP3653A2 (FOCC013756) in the 'turquoise' module, a hypothetical protein (FOCC009188) in the 'blue' module, and an endocuticle structural protein -specifically endoCP-G N (FOCC014803), a TSWV-interacting proteins identified in F. occidentalis L1s [33] -in the 'lightcyan' module ( Table 2, Table S4). The 'turquoise' and 'blue' DET hubs exhibited subtle change due to virus infection (~2-to 3-fold change), while the four 'lightcyan' hubs exhibited a more pronounced effect in response to virus regardless of developmental stage (~7to 14-fold change) ( Table 2).
Notably, there were a few tightly connected DETs assigned to other modules (Table S3). For instance, the FOCC006197 and FOCC000732 assigned to the 'cyan' and 'greenyellow' modules, respectively, showed the highest module membership in each module, and the second highest connectivity among the 144 transcripts in 'cyan' module and 191 transcripts in 'greenyellow' module (Table S3, 'cyan_d' and 'greenyellow_d' module tabs). Moreover, after a weight cutoff of 0.15, both transcripts established the highest number of connections (89 and 119 for FOCC006197 and FOCC000732, Fig. 4 Weighted gene co-expression network analysis (WGCNA) of gut-expressed transcripts in Frankliniella occidentalis to identify modules of coexpressed transcripts associated with larval development and response to tomato spotted wilt virus infection. (A) Module-trait relationships depicting significant associations (correlations, P < 0.05) between clusters (modules) of co-expressed transcripts (left-hand column of colored blocks, i.e., module eigengenes) and time-larval stages (trait). Each trait includes normalized read counts of all gut-expressed transcripts from four biological replications and two conditions (both virus-exposed and non-exposed) of the gut RNA-seq analysis, i.e., eight samples per time-stage. Asterisks indicate eight modules that are significantly associated with time-larval stages. (B) Hierarchical clustering of module eigengenes. The dendrogram depicts relationships among modules (eigengenes) of co-expressed transcripts; positively correlated eigengenes cluster within the same clade. Asterisks indicate modules that include sets of differentially-expressed transcripts (DETs) in response to tomato spotted wilt virus infection. (C) Counts of larval gut DETs across 14 modules show enrichment in the 'turquoise', 'lightcyan' and 'blue' modules. 3-L1 = cohort of first instars three hours after removal of virus inoculum after a 24-h acquisition access period; 24-L2 = same cohort of insects developed into early second instars, 24 h after removal of inoculum; and 48-L2 = same cohort of insects entering into late second instars, 48 h after removal of inoculum respectively) with other module members, which altogether indicates that they likely play an important role in influencing the expression of co-expressed genes and the subsequent biological processes.

Significant interactions revealed between time and virus treatment on larval gut expression for selected network hub and connecting DETs
Gut expression patterns were documented for developing larvae during and after removal of the virus inoculum for three of the four network DETs examined (Fig. 6A, C, D). Coinciding with the shift in larval stage (molting) from L1 to L2, a significant shift in expression pattern of noninfected guts was observed between 30 h and 42 h for FOCC013675 (blue hub: lysosomal alpha-mannosidase, P = 0.0004), FOCC013756 (turquoise hub: cytochrome P450, CYP3653A2, P < 0.0001), and FOCC001422 (blue network DET: putative beta-glucosidase 6, P = 0.0003). The shifts in expression patterns of virus-infected guts between 30 h and 42 h were not significant for any of these DETs. Additionally, there was a significant treatment*time interaction for FOCC013675 (P = 0.0005), FOCC013756 (P < 0.0001), and FOCC001422 (P = 0.0004), indicating that the effect of virus treatment depended on time of sampling, and incidentally, larval age and stage. Indeed, the effect of virus on gene expression reversed in direction between 30 and 48 h after L1s first encountered virusinfected tissue, which was equivalent to 6 (early L2) and 24 h (mid L2) after removal of the inoculum. Consistent between the three temporally-dynamic DETs, the virus effect first occurred between 6 and 18 h during exposure to virus-infected tissue, and the direction of change in Transcripts that were in the top 10% of all module transcripts with regards to both module membership (kME, correlation of a node to a module eigengene) and intramodular connectivity (kIN, sum of connection weights between a node and all its network neighbors) were considered candidate hub transcripts (red shapes). Grey and pink lines indicate the correlations between transcripts and interconnections between transcripts and hubs, respectively. For better visualization of the interactions, a weight cutoff of ≥0.15 was applied for 'turquoise' and 'blue' networks, and a weight cutoff of ≥0.35 was applied for the 'lightcyan' network. Line thickness indicates strength of the connection. 3-L1 = cohort of first instars three hours after removal of virus inoculum after a 24-h acquisition access period; 24-L2 = same cohort of insects developed into early second instars, 24 h after removal of inoculum; and 48-L2 = same cohort of insects entering into late second instars, 48 h after removal of inoculum expression confirmed findings obtained for the RNA-Seq time-stages. The virus effect was documented during feeding on infected tissue and after removal of the inoculum (P < 0.05; Fig. 6A, C, D), however, with some variation depending on the DET. For example, transcript abundance of FOCC013756 (CYP3653A2) was significantly modified by virus at 5 of the 6 sampling time points, while lysosomal alpha-mannosidase and the beta-glucosidase transcripts exhibited response to virus at 3 time points at varying times. For the fourth DET monitored -lysozyme (FOCC004976), there was no significant change in expression over time (P = 0.5186) nor a treatment*time interaction (P = 0.6756), and the amount of variation resulted in no apparent virus effect. However, it was more stably expressed in virus-infected guts compared to non-infected control, indicated by their relatively less variation in expression (Fig. 6B).
Virus accumulation, approximated by either normalized abundance of TSWV N (Fig. 6E) or TSWV NSs (Fig. 6F), was also documented in larval guts during exposure and after removal of inoculum. TSWV RNA significantly increased 3.3-fold in average over the course of feeding on infected tissue (L1-6h and L1-24h comparison, P < 0.0001 for N and P = 0.0014 for NSs) and 2.2-fold after removal of the inoculum (L1-24h and L2-24h comparison, P = 0.0093 for N and P = 0.0035 for NSs). A correlation analysis of TSWV RNA abundance and DET expression showed a significant correlation between virus abundance and CYP3653A2 gene expression with a correlation coefficient of 0.5968 between N and CYP3653A2 (P = 0.0089), and 0.6311 between NSs and CYP3653A2 (P = 0.005). There were no significant correlations between virus abundance and the other DETs.

Discussion
The insect alimentary canal is a structurally complex organ that plays a vital role in food digestion and absorption, and in promoting or suppressing microbial and viral associations [37][38][39]. Previous -omics studies on thrips vector species in response to orthotospovirus infection have primarily focused on the whole-body level [13][14][15]31], with a common finding that transcriptomelevel responses vary greatly over the course of thrips development from larvae to adulthood. In this study, we similarly found a tight link between developmental stages (L1 and L2) and gut transcriptome response to TSWV, which may be due to gut activities and physiological changes that occur between the two larval stages (L1 and L2), reflected by the number, diversity and putative functional roles of the differentially-expressed transcripts between the larval stages.
Consistent with the previous whole-body transcriptome studies on F. occidentalis and F. fusca [14,15], the responses of F. occidentalis larval guts to TSWV infection were subtle (less than 1%) in terms of the number of significantly altered gene expressions. However, the magnitude of responses (fold change in expression) in larval guts was significantly increased by 7.5-fold on average compared to whole-body F. occidentalis larvae (P < 0.0001, Mann-Whitney test). This suggests the presence of a dilution effect when pooling all thrips tissues where viruses are only limited to certain tissues/cells, and thus emphasizing the potential linkage between viral tissue tropism and thrips response to virus infection.
Insects have evolved diverse defensive mechanisms to cope with the accumulation of harmful substances in their tissues [40]. Comparison of the provisional functions of all DETs across the larval stages revealed a distinct difference between L1 and L2 guts, where three GO annotation groups were uniquely identified in L1 gut but not in L2. For example, the organic cation transporter assigned to the GO annotations of 'transport' plays an important physiological role in secretion of organic cations derived from endogenous and exogenous origin, including toxic environmental compounds, Fig. 6 Temporal gut-expression of selected hub and connecting transcripts in the 'blue' and 'turquoise' co-expression gene networks for non-exposed (NV) and tomato spotted wilt virus (TSWV) -infected (V) larvae of Frankliniella occidentalis. Droplet-digital PCR was performed to quantify normalized transcript abundance of A) FOCC013675 (lysosomal alpha-mannosidase, 'blue' hub), B) FOCC004976 (lysozyme, 'blue' hub), C) FOCC013756 (cytochrome p450: CYP3653A2, 'turquoise' hub), and D) FOCC001422 (putative beta-glucosidase, 'blue' connecting gene) during (DVE) and post virus exposure (PVE) at six time points during larval growth and development. Asterisks indicate statistically significant differences in gene expression level for pairwise comparisons between V and NV guts (*P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001). The corresponding abundance of E) TSWV nucleocapsid gene (N RNA) and F) TSWV nonstructural small RNA gene (NSs RNA) in larval guts was also temporally monitored using real-time quantitative reverse-transcription PCR (qRT-PCR). Normalized abundance of each thrips and viral RNA was normalized to Frankliniella occidentalis actin RNA. Bars headed by different letters indicate statistically significant differences (P < 0.05) between time-points. No virus was detected in NV guts. Developmental stage (L1, L2) and time-stage corresponding to the RNA-Seq analysis of this study are delineated at the bottom of the figure; double dashes in the developmental stage timeline indicate the transition period (molting) from L1 to L2, thus the 30-h sample included both stages. Each expression data point/bar represents the mean and standard error of three biological replicates (independent experiments), and each replicate represents a pool of 60 larval guts pharmacological drugs, and plant defense chemicals [41]. Mutation of an organic cation transporter gene in Drosophila melanogaster was also shown to cause apoptotic cell death [42]. Cytochrome P450s (P450s) have long been considered to serve multi-functional roles in the oxidative metabolism of a wide variety of plant secondary metabolites and pesticides, as well as in biosynthetic pathways of juvenile hormone and ecdysteroids, which are associated with insect growth, development, and reproduction [43][44][45]. We found that all P450coding DETs assigned to the GO annotations of 'hemebinding' and 'iron-binding' were identified in 3-L1 guts but not in L2 guts. Among them, two were newly discovered to be thrips-specific [35], one of which was further predicted to be a 'turquoise' hub transcript in our network analysis. Another set of DETs that are involved in detoxification is glutathione S-transferase (GST), which is known to detoxify xenobiotics by catalyzing the conjugation of glutathione with electrophilic substrates [46]. These GSTs were also exclusively identified in 3-L1 guts. Notably, the perturbed transcripts (down-regulation) coding for these detoxification proteins are only detected in L1 guts shortly after exposure to virus inoculum but not after 24 h and 48 h post exposure, which may indicate the indirect plant-mediated effect of virus on the larval gut, or alternatively, the potential involvement of detoxification proteins in advancing virus infection of the gut or defending against the infection. Further functional studies are required to elucidate their exact roles in thrips and in virus infection of vectors, in particular the thrips-specific, P450 hub gene. Pathogen invasion into insects stimulates a complex immune response [47]. Understanding the defense response of insect vectors to virus infection is important to uncover the molecular mechanism by which virus overcomes the host immune system. Our study found several immunerelated DETs that are up-regulated in virus-exposed guts, including apolipophorins, lysozyme, and trypsin, but only during the early stage (3-L1) of the infection. In addition to their primary role in lipid transport, apolipophorins also function in pattern recognition and innate immune response by stimulating anti-bacterial/fungal activities, phagocytosis, superoxide production, and detoxification of endotoxins of the insect pathogenic bacteria [48][49][50][51]. Lysozyme possesses ubiquitous antibacterial enzyme activity [52,53]. Trypsin is known for its proteolytic activity in food digestion and degradation of pathogens as a defensive response [54]. On the contrary, it was also shown to promote virus infection in the insect midgut. For instance, the affinity of La Crosse virus (Bunyavirales, the same order that includes TSWV) to Ochlerotatus triseriatus midgut cells was shown to be increased after in vitro proteolytic processing of the virion surface [55,56]. Similarly, midgut trypsin activity was demonstrated to be required for achieving optimal infection of dengue virus-2 (family Flaviviridae) in Aedes aegypti [57]. Furthermore, as the virus accumulated in the larval gut of the present study, we observed that some immune-related DETs were downregulated in the L2 stage. In the early L2 stage (24-L2), an inhibitor of apoptosis (the death-associated inhibitor of apoptosis) was identified to be down-regulated by almost 7-fold in virus-infected guts. Apoptosis is a highly regulated process, and viral infection could induce the programmed cell death response in host cells as a means of eliminating viral pathogens. On the other hand, some viruses may utilize such apoptotic cell death mechanisms to kill cells for their dissemination [58,59]. In both cases, viruses must block or delay cell death until neighboring cells are invaded. At a time when TSWV abundance has not peaked (as is the case in early L2s), inhibition of apoptosis in thrips gut cells may benefit virus replication to accumulate sufficient progeny for cell-to-cell spread. This may also support the notion that TSWV is non-pathogenic to F. occidentalis. Lastly, in the 48-L2 stage, some DETs involved in anti-microbial/fungal/viral response were downregulated in virus-exposed guts, such as chitinase, diptericin, and troponin. Another study showed that silencing of the diptericin gene regulated by one of the canonical signaling pathways (immune deficiency or IMD) led to the significant increase in Sindbis virus replication in Drosophila melanogaster [60]. We speculate that the downregulation of diptericin in F. occidentalis larval guts may contribute to the accumulation of TSWV.
Weighted gene correlation network analysis (WGCN A) has been widely used to facilitate the understanding of various biological processes in different organisms and to identify highly connected hub genes as candidate biomarkers or therapeutic targets of diseases [36]. Our study findings provided several candidate genes for studying vector competence of F. occidentalis. Mirroring our gene ontology study, a great number of DETs encoding detoxification proteins were clustered into the 'turquoise' module, and all DETs associated with lipid transport and storage were clustered into the 'blue' module. Similarly, DETs encoding cuticle/endocuticle proteins and others involved in lipid metabolism were clustered into the 'lightcyan' module. To some extent, these findings confirm the validity of our network-based systems biology analysis. As the hub genes share the most connections with other co-expressed genes in a module, understanding their functional roles will likely lead to important biological insights. Although it is uncertain how the hub transcripts crosstalk with other clustered DETs, the tight co-regulation and functional relatedness point to their significance in TSWV infection events. The 10 hub transcripts identified through our network analysis prioritize future evaluation to decipher their functions in thrips and in virus transmission using functional analysis tools, such as RNA interference and interactive proteomics. Particularly, CYP3653A2 and FOCC009188 encoding a novel P450 and a hypothetical protein are the two most connected hub transcripts in the 'turquoise' and 'blue' module networks, respectively. Both proteins are uniquely identified in F. occidentalis, which make them interesting and important targets to investigate in the future in the context of TSWV transmission.
Our network analysis identified a gut DET coding for endocuticle structural glycoprotein (endoCP-GP, FOCC014803) as the most connected hub in the 'lightcyan' network. A pairwise alignment of coding sequences (CDS) revealed that FOCC014803 shared 99.9% nucleotide sequence identity (100% coverage) with EndoCP-G N (GenBank accession no. MH884757), an endocuticle structural protein reported to be expressed in F. occidentalis larval guts, including the midgut, and interacted directly with TSWV glycoprotein (G N ), the viral attachment protein [33]. It is curious that an endoCP-GP containing a chitin-binding domain was expressed in the thrips midgut, the site of TSWV entry, since it is understood that the thysanopteran midgut secretes an extra-cellular lipoprotein membrane devoid of chitin, i.e., perimicrovillar membrane (PMM) [61]. The role of TSWVbinding and/or infection-responsive endoCP-GPs expressed in the midgut have yet to be determined. While it is still unknown whether EndoCP-G N serves as a receptor for TSWV, its perturbation in infected L1 guts coupled with high connectivity to other DETs (i.e., hub gene) underscores a critical role during the virus lifecycle.
Both TSWV accumulation and expression of TSWVresponsive gut genes (DETs) varied over larval development. TSWV accumulation in larval guts supported findings reported for larval whole bodies [26], where the abundance of N RNA increased significantly over larval growth and development. In the present study, accumulation of NSs RNA provided confirmatory evidence of increased virus abundance over larval development. While there was no consistent association between virus abundance and DET expression in guts, virus infection did temporally modulate DET expression patterns as compared to non-infected guts. This was most evident at or around molting (transition from L1 to L2). While only a few DETs were monitored in the temporal study, consistently the largest differential between TSWVinfected and non-infected guts occurred i) during exposure to the virus inoculum, and ii) when guts supported the highest virus titers, i.e., during the L2 stage, when the insect was far removed from virus-infected plants. Our temporal study was not intended to completely uncouple gut responses due to direct effects of virus infection from plant-mediated, indirect effects of feeding on infected tissue, however it did reveal patterns of expression that may be experienced by emergent L1s feeding on infected plants in natural settings. Findings from our temporal study warrant further investigation to uncouple direct from indirect effects of virus on larval gut responses during thrips development.

Conclusions
Our study centers on the first tissue barrier to internalization of orthotospoviruses into the thrips vector body the larval gutand documents transcriptome-level gut response to viral invasion and accumulation during larval growth and development. The key determinant of vector competence for thrips vectors is acquisition of virus during the larval stage, a requisite that uniquely distinguishes thrips vectors from other insect vectors (e.g., aphids, their closest relatives). Our study represents the first description and analysis of the transcriptomewide response of thrips vector gut tissue to an orthotospovirus, and provides promising, highly connected network hub genes for future biochemical pathway-probing of putative biomarkers of vector competence. In addition, we optimized and implemented ddPCR for quantification of thrips gene expression in order to circumvent the challenges of obtaining sufficient amounts of RNA from minute tissue systems dissected from minute insects, and consequently increasing sensitivity over that of qPCR technology. Our findings and F. occidentalis larval gut sequences will enable functional genomics studies for better understanding thrips transmission biology and for developing novel ways to disrupt virus transmission.

Thrips colony and virus source of inoculum
The insect colony of Frankliniella occidentalis originating from the island of Oahu, HI was maintained on green bean pods (Phaseolus vulgaris) in the laboratory following conditions as previously described [62]. TSWV (isolate TSWV-MT2) was maintained on Emilia sonchifolia by thrips-transmission and mechanical inoculation, alternately, as described previously [33]. Consecutive mechanical passage of virus was not allowed to ensure the vital transmissibility of virus by thrips. The symptomatic E. sonchifolia leaves collected from 12 days postmechanical inoculation were used for insect acquisition of TSWV. This study complies with relevant institutional, national, and international guidelines and legislation for research involving plants.

TSWV acquisition by larval thrips
An experiment comprised of two treatments (+/− TSWV exposure), three sampling time points during larval development, and four biological replications of each treatment-time combination (i.e., four independent trials of the experiment) formed the basis of the RNA-Seq analysis of larval gut gene expression in response to TSWV. To obtain cohorts of L1s (0 to 16-h-old) for virus acquisition, females were allowed to oviposit on healthy green bean pods (robust quality, no apparent blemishes or evidence of decay) contained in colony cups (deli cups, bottom lined with filter paper and lid fitted with thrips proof-mesh) for 24 h and then they were removed. Impregnated bean pods were incubated at room temperature for 3 days and emergent larvae (of various ages) were thoroughly brushed off of the bean pods and removed from the cups. The beans were incubated for an additional 16 h to collect the agesynchronized L1s (0 to 16-h-old) by gentle tapping and brushing beans to allow the young larvae to gently drop to the bottom of the container. A subset (approximately half) of these young L1s were exposed to symptomatic, TSWV-infected E. sonchifolia leaves, and the remaining half were exposed to TSWV-free, healthy E. sonchifolia leaves; the exposure time for both treatments was 24 h (Fig. S1.1). These virus-infected (V) and non-infected (NV) larvae were transferred to two separate containers containing green bean pods and were allowed to feed and develop for 3, 24, and 48 h, respectively. Virus acquisition and larval rearing were performed at 25-26°C with a 16-h photoperiod.
Virus acquisition efficiency was assessed for each biological replicate by real-time quantitative reversetranscription PCR (qRT-PCR) to determine the virus infection status of 10-12 individual L2 thrips (24-h time point). RNA was extracted from individual larvae using Chelex 100 (Bio-Rad Laboratories, Hercules, CA) as described by Boonham et al. 2002 [63], and cDNA was synthesized from 11 ul of RNA template using the Verso cDNA Synthesis kit (Thermo Scientific, Wilmington, DE) following the manufacturer's protocol, and diluted 4-fold with nuclease-free water. Virus infection was determined by the presence of TSWV-N RNA in each sample using the real-time quantitative Polymerase Chain Reaction (qPCR). Two technical replicates of 20 ul reaction mixture were included in qPCR for each sample, which contained 200 nM of TSWV-N or thrips internal reference Actin gene primers in the SYBR Green Supermix (Bio-Rad, Hercules, CA, USA). Samples with no signal and/or high (> 35) Cq values were considered as TSWV-negative insects. The primer pairs used in this test are listed in Table S5. The reactions were performed on the CFX Connect Real-Time PCR Detection System (Bio-Rad, Hercules, CA) with the two-step amplification protocol followed by a melting cycle. The thermal cycling protocol began with 1 cycle of 95°C for 30 s (s), then 40 cycles of 95°C for 10 s and 55°C for 30 s. The melting cycle was processed after the amplification protocol by raising the temperature from 55°C to 95°C by 0.5°C increase per second. No less than a 90% infection rate was found for TSWV-exposed larvae from each biological replicate.

Thrips gut dissection and RNA sample preparation
A sub-sample of larvae from each sampling time point was collected into a Petri-dish and placed on ice for anaesthetization by chilling. In succession, single larvae were transferred to a fresh amount of 60% ethanol (~80 ul) on a microscope slide and decapitated by using a sterile, polymer-coated, #15 scalpel blade (Southmedic, Barrie, ON, Canada) to cut between the pro-and mesothorax segments. The principal salivary glands, tubular salivary glands, fat body, and other residual parts/tissues were carefully removed from the gut tissue using an ultrafine single deer hair tool (Ted Pella, Redding, CA). A second amount of fresh 60% ethanol (~100 ul) was added to the slide, and with the deer hair, the single gut tissue was pulled through the surface of the ethanol, which aided in washing residual tissues and other body contents off of the gut surface. The single gut tissue (foregut, midgut and partial hindgut) was immediately immersed and lysed in 200-ul PicoPure extraction buffer (PicoPure DNA isolation kit, Arcturus, Mountain View, CA) in a sterile, RNase-free microfuge tube. This iterative process proceeded to achieve 100 guts pooled in the tube of extraction buffer per treatment, and then the samples were processed for total RNA extraction using the PicoPure RNA isolation kit (Arcturus) as described by the manufacturer. On-column DNase treatment was performed with a RNase-free DNase set (Qiagen, Valenica, CA). The 200-ul PicoPure extraction buffer containing gut RNA was applied onto two PicoPure spin columns and the final eluted RNA samples were combined for maximizing the RNA yield. To minimize any possible batch effects during the experiments, gut dissections were performed under the same laboratory condition during the same time of the day and by alternating between 20 virus-exposed larvae and 20 non-exposed larvae until achieving 100 guts for both samples (using separate tools for each treatment). The quantity and quality of RNA samples were evaluated with the Agilent 2100 Bioanalyzer system. The average RNA yields ranged from 1655 to 2412 ng with an average RNA Integrity Number (RIN) score of 8.33 (ranged from 7.70 to 8.80), indicating no sign of RNA degradation. RNA samples were stored at − 80°C until use. This time-course experiment was repeated four times.

Library preparation and RNA sequencing
All 24 gut RNA samples were sent to the Genomic Sciences Laboratory facility at North Carolina State University for cDNA library preparation and RNA sequencing. The directional (strand-specific) RNA-Seq libraries were constructed with poly(A) mRNA enrichment and rRNA depletion processes using the NEBNext Ultra II Directional RNA Library Prep Kit (New England Biolabs, Ipswich, MA). The indexed libraries were multiplexsequenced in duplicate using two flow cells, each containing all 24 libraries to increase read coverage. The 150-bp paired-end sequence reads were generated by the Illumina NextSeq 500 platform. A total of 89.2 and 94.7 gigabytes of sequencing data were generated from each flow cell, after which the duplicate raw read files for each library generated from these two flow cells were concatenated using Windows Command Prompt. The raw sequence data has been deposited in the NCBI SRA as BioProject PRJNA748697.

Read mapping and differential gene expression analysis
All reads were trimmed for adapters in CLC Genomics Workbench version 12.0. Quality control of the raw sequence reads was further performed by discarding reads with a Phred score less than 20 and length shorter than 100 bp. After trimming, the average Phred score for all paired-end reads of 24 libraries was 30, ranging from 21 to 36. The trimmed, paired-end reads were mapped using the official gene set version 1.0 (OGSv1.0) [64,65] of the Frankliniella occidentalis genome [35]. The perand cross-sample library size normalization were performed on the mapped reads by RNA-Seq Analysis tools in CLC Genomic Workbench. Differential gene expression analysis was then completed using the normalized read count data in value of transcripts per million (TPM) for each treatment (V/NV) of corresponding sampling time point-stage (3-L1, 24-L2, and 48-L2) from all four biological replicates. Number of differentiallyexpressed transcripts and their fold change values were obtained after a cutoff of 2.0 for absolute fold change and 0.05 for false discovery rate (FDR) corrected pvalues.

Gene ontology analysis of differentially-expressed transcripts (DETs)
Nucleotide sequences of all DETs identified across the three sampling time points (147 non-redundant) were subjected to Blastx using the NCBI non-redundant protein database and Gene Ontology (GO) functional annotation (E < 1.0E-6) using the CloudBlast resource in Blast2GO v5.2.5 [66]. The DETs with GO terms assigned were functionally categorized based on their molecular function, biological process, and subcellular localization [67].

Validation of RNA-Seq using Droplet Digital PCR
Because of the difficulty in obtaining ample amounts of gut RNA (and sample volumes) for analysis of multiple sequence targets, and the limited sensitivity of conventional qPCR to detect templates of low abundance, Droplet Digital PCR (ddPCR) (Bio-Rad, Hercules, CA) was utilized to validate a subset the 3-L1 gut transcripts that responded to TSWV. Preliminary trials with titrated (diluted) L1 gut RNA samples determined that transcripts with RNA-Seq normalized read counts (TPM, transcripts per million) less than 150 TPM were below the level of detection or not reliably quantifiable by ddPCR. As such, we selected transcripts with TPM values of > 150 for both V and NV treatments for the ddPCR analysis. Ten transcripts satisfying this criterion were arbitrarily selected from the suite of 3-L1 DETs. The analysis was performed on gut samples obtained from three biological replications of another, independent TSWV acquisition assay, instead of using remaining RNA samples from the RNA-Seq experiment. Primers for target DETs were designed using Primer3Plus (http://www.bioinformatics.nl/cgi-bin/primer3plus/ primer3plus.cgi). Primer sequences are listed in Table  S5. The cDNA sample was synthesized from 500 ng of 100-pooled gut RNA and diluted 16-fold. Eight microliters of the diluted cDNA were added into 12 ul Eva-Green reaction mixture containing 200 mM of sequence-specific primer pairs following the manufacturer's protocol. The internal reference gene (actin) and a water control were included in the ddPCR analysis for normalizing the estimated copy number of target transcripts. Two technical replicates of 20 ul of EvaGreen reactions for target transcripts, actin reference, and water control were included in each ddPCR run. The abundance of the target transcript (copies/ul) was estimated by QuantaSoft software v1.7 (Bio-Rad, Hercules, CA). The log2 fold change of each target transcript in virusexposed guts against non-exposed guts was calculated after normalization to the reference gene (target gut transcript/reference) [68].

Weighted gene co-expression network analysis
The pairwise relationships among all thrips transcripts (a total of 16,849 transcripts) across RNA-Seq samples were analyzed using the weighted gene co-expression network analysis (WGCNA) [36]. The TPM-normalized read count data for 24 RNA-Seq samples was used for network construction to detect modules (clusters) of highly correlated transcripts, followed by eigengene calculation for each module and relating modules to external sample traits (time-larval stages) using an eigengene network. The weighted network analysis was carried out using the WGCNA R software package, following the tutorials with "Step-by-step network construction and module detection" methodology (last updated on February 13, 2016) developed by Langfelder and Horvath ( h t t p s : / / h o r v a t h . g e n e t i c s . u c l a . e d u / h t m l / CoexpressionNetwork/Rpackages/WGCNA/Tutorials/) [36]. Briefly, the input count data was cleaned for excessive missing values, after which all 24 samples were preserved for downstream analyses. The expression samples were matched to their corresponding external traits for time-larval stages (3-L1, 24-L2, and 48-L2). A softthresholding power of 12 was selected to reach values of 0.97 for the scale-free topology fit index. The adjacencies between genes were calculated using the "Signed" network after raising the co-expression similarity values by the power of 12. The hierarchical clustering tree of genes was generated with the topological overlap matrix (TOM)-based dissimilarity. Modules of highly coexpressed genes were identified using the Dynamic Tree Cut method with minimum module size set as 30 and their corresponding eigengenes were calculated. The highly correlated modules were merged into one if the pairwise correlation of their eigengenes was larger than 0.75. The modules that were significantly associated with the external traits (time-larval stages) were identified and quantified by analyzing their corresponding correlations and p-values.
Considering the significance of identified module-trait correlations and the number of DETs assigned to each module, the 'turquoise', 'blue', and 'lightcyan' modules were selected for further analysis (Table S3). The DETs identified across the three time-larval stages were exported from each selected module and used for constructing the intramodular interaction networks. The transcripts that fell into the top 10% of both module membership (kME, correlation of a node to a module eigengene) and intramodular connectivity (kIN, equals to the sum of connection weights between a node and all its network neighbors) were considered as the candidate hub transcripts. The intramodular interaction networks of DETs within 'turquoise', 'blue', and 'lightcyan' modules were visualized using VisANT software platform v5.53 [69]. For better visualization of the nodes and connecting lines between nodes (i.e. to reduce complexity and crowding of the module network connections), a weight cutoff of ≥0.15 was applied for 'turquoise' and 'blue' networks, and a weight cutoff of ≥0.35 was applied for 'lightcyan' network. Increasing the weight cutoff reduces the complexity in the visual depiction of the network neighborhood by retaining the strongest member-member (node-node) connections.
Gene expression analysis of hub and select connecting DETs in larval guts during and after virus exposure A time-course experiment was performed to analyze the temporal expression of selected hub transcripts and their connecting transcripts (see Fig. S1.2 for guided illustration). The virus acquisition experiment was set up similarly as the RNA-Seq experiment described above, where 24 h AAP was given to the synchronized L1s followed by a 24 h feeding period on healthy green bean pods. However, a subsample of larvae (~30-40) from each treatment were sequentially collected after 6 h, 18 h, 24 h, 30 h, 42 h, and 48 h since the first exposure to virusinfected leaves. Larvae from the first three time points were collected during virus exposure (DVE, 24 h AAP) and the rest were collected post virus exposure (PVE, feeding on healthy green beans). Twenty dissected gut tissues were pooled per sample and extracted for RNA using PicoPure RNA isolation kit (Arcturus) following the manufacturer's protocol. 130 ng of RNA were reverse transcribed into cDNA using a Verso cDNA kit and diluted in 4-fold with nuclease-free water. The expression levels of target gut transcripts were detected in ddPCR following the procedure described above. The estimated abundance of target transcript was normalized to the internal reference F. occidentalis Actin gene. Additionally, the abundance of TSWV-N and NSs RNA were also estimated at each sampling time point using qPCR as described above and normalized to the internal reference F. occidentalis Actin gene using the inverse equation in Pfaffl [70] as described in Rotenberg 2009 [71]. Primer sequences are listed in Table S5. The reliability of Actin as an invariant reference gene for normalized expression analysis in gut tissues between treatments within each sampling time point was confirmed with no more than 2.8% of the coefficient of variation in Actin Cq values between virus-exposed and non-exposed gut samples. The time-course experiment was independently repeated three times.
Statistical analysis of temporal expression data was performed by fitting to a generalized linear mixed model using the PROC GLIMMIX procedure of SAS 9.4 (SAS Institute, Cary, NC) with virus condition, time, and their interaction as the three fixed effects. The Gaussian distribution (link function = identity) and restricted maximum likelihood estimation method were specified. The statistically significant effects on gene expression levels were determined by the Type III tests of fixed effects and least square means using the LSMEANS 'Slice' and 'Slicediff' options with a Tukey adjustment option for multiple comparisons (P < 0.05). One-way ANOVA following Tukey's multiple comparisons (JMP Pro 15.2.0) was used to test the statistical significance in virus abundance between treatments. Pearson's correlation was performed between log2 transformed TSWV-N or -NSs normalized abundance and log2 transformed expression of target transcripts in JMP Pro 15.2.0.