The genome of T. vaporariorum
Sequencing of a T. vaporariorum colony established from a single female using the 10X Genomics Chromium linked-read system generated 239 Gbp of sequencing data (Additional file 2: Table S1). k-mer analysis revealed a coverage peak at around 95X, and estimated a heterozygosity rate of 0.49% and a genome size of 591 Mbp (Additional file 3: Table S2 and Additional file 4: Figure S2A). The latter closely matches the genome size (615 MB) of the other sequenced whitefly species, B. tabaci [16]. Supernova effectively used 300 million raw short-reads with a minimum read length of 139.50 bp and molecule length of 33.75 kb (Additional file 5: Table S3) to generate a genome assembly of 581.92 Mb. The final assembly comprised 6016 scaffolds > 10 kb, with a contig N50 of 21.67 kb and scaffold N50 of 921.58 kb. The completeness of the gene space in the assembled genome was assessed using the Benchmarking Universal Single-Copy Orthologues (BUSCO) and Core Eukaryotic genes mapping approach (CEGMA) pipelines. BUSCO analysis identified 90.8, 92 and 93.5% of the Eukaryota, Insecta and Arthropoda test gene sets respectively as complete in the assembly (Additional file 4: Figure S2B). Furthermore, 94% of CEGMA core Eukaryotic genes (including both complete and partial genes) were present in the assembled genome (Additional file 6: Table S4). Structural genome annotation using a workflow incorporating RNAseq data predicted a total of 22,735 protein-coding genes (Additional file 7: Table S5). Of these 19,138 (79%) were successfully assigned functional annotation based on BLAST searches against the non-redundant protein database of NCBI and the InterPro database (Additional file 4: Figure S2C).
The proteome of T. vaporariorum was compared with B. tabaci -v1.2, A. glabripennis -v2.0, T. castaneum -v5.2, M. persicae G006 -v1.0, A. pisum -v2.0 and D. melanogaster -v6.0 by orthology inference to obtain 15,881 gene clusters. Among these, 5345 gene clusters were found in all species of which 373 consisted entirely of single-copy genes. A total of 251 genes were specific to T. vaporariorum, 9841 genes were shared between T. vaporariorum and B. tabaci, and 7990, 7484, 8072, 7492 and 6805 genes are shared between T. vaporariorum and A. glabripennis, T. castaneum, A. pisum, M. persicae and D. melanogaster respectively. Based on mcmctree analysis, the divergence time between T. vaporariorum and B. tabaci was estimated to be approximately 110 million years ago (MYA).
Modelling of global gene gain and loss revealed a gene turnover rate of 0.0026 gains and losses per gene per million years in T. vaporariorum, similar to that reported for D. melanogaster (0.0023 duplications/gene/million years) [19]. Estimation of gene gain and loss in gene families across the 7 arthropod species revealed a positive average gene family expansion (0.1427) in T. vaporariorum, with a greater number of gene families expanded (1832) and genes gained (2931) than contracted (587) or lost (734) (Additional file 8: Table S6). This contrasts with B. tabaci which has a negative (− 0.0993) average expansion resulting from a lower number of gene families expanded (545) and genes gained (1079) than contracted (2213) or lost (2600) (Additional file 8: Table S6). Thus, under the assumption of a constant gene gain and loss rate (ʎ) throughout the arthropod phylogeny, gene gain is higher and gene loss lower in T. vaporariorum than B. tabaci (Fig. 1c). Gene ontology (GO) enrichment analysis of genes specific to the whitefly clade, identified GO categories related to carbohydrate metabolism, peptidase activity, proteolysis and transferase activity as significantly enriched (p < 0.0001) (Additional file 9: Table S7). A total of 43 gene families were identified as rapidly evolving in T. vaporariorum with genes involved in metabolic processes, nucleic acid binding, and catalytic activity significantly enriched (Additional file 10: Table S8). Approximately 30% of the rapidly evolving genes gained in T. vaporariorum, are contracting in B. tabaci among which genes involved in transposase activity, DNA recombination, aspartic-type peptidase activity, actin filament binding, motor activity and cytoskeletal protein binding are significantly enriched.
Curation and phylogeny of genes involved in detoxification of natural and synthetic xenobiotics
Because of our interests in the mechanisms underpinning adaptation of T. vaporariorum to plant secondary metabolites and insecticides we manually curated the gene superfamilies most frequently implicated in detoxification and/or excretion of these xenobiotics, namely cytochrome P450s (P450s), carboxyl/cholinesterases (CCEs), glutathione S-transferases (GSTs), UDP-glucuronosyltransferases (UGTs) and ATP-binding cassette transporters (ABC transporters) (Additional file 11: Table S9-S13). Phylogenetic analysis was then performed, with the curated gene sets of T. vaporariorum compared to those of B. tabaci (MEAM1) [16].
A total of 80 cytochrome P450s were identified in the T. vaporariorum genome assembly, representing an additional 23 novel genes beyond those previously described in the transcriptome of this species. While this takes the P450 gene count into the range of most other insect species (Additional file 12: Table S14), it is still significantly reduced when compared to B. tabaci which has 130 P450 genes. Phylogenetic comparison of the CYPome of T. vaporariorum and B. tabaci (Fig. 2a) revealed that both the CYP2 and mitochondrial clades are highly conserved between the two species with 1:1 orthologs observed for all members of the mitochondrial clan and only 3 additional enzymes found within the CYP2 clade of B. tabaci. However, significant differences in the CYPomes of the species are observed in the CYP3 and CYP4 clades. This is largely due to the presence or absence of certain P450 subfamiles in one of the species, or major expansions/contractions in other subfamilies. Within the CYP3 clan this is most apparent for the CYP402C (13 members in B. tabaci but none in T. vaporariorum), CYP6CX (7 members in B. tabaci but none in T. vaporariorum) and CYP6DT (no members in B. tabaci but 7 members in T. vaporariorum) subfamilies. While less marked than the above cases it is also notable that the CYP6CM subfamily comprises just one gene (CYP6CM1) in B. tabaci but three genes in T. vaporariorum. CYP6CM1 of B. tabaci is the most well characterised P450 in any whitefly species as its overexpression leads to resistance to several insecticides [20,21,22,23]. A similar pattern was observed in the CYP4 clade with the CYP3133 family, which is unique to the two whitefly species, comprising 19 genes and 7 subfamiles in B. tabaci but just one subfamily comprising 5 genes in T. vaporariorum. Likewise the CYP4CS subfamily contains 13 members in B. tabaci but only three members in T. vaporariorum. The net effect of the differences in the two clans sums to 17 additional CYP3 P450 genes and 31 CYP4 genes in B. tabaci. Both T. vaporariorum and B. tabaci are highly polyphagous so this disparity in P450 gene content is somewhat surprising, however, similar numbers of P450 genes are observed in the genomes of the generalist aphid M. persicae and the specialist A. pisum [24] demonstrating that CYPome size does not necessarily correlate with an insects host plant range.
In the case of GSTs a total of 26 genes were collated from the T. vaporariorum genome assembly - an addition of 4 sequences compared to the previous transcriptome. This number is comparable with other insect species and slightly higher than B. tabaci (24 genes). Interestingly, phylogeny (Additional file 13: Figure S3A) revealed a GST belonging to the epsilon class in T. vaporariorum, a clade not found in B. tabaci or indeed the sap sucking aphids M. persicae or A. pisum [25]. The largest clade in both whitefly species was the delta clan with 14 genes observed in T. vaporariorum and 12 in B. tabaci. Both the delta and epsilon classes of GSTs are unique to insects and members of this class have been previously implicated in detoxification of insecticides [26].
A total of 31 CCEs (4 novel) were identified in the T. vaporariorum genome. This is a comparable number to other insect species but again is reduced compared to B. tabaci which has 51 CCE genes. Phylogeny (Additional file 14: Figure S4A) assigned 14 of the T. vaporariorum CCE genes to the A and C clades, which have been previously associated with the detoxification of xenobiotics and metabolism of dietary compounds [27]. Despite the high number of CCEs in B. tabaci fewer of the CCE genes in this species are observed in these clades and so, with respect to xenobiotic tolerance, T. vaporariorum may be equally or even better equipped to hydrolyse allelochemicals and/or synthetic insecticides. B. tabaci has a larger total number of CCEs due to an expansion of CCEs belonging to the E clade which function to process hormones and pheromones [27]. Other clades principally related to neurodevelopment and cell adhesion remain largely consistent between the two whitefly species.
A total of 46 ABC transporters were curated from the T. vaporariorum genome, comparable with the number observed in B. tabaci (50) (Additional file 15: Figure S5A). In many of the clades (C, D, F and A) close to 1:1 orthology between the two species is observed. However, significant differences in the two species are observed in the B and G clades with many more ABC transporter genes observed in B. tabaci in the G clade and more genes in the B clade in T. vaporariorum. ABC transporters belonging to several clades (B, C, D and G) have been previously associated with detoxification of natural and synthetic xenobiotics in several arthropod species [28, 29]. These include B. tabaci where several ABC transporter genes of the G clade were implicated in resistance to neonicotinoids [30].
Comparison of the UGT gene family of T. vaporariorum with that previously described for B. tabaci [16] initially suggested that the genome of B. tabaci contains close to double the number of UGT genes (81) than the number observed in T. vaporariorum (42). However curation and naming (UGT nomenclature committee) of UGT genes in the two species revealed many of the previously proposed UGTs of B. tabaci were partial or not bona fide UGTs reducing the number in this species to 51 (Additional file 12: Table S14). Despite the similarity in UGT gene number in the two whitefly species, phylogenetic analysis (Additional file 16: Figure S6A) revealed marked contractions/expansions in specific UGT families between the two species. For example, the UGT353 family contained 1 gene in T. vaporariorum but 10 genes in B. tabaci. Such large species-specific blooms have been described in insect UGTs previously, for example, the UGT344 family of the pea aphid A. pisum and the UGT324, 325 and 326 families of red flour beetle (Tribolium castaneum) [31]. While other UGT families were observed in both T. vaporariorum and B. tabaci (UGT357, 358, 354), the pattern of one to one orthologs observed for several P450 subfamilies in the two species was not apparent (Additional file 16: Figure S6A). Previous analysis of insect UGTs [32] observed generally poor conservation between different insect species with genes frequently grouping in species-specific clades and our results are consistent with this. However, one clade that does not exhibit this pattern is the UGT50 family which is nearly universal across insect species, where it is composed of one member suggesting it has a conserved, and important, physiological role. Interestingly, while a single gene belonging to this family is found in B. tabaci, no member of this family was identified in T. vaporariorum, a phenomenon only previously reported for the pea aphid A. pisum [31].
In summary, across the five superfamilies of genes that play a key role in the ability of insects to detoxify and/or excrete natural and synthetic xenobiotics we observed ~ 1.4-fold difference in total gene number between T. vaporariorum (225) and B. tabaci (306). It has previously been suggested that species with larger complements of these families may be associated with a broader host range and greater propensity to develop resistance to chemical insecticides. However, both T. vaporariorum and B. tabaci are highly polyphagous and appear to be equally adept at evolving resistance to chemical insecticides [33]. Thus our findings support previous work which has found no direct link between host plant range, size of enzyme families and pesticide resistance [34, 35].
Host-plant effects on the sensitivity of T. vaporariorum to insecticides
To explore the relationship between the sensitivity of T. vaporariorum to natural or synthetic insecticides and the host plant on which it was reared we established cultures of the insecticide susceptible strain TV1 on bean, tobacco, tomato, cucumber and pumpkin. The sensitivity of each line to synthetic insecticides belonging to four different insecticide classes, and the plant secondary metabolite nicotine was then examined. The population reared on bean, the host of origin, acted as a reference for the calculation of tolerance ratios (TRs). Adaptation to different host-plants was frequently associated with significant decreases in sensitivity to insecticides (Fig. 3, Additional file 17: Table S15). This was particularly apparent for the nightshade hosts (tobacco and tomato) which in general exhibited a higher tolerance to the tested insecticides than all other lines. All lines showed significant tolerance to the pyrethroid bifenthrin compared to the line on bean and this was particularly pronounced for the tobacco and tomato lines (TRs of 16 in both cases). Similarly, the lines reared on tobacco and tomato show significant tolerance to the antifeedant pymetrozine and the neonicotinoid imidacloprid compared to the bean-reared line. However, the most dramatic changes in sensitivity were observed for the diamide chlorantraniliprole. In this case the cucurbits, in particular cucumber, showed marked tolerance to this compound compared to both the bean-reared (TR of 42) and nightshade-reared lines (TR of 12–55). In the case of the natural insecticide nicotine only the tobacco-reared line exhibited a significant reduction in tolerance to this compound.
These data, in combination with a range of previous studies (see introduction), demonstrate unequivocally that host plant can strongly influence the susceptibility of herbivorous insects to insecticides. It is notable that the T. vaporariorum lines reared on the nightshade hosts showed the broadest spectrum of tolerance to the tested insecticides. Tobacco and tomato are challenging hosts for most insect species due to the profile of insecticidal allelochemicals they produce (see introduction). This finding is therefore consistent with previous studies [12, 36,37,38,39,40,41,42,43,44,45] which have provided strong evidence that host-dependent insecticide tolerance results, in part, from induction of insect detoxification pathways in response to plant allelochemicals.
Host-plant effects on T. vaporariorum gene expression
To examine if changes in insecticide sensitivity of the host-adapted lines were correlated with changes in gene expression we performed replicated messenger RNA sequencing (RNAseq) of each T. vaporariorum line. Comparisons against the bean-reared line identified 65–4304 significantly differentially expressed (DE) genes (Fig. 4b, Additional file 18: Tables S16-S19), with a greater number of genes upregulated in lines reared on the alternate (non-bean) host plant. The most dramatic transcriptional response was observed for the nightshade-reared lines with 4304 and 2974 genes identified as DE in the tomato and tobacco-reared lines compared to the control line on bean. In contrast, just 65 genes were DE between the pumpkin- and bean-reared T. vaporariorum lines, with an intermediate number of genes (2069) DE in the comparison with the cucumber-reared line. Comparison of the lists of DE genes revealed clear plant-family specific transcriptional signatures with the nightshade derived lines sharing more DE genes with each other than with either of the cucurbit-reared lines and vice versa (Fig. 4a). This clear evidence of a plant-specific transcriptional response has also been observed in Lepidoptera and spider mites [9, 11, 12]. The magnitude of the transcriptional response of T. vaporariorum to the different host plants is consistent with the profile of the defensive secondary metabolites they produce. Our results suggest extensive transcriptional reprogramming is required for T. vaporariorum to effectively utilise the nightshades as hosts, which produce a challenging profile of allelochemicals including potent natural insecticides. In contrast, our data suggest that only a limited transcriptional response is required for T. vaporariorum to adapt from bean to pumpkin, which produces a lower concentration of the anti-herbivore cucurbitacins than cucumber - on which T. vaporariorum exhibited more extensive remodelling of gene expression. Thus, generalism in T. vaporariorum is associated with marked transcriptional plasticity. This finding provides further vidence that polyphagous species can rapidly tailor gene expression for a particular host and this plasticity plays an important role in their striking ability to utilise a diverse range of plants.
Gene ontology (GO)-term enrichment analysis identified significantly enriched processes for both the tobacco-reared and the tomato-reared comparisons, however, no over- or under-represented terms were identified in the RNAseq comparisons involving the cucumber or pumpkin-reared lines (Additional file 19: Figure S7). The significantly enriched terms for the tomato-reared comparison primarily relate to nucleic acids with many of the terms involving nucleotide, nucleoside and ribonucleotide binding. This likely reflects the DE of genes involved in regulating the large scale transcriptional changes observed in the tomato-reared comparison (see below) and parallels the findings of previous research on host-plant adaptation of the polyphagous butterfly, Polygonia c-album [9]. Interestingly, the same terms were enriched in the genes classed as rapidly evolving in T. vaporariorum (see above). The majority of the enriched terms in the tobacco-reared comparison reflect metabolic processes and ranged from higher-level terms such as primary metabolism to more specific terms such as heterocyclic compound and nitrogen compound metabolism. In regards to the two latter terms it is notable that nicotine, the natural insecticide produced by tobacco, is a heterocyclic nitrogen compound. Finally, the list of enriched terms also included ‘catalytic activity’ which is synonymous with enhanced enzyme activity, and may reflect a response to the allelochemicals produced by tobacco. The only significantly enriched term shared by the tobacco-reared and tomato-reared comparisons was ‘ion binding’.
QPCR was used to validate the expression of 6 genes selected randomly from those that were DE across RNAseq comparisons, and three P450s CYP6CM2, CYP6CM3 and CYP6CM4 that show high similarity to a known insecticide resistance gene (CYP6CM1) in B. tabaci. All genes were validated as DE although the fold-changes observed in QPCR were lower than those reported by edgeR in RNAseq analysis (Additional file 20: Figure S8).
Detoxification and transport of natural and synthetic xenobiotics
To build on our earlier analysis of genes involved in the detoxification and/or excretion of natural and synthetic insecticides we examined the expression of genes encoding P450s, GSTs, CCEs, UGTs and ABC transporters, and/or also interrogated lists of DE for genes encoding these proteins (Additional file 21: Table S22). Analysis of candidate genes focused on the tobacco-, tomato- and cucumber-reared T. vaporariorum lines, which exhibited the greatest transcriptional response, and exploring the association between upregulation of detoxification genes and sensitivity to insecticides.
Of all detoxification enzyme superfamilies P450s have been most frequently implicated in tolerance to plant allelochemicals and synthetic insecticides [46], and, in a previous study on spider mites, showed the most profound changes in gene expression after transfer to a challenging host [12]. Consistent with these studies we observed marked differences in the expression of P450 genes between the whitefly lines adapted to novel host plants (Fig. 2b, Additional file 18: Tables S16-S21). Interestingly, the lines with the most similar profile of P450 expression were the cucumber and tobacco-reared lines (Fig. 2b). The expression profile of the pumpkin-reared line was more distantly related to that of the other three strains and also had no significantly over-expressed P450s relative to the bean-reared line. A total of 11, 18 and 28 P450 genes were DE in the cucumber-, tobacco- and tomato-reared T. vaporariorum lines respectively. Grouping these by clade (Fig. 2c) revealed the majority belong to the CYP3 and 4 clades, members of which have been most frequently linked to xenobiotic detoxification across a range of insect species. Five P450 genes were overexpressed in all three comparisons of which CYP6DP2 belonging to the CYP3 clade was by far the most highly expressed in all three lines (19.6–28.3-fold) (Fig. 2b). Two additional P450s were over-expressed in both lines reared on nightshade hosts; CYP6EA1 a member of the CYP3 clade (overexpressed 5.0–9.2-fold) and CYP306A1 (overexpressed 3.3–2.4-fold). Finally, as detailed above, QPCR revealed that three P450s, CYP6CM2, CYP6CM3 and CYP6CM4, were overexpressed in the tobacco-reared line (2.4–4.7-fold) that belong to the same subfamily as CYP6CM1 of B. tabaci (Additional file 20: Figure S8). The overexpression of CYP6CM1 in this species has been shown to confer potent resistance to several neonicotinoid insecticides which have structural similarity to nicotine [21, 23]. Correlation of the expression of the upregulated P450s with the phenotypic data derived from insecticide bioassays allowed us to assess their potential role in mediating the observed tolerance of the different T. vaporariorum lines to insecticides. While CYP6DP2 is the most highly upregulated P450 in the cucumber-, tobacco- and tomato- reared lines, correlation of its expression with bioassay data suggests it may play a limited role in insecticide tolerance. Specifically, this P450 is overexpressed > 20-fold in the cucumber-reared line but is not overexpressed in the pumpkin reared line, despite this both of these lines show the same (~ 8-fold) tolerance to bifenthrin (Fig. 3), suggesting its overexpression has no effect on the sensitivity of T. vaporariorum to this compound. Similarly, the cucumber-reared line exhibits no tolerance to imidacloprid, pymetrozine or nicotine (Fig. 3), suggesting the overexpression of CYP6DP2 does not enhance the detoxification of these compounds. Finally, the high expression of CYP6DP2 in the tomato-reared line is not associated with tolerance to chlorantraniliprole (Fig. 3). Thus, the overexpression of this P450 in three of the lines may represent a generic stress response to challenging host plants, but is unlikely to explain the pattern of insecticide tolerance observed. Using the same process all other overexpressed P450s were ruled out as strong candidate insecticide tolerance genes except for CYP6EA1. This P450 is overexpressed in the tobacco and tomato- reared lines and is a candidate for the tolerance of these lines to imidacloprid, with the level of expression in the two lines (5.0-fold and 9.2-fold) mirroring their relative tolerance to this compound (3.1-fold and 5.2-fold). Finally, given previous work on the substrate profile of CYP6CM1 in B. tabaci, the overexpression of CYP6CM2–4 in the tobacco-reared line represent potential candidates to explain the tolerance of this line to nicotine (Fig. 3).
In the case of GSTs two genes were upregulated in the cucumber-reared line (g10036 and g13867), however, both of these were also overexpressed at similar levels in both night-shade reared lines (Additional file 13: Figure S3B and Additional file 18: Tables S16, S20). This suggests that while they may play a role in host plant adaptation they play no role in the enhanced tolerance of the cucumber-reared line to chlorantraniliprole, or the tolerance of the nightshade-reared lines to pymetrozine or imidacloprid (Fig. 3). In addition to these two genes, one further GST (g5077) was upregulated exclusively in the nightshade-reared plants (overexpressed 2.7- and 2.3-fold in the tobacco- and tomato-reared lines) (Additional file 18: Table S20). This GST belongs to the microsomal clade and while its pattern of expression in the two nightshade reared lines would make it a candidate for contributing to the observed tolerance of these lines to bifenthrin (Fig. 3), to date, only cytosolic GSTs have ever been implicated in insecticide resistance [47]. No additional GSTs were overexpressed exclusively (or at significantly higher levels) in the tobacco-reared lines that might contribute to the tolerance of this line to nicotine.
Two CCEs, g14105 and g17172, were upregulated in the cucumber-reared line, of which the latter was also modestly overexpressed in the nightshade-reared lines (Additional file 14: Figure S4B and Additional file 18: Table S16, S20). The high expression of g14105 (11.9-fold overexpressed) and the fact that it belongs to clade A, members of which have been previously associated with the detoxification of xenobiotics and metabolism of dietary compounds [27], makes it a potential candidate for the tolerance of the cucumber-reared line to chlorantraniliprole (Fig. 3). g17172 also belongs to clade A, however, comparison of its pattern of expression in the three T. vaporariorum lines with the sensitivity of these lines to insecticides suggests it is unlikely to confer tolerance to any of the compounds tested.
Much more marked changes were observed in the expression of genes encoding UGTs, with 11 UGT genes upregulated in the cucumber-reared line and 9 upregulated in both nightshade-reared plants (Additional file 16: Figure S6B and Additional file 18: Table S16, S20). Of these 7 were upregulated at similar levels in all three lines. The four UGT genes (UGT352P5, UGT356E1, UGT352P2 and UGT358B1) exclusively upregulated (2.3–4.5-fold) in the cucumber-reared line are potential candidates for a role in the marked tolerance of this line to chlorantraniliprole. Indeed, UGTs have been recently implicated in metabolic resistance to this compound in the diamondback moth, Plutella xylostella, and striped rice stem borer, Chilo suppressalis [48, 49]. The two UGTs (g12287 and g2864) exclusively overexpressed in the nightshade reared lines are potential candidate genes for a role in the tolerance of these lines to insecticides, particularly g12287 which was overexpressed > 19-fold in both lines.
Several ABC transporters were found to be significantly overexpressed in response to feeding on cucumber, tobacco and tomato, although few were upregulated to the extent seen for other families of detoxification genes (Additional file 15: Figure S5B and Additional file 18: Tables S16, S18, S19). Four genes (g11125, g11231, g5414 and g3563) were moderately (up to 5.4-fold) overexpressed in the cucumber-feeding line. ABC transporter genes have previously been implicated in insecticide resistance in B. tabaci, all belonging to the G clade [30]. Three of the ABC transporter genes overexpressed in the cucumber-reared line (g11231, g5414 and g3563) also belong to this clade and thus are potential candidates for the increased tolerance to chlorantraniliprole. Both genes significantly upregulated in the tobacco-reared line (g11231 and g5415) were also upregulated in the tomato-reared line, and so are unlikely to be responsible for the tolerance of this line to nicotine (Fig. 3). However, they could be associated with the elevated tolerance to imidacloprid or pymetrozine, especially as ABC transporters belonging to the G clade have been associated with neonicotinoid resistance in B. tabaci [30].
Structural proteins and cysteine proteases
Analysis of the transcriptomes of the T. vaporariorum lines revealed other trends in the transcriptional response to host switching beyond changes in the expression of genes belonging to superfamilies commonly implicated in detoxification. These included marked changes in the expression of genes encoding cathepsin B cysteine proteases and cuticular proteins, both of which have been previously implicated in insect adjustment to new host plants [24]. In the case of cathepsin B proteases the tomato, tobacco and cucumber reared lines all had > 10 genes belonging to this family DE (Additional file 18: Tables S16, S18, S19). In the cucumber-reared line all but one of the 14 cathepsin B genes DE was upregulated (2.1- to 14.6-fold), however, in both the tobacco and tomato reared lines a higher number of cathepsin B genes were downregulated with just 3 genes upregulated (2.7- to 30.2-fold) in both comparisons (Additional file 18: Table S18). Previous work on the aphid, M. persicae identified marked downregulation of cathepsin B genes in aphids when transferred from cabbage (Brassica rapa) to Nicotiana benthamiana, a close relative of tobacco [24]. RNAi-mediated knock-down of genes belonging to this family impacted aphid fitness in a host-dependant manner providing clear evidence that cathepsin B genes play a role in adaptation to specific host plants [24]. Cathepsin B proteins have a role in several biological processes in insects including digestion, embryonic development, metamorphosis and the decomposition of larvae and adult fat body. Their specific role in host plant adaptation in less clear but their overexpression could represent a counter defence against plant protease inhibitors [50]. Alternatively, work on aphids has suggested they may function as effectors that manipulate plant cell processes in order to promote insect virulence [24].
In the case of genes encoding structural components of the insect cuticle 15 sequences were identified as over-expressed in the nightshade-reared T. vaporariorum lines that returned BLAST hits to cuticle proteins and cuticular protein precursors (Additional file 18: Table S20). All proteins which were characterised belonged to the Rebers and Riddiford subgroup 2 (RR-2) cuticular family and so are associated with hard cuticle rather than flexible cuticle [51]. These findings align with prior studies on M. persicae, Polygonia c-album and B. tabaci which all reported the upregulation of genes encoding cuticular proteins during host adaptation [9, 12, 52]. The specific role of cuticular proteins in insect host plant adaptation is unclear, however, a study of the adaptation of B. tabaci to tobacco observed both the upregulation of cuticular proteins and increases in body volume and muscle content [52]. Thus, the overexpression of cuticular proteins could play a role in host plant adaptation by mediating physical changes that allow insects to more readily survive the effects of feeding on hostile plants, and this in turn could impact their sensitivity to insecticides.
Gene regulation and signalling
Among the most striking changes in gene expression during host adaptation related to genes involved in the regulation of transcription and signal transduction namely transcription factors and G protein-coupled receptors (GPCRs).
Transcription factors have been shown to play a key role in the regulation of enzymes responsible for detoxifying xenobiotics [53,54,55,56]. Their potential role in underpinning the marked transcriptional response observed during the adaptation of T. vaporariorum to challenging host plants was suggested by the over-expression of 56 transcription factors in the tomato- and tobacco-reared lines, representing 5.1% of all DE genes (Additional file 18: Table S20). The overexpressed genes encoded factors belonging to a variety of families including zinc-finger (ZF-TFs) and nuclear hormone receptors (NHR). ZF-TFs have been previously associated with the regulation of a ribosomal protein associated with pyrethroid resistance in mosquitoes [57], and a transcription factor belonging to the NHR family was upregulated in T. urticae in response to transfer to tomato and in two insecticide resistant strains [12]. However, it is worth noting that many of the observed changes in the expression of transcription factors may be unrelated to hostile challenge or insecticide resistance but simply result from the change in the nutrient composition of the host plant.
G-protein-coupled receptors or GPCRs are the largest family of membrane proteins, responsible for cellular responses to hormones and neurotransmitters [58]. More than 20 genes annotated as GPCRs were overexpressed during adaptation of T. vaporariorum to nightshade plants (Additional file 18: Table S18). The stress of feeding on these challenging plants could lead to upregulation of these proteins for several reasons. Firstly, GPCRs mediate neurohormones which have been implicated in the regulation of feeding and digestion in insects which are likely modified when feeding on hostile plants [59,60,61]. Secondly, previous work in mosquitoes found that knocking out GPCR genes not only reduces insecticide resistance but also downregulates the expression of P450 genes, suggesting a role for GPCRs in the regulation of these enzymes [62]. As the significant upregulation of GPCRs in the nightshade-reared lines was associated with both induced tolerance to insecticides and significant over-expression of P450s, it is possible that GPCRs play a similar role here.
P450s of the CYP6CM1 subfamily confer tolerance to plant-derived, but not synthetic, insecticides
As described above transcriptome profiling identified a diverse range of candidate insecticide tolerance genes which require functional characterisation to confirm their causal role. As a first step towards this aim we selected P450s of the CYP6CM subfamily for further functional characterisation for the following reasons: Firstly, the three P450s belonging to this subfamily in T. vaporariorum were all overexpressed in the tobacco-reared line which exhibited tolerance to both nicotine and imidacloprid (Additional file 20: Figure S8). Secondly, in a previous study two of the genes, CYP6CM2 and CYP6CM3, were found to be upregulated in imidacloprid-resistant populations of T. vaporariorum from Greece [21, 23]. Finally, the three P450s belong to the same subfamily as CYP6CM1, a P450 in B. tabaci that confers strong resistance to several neonicotinoid insecticides including imidacloprid [23]. CYP6CM2–4 thus represent strong candidates for P450 enzymes that confer resistance to a natural insecticide (nicotine) and a structurally related synthetic insecticide (imidacloprid). To investigate this transgenic strains of D. melanogaster were created that individually express each of the three genes, and their sensitivity to nicotine and neonicotinoids examined. In insecticide bioassays none of the three lines showed tolerance to the neonicotinoid imidacloprid (Fig. 5a, Additional file 22: Table S23). Indeed, all three lines were much more sensitive to this compound than flies of the same genetic background but without a transgene, suggesting a fitness cost is associated with the expression of these transgenes in D. melanogaster. In contrast, in bioassays with nicotine a trend of increased tolerance of the three transgenic lines to this compound was observed when compared to the control. While the 95% confidence intervals of the calculated LC50 values between control and transgene expressing lines overlap, the lines expressing CYP6CM3 and CYP6CM4 both showed significant resistance compared to the control when exposed to a 30,000 ppm concentration of nicotine (one-way ANOVA, p < 0.05, post hoc: Control-CM3 and Control-CM4 p < 0.05). These data provide evidence that these P450s confer tolerance to nicotine but not to synthetic insecticides. The latter finding is consistent with a recent study which expressed CYP6CM2 and CYP6CM3 in E. coli and observed no metabolism of the neonicotinoid insecticides imidacloprid, clothianidin, dinotefuran, thiamethoxam, nitenpyram, thiacloprid, or acetamiprid [63].