Chemosensory genes identified in the antennal transcriptome of the blowfly Calliphora stygia

Background Blowflies have relevance in areas of forensic science, agriculture, and medicine, primarily due to the ability of their larvae to develop on flesh. While it is widely accepted that blowflies rely heavily on olfaction for identifying and locating hosts, there is limited research regarding the underlying molecular mechanisms. Using next generation sequencing (Illumina), this research examined the antennal transcriptome of Calliphora stygia (Fabricius) (Diptera: Calliphoridae) to identify members of the major chemosensory gene families necessary for olfaction. Results Representative proteins from all chemosensory gene families essential in insect olfaction were identified in the antennae of the blowfly C. stygia, including 50 odorant receptors, 22 ionotropic receptors, 21 gustatory receptors, 28 odorant binding proteins, 4 chemosensory proteins, and 3 sensory neuron membrane proteins. A total of 97 candidate cytochrome P450s and 39 esterases, some of which may act as odorant degrading enzymes, were also identified. Importantly, co-receptors necessary for the proper function of ligand-binding receptors were identified. Putative orthologues for the conserved antennal ionotropic receptors and candidate gustatory receptors for carbon dioxide detection were also amongst the identified proteins. Conclusions This research provides a comprehensive novel resource that will be fundamental for future studies regarding blowfly olfaction. Such information presents potential benefits to the forensic, pest control, and medical areas, and could assist in the understanding of insecticide resistance and targeted control through cross-species comparisons. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1466-8) contains supplementary material, which is available to authorized users.

Behavioural and physiological studies have shown the olfactory attraction of blowflies to whole host samples (e.g. liver, mice, pigs, etc.) [5,22] as well as their ability to detect ("smell") individual odour molecules emitted from those samples [23][24][25][26]. Of the numerous VOCs available for detection, sulfur compounds appear to be some of the most important. Sulfides are consistently identified within the odour space of mammalian decomposition [14,17,19,[27][28][29], are produced by plants that mimic decomposition odour [30], and have been shown to elicit physiological and behavioural responses in a variety of blowfly species [16,23,[31][32][33]. Knowledge regarding the underlying molecular mechanisms regulating blowfly olfaction, however, is severely limited.
Because of the sequence diversity of olfactory genes, their identification has largely been only possible with insects for which genomic data is available [48,49]. However, recent advances in RNA-Seq and computational technologies have opened up such identifications in non-model organisms. This has resulted in the identification of olfactory genes in a wide range of insects for which no sequenced genome is available [50][51][52][53][54][55][56]. With respect to blowflies, very few chemosensory genes have been identified [57,58]. For example, the co-receptor crucial for the appropriate function of ligand-binding ORs has been identified in various blowfly species [59-61] along with two candidate ligand-binding ORs [62]. Candidate genes from the OBP, IR, and GR families have also been identified [59,63]. However, no research regarding the functional characterisation of individual olfactory genes has been published and the ligands for these candidate genes remain unidentified.
This research investigated the antennal chemosensory gene families of the blowfly Calliphora stygia (Fabricius) (Diptera: Calliphoridae) via transcriptomic analysis. C. stygia is native to Australia and is primarily carriondependent with respect to its feeding and reproductive behaviour [64]. Previous research with C. stygia has focused Figure 1 Scanning electron micrograph of the head of a male C. stygia. The main olfactory organs, the antennae (an) and maxillary palps (mp), are located between the compound eyes (ce) and the base of the proboscis (p), respectively. Scale bar: 1 mm. primarily on factors affecting its growth throughout its lifecycle [65][66][67][68]; there is no information regarding its olfactory abilities. Identification of members of the primary gene families mediating insect olfaction permits a better understanding of the molecular basis of blowfly olfaction. Such knowledge could ultimately lead to the identification of new targets of control strategies [11,57], an improved understanding of how blowflies recognise, locate, and colonise hosts, as well as improved methods for estimating post-mortem interval [7].

Antennal transcriptome
The combined Trinity assembly of the male and female C. stygia antennal transcriptomes led to the generation of 75,836 contigs, from which 16,522 non-redundant putative transcripts were predicted. Searches against the NCBI non-redundant protein database returned 14,094 transcripts showing sequence similarity to known proteins (Additional file 1). Of these, 8,709 (~53% of all predicted proteins) were assigned at least one GO term ( Figure 2). There was no significant difference between the male and female data sets with respect to GO annotation therefore the male and female data sets are presented together. The most abundant GO term associations were in relation to basic cell functions; however, GO terms associated with olfaction (e.g. "odorant binding", "response to stimulus", and "signal transducer activity") and enzyme activity (e.g. "hydrolase activity", "transferase activity", etc.) were also represented within the data sets. The large number of transcripts without associated GO terms (7,813 transcripts,~47%) potentially represent orphan genes.

Identification of candidate odorant receptors
Analysis of the C. stygia antennal transcriptomes identified 48 and 50 candidate OR proteins in the male and female data sets respectively (combined total of 50 candidates [GenBank accession numbers KJ702047-KJ702096], with CstyOR118 and CstyOR119 being absent from the male data set). Additional file 2: Table S1 summarises transcript name, length, best BLASTx hit, predicted domains, and male or female specificity. Twenty-four of the putative CstyORs likely represent full-length sequences. The majority of partial length transcripts possess overlapping regions with low amino acid sequence identity, which indicates that they represent separate individual proteins. However, the possibility that the remaining nonoverlapping transcripts represent fragments of individual proteins cannot be excluded; therefore, based on sequence alignments and subsequent fragment location (i.e. C-terminus, internal, or N-terminus), the total number of CstyORs reported could be reduced by two.
Consistent with the diversity of the OR gene family (with the exception of Orco), full length putative C. stygia ORs shared between 9% and 49% amino acid identity (average 16%). Predictive software also indicated fulllength candidate CstyOR transcripts possess between three and eight transmembrane domains. Depending on the length of the partial transcripts, the remaining CstyORs were predicted to have zero to seven transmembrane domains (Additional file 2: Table S1). Importantly, the highly conserved co-receptor Orco was identified in the C. stygia transcriptomes, sharing~88% to~99% amino acid sequence identity with Orco's from Drosophila melanogaster and other blowfly species. As expected, greater sequence identity was observed with other blowfly Orco's than with Drosophila ( Figure 3). Putative D. melanogaster orthologues could be assigned for the majority of the presumably ligand-binding CstyORs; however, 13 appear to have no D. melanogaster counterpart. Of these 13, ten could be assigned putative orthologues (based on reciprocal best hits) in other species, including Anopheles gambiae Bombyx mori, Danaus plexippus, and other Drosophila species. Interestingly, a putative orthologue of DmelOR67d, a pheromone specific receptor, was identified in C. stygia (sharing~43% amino acid sequence identity).
In absolute terms (i.e. presence or absence), there were no significant differences in the number of candidate OR proteins identified in the respective male and female data sets. Quantitative differences in the relative transcript abundances were observed (Additional file 3). Of the identified candidate OR transcripts, nine appear to be enriched (i.e. double the normalised FPKM value) in the female data set, while 15 are enriched in the male data set.

Identification of candidate gustatory receptors
Twenty-one candidate GR transcripts were identified in the combined male and female C. stygia transcriptomes (Additional file 2: Table S2) (a total of 20 male and 20 female candidates [GenBank accession numbers KJ702097-KJ702117]). The majority of candidate CstyGRs were partial fragments (only five represent full-length proteins), encoding overlapping but distinct sequences. This establishes the proteins as being fragments of (See figure on previous page.) Figure 2 Distribution of C. stygia antennal transcriptome data in GO terms. GO analysis of 8,709 (8,628 male, 8,609 female) transcripts for their predicted involvement in molecular functions (A) and biological processes (B) or as cellular components (B). GO categorisation for molecular functions is presented at level 3 and at level 2 for biological processes and cellular components. Annotated genes are depicted as percentages of the total number of transcripts with GO term assignments.
independent genes. Consistent with other insect GRs [69], transmembrane domain and topology predictions in full-length transcripts indicated between six and eight domains with an intracellular N-terminus and extracellular C-terminus being the most likely configuration. The eight candidate CstyGR transcripts included in the phylogenetic analysis formed a distinct clade ( Figure 3); none clustered within the OR clades thus indicating that the transcripts are more related to GRs than ORs. The CstyGRs were also observed to group with their presumed Drosophila orthologues, which have been shown to have roles in carbon dioxide detection (GR21a and GR63a) [36,70] and thermosensation (GR28b) [71], or are members of the candidate sugar GR64 receptor subfamily (GR64b and GR64e) [72]. Several of the partial length candidate CstyGRs also show high sequence amino acid similarity to known sugar (DmelGR43a) and bitter (DmelGR66a and DmelGR93a) Drosophila receptors (Additional file 2: Table S2).

Identification of candidate ionotropic receptors
Twenty-two candidate IRs were identified in both the male and female C. stygia antennal transcriptomes (Additional file 2: Table S3 [GenBank accession numbers KJ702118-KJ702139]). Structural analysis and amino acid sequence alignments revealed that most candidate CstyIRs shared the structural organisation of insect IRs and iGluRs (in the case of co-receptors IR8a and IR25a) ( Figure 4A and B). The most conserved sequence regions were the three transmembrane domains and the ion channel pore ( Figure 3C) [38,73]. Characteristic variability of the glutamate-binding residues located in the ligand-binding S1 and S2 domains was also present ( Figure 5A and B). Only two CstyIRs (CstyIR8a and CstyIR64a) retain all residues characteristic of iGluRs (R, T and D/E) [38]; all other IRs have a diversity of amino acids at one or more of these positions indicating variable ligand binding properties. However, it should be noted that some of the putative CstyIRs with incomplete sequences could not be assessed for the presence of these crucial residues.
Phylogenetic analysis revealed that the candidate CstyIRs were more closely related to IRs then iGluRs, with all candidate CstyIRs assessed clustering with their presumed "antennal" orthologues ( Figure 6). This analysis identified representatives from 10 of the 13 orthologous "antennal" IR groups conserved across the protostome species analysed by Croset et al. [73]. Thus, orthologues of the remaining three conserved groups (IR21a, IR60a, and IR68a) are either lacking from the C. stygia transcriptome assembly (due to their low expression levels [38,74] which could result in them being missed during random sequencing) or are yet to be identified within the putative CstyIRs represented by partial sequences (e.g. CstyIR101 and CstyIR106 share 78% and 70% identity with DmelIR21a, respectively). Notably, transcripts putatively encoding IR8a, IR25a, and IR76bwhich are thought to function as IR co-receptors [38,75] were found in C. stygia antennae. No candidate CstyIRs clustered within the "divergent" clade.

Identification of candidate sensory neuron membrane proteins
Analysis of the male and female C. stygia antennal transcriptomes identified three candidate SNMPs present in both data sets (Additional file 2: Table S4 [GenBank accession numbers KJ702172-KJ702174]), two of which (CstySNMP1 and CstySNMP3) likely represent fulllength genes. Notably, a putative orthologue to the D. melanogaster protein, SNMP1, which has been shown to have an important role in pheromone detection [39], was present in the C. stygia data sets ( Figure 7).

Identification of putative odorant binding proteins
Twenty-eight candidate OBP transcripts were identified in C. stygia (Additional file 2: Table S5), all of which were present in both the male and female data sets [GenBank accession numbers KJ702140-KJ702167]. Of the 18 fulllength CstyOBPs, 15 exhibited the classic arrangement of conserved six-cysteines, 1 was the Plus-C gene motif (CstyOBP23), and 2 were Minus-C (CstyOBP25 and CstyOBP27) (Figure 8) [76,77]. A further 6 Classical, 2 Plus-C (CstyOBP22 and CstyOBP24), and 1 Minus-C (CstyOBP26) type transcripts could be allocated from the partial CstyOBP transcripts. Notably, while the CstyOBPs classified as Minus-C do lack one or more of the cysteine residues in the conserved classic locations, additional cysteines (conserved in the CstyOBP Minus-C motif sequences) were present in nearby positions that may act as an alternative.
Phylogenetic analysis revealed that all candidate CstyOBPs clustered in accordance with their respective sub-families (Figure 9). This analysis also indicated that D. melanogaster orthologues were likely to be present for many of the putative CstyOBPs, although a few small C. stygia specific clades potentially indicate a level of divergence within the blowfly. The low average amino acid identity exhibited by the candidate CstyOBPs is consistent with that of other species [76,78] and aligns with the notion that their role involves interacting with a range of diverse odour molecules. Interestingly, one particular candidate CstyOBP shared significant amino acid sequence identity to the D. melanogaster OBP LUSH (DmelOBP76a) (55% identity), which, in addition to its role in driving the avoidance of high alcohol concentrations [79], has been shown to play a role in pheromone sensitivity [39,80].

Identification of candidate chemosensory proteins
Four transcripts encoding candidate CSPs were identified in both the male and female C. stygia transcriptomes (Additional file 2: Table S6), three of which likely represent full-length proteins [GenBank accession numbers KJ702168-KJ702171]. All of the identified amino acid sequences possessed a signal peptide and the highly conserved four-cysteine profile ( Figure 10). Figure 5 Ligand-binding S1 and S2 domains. Important glutamate-interacting residues are lacking in the ligand binding domains of most C. stygia IR candidates. MAFFT amino acid alignments of the S1 (A) and part of S2 (B) ligand binding domains of candidate C. stygia IRs and Drosophila melanogaster IRs and iGluRs. The key binding residues in iGluRs are boxed.

Identification of candidate odorant degrading enzymes
GO annotation of the transcriptomes indicated an enrichment of proteins involved in catalytic activity. Further analysis of the combined male and female transcripts identified 136 candidate Cytochrome P450s and esterases (Additional file 2: Table S7 [GenBank accession numbers KJ702175-KJ702310]). Of the 39 candidate esterases (37 of which were annotated from both the male and female data sets), 17 likely represent full-length sequences. Ninetythree of the 97 candidate P450s were present in both sexes with 28 sequences predicted to be full-length.
Phylogenetic analysis revealed that the candidate Csty-Ests clustered within all three of the major functional groups of the esterase gene family (based on the classification system of [47]) ( Figure 11). This indicates that the CstyEsts have possible functions in neurodevelopment (non-catalytic enzyme group), detoxification (mostly intracellular enzyme group), and hormone and pheromone processing (mostly secreted enzyme group) [47]. This analysis also indicated that D. melanogaster orthologues were likely to be present for many of the candidate CstyEsts, with no apparent C. stygia specific clades or expansions. Candidate CstyCyps were also distributed throughout the four major Cytochrome P450 gene family groups, specifically the CYP2, CYP3, CYP4, and mitochondrial clades (as classified by [81]) ( Figure 12). Genes Figure 6 Phylogenetic tree of a selection of Dipteran IRs. Neighbour-joining tree of candidate C. stygia IRs (red) with IRs from Anopheles gambiae (light blue), all iGluRs and IRs from Drosophila melanogaster (dark blue) and the single IR identified in Musca domestica (grey). The identified IR candidates of C. stygia cluster with their presumed orthologues within the "antennal IR" clades and the IR25a/8a subgroup.
from the CYP3 clade, in which the majority of CstyCyps clustered (31 transcripts), have been shown to be involved in xenobiotic metabolism and insecticide resistance [82]. Additionally, some genes from the CYP4 clade have been associated with the metabolism of odorants or pheromones [82]. Candidate NADPH-Cytochrome reductases, proteins required for the reduction of P450s, were also identified. Notably, several of the candidate enzymes shared significant amino acid sequence identity to CYP450s and esterases specifically associated with pesticide resistance and detoxification. For example, the candidate esterase CstyEst7 was a best BLASTx hit to Lucilia cuprina's E3 (sharing 89% amino acid identity), an enzyme shown to be a prime candidate for pesticide resistance [83,84]. Additionally, CstyCyp83 was a reciprocal best hit to D. melanogaster's CYP6g1, which has been associated with DDT resistance [85]. While it is possible that many of the candidate enzyme transcripts are ODEs, significant biochemical analysis is necessary to identify their specific physiological roles.

Discussion
This research represents the first comprehensive analysis of a blowfly antennal transcriptome for the purpose of identifying members of the major chemosensory gene families necessary for olfaction. The reported gene sets therefore represent a significant addition to the data regarding the molecular basis of blowfly olfaction. Due to the importance of olfaction in blowfly behaviour, the identified genes could represent novel targets for future    population control methods as well as providing opportunities for improving post-mortem interval estimations through a greater understanding of the odour related factors that favour or inhibit blowfly detection and colonisation.
Classification of predicted functions of the male and female C. stygia transcripts via GO assignment produced similar results as those obtained for other invertebrates [50,52,62,86,87]. The number of individual candidate transcripts identified for many of the olfactory gene families is also comparable to those of other Dipteran, Coleopteran, and Lepidopteran species for which the antennal transcriptome has been examined [50,51,86,[88][89][90][91]. Without considering the potential significance of individual genes to each species studied, the similarity of the different data sets does indicate a certain level of antennal conservation with respect to gene expression.
Interestingly, the 50 candidate ORs identified in C. stygia show greater similarity to the numbers identified in Lepitopteran species (43 in Cydia pommonella, and 47 in Manduca sexa, H. armigera, and Spodoptera littoralis) [51,86,88,92] than to the closer related D. melanogaster Figure 11 Phylogenetic tree of esterases. Maximum-likelihood tree of candidate C. stygia esterases (red) with esterases from Drosophila melanogaster (dark blue). The candidate esterases identified in C. stygia are spread throughout all three of the major esterase gene family groups [47], indicating a variety of possible antennal functions.
(37 ORs) [91]. Additionally, the number of candidate C. stygia GRs (20) and IRs (22) is greater than those reported for most species [52,[91][92][93]. While these studies also analysed antennal transcriptomes, variations in the number of transcripts identified could arise from differences in the sequencing methods, sequencing depth, and/or sample preparation. The greater number of candidate ligandbinding transcripts annotated in C. stygia could be due to ecological differences; however, further research is required to determine the specific reason such differences may exist. Typically, invertebrates with large numbers of antennal GRs use their antennae for tasting purposes as well as for olfaction (e.g. the butterfly Heliconius melpomene) [94]. However, there are no reports of blowflies exhibiting such behaviour. Interestingly, recent transcriptomic analysis of A. gambiae also identified GRs in addition to the previously identified carbon dioxide receptors [90]. While primarily linked to the detection of tastants [95], the C. stygia and A. gambiae transcriptomes, and the increasing range of "non-gustatory" sensory functions being identified for these proteins [96], suggests antennal GRs could have far more diverse roles.
Functional analysis of D. melanogaster IRs has demonstrated their role in the detection of amines and acids Figure 12 Cytochrome P450s. Maximum-likelihood tree of candidate C. stygia Cytochrome P450s (red) with Drosophila melanogaster Cytochrome P450s (dark blue). The identified CypP450 candidates from C. stygia cluster within the four distinct clades (CYP2, CYP3, CYP4, and mitochondrial). Candidate NADH-Cytochrome reductases were also identified. [38,97], which are significant compounds emitted during biological decomposition [14,19,28,98]. Candidate C. stygia IRs putatively orthologous to D. melanogaster IRs shown to respond to individual decomposition compounds (e.g. propanoic acid, ammonia, butyric acid, and putrescine) [93] were present amongst the male and female C. stygia data sets. Predicting the ligands to which olfactory receptors will respond based on empirical data from other receptors is problematic due to their extensive divergence. Therefore, determining receptor ligands can only be achieved experimentally. Ultimately, the reason for such a large number of ORs, GRs, and IRs in C. stygia is unknown and additional molecular biology and functional experiments are required in order to confirm the expression and role of these genes. Overall, the comparable and/or greater number of genes identified within each of the olfactory gene families suggests that a comprehensive antennal data set has been obtained for C. stygia. Such results also illustrate the sensitivity and value of transcriptomic analysis via next generation RNA-Seq for non-model organisms.
Notably, the C. stygia transcriptome data indicates that the chemosensory gene repertoire is largely similar in the male and female. This indicates that male and female C. stygia share similar odour-coding capacity. Quantitatively, the range of relative expression levels (i.e. low to high expression) of the candidate ligand-binding ORs in C. stygia, in relation to Orco, are similar to those reported for other Dipteran species [91,99,100]. However, preliminary data suggests that there is a difference in the relative levels of expression of individual ORs between male and female C. stygia (Additional file 3). Therefore, while male and female antennae likely perceive similar odour stimuli, their sensitivities, and hence the odour significance to the male and female, may differ. This is consistent with previous studies, which show that electrophysiological responses can be elicited from males and females by a particular odour [23,101], while leading to sex-based behavioural differences [22,[102][103][104]. Additional biological repeats and experimental validation (e.g. quantitative PCR) are required to confirm the expression data. Further research, such as in situ hybridization and single-sensilla recordings, would also be beneficial to determine the distribution and frequency of ORs within the antennae.
Sexually dimorphic expression of chemosensory genes could indicate roles in sex-specific behaviours, including those mediated by pheromones. Invertebrate pheromone detection mediates various behaviours including aggregation, mate recognition, and sexual behaviour [105]. In D. melanogaster, reception of the male volatile pheromone, cis-vaccenyl acetate, is achieved with LUSH, OR67d, and SNMP1 [1,39,80]. The identification of putative orthologues of these three proteins in C. stygia could indicate potential functional pheromone detection in this species. Proteins sharing high sequence similarity to LUSH and DmelOR67d have also been identified in Stomoxys calcitrans (stable fly) [62]. Candidate blowfly pheromones have been described for several species [106][107][108]; however, all have been non-volatile cuticular hydrocarbons and therefore are more likely to be contact pheromones. Additionally, the mechanisms that allow pheromone reception in blowflies are not yet known. Further characterisation of the candidate C. stygia proteins is required to determine if they exhibit similar chemosensory roles to the D. melanogaster genes or if they have adapted different functions.

Conclusions
A total of 264 transcripts encoding putative chemosensory proteins from the seven major olfactory gene families were annotated in the blowfly C. stygia. The transcriptomic approach proved to be a highly effective strategy for the identification of divergent blowfly chemosensory receptors for which no genomic data is publically available. Comparative analysis with other species suggests that nearcomplete information regarding the molecular basis of C. stygia olfaction was obtained. This research greatly improves the gene inventory for C. stygia and provides a valuable resource for future analysis on blowfly olfaction. Such information will be fundamental for future comparative analyses that could highlight interspecies differences underlying ecological differences and genetic adaption.

Insects
Calliphora stygia pupae were obtained from a commercial supplier (Sheldon's Bait, Victor Harbor, SA, Australia) and maintained at 23°C with a 12 hr light: 12 hr dark photoperiod. Following eclosion, males and females were separated and provided with water and protein biscuits (sugar, eggs, powdered milk yeast, and water) as per published procedures [109].

RNA extraction and reference transcriptome generation
Antennae were excised from adult male and female flies (minimum 5 days old) following snap freezing in liquid nitrogen. Total RNA was isolated using the RNAqueous®-Micro Kit (Ambion) with DNase treatment following the manufacturer's protocol. RNA quantity was determined on a Nanodrop ND-2000 spectrophotometer (Thermo Scientific, Waltham, MA, USA). Synthesis of cDNA and Illumina library generation was completed at BGI -Hong Kong Co., Ltd. using Illumina HiSeq2000 sequencing. Raw RNA-Seq data was pre-processed, combined, and de novo assembled using Trinity [110,111]. Open reading frames were predicted using TranscriptDecoder software as implemented in Trinity. In silico expression profiles were generated using DEW [112], which aligns data using Bow-tie2 v.2.1.0 [113] with RNA-eXpress post-processing [114]. Expression levels were expressed in terms of FPKM values (fragments per kilobase per million reads).

Gene identification and annotation
An initial assessment of the combined male and female C. stygia transcriptomes was completed via searches against the NCBI non-redundant protein database (using BLASTp with a 1e -10 threshold) and GO Annotation with Blast2GO [115,116]. Putative chemosensory genes were identified by custom database nucleotide Blast profile searches (Geneious software) using known D. melanogaster sequences as queries. Putative C. stygia chemosensory genes were in turn used as queries to identify additional genes (tBLASTx and BLASTp). Iterative searches were completed until no new candidates were identified. Identification of candidate genes was verified by additional BLAST searches using the C. stygia contigs as queries against the NCBI non-redundant protein database (BLASTx). Protein domains (e.g. transmembrane domains, signal peptides, secondary structures, etc.) were predicted by queries against InterPro using the InterProScan Geneious software plugin running a batch analysis (e.g. HMMPanther, SignalPHMM, Gene3D, HMMPfam, TMHMM, HMMSmart, Superfamily, etc.) [117]. Membrane topology was assessed with Phobius [118]. Sequences were classified based on sequence similarity, domain structure predictions, and phylogenetic analysis.

Orthology determination
C. stygia genes were defined as potential orthologues when they were reciprocal best hits with the corresponding D. melanogaster gene and subsequently grouped within the same clade in phylogenetic trees.

Protein nomenclature
Candidate chemosensory transcript names are preceded by a four-letter species abbreviation in accordance with established conventions (e.g. [73,119]). C. stygia transcripts deemed orthologous (based on sequence similarity) to D. melanogaster sequences were given the same name (e.g. DmelIR8a, CstyIR8a, DmelOrco, CstyOrco). Multiple copies of a putative D. melanogaster orthologue were given the same name followed by a point and number (e.g. CstyIR76a.1, CstyIR76a.2). For IRs, novel transcripts (i.e. those without putative orthologues) were numbered from 101 upward in order to avoid confusion (D. melanogaster IRs are numbered up to IR100a). Similarly, novel ORs and GRs were numbered from 99 upwards. Transcripts identified as putative OBPs were named according to previously established conventions [80]. Briefly, OBPs were numbered from one upwards in the following order: "classical" members; "Plus-C" members; and "Minus-C" members. OBPs unable to be classified (due to incomplete sequences) were listed last. Candidate Cytochrome P450s and esterases were numbered from one upwards.

Phylogenetic analysis
Amino acid sequences were aligned using MAFFT [120]. Unrooted neighbour-joining (for IRs) and maximumlikelihood trees (for ORs and GRs, OBPs, SNMPs, esterases, and Cytochrome P450s) were constructed using MEGA5 [121] and subsequently viewed and graphically edited in FigTree v1.4.0 [122] and InkScape v0. 48.2 [123]. Branch support was assessed using the bootstrap method based on 1000 replicates. Incomplete transcripts without sufficient overlap in alignments and transcripts less than 200 amino acids in length (except for the OBPs where full-length transcripts are generally shorter than 200 amino acids) were excluded from phylogenetic analyses to ensure that the analysed transcripts corresponded to individual genes and that greater accuracy in the analyses was maintained.
Phylogenetic trees were based on Dipteran data sets. The IR data set contained 12 C. stygia sequences (10 transcripts were omitted due to their short length and/or their lack of predicted M1-M3 domains), one from Musca domestica [GenBank accession number AFP89966.1], 59 IR and 14 iGluR sequences identified in D. melanogaster [38], and 44 IR sequences from Anopheles gambiae [124]. For construction of the OBP dendogram, all 29 putative C. stygia sequences were analysed with 52 from D. melanogaster [125]. The OR and GR data set contained 43 and eight amino acid sequences, respectively, from C. stygia (seven OR and 13 GR transcripts were omitted due to their short length and/or lack of overlap when aligned) and 60 and 5, respectively, from D. melanogaster. Blowfly OR and GR sequences available from the NCBI database were also included. For the SNMP dendogram, all three putative C. stygia sequences were analysed with two SNMPs from D. melanogaster and an additional two SNMPs from A. gambiae. The esterase and Cytochrome P450 data sets contained 29 and 70 C. stygia sequences, respectively (12 esterase transcripts and 27 Cytochrome P450 transcripts were omitted due to their short length) and 35 and 85 sequences, respectively, from D. melanogaster. Protein names and GenBank accession numbers of genes used for building phylogenetic trees are listed in Additional file 4.

Availability of supporting data
All supporting data is included within the article and its additional files. Candidate chemosensory genes were submitted to the National Center for Biotechnology