Chemosensory genes in the antennal transcriptome of two syrphid species,Episyrphus balteatusandEupeodes corollae (Diptera: Syrphidae)

Background Predatory syrphid larvae are an important natural enemy of aphids in cotton agro-ecosystems in China. Their behaviors in prey foraging, localization and oviposition greatly rely on the perception of chemical cues. As a first step to better understand syrphid olfaction at the molecular level, we have performed a systematic identification of their major chemosensory genes. Results Male and female antennal transcriptomes of Episyrphus balteatus and Eupeodes corollae were sequenced and assembled using Illumina HiSeq2000 technology. A total of 154 chemosensory genes in E. balteatus transcriptome, including candidate 51 odorant receptors (ORs), 32 ionotropic receptors (IRs), 14 gustatory receptors (GRs), 49 odorant-binding proteins (OBPs), 6 chemosensory proteins (CSPs) and 2 sensory neuron membrane proteins (SNMPs) were identified. In E. corollae transcriptome, we identified 134 genes including 42 ORs, 23 IRs, 16 GRs, 44 OBPs, 7 CSPs and 2 SNMPs. We have provided full-length sequences of the highly conserved co-receptor Orco, IR8a/25a family and carbon dioxide gustatory receptor in both syrphid species. The expression of candidate OR genes in the two syrphid species was evaluated by semi-quantitative reverse transcription PCR. There were no significant differences of transcript abundances in the respective male and female antenna, which is consistent with differentially expressed genes (DEGs) analysis using the FPKM value. The sequences of candidate chemosensory genes were confirmed and phylogenetic analysis was performed. Conclusions This research comprehensively analyzed and identified many novel candidate chemosensory genes regarding syrphid olfaction. It provides an opportunity for understanding how syrphid insects use chemical cues to conduct their behaviors among tritrophic interactions of plants, herbivorous insects, and natural enemies in agricultural ecosystems. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3939-4) contains supplementary material, which is available to authorized users.

In general, the process of chemoreception, including olfaction and taste, involves several families of genes, including odorant receptors (ORs), ionotropic receptors (IRs), and gustatory receptors (GRs) [20,[29][30][31]. In addition, odorant binding proteins (OBPs), chemosensory proteins (CSPs) and sensory neuron membrane proteins (SNMPs) also play crucial roles in chemoreception [32][33][34][35][36]. The insect chemoreceptor superfamily including ORs and GRs was first identified in the Drosophila melanogaster genome [18,21]. Insect odorant and gustatory receptors were once thought to be G-protein-coupled receptors just like ORs in worms and vertebrates, but subsequent studies have shown a lack of homology to vertebrate ORs [37]. One such superfamily encoding ORs is highly divergent across insect taxa with sequences and frequencies varying to a large extent [18,38,39]. ORs are broadly tuned to alcohols, ketones, and esters generally present in the environment [40,41]. Another family encoding GRs, or receptors for taste or contact stimuli, is also very divergent across insect taxa [31]. On the contrary, one example of exceptionally conserved GRs are GR21a and GR63a, which work together as a CO 2 receptor in Drosophila [42]. Such chemoreceptors play an important role in host seeking behaviors in many insects but, especially seen in mosquitoes [43,44]. A new insect chemosensory family was identified recently and given the name ionotropic receptors (IRs). These IRs belong to the ionotropic glutamate receptor superfamily (iGluRs) and were identified in both the olfactory and gustatory systems [30,45]. IRs are more greatly conserved than ORs and GRs but considerable variations can be observed in ligand-binding domains. They are mainly tuned to acids, amines and other odorants that are not detected by ORs [30,45,46].
Since chemosensory gene families were characterized in two important model species, D. melanogaster and Anopheles gambiae [47,48], a growing number of chemosensory genes have been identified from many Dipteran species, such as Musca domestica [49], Bactrocera dorsalis (genome: assembly ASM78921v2), Calliphora stygia [50], Glossina morsitans morsitans [51], and Mayetiola destructor Say [20]. Protein prediction methods have been the first step for functional identification of chemosensory genes. All information regarding insect chemosensory was obtained bioinformatically and has been beneficial in understanding insect processing of diverse volatile compounds and cross-species differences in chemical communication.
Due to the larvae's agricultural importance via potential applications, several reports have been published on the chemical ecology of these insects. In Episyrphus balteatus DeGeer, larvae may use a sesquiterpene as a kairomone [17,52] and other potential semiochemicals to locate their prey [8,53]. In Sphaerophoria rueppellii, adult females are strongly attracted to odors from aphid colonies showing that specific volatile compounds are important to detect their prey [1]. Some studies on the relationships between aphid or host plant volatile emissions and aphid localization and foraging behavior have shown strong associations with syrphid recognition. One striking finding has shown that volatiles from plants attacked by aphids produce strong electrophysiological responses from the antennae of syrphids [7,8]. These studies indicate that detecting preyderived volatiles ((E)-β-farnesene), herbivore-induced plant volatiles (monoterpenes and sesquiterpenes), or naturally occurring general leaf volatiles (GLVs; alcohols, aldehydes and esters) help natural enemies to select oviposition sites and locate their prey [8,12,13,17,[52][53][54][55].
Despite these reports on chemosensory behavior, little is known on the molecular basis of syrphid olfaction. Therefore, the identification of predatory syrphid chemosensory gene families will help reveal how syrphids forage on their prey and choose oviposition sites. In this study we selected two syrphid species, E. balteatus and Eupeodes corollae Fabricius, active in northern China cotton fields to perform antennal transcriptome sequencing in order to explore and compare chemosensory genes in the two species. A total of 154 and 134 chemosensory candidate genes were identified in E. balteatus and E. corollae transcriptomes, respectively, including ORs, IRs, GRs, OBPs, CSPs and SNMPs. Furthermore, we report the expression profile of the OR families found in each insect transcriptome. A comparison between these two syrphids and other insect species revealed candidate chemosensory genes that could be involved in prey selection and plant volatile recognition. The discovery of putative chemosensory genes gives way for further exploration into functional assessments regarding chemoreception association.

Results
Antennal transcriptome sequencing and sequence assembly  Table S1). In addition, unigenes with a sequence length > 500 bp accounted for 47.64% and 43.59% of the E. corollae and E. balteatus transcriptome assembly, respectively.
Gene ontology (GO) annotations were used to classify the transcripts into functional groups in accordance with specific GO categories. A total of 12,441 (23.22%) of all predicted proteins from E. balteatus and 12,425 (24.39%) predicted proteins from E. corollae were assigned to at least one GO term (Additional file 2: Fig. S1B). The GO terms distribution in the three categories were similar in the two species. In the "molecular function" category, the most abundant GO terms were "binding" and "catalytic activity". In the "biological process" category, "cellular process", "single-organism process" and "metabolic process" were the most represented. Finally, "cell", "cell part", and "organelle" were the most abundant GO terms in "cellular component" category (Additional file 2: Fig. S1B). GO terms associated with chemosensory genes were distributed in the "biological process" category (e.g. "cellular process", "developmental process", "response to stimulus", "establishment of localization", and "biological regulation", etc.), "molecular function" category (e.g. "molecular transducer activity", etc.) and "cellular component" category (e.g. "extracellular region", "membrane part", "membrane", etc.).

Candidate ORs in E. balteatus and E. corollae
Based on our analysis of the antennal transcriptomes in the two species, 51 and 42 transcripts for candidate ORs were identified in the combined male and female data sets from E. balteatus and E. corollae, respectively (Additional file 3: Table S2). A total of 21 E. balteatus ORs (EbalORs) and 29 E. corollae ORs (EcorORs) contained full-length open reading frames (ORFs), whose translation products are predicted to possess 2-8 transmembrane domains (TMDs). Other partial length transcripts encoded proteins exhibiting overlapping regions with low identity and were classified as unique genes. After a more exhaustive comparison with OR genes from other insect species, we found that all putative EbalORs shared between 22% and 86% amino acid identity with other ORs, with almost identical values (22% to 87%) for EcorORs. Detailed information is reported in Additional file 3 and Table S2.
We next performed a phylogenetic analysis using our candidate ORs and the ORs from four other Diptera species including B. dorsalis, C. stygia, D. melanogaster and M. domestica (Fig. 1). Clustered with DmOR83b, the highly conserved co-receptor Orco, orthologous genes were identified in the antennal transcriptomes of both syrphid species, and named EbalOrco and EcorOrco. As expected, sequence identity between EbalOrco and EcorOrco is very high (97.27%). Among the other ORs, five EbalORs (EbalOR9, 16, 18, 22 and 37) and three EcorORs (EcorOR8, 24 and 29) clustered with DmelOR67d, the pheromone receptor from D. melanogaster. This OR67d specific clade also included the OR67d orthologues from M. domestica and B. dorsalis. Two of these genes, EbalOR16 and EcorOR24, are fulllength transcripts with 71.65% amino acid identity. The remaining ORs in this group were highly divergent among different species. Within the Dipteran OR sequences, we found a species-specific clade including eight members from E. balteatus (EbalOR7, 10, 30, 32, 41, 46, 47 and 48) and seven from E. corollae (EcorOR9, 10, 11, 15, 16, 34 and 38) that shared low identities with other Dipteran ORs (Fig. 1).

Candidate GRs in E. balteatus and E. corollae
We have identified 14 and 16 candidate GR genes from E. balteatus and E. corollae transcriptomes, respectively (Additional file 3: Table S2). The majority of candidate EbalGRs and EcorGRs were partial fragments, with only three from E. balteatus and six from E. corollae encoding full-length proteins. These complete sequences all show six or seven TMDs with an intracellular N-terminus and extracellular C-terminus. Phylogenetic analysis with GRs from six Dipteran species suggest that Drosophila GR21a and GR63a, reported as carbon dioxide sensors [42,56], clustered first with EbalGR2 and EcorGR2 and second with EbalGR1 and EcorGR1. In addition, EbalGR4, EbalGR13 and EcorGR4 showed high identities to thermoreceptor DmelGR28b responsible for rapid warmth avoidance [57].
Candidate IRs in E. balteatus and E. corollae We identified 32 transcripts for putative ionotropic receptors in E. balteatus and 23 in E. corollae. Of these, seven EbalIRs and 14 EcorIRs contained full-length ORFs, with two to five TMDs (Additional file 3: Table S2). Among these we found the common conserved co-receptors IR8a (EbalIR8a and EcorIR8a) and IR25a (EbalIR25a and EcorIR25a) in both species. Other candidate IRs were found as partial sequences (Fig. 3).
In order to further distinguish putative IRs from iGluRs, all EbalIRs and EcorIRs were aligned with IRs from A. gambiae, C. stygia and D. melanogaster, as well as some AgamiGluRs and DmeliGluRs for phylogenetic analysis. The results showed that the candidate EbalIRs and EcorIRs clustered with presumed "antennal" orthologues IR76b, IR93a, IR21a, IR68a, IR40a, IR75l, IR75d, IR64a, IR84a, IR31a and IR92a, and were well separated from the Agami-GluRs and DmeliGluRs clade (Fig. 3) [63]. Interestingly, the conserved "antennal" orthologues, IR60a, was lacking from E. balteatus and E. corollae transcriptome assemblies, while IR68a was only absent from E. corollae. The sequences of E. balteatus clustering with DmelIR94d and DmelIR94e were quite divergent ( Fig. 3; Additional file 5: Fig. S2). When compared to the orthologues within other species, these IRs may play different roles in olfaction.

Candidate OBPs in E. balteatus and E. corollae
We identified 49 different transcripts encoding candidate OBPs in E. balteatus and 44 in E. corollae, numbers similar to the 52 OBPs of D. melanogaster [64]. Of these, 38 transcripts of EbalOBPs and 31 EcorOBPs contained full-length ORFs with predicted signal peptide sequences (with Eba-lOBP31 as the only exception) (Additional file 3: Table S2).
Candidate CSPs in E. balteatus and E. corollae Through bioinformatic analysis, six and seven different transcripts encoding candidate CSPs were identified from E. balteatus and E. corollae transcriptomes, respectively. Five EbalCSPs and six EcorCSPs represented full-length proteins and only EbalCSP6 lacked a signal peptide (Additional file 3: Table S2). All of the identified amino acid sequences possessed the highly conserved four-cysteine profile. A phylogenetic tree was built with all the syrphid CSPs and those of A. gambiae, C. stygia, D. melanogaster (Fig. 5).

Candidate SNMPs in E. balteatus and E. corollae
In both species, two SNMPs with full-length ORFs were identified possessing two TMDs (with EbalSNMP1 having a single TMD as an exception) (Additional file 3: Table  S2). EbalSNMP1 and EcorSNMP1 are very similar to DmelSNMP1, a protein shown to be required for correct pheromone detection [50,[66][67][68]. EbalSNMP2 and EcorSNMP2 are similar to DmelSNMP2, reported to be expressed in supporting cells (Fig. 6) [27,69,70].

Differentially expressed genes (DEGs) analysis
Gene expression levels of all male and female antennaeassociated chemosensory genes in both E. balteatus and E. corollae were assessed using fragments per kilobase per million fragments (FPKM) values, represented in a heatmap (Fig. 7). Normalised antennal expression levels of candidate E. balteatus and E. corollae ORs are shown in Additional file 7. Of all ORs, Orco had the highest expression level of transcripts in both sexes of each species. There were no significant differences of OR transcript abundances (FPKM value) in the respective male and female antenna, except for EcorOR14 (Additional file 7). A combined analysis of false discovery rate (FDR) ≤0.001 and |log2 Ratio| ≥ 1 showed that EcorOBP shared highest number of differentially expressed genes (DEGs), including eleven highexpression in male and six high-expression in female syrphids (Additional file 7). In addition, candidate carbon dioxide receptor GR1 and GR2, and SNMP1 in both sexes showed a high expression level (Fig. 7).
Tissue-and sex-specific expression of candidate E. balteatus and E. corollae OR genes The expression of the candidate ORs in E. balteatus and E. corollae male and female antennae and legs (control sample) was analyzed using semi-quantitative reverse transcription PCR (RT-PCR). All 51 EbalORs and 42 EcorORs were detected in the antennae at high expressing level. Only EbalOR49 was found to be mainly expressed in legs. There were no significant differences of transcript abundances in the respective male and female antenna (Fig. 8). The Orco co-receptor gene also showed a high expression level in both syrphid species. This is consistent with DEGs analysis of OR transcript abundances using the FPKM value.

Discussion
The syrphids E. balteatus and E. corollae are aphidspecific predators and predominately inhabit northern China wheat and cotton fields. Typical of most insects, chemical cues drive several aspects of their behavior, such as foraging on prey and choosing oviposition sites [7,8,10]. Chemosensory proteins play an important role in this process. We analyzed antennal transcriptomes of E. balteatus and E. corollae and searched for chemosensory genes with the purpose of understanding chemical communication of tritrophic interactions among plants, herbivorous insects, and natural enemies.
In our study, we sequenced E. balteatus and E. corollae antennal transcriptomes using next generation sequencing technology on the Illumina HiSeq 2000 platform. The total RNA was converted into a template library for high throughput DNA sequencing, allowing  us to obtain all expressed transcripts. De novo assembly of transcripts using the Trinity method gives highefficiency and reliable full-length transcripts across extensive expression levels, even without genome information [71]. Our sequence assembly yielded a final transcript dataset of 50,942 unigenes from E. corollae and 53,575 E. balteatus unigenes. Total unigenes counts resulted in 44.2% unigenes from E. balteatus and 50.3% unigenes from E. corollae shared sequence similarities to known proteins using the BLASTX homology search from the NCBI non-redundant protein database. These percentages are very similar to other Dipteran species [50,72]. Remaining transcripts without associated GO terms may represent species-specific genes. The antennal transcriptome analysis proved to be a powerful tool to identify chemosensory genes in insects without genome information. It has been successfully employed in many insect orders including Lepidoptera, Coleoptera, Hymenoptera and Hemiptera. In Diptera, the chemosensory genes were successfully identified in C. stygia, B. dorsalis and Scaeva pyrastri antennal transcriptomes [50,72,73]. Here, we identified 154 and 134 candidate chemosensory genes in E. balteatus and E. corollae, respectively, a number similar to other Diptera antennal transcriptomes (e.g. 128 in C. stygia) [50] but less than the chemosensory genes identified in D. melanogaster (254), M. domestica (386) and A. gambiae (292) genome [30,33,47,49,63,65,74]. This could be the result of differential expression based on developmental stages of the insect larva or adult olfactory organ development such as maxillary palp and proboscis. All data shows that the chemosensory genes identified by antennal transcriptome sequencing are accurate and reliable.
We identified 154 candidate chemosensory genes (51 ORs, 32 IRs, 14 GRs, 49 OBPs, 6 CSPs and 2 SNMPs) in E. balteatus and 134 (42 ORs,23 IRs,16 GRs,44 OBPs,7 CSPs and 2 SNMPs) were identified in E. corollae, numbers slightly different compared with those of other Dipteran species [50,73,75,76]. Such differences could be due to sequencing methods, coverage and/or depth. The number of chemosensory genes is higher in E. balteatus than in E. corollae. However, assembling and splicing quality (unigene number and N50 length) in E. corollae is better than in E. balteatus. The differences in the number and quality of transcripts identified could arise from variations in sample preparation or could be due to evolution [77] and adaptation to the environment (tritrophic interactions).
A total of 49 and 44 OBPs were identified in E. balteatus and E. corollae transcriptomes, respectively. The number of OBPs is variable across species, with 52 members in D. melanogaster, 66 in A. gambiae (Diptera), 21 in Apis mellifera (Hymemoptera), 34 in Helicoverpa armigera, 29 in Helicoverpa assulta (Lepidoptera), 26 in Colaphellus bowringi, 46 in Tribolium castaneum (Coleoptera) and 15 in A. pisum (Hemiptera) [27,74,[77][78][79][80][81][82][83]. Meanwhile, the OBPs of these two syrphid species are highly divergent with those of other insects. These evolutionary differences may result from different physiological functions or ecological niches. Compared with OBPs, only a small amount of CSPs were detected in Diptera. They are only 4 CSPs in D. melanogaster, 4 in C. stygia and 7 in A. gambiae (Diptera). These numbers are much lower than in other insect orders, such as 18 CSPs found in H. armigera and 17 in H. assulta (Lepidoptera) [27,50,80,81]. In our study, six EbalCSPs and seven EcorCSPs are identified in transcriptome sequencing, revealing that the numbers of CSP gene family differ among species. CSPs show a high evolutionary diversity in insecta, probably related to different physiological functions.
We identified 51 ORs from E. balteatus and 42 ORs from E. corollae, respectively. Compared with other Dipteran species, these numbers are similar to those identified in C. stygia (50) [50] and G. morsitans morsitans (46) [51] but lower than those of D. melanogaster (62), M. domestica (86), A. gambiae (79) [47,49,[84][85][86], suggesting that sequencing method/depth may be different between studies yielding less genes that may be difficult to detect because of low expression [77]. Here, we were able to detect speciesspecific OR transcripts in E. balteatus and E. corollae. This clade of ORs may have a greater impact on recognizing specific odors, particularly perception of aphids-derived volatiles and herbivore-induced plant volatiles granting syrphid localization access of its prey.
The tissue-and sex-specific expression analysis showed no differences between male and female, which is consistent with DEGs analysis of OR transcript abundances using FPKM values. Lepidopteran ORs have shown male-specific expression that is usually involved in the detection of the sex pheromone [19,26,27], but this does not seem to be the case in syrphids. Additional real time quantitative PCR, in situ hybridization and single-sensilla recordings would be required to validate OR expressions and functions.
In D. melanogaster, males release the volatile sex pheromone cis-vaccenyl acetate (cVA) [87][88][89]. The perception of sex pheromone cVA is mediated by OR67d [88], OR65a [90,91], LUSH [92], and SNMP1 [34]. In our two syrphid species, EbalOBP17 and EcorOBP14 are the orthologues of the DmelOBP-lush gene, while EbalOR16 and EcorOR24 are the orthologues of DmelOR67d, and EbalSNMP1 and EcorSNMP1 are very similar to DmelSNMP1, suggesting that these proteins may be involved in detection of their yet unidentified pheromones. Therefore, further functional characterization of these candidate proteins will help reveal any mechanism associated with pheromone reception in E. balteatus and E. corollae.
The E. balteatus and E. corollae IR family is relatively conserved, especially with respect to common receptors IR8a and IR25a, which are expressed in both olfactory and gustatory systems [30,45]. The numbers of IRs identified in E. balteatus (32) and E. corollae (23) are similar to that of C. stygia (22) [50], but lower than those of D. melanogaster (66) and A. gambiae (46) [63]. It is possible that some IRs do not express in antennae tissues or perhaps the number of IRs varies between species and is dependent on natural habitats. A large number of EbalIRs and EcorIRs are clustered with "antennal" orthologues in Drosophila, indicating that IRs are highly conserved in Diptera. Furthermore, the IRs identified in these two species may be activated by acids, amines and other odorants that are not sensed by ORs [30,45,46].
In the antennae of E. balteatus and E. corollae, we identified 14 and 16 candidate GRs, respectively. The total number of GRs in these two species may be much larger, because some members could be exclusively expressed in other gustatory organs, such as maxillary palps, proboscises and legs. However, the numbers are still lower than those reported in other Dipteran antennal transcriptomes [50]. The conserved receptors identified in the two syrphid species may be involved in CO 2 perception. However, we infer that the mechanism of CO 2 perception is different from mosquitoes which concerns host-seeking [43,44,48,93]. Some GRs may function as taste or contact receptors [31], particularly with reference to their specific pollination behavior [94,95]. Some GRs from these two species are clustered with thermos-sensing GRs and sugar-detecting GRs from Drosophila, indicating that they may perform similar functions. Functional analysis of the candidate E. balteatus and E. corollae chemosensory proteins is required to identify their physiological roles.

Conclusions
We have identified and annotated 154 transcripts encoding putative chemosensory proteins in antennal transcriptome of E. balteatus and 134 in E. corollae. Comparisons between the two syrphid species and among other Dipteran species were deduced using sequence information. This work gives a foundation for future studies aimed at understanding chemical communication in syrphids and tritrophic interactions between plants, herbivorous insects, and natural enemies in agricultural ecosystems.

Methods
Insect rearing and tissue collection E balteatus and E. corollae larvae were fed with aphids (Aphis gossypii Glover) and maintained at 22 ± 1°C with a 12 h light: 12 h dark photo-period at the Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing, China. Following eclosion, adult males and females were separated and provided with pollen and 10% honey solution.
Antennae were excised from 2-to 5-day-old adult males and females respectively, and legs were collected together, then immediately frozen and stored in liquid nitrogen.

cDNA library construction and Illumina sequencing
Total RNA of male and female antennae was extracted from E. balteatus and E. corollae using TRIzol reagent (Invitrogen, Carlsbad, CA, USA). The method for RNA extraction followed in the manufacturer's instruction. Total RNA was dissolved in RNase-free water and RNA integrity was verified by gel electrophoresis. RNA concentration and purity were measured on a Nanodrop ND-2000 spectrophotometer (NanoDrop products, Wilmington, DE, USA). Ten micrograms total RNA of each sample was used to construct the cDNA library. The cDNA library construction and Illumina HiSeq 2000 (Illumina, San Diego, CA, USA) sequencing of the samples was performed at Beijing Genomics Institute (BGI, Shenzhen, China). The insert sequence length was around 200 bp and these libraries were pair-end sequenced using PE100 strategy [22,27].

Assembly and function annotation
Raw reads were pre-processed by filtering low quality reads, trimming low quality nucleotides at each ends and removing 3′ adaptors and poly-A/T tails. Each clean-read dataset of male and female antenna was fed to Trinity [71]. The Trinity assembly procedure, including Inchworm, Chrysalis and Butterfly were followed using Grabherr et al., 2011 as a reference [71]. In the first step of Trinity, Inchworm assembles reads into the unique sequences of transcripts using the default parameters (default k-mers = 25). Next, Chrysalis clusters related contigs that correspond to portions of alternatively spliced transcripts or otherwise unique portions of paralogous genes. Finally, Butterfly uses read sequences, read-pairings and Chrysalis' read mappings to select the paths that are best supported by read sequences [71].
The Trinity outputs were clustered by TGICL [96]. The consensus cluster sequences and singletons make up the unigenes dataset [22]. The unigenes annotation was performed by NCBI BLASTX against a pooled database of non-redundant and SwissProt protein sequences with evalue <1e-5. The BLASTX results were then imported into Blast2GO pipeline for GO annotation [97].

Identification of chemosensory genes
Candidate unigenes encoding putative ORs, IRs, OBPs, CSPs, SNMPs and GRs were found by running Perl scripts against transcriptome assembly and annotation in the remote sever. Perl scripts were written to extract sequence from functional annotation results using olfaction keywords. Subsequently, all candidate chemosensory genes were manually checked by BLASTX against local non-redundant database with e-value <1e-5. Using the BLASTX NCBI database, we manually performed alignments comparing transcripts against all known proteins to examine full-length coverage. The full-length transcripts contain start and termination codons. The ORFs of all putative chemosensory genes were predicted by using ExPASy (Expert Protein Analysis System) server version (http://web.expasy.org/translate/) according to the BLASTX best hit result [98]. Putative N-terminal signal peptide of OBPs and CSPs were predicted by SignalP 4.0 server version with default parameters [99]. The TMDs of ORs, IRs and GRs were predicted using TMHMM server version 2.0 [100].

DEGs analysis
A mapping-based expression profiling analysis of the chemosensory genes was conducted to compare gene expression between male and female antennae. All of the clean reads were remapped onto the transcripts using SOAPaligner (http://soap.genomics.org.cn /soapaligner.html), allowing up to three base mismatches and a minimum length of 40 bp. The FPKM method was used for calculating unigene expression levels [20,50,102,103]. The suitable P-values were calculated to identify differentially expressed genes according to the hypergeometric test [103]. The FDR was a statistical method used in multiple hypothesis testing to correct for Pvalue. Criteria for estimating significant differential expression was set at FDR ≤ 0.001 and |log2 Ratio| ≥ 1. Heatmaps of differential gene expression between male antennae and female antennae in both species were generated by Heml 1.0 software [104].

Expression analysis by semi-quantitative RT-PCR
Semi-quantitative RT-PCR was performed to verify the expression of candidate chemosensory genes. Male and female antennae and legs were collected from adult E. balteatus and E. corollae after eclosion. The extraction of total RNA followed the manufacturer's instruction [27]. The cDNA was synthesized from total RNA using Rever-tAid First Strand cDNA Synthesis Kit (Thermo Scientific, Waltham, MA, USA). Gene specific primers were designed using PrimerQuest Tool (http://sg.idtdna.com/Primerquest/Home/Index) (Additional file 9: Table S5) and synthesized by Sangon Biotech Co., Ltd. (Shanghai, China). A Taq MasterMix (CWBIO, Beijing, China) was used for PCR reactions under the general three-step amplification of 94°C for 30s, 55°C for 30s, 72°C for 30s. RT-PCR products were separated on 2% agarose gels, stained by ethidium bromide (EB), and photographed under UV light in Gel Doc XR+ Gel Documentation System with Image Lab Software (Bio-Rad, Hercules, CA, USA).