Identification of genes expressed in the sex pheromone gland of the black cutworm Agrotis ipsilon with putative roles in sex pheromone biosynthesis and transport

Background One of the challenges in insect chemical ecology is to understand how insect pheromones are synthesised, detected and degraded. Genome wide survey by comparative sequencing and gene specific expression profiling provide rich resources for this challenge. A. ipsilon is a destructive pest of many crops and further characterization of the genes involved in pheromone biosynthesis and transport could offer potential targets for disruption of their chemical communication and for crop protection. Results Here we report 454 next-generation sequencing of the A. ipsilon pheromone gland transcriptome, identification and expression profiling of genes putatively involved in pheromone production, transport and degradation. A total of 23473 unigenes were obtained from the transcriptome analysis, 86% of which were A. ipsilon specific. 42 transcripts encoded enzymes putatively involved in pheromone biosynthesis, of which 15 were specifically, or mainly, expressed in the pheromone glands at 5 to 120-fold higher levels than in the body. Two transcripts encoding for a fatty acid synthase and a desaturase were highly abundant in the transcriptome and expressed more than 40-fold higher in the glands than in the body. The transcripts encoding for 2 acetyl-CoA carboxylases, 1 fatty acid synthase, 2 desaturases, 3 acyl-CoA reductases, 2 alcohol oxidases, 2 aldehyde reductases and 3 acetyltransferases were expressed at a significantly higher level in the pheromone glands than in the body. 17 esterase transcripts were not gland-specific and 7 of these were expressed highly in the antennae. Seven transcripts encoding odorant binding proteins (OBPs) and 8 encoding chemosensory proteins (CSPs) were identified. Two CSP transcripts (AipsCSP2, AipsCSP8) were highly abundant in the pheromone gland transcriptome and this was confirmed by qRT-PCR. One OBP (AipsOBP6) were pheromone gland-enriched and three OBPs (AipsOBP1, AipsOBP2 and AipsOBP4) were antennal-enriched. Based on these studies we proposed possible A. ipsilon biosynthesis pathways for major and minor sex pheromone components. Conclusions Our study identified genes potentially involved in sex pheromone biosynthesis and transport in A. ipsilon. The identified genes are likely to play essential roles in sex pheromone production, transport and degradation and could serve as targets to interfere with pheromone release. The identification of highly expressed CSPs and OBPs in the pheromone gland suggests that they may play a role in the binding, transport and release of sex pheromones during sex pheromone production in A. ipsilon and other Lepidoptera insects.


Background
Lepidoptera sex pheromones are primarily C10-C18 long straight chain unsaturated alcohols, aldehydes or acetate esters [1], biosynthesised and released mainly from pheromone glands located between the 8 th and 9 th abdominal segments of the female moths. Usually the females use a mixture of compounds in a unique ratio to attract conspecific males [2]. The extremely high specificity and sensitivity of species-specific pheromones make them potential biological control agents for population monitoring, mass trapping and reducing pesticide use in integrated pest management (IPM) programs [3][4][5]. Further use of pheromones in such strategies would be aided by an understanding of the pathways involved in pheromone biosynthesis and transport.
Most sex pheromone blends of Lepidoptera insects are synthesised de novo via modified fatty acid biosynthesis pathways [2,6,7] and gland-specific enzymes are involved in desaturation, chain shortening, reduction and acetylation [1,2]. Different species use different combinations of these reactions to produce unique species-specific pheromone blends. The first step is the synthesis of saturated fatty acid precursors malonyl-CoA from acetyl-CoA by acetyl-CoA carboxylase (ACC) and fatty acid synthetase (FAS) [8,9]. Labeling studies conducted with acetate indicated that malonyl-CoA and NADPH are used by FAS to produce mainly saturated stearic acid (18:0) and palmitic acid (16:0) with 18 and 16 carbon atoms and no double bonds, respectively, as precursors [10][11][12]. Modification of the fatty acid chain includes the introduction of a double bond by desaturases specific to pheromone biosynthesis followed by chain shortening using specific β-oxidation enzymes [13,14]. So far, several types of desaturases have been extensively studied through gene characterization and expression analysis, including Δ5 [15], Δ9 [16,17], Δ10 [18], Δ11 [19,20], and Δ14 [21] desaturases. Once unsaturated pheromone precursor with a specific chain-length is produced, the carboxyl carbon is modified to form one of functional groups (aldehyde, alcohol or acetate ester). These modifications require the enzymes fatty acid reductase to produce the alcohols from the fatty acyl precursor [22], which in some species may be oxidized to aldehydes serving as pheromone components [23], and to acetate esters (OAc) by acetyltransferase [24]. Recently, a few members of the reductase gene family have been discovered and functionally characterized in several Lepidoptera species, including Ostrinia scapulalis [25], Heliothis virescens, Heliothis subflexa, Helicoverpa armigera, Helicoverpa assulta [26], Ostrinia nubilalis [27], Yponomeuta evonymellus (L.), Yponomeuta padellus (L.) and Yponomeuta rorellus (Hübner) [28]. A number of pheromone gland-specific enzymes have been identified and their essential functions in pheromone production demonstrated in vitro as well as in vivo. For example, using RNA interference, Matsumoto and colleagues showed that two pheromone gland-specific enzymes (acyl-CoA desaturase and a fatty-acyl reductase) are responsible for pheromone production in the silk moth Bombyx mori [29][30][31].
After production and release of the sex pheromone components by female moths the males detect the pheromone and respond for mating. It is commonly accepted that pheromone molecules are captured and transported to the pheromone receptors on the dendrites of pheromonesensitive neurons by olfactory binding proteins, including odorant binding proteins (OBPs) and chemosensory proteins (CSPs) [32][33][34]. Pheromone binding proteins (PBPs) bind to sex pheromone components and classified into a subclass of OBPs [35]. After activation of the pheromone receptors the olfactory signals must be degraded rapidly to prevent from prolonged neuronal excitation [36]. This may involve pheromone degrading enzymes (PDEs) capable of degrading the pheromone molecules [37].
The black cutworm Agrotis ipsilon is a destructive polyphagous insect pest of many crops and for a strain from China the female sex pheromone blend comprises five main acetate components: (Z)-11-hexadecenyl acetate (Z11-16: OAc), (Z)-9-tetradecenyl acetate (Z9-14:OAc), (Z)-7dodecenyl acetate (Z7-12:OAc), (Z)-8-dodecenyl acetate (Z8-12:OAc) and (Z)-5-decenyl acetate (Z5-10:OAc) [38]. These components indicate the involvement of different desaturases and ß-oxidases during the sex pheromone biosynthesis. However, the genes/proteins and their specific function in mediating A. ipsilon pheromone production, transport and degradation have not been characterized. Over the last few years, the next generation sequencing such as 454 pyrosequencing technique provides an easy and effective method for the discovery of novel genes. In present study, using the Roche GS FLX Titanium sequencing platform, we report a genetic database of the genes expressed in the pheromone glands of A. ipsilon and the identification of genes with putative roles in pheromone biosynthesis, degradation and transport as well as their tissue expression profiles.

sequencing and unigene assembly
Sequencing of a cDNA library prepared from mRNAs of the pheromone glands of A. ipsilon gave a total of 631,425 raw reads with an average length of 517 base pairs (bp). After trimming adaptor sequences and removing low quality sequences, 629,273 clean reads remained with an average length of 496 bp. The size distribution of the clean reads is shown in Additional file 1. The sequences of all reads have been deposited in the NCBI SRA database with the accession number SRX189143.
The 629,273 clean reads were assembled into 23,473 unigenes, including 20,541 contigs (87.5%) and 2,932 singletons (12.5%), the largest transcriptome dataset so far from moth sex pheromone glands. An overview of the sequencing and assembly results is presented in Table 1. The length of the assembled unigenes ranged from 100 bp to 21842 bp with an average length of 770 bp. Among the unigenes, 22,035 (93.9%) are between 200 bp and 2000 bp long with an average length of 649 bp. These unigenes are in fact transcripts in the A. ipsilon pheromone gland cDNA library. Therefore we refer them as transcripts. All sequences of the unigenes used in the current study are provided in Additional file 2.
Analysis of the transcripts from the A. ipsilon pheromone gland BLASTx and BLASTn were used to compare each A. ipsilon transcript with a cut-off E-value of 1.0E-5 against GeneBank entries. 12,989 transcripts (55%) had BLASTx hits in the non-redundant protein (nr) databases and 9,392 (40%) had BLASTn hits in the non-redundant nucleotide sequence (nt) databases. This is consistent with a previous report of H. virescens pheromone gland ESTs [39]. Some of the A. ipsilon transcripts were homologous to those from more than one species but in general most were homologous to other Lepidoptera species taking up 2,379 in the 9,392 BLASTn hits, including 1,124 (12%) to B. mori entries. The second highest hits were to Dipteran species with 343 hits to D. melanogaster and 279 and 221 hits to the mosquitoes Anopheles gambiae and Aedes aegypti, respectively. The lowest hits were to the wasp Nasonia vitripennis (190 hits), the beetle Tribolium castaneum (147 hits) and the pea aphid Acyrthosiphon pisum (136 hits). The top 15 insect species that have significant BLASTn hits are shown in Figure 1.
Gene Ontology of the genes expressed in the A. ipsilon pheromone gland The 23,473 assembled transcripts were annotated into different functional groups according to Gene Ontology (GO) analysis. Some transcripts were annotated into more than one GO category. Of the 22,473 transcripts, 7,546 (32%) could be assigned to a GO category (Additional file 3). The "cellular process" and "metabolic process" GO categories were most abundantly represented with 4,056 (17.3%) and 3,361 (14.3%) transcripts, respectively, within the biological process GO ontology. In the "cellular components" GO ontology the transcripts were mainly distributed in cell (18.8%) (4,415 transcripts) and cell part (17.6%) (4,133 transcripts). The GO analysis also showed that in the molecular function ontology 3,271 transcripts (13.9%) were annotated as having binding functions and 3,484 (14.8%) to have catalytic activity.

Comparative analysis of transcripts in Lepidoptera pheromone glands
In order to compare the A. ipsilon pheromone gland transcriptome with those from other Lepidoptera and to identify A. ipsilon transcripts with potential involvement in sex pheromone production and transport we downloaded the pheromone gland ESTs of three other Lepidoptera A. segetum, B. mori and H. virescens from the dbEST database of NCBI and previously published pheromone gland transcriptome of H. virescens [39]. After assembling these ESTs we obtained 925 unigenes from A. segetum, 3943 from B. mori and 8202 from H. virescens with an average length of 384 bp, 692 bp and 474 bp, respectively. These are much lower numbers than that obtained by the current study through the 454 sequencing of the A. ipisilon pheromone gland, demonstrating that our pheromone gland transcriptome is currently the largest transcriptome resource for an insect pheromone gland.
When comparing the pheromone gland transcripts pairwise using best bidirectional hits, we found that there were 461 homologous transcripts between A. ipsilon and A. segetum, 1110 homologous transcripts between A. ipsilon and B. mori, and 2106 homologous transcripts between A. ipsilon and H. virescens ( Figure 2 Figure 2).

Transcript abundance in the A. ipsilon pheromone gland
The pheromone gland mRNA samples used for constructing the cDNA library were non-normalized and non-amplified by PCR, so the reads in the sequencing dataset most likely represent the relative abundance of each assembled transcript in the pheromone gland as summarized in Table 2. The most abundant transcripts include vitellogenin, a major reproductive protein in insects (2,925 reads per kilobase per million mapped reads (RPKM); 2.2% reads), the precursor of egg yolk proteins for insect egg production [40] and genes involved in PBAN Candidate genes in the A. ipsilon pheromone gland with putative functions in pheromone production, transport and degradation The overall enzymatic steps during pheromone biosynthesis in A. ipsilon are likely to be similar to those in other moth species, which include fatty acid synthesis, desaturation, chain shortening, reduction and acetylation [1,2,6]. By homologous searches we identified members of gene subfamilies in the A. ipsilon pheromone gland transcriptome putatively involved in these biosynthetic processes and pheromone production, including transcripts putatively encoding 3 synthases (2 actyl-CoA carboxylase and 1 fatty acid synthase), 5 desaturases, 13 acyl-CoA reductases, 5 alcohol oxidases and 5 acetyltransferases as well as 11 aldehyde reductases (Table 3); 17 transcripts encoding putative Figure 1 Top 15 insect species that have significant BLASTn hits. All A. ipsilon pheromone gland unigenes were used in BLASTn searches against the GenBank entries. The significant hits with an E-value <=1.0E-5 for each query were grouped according to species and the number of the unigenes that had significant homology is indicated after the specie name. pheromone degradation enzymes (Table 4); 8 transcripts encoding putative CSPs and 7 transcripts encoding putative OBPs (Table 5). Their abundances in the pheromone gland transcriptome are shown in Figures 3 and 4. We further validated and characterized the expression level and the tissue distribution of these genes by RT-PCR and qRT-PCR and summarised below. There is a clear agreement between the transcript abundance estimated by the transcriptome sequencing and transcript expression level in the pheromone gland as measured by RT-PCR and qRT-PCR.

Receptor for the pheromone biosynthesis activating neuropeptide (PBAN)
PBAN is released from the suboesophagal ganglion in the brain and goes to the hemolymph, where it binds to the PBAN receptor in the membrane of the pheromone gland and triggers the pheromone production [42,43]. Although there was no PBAN receptor found in the pheromone gland transcriptome of H. virescens [39] we found one transcript (Unigene_3821) encoding a protein highly homologous to PBAN receptor isoform B. It has very low abundance in the A. ipsilon transcriptome (31 RPKM) but high amino acid identity of 97% to H. virescens PBAN receptor in GenBank (Protein IDs: ABU93813) [44].

Acetyl-CoA carboxylase (ACC)
Saturated long chain fatty acids are the precursors of sex pheromones in most moth species. Their biosynthesis is started by ACC catalysing the production of malonyl-CoA from acetyl-CoA in the first committed biosynthesis step [8,9]. In the A. ipsilon pheromone gland we found two transcripts (ACC-JX989149 and ACC-JX989150) encoding ACCs. ACC-JX989149 with an open reading frame (ORF) of 5841 bp encodes for a ACC with 67% amino acid identity with the ACC of T. castaneum (Protein ID: XP_969851) and ACC-JX989150 encodes a protein with 56% amino acid identity with the ACC of H. virescens (Protein ID: ACX53705) ( Table 3). The RT-PCR and qRT-PCR revealed that both ACC-JX989149 and ACC-JX989150 are highly expressed in the pheromone gland as compared to the body ( Figure 5 and Figure 6). However, they have very low abundance (81 and 21 RPKM) in the transcriptome ( Figure 3).

Fatty acid synthase (FAS)
FAS has been shown to catalyse the conversion of malonyl-CoA and NADPH to produce saturated fatty acids [8]. We identified one putative FAS transcript (FAS-JX989151) in the A. ipsilon pheromone gland (Table 3), containing an ORF of 7176 bp and encoding a FAS with 57% amino acid identity to the FAS of T. castaneum (Protein ID: XP_970417). The RT-PCR and qRT-PCR revealed that FAS-JX989151 is highly expressed in the pheromone gland (40-fold higher than in the body, Figure 5 and Figure 6) and also has a high abundance (343 RPKM) in the transcriptome ( Figure 3).

Desaturase (DES)
Pheromone-specific desaturases introduce double bond(s) into the fatty acids at specific positions along the chain. Five putative sex pheromone components extracted from A. ipsilon sex pheromone gland are unsaturated fatty acids with acetate as the functional group and 16 or less carbons      [46] and Japan [47]. It is reasonable to propose that the saturated fatty acid precursor of A. ipsilon sex pheromones would be palmitic acid (16:0) which is desaturated by Δ11-desaturase to form the precursor Z11-16:acyl-CoA for the production of two major (Z7-12:OAc and Z9-14: OAc) and two minor (Z11-16:OAc and Z5-10:Ac) pheromone components (Figure 7). It is not clear how the minor pheromone component (Z8-12:OAc) is synthesized in A. ipsilon, which should involve a Δ12-desaturase. Other studies in Lepidoptera species support a Δ11-desaturase acting on palmitic acid and leading to the production of the sex pheromone components [19,20,48]. In the A. ipsilon pheromone gland transctiptome 5 transcripts have high homology to genes encoding desaturases (Table 3) JX989156 encode proteins, respectively, with 94% amino acid identity to the acyl-CoA desaturase from H. assulta (Protein ID: AF482909), 64% amino acid identity to a S. littoralis desaturase (Protein ID: AAQ74260) and 93% amino acid identity to an acyl-CoA desaturase of S. exigua (Protein ID: AAM28510). These transcripts could possibly encode Δ12-desaturases in A. ipsilon in formation of the minor pheromone component Z8-12:OAc from the precursor Z12-16:acyl-CoA. However, they could also function as Δ9-desaturase. Further study on their enzyme activity could confirm their role in the sex pheromone biosynthesis. The RT-PCR and qRT-PCR results indicated that DES-JX989153 and DES-JX989154 are highly expressed in the A. ipsilon pheromone gland compared with the body (85 and 63 fold higher, respectively) ( Figure 5 and Figure 6). One of the transcripts (DES-JX989154) is also highly abundant (1206 RPKM) in the pheromone gland transcriptome (Figure 3), suggesting a possible role in A. ipsilon sex pheromone biosynthesis.

Fatty acyl-CoA reductase (FAR)
Once a specific Δ11 and possibly Δ12 double bond is introduced into fatty acid precursors to form a fatty acyl-CoA precursor, the chain of the precursors is then shortened sequentially by ß-oxidation to form different shorter chain fatty acyl-CoA precursors [6]. These precursors are further reduced individually by fatty acyl reductase (FAR) to form corresponding fatty alcohols [26,28,51]. In the A. ipsilon pheromone gland transcriptome there are 13 transcripts homologous to putative FAR genes (Table 3). Among them, 5 transcripts encode proteins with 59%-80% amino acid identity to the fatty-   (Table 3). The RT-PCR and qRT-PCR results indicated that three transcripts (FAR-JX989157, FAR-JX989162 and FAR-JX989164) are highly expressed in the pheromone gland ( Figure 5 and Figure 6). The other ten transcripts seem equally expressed in the pheromone gland and the body or highly expressed in the body. All FAR transcripts except two (FAR-JX989157 and FAR-JX989159) have low abundance (from 81 and 16 RPKM) in the pheromone gland transcriptome (Figure 3).

Alcohol oxidase/dehydrogenase (AOX)
Fatty alcohols can be used as pheromone components in many moth species, and they are also pheromone intermediates to produce aldehyde pheromones by the alcohol oxidases [52,53]. In the A. ipsilon PG 5 homologous genes of alcohol oxidase/dehydrogenase were identified, the BLASTx results revealed three unigenes (AOX-KC007341, AOX-KC007342 and AOX-KC007344) are with the amino acid identity of 43%, 55% and 64%, respectively, to a putative alcohol dehydrogenase of D. plexippus (Protein ID: EHJ70611), and one unigene (AOX-KC007345) are homologous to another putative alcohol dehydrogenase of D. plexippus (Protein ID: EHJ73729 ) with the amino acid identity of 68%. AOX-KC007343 showed 78% amino acid identity with the alcohol dehydrogenase of H. virescens (Protein ID: ACX53694). The RT-PCR and qRT-PCR results indicated that AOX-KC007341 and AOX-KC007343 showed a higher expressed level in the PG than in the body ( Figure 5 and Figure 6).

Aldehyde reductase (AR)
Aldehyde reductases are members of the aldoketoreductase superfamily and could be used to reduce long-chain acyl-CoA to form alcohol intermediates [13]. In the A. ipsilon pheromone gland we identified 11 transcripts with homology to the aldo-ketoreductases of Papilio dardanus, B. mori, H. armigera, D. plexippus, Culex quinquefasciatus, H. virescens and Papilio xuthus ( Table 3). The derived protein sequences of these 11 transcripts show 53%-88% amino acid identity with their homologs in other insects. The RT-PCR and qRT-PCR results indicated that AR-KC007350 and AR-KC007351 are mainly expressed in the pheromone gland, while the other 9 putative aldehyde reductase transcripts have equal expression levels between the pheromone gland and the body or a higher expression level in the body ( Figure 5 and Figure 6). All aldehyde reductase transcripts are present at low abundance (from 67 to 10 RPKM) in the pheromone gland transcriptome (Figure 3). The involvement of aldehyde reductase in sex pheromone biosynthesis has not been demonstrated in moth species.

Acetyltransferase (ATF)
The fatty acid alcohols are used as pheromone components in many moth species. In A. ipsilon whose sex pheromone blends comprise only acetates, they are intermediates and acetylated to pheromone components as acetate esters by  actyltransferases [13]. In the A. ipsilon pheromone gland transcriptome 5 acetyltransferase homologous transcripts were identified ( Table 3) Genes encoding candidate pheromone degrading enzymes in the A. ipsilon pheromone gland It would be potentially harmful to insects if pheromone molecules and other odorants remained on the olfactory receptors after they had stimulated the olfactory receptor neurons (ORNs). It is therefore thought that there are mechanisms to protect the ORNs by odorant degrading enzymes (ODEs) [37] including esterases [54,55], aldehyde oxidases [56][57][58], cytochromes P450 [59][60][61], carboxyl esterase [62], and glutathione S-transferase (GST) [63]. In this study, we identified 17 transcripts predicted to encode esterases in the A. ipsilon pheromone gland, and the BLASTx results showed that all have very high amino acid identities with the antennal esterases of S. littoralis (Table 4), we named them as AipsCXE1-AipsCXE16 and AipsCXE20 following the nomenclature in S. littoralis. Our qRT-PCR results revealed that 7 of the transcripts (AipsCXE3, AipsCXE7, AipsCXE8, AipsCXE9, AipsCXE11, AipsCXE14 and AipsCXE20) are antennal-enriched, 3 (AipsCXE5, AipsCXE10 and AipsCXE15) are both antennal-and pheromone gland-enriched and the remaining 7 (AipsCXE1, AipsCXE2, AipsCXE4, AipsCXE6, AipsCXE12, AipsCXE13 and AipsCXE16) have similar expression levels in antennae, body and pheromone gland, suggesting they are not pheromone specific (Figure 8).
Genes encoding candidate pheromone carrier proteins in the A. ipsilon pheromone gland Moth sex pheromones are synthesised and protected from degradation until being released from the female pheromone gland and it has been proposed that OBPs and CSPs could participate in this process. In this study we have identified transcripts of 7 OBPs and 8 CSPs from the A. ipsilon pheromone gland ( Table 5), all of these have the typical insect OBP sequence motif C1-X 15-39 -C2-X 3 -C3-X 21-44 -C4-X 7-12 -C5-X 8 -C6 [35,64] or CSP sequence motif C 1 -X 6-8 -C 2 -X 16-21 -C 3 -X 2 -C 4 [65]. One CSP transcript, AipsCSP2 seems to be gland-specific and has an extremely high expression level (>100 folds) in the pheromone glands compared with the antennae and body and a relative high abundance in the pheromone gland transcriptome. AipsCSP8 shows a higher expression level in the (See figure on previous page.) Figure 6 qRT-PCR results showing the relative expression levels of the A. ipsilon pheromone biosynthesis related genes between the pheromone gland (PG) and the body (BO). The putative enzyme names are indicated as gene abbreviations followed by Genbank accession numbers. ACC Acetyl-CoA carboxylase, FAS Fatty acid synthase, DES Desaturase, FAR Fatty acyl reductase, AOX alcohol oxidase, AR Aldehyde reductase, ATF Acetyltransferase. The internal control β-actin and ribosomal protein S3 were used to normalize transcript levels in each sample. This figure was presented using β-actin as reference gene to normalize the target gene expression and correct sample-to-sample variation; similar results were also obtained with ribosomal protein S3 as reference gene. The standard error is represented by the error bar, and the different letters (a, b) above each bar denote significant differences (p <0.05).  Figure 7 Putative biosynthesis pathways of the sex pheromones in Agrotis ipsilon. The saturated fatty acid precursor palmitic acid (16:0) is desaturated by Δ11-desaturase to form the precursor Z11-16:acyl-CoA for the production of three major and one minor pheromone components (adapted from [2,6,12,13,50]).
pheromone gland (10-fold higher than in body) ( Figure 9) and is extremely abundant with 1,364 RPKM in the pheromone gland transcriptome (Figure 4). There is one OBP transcript (AipsOBP6) which is highly expressed in the pheromone gland (more than 3-fold higher than in the antennae), and 3 OBPs (AipsOBP1, AipsOBP2 and AipsOBP4) are highly expressed in the antennae ( Figure 10). This high expression of OBPs and CSPs in the pheromone gland is interesting because it suggests a possible involvement in carrying and releasing sex pheromones as demonstrated for the antennal OBPs and CSPs. However, the molecular mechanisms that connect these proteins with the involvement of pheromone production needs further investigation. No ORs, IRs and SNMPs are identified in the A. ipsilon pheromone gland.

Conclusions
The black cutworm A. ipsilon is a destructive pest of many crops [66,67] and mainly controlled by chemical pesticides, which has led to the development of resistance to various compounds [68]. Our study provides information and resource to identify and facilitate functional studies of genes responsible for pheromone production, transport and degradation at the molecular level both in vivo and in vitro. By deep sequencing of the A. ipsilon sex pheromone gland transcriptome, we have identified 42 transcripts encoding enzymes putative involved in pheromone production. This is the first study reporting the key enzyme Δ11-desaturase involved in A. ipsilon sex pheromone biosynthesis. One new transcript (DES-JX989154) encoding a desaturase is highly abundant in the transcriptome and highly expressed in the pheromone gland, suggesting this desaturase encoded by DES-JX989154 or other newly identified transcripts (DES-JX989155 and DES-JX989156) may play important roles in A. ipsilon sex pheromone biosynthesis. They may contribute in the introducing a double bond at C11 and C12 positions of the saturated fatty acid precursor palmitic acid for the production of pheromone precursors. Further studies are needed to confirm the substrates and the products thus the involvement of these desaturases and other newly identified genes such as those encoding for aldehyde reductases and acetyltransferases in A. ipsilon sex pheromone biosynthesis. Two of the CSPs are highly  abundant transcripts (AipsCSP2 and AipsCSP8) with 100and 10-fold higher transcription level, respectively than in the body. Furthermore AipsCSP2 and AipsOBP6 are pheromone gland-specific and -enriched, respectively ( Figure 9 and Figure 10). This suggests a functional role of the PG-enriched CSPs and OBPs in sex pheromone transport and release. It is clear that during perireceptor event after pheromones and odorants enter the sensillun lymph that the antennae-specific odorant binding proteins (OBPs) capture these hydrophobic pheromone and odorant and deliver them to the membrane-bound olfactory receptors (ORs) [35]. Further study of these PG-expressed OBPs, especially their binding to sex pheromone components is needed to confirm its function.

Insect material
The A. ipsilon colony has been reared in our laboratory (State Key Laboratory for Biology of Plant Diseases and Insect Pests, Chinese Academy of Agricultural Sciences, Beijing, China) since 2006 with field-collected moths introduced each summer to prevent inbreeding effects. The larvae were reared on an artificial diet comprising wheat germ, casein and sucrose as the main components. The colony was kept at 24°C with 75% relative humidity and a 14h:10h light:dark photoperiod. Pupae were sexed and kept separately in hyaline plastic cups before emergence. Adult moths were given 20% honey solution after emergence.

Pheromone gland dissection
The pheromone gland plus associated ovipositor valves and parts of the terminal abdominal segments were dissected with fine scissors [39] from the rest of the body parts refereed as 'body' which comprises of heads, thoraxes, legs, wings and abdomens (without the pheromone glands). The calling behavior of female A. ipsilon moths begins on the first night after eclosion and increases sharply, peaking on the third night [38]. So in order to cover all genes involved in pheromone biosynthesis, four glands of 1-day-old females, four glands of 2-day-old females and ten glands of 3-day-old females were dissected during the second half of the scotophase, which is reported to be the calling period of this moth [69][70][71]. The eighteen glands were mixed in one RNase-free centrifuge tube for total RNA extraction and frozen in liquid nitrogen until further processing.

RNA extraction and cDNA library construction
Total RNA was extracted using TRIzol regent (Invitrogen, Carlsbad, CA, USA) according to the manufacturer's protocol. The quantity of RNA was determined using a Nanodrop spectrophotometer (Thermo Scientific, Wilmington, DE, USA) and 1.1% agarose gel electrophoresis. About 500 ng mRNA was further purified from 50 μg total RNA using the polyATtract mRNA isolation system III (Promega, Madison, WI, USA). The mRNA was then sheared into about 800 nucleotides using a RNA fragmentation solution (Autolab, Beijing, China) at 70°C for 30 sec, and then cleaned and condensed using RNeasyMinElute RNA Cleaning Up kit (Qiagen, Valencia, CA, USA). The mRNA was used as a template for first-strand cDNA synthesis using N6 random primers and MMLV reverse transcriptase (TaKaRa, Dalian, China) and the second strands were synthesized using Secondary Strand cDNA synthesis enzyme mixtures (Autolab, Beijing, China). cDNAs with appropriate length were purified with the QIAquick PCR Purification kit (Qiagen, Valencia, CA, USA) and eluted with 10 μl Elution Buffer. After blunt ending and the addition of a poly-A tail at the 3' end according to the Roche's Rapid Library Preparing protocols (Roche, USA), the purified cDNAs were linked to GS-FLX sequencing Adaptors (Roche, USA). Finally, the cDNAs shorter than 500 bp were removed using Ampure Beads according to the manufactures' instruction (Beckman, USA) before the preparation of the cDNA library.

sequencing
Pyrosequencing of the cDNA library was performed by Beijing Autolab Biotechnology Company using a 454 GS-FLX sequencer (Roche, IN, USA). All sequencing reads were deposited into the Short Read Archive (SRA) of the National Center for Biotechnology Information (NCBI) under the accession number SRX189143.

Sequence analysis and assembly
Base calling of the raw 454 reads in SFF files was carried out using the python script sff_extract.py developed by COMAV (http://bioinf.comav.upv.es). All raw reads were then processed to remove low quality and adaptor sequences using programs tagdust [72], LUCY [73] and SeqClean [74] with default parameters. The resulting sequences were then screened against the NCBI UniVec database (http://www.ncbi.nlm.nih.gov/VecScreen/UniVec.html) to remove possible vector sequence contamination. Cleaned reads shorter than 60 bases were discarded because they are likely to be sequencing artifacts [75].
Two steps were taken to assemble the clean reads. First MIRA3 [76] was used with the assembly settings of minimum sequence overlap of 30 bp and minimum percentage overlap identity of 80%. Then CAP3 was used with assembly parameters of overlap length cutoff >30 and overlap percent identity cutoff >90% [77]. The resulting contigs and singletons of more than 100 bases were retained as unigenes and annotated as described below.

Homology searches and functional classification
Following the assembly, homology searches of all unigenes were performed using BLASTx and BLASTn programs against the GenBank non-redundant protein (nr) and nucleotide sequence (nt) database at NCBI [78]. Matches with an E-value less than 1.0E-5 were considered significant [79]. Gene names were assigned to each unigene based on the best BLASTx hit with the highest score value.
Gene Ontology terms were assigned by Blast2GO [80] through BLASTx program with an E-value less than 1.0E-5. Then, WEGO [81] software was used for assignment of each GO ID to the related ontology entries. The longest open reading frame (ORF) of each unigene was determined by an ORF finder tool (http://www.ncbi.nlm.nih.gov/gorf/gorf.html).

Sequence analyses
The putative N-terminal signal peptides and most likely cleavage sites were predicted by the SignalP V3.0 program [83] (http://www.cbs.dtu.dk/services/SignalP/). Sequence alignments were done with ClustalX 1.83 [84] with default gap penalty parameters of gap opening 10 and extension 0.2.

RT-PCR and qRT-PCR
The cDNAs from female pheromone glands and other body parts (mixture of heads, thoraxes, legs, wings and abdomens (without the pheromone glands)) were synthesized using PrimeScript RT Reagent with gDNA Eraser (TaKaRa, Dalian, China). 200 ng cDNA was used as RT-PCR and qRT-PCR templates. Specific primer pairs for RT-PCR analysis were designed with Primer 3 (http://frodo. wi.mit.edu/) or Primer Premier 5 (see Additional file 4). To test the integrity of the cDNA templates, a pair of control primers for the β-actin (GenBank Acc. JQ822245) of A. ipsilon was used. The PCR cycling profile was: 95°C for 2 min, followed by 35 cycles of 95°C for 30 sec, 60°C for 30 sec, 72°C for 1 min and a final extension for 10 min at 72°C. PCR products were separated in 1.2% agarose gels and stained with ethidium bromide. Each reaction was done at least six times with three biological replicates.
qRT-PCR analysis was conducted using the ABI 7500 Real-Time PCR System (Applied Biosystems, Carlsbad, CA). The primers were designed by Beacon Designer 7.90 (PREMIER Biosoft International) (see Additional file 5). Two reference genes, β-actin (GenBank Acc. JQ822245) and ribosomal protein S3 (GenBank Acc. JQ822246) were used for normalizing expression of the target gene and correcting for sample-to-sample variation. qRT-PCRs were done in a 25 μl reaction containing 12.5 μl of Platinum SYBR Green qPCR SuperMix-UDG (Invitrogen, Shanghai, China), 0.5 μl of each primer (10 pmol/ μl), 0.5 μl of Rox Reference Dye, 1 μl of sample cDNA (200 ng/μl), 10 μl of sterilized H 2 O. The cycling parameters were: 50°C for 2 min, 95°C for 2 min, followed by 40 cycles of 95°C for 15 sec and 60°C for 30 sec. Then, the PCR products were heated to 95°C for 15 sec, cooled to 60°C for 1 min and heated to 95°C for 30 sec and cooled to 60°C for 15 sec to measure the dissociation curves. Negative controls, without either template or transcriptase, were included in each experiment. To check reproducibility, each qRT-PCR reaction for each sample was carried out in three technical replicates and three biological replicates.