Skip to main content

Advertisement

Transcriptome analysis of the almond moth, Cadra cautella, female abdominal tissues and identification of reproduction control genes

Article metrics

Abstract

Background

The almond moth, Cadra cautella is a destructive pest of stored food commodities including dates that causes severe economic losses for the farming community worldwide. To date, no genetic information related to the molecular mechanism/strategies of its reproduction is available. Thus, transcriptome analysis of C. cautella female abdominal tissues was performed via next-generation sequencing (NGS) to recognize the genes responsible for reproduction.

Results

The NGS was performed with an Illumina Hiseq 2000 sequencer (Beijing Genomics Institute: BGI). From the transcriptome data, 9,804,804,120 nucleotides were generated and their assemblage resulted in 62,687 unigenes. The functional annotation analyses done by different databases, annotated, 27,836 unigenes in total. The transcriptome data of C. cautella female abdominal tissue was submitted to the National Center for Biotechnology Information (accession no: PRJNA484692). The transcriptome analysis yielded several genes responsible for C. cautella reproduction including six Vg gene transcripts. Among the six Vg gene transcripts, only one was highly expressed with 3234.95 FPKM value (fragments per kilobase per million mapped reads) that was much higher than that of the other five transcripts. Higher differences in the expression level of the six Vg transcripts were confirmed by running the RT-PCR using gene specific primers, where the expression was observed only in one transcript it was named as the CcVg.

Conclusions

This is the first study to explore C. cautella reproduction control genes and it might be supportive to explore the reproduction mechanism in this pest at the molecular level. The NGS based transcriptome pool is valuable to study the functional genomics and will support to design biotech-based management strategies for C. cautella.

Background

Date palm, Phoenix dactylifera is an important fruit tree of the Arabian Peninsula and temperate regions worldwide [1]. In hot dry regions globally, dates have a very important history and are considered one of the most important nutritional fruits. Dates can be consumed in many ways, such as eaten directly as fresh dates, eaten as dried dates, and also used in the preparation of date cookies, date paste, date syrup, and many other products. Additionally, dates have a very important medicinal value as they contain a rich source of minerals [2]. The presence of amino acids, flavonoids, steroids, anti-oxidants, anti-inflammatory, and anticancer elements in the flesh highlights the medicinal and nutritional importance of dates [3, 4]. The by-products of dates are used for the production of organic acids, antibiotics, and fermented yeast. In the Gulf region, the populace prefer to consumes a certain quantity of dates [5].

Several devastating pests can infest date fruits causing great economic losses. These pests include the almond moth, Cadra cautella (Walker) (Lepidoptera: Pyralidae) and the sawtoothed grain beetle, Oryzaephilus surinamensis [1]. In the Middle East as well as in many other regions of the world, C. cautella is a destructive polyphagous storage pest of date fruits, cereals, dried fruits, ground nuts, and maize [6,7,8]. The life cycle of C. cautella is short with many generations per year and a single female can produce 213 and 422 eggs/female, when reared on artificial diet and “khodari” date fruits, respectively [7, 9,10,11,12].

The moth, C. cautella infests date fruits both in the field as well as in the warehouses and deteriorates the quantity and quality of dates, which leads to trade restrictions. Many countries enforce strict quarantine limitations, which bound the world trade in agricultural produce [13]. The control of C. cautella mostly depends on fumigation with methyl bromide and phosphine gas, which are effective and inexpensive and have been widely applied over the last few decades. However, recently the use of such control treatments have been questioned because the excessive use of these chemicals poses environmental concerns for human health as well as the phosphine resistance that has been reported in several stored product insect species [14,15,16]. In addition, methyl bromide, that was an efficient and cost effective fumigant; has been declared an ozone depleting chemical and has been phased out of production and use [17].

Several studies have reported on the basic ecological and biological characteristics of C. cautella [11, 18,19,20]. Therefore, there is an urgent need to develop environmentally friendly strategies to manage this serious pest. However, the molecular mechanism of its reproduction remains unknown. Over the last two decades, genomes of different insects have been sequenced. Genes related to reproduction, physiology, and sex pheromone biosynthesis and their receptors have been intensively studied for further analysis [21,22,23,24,25,26]. Thus, the objective of the present study was to identify the reproduction control genes through transcriptome data analysis especially the vitellogenin (Vg). Vg is the key component of egg yolk protein, synthesized extra-ovarially in the fat body tissues, and transported to the developing oocytes where it is internalized in the egg by the VgR and serves as a nutrient source for the developing embryo. Vg and VgR have been reported at the genetic and molecular level in many insect species [21, 22, 27,28,29,30,31].

The transcriptome is an entire set of transcripts in a cell, tissue, or organism. De novo transcriptome sequencing is a method of creating a transcriptome profile via the Illumina HiSeq 2000/2500 platform [32]. Next-generation sequencing (NGS), can extensively explore the structure and provide indication about functional role of a particular gene product in a given tissue without the aid of any reference genome [33, 34]. The NGS is an analytical technique that sequences RNA molecules with a large number of reads [35,36,37]. Transcriptome analysis has been used to study fatal diseases in humans, plants, and other organisms [38,39,40]. Transcriptomes from many insect species have been sequenced such as the silkworm, Bombyx mori, red flour beetle, Tribolium castaneum, and oriental fruit fly, Bactrocera dorsalis [41,42,43].

Sequencing of C. cautella abdominal tissues transcriptome would clarify the reproduction strategies of at the molecular level.

To the best of our knowledge, the present study is the first to report on the transcriptome analysis of C. cautella abdominal tissues, provides evidence-based knowledge to facilitate the development of future eco-friendly management strategies for this pest.

Results

Cadra cautella transcriptome sequencing and sequence assembly

A library of C. cautella adult female abdominal tissue was sequenced by the Illumina Hiseq 2000 system. The transcriptome generated raw reads, these reads were cleaned with the help of filter-fq software (version: internal filter_fq software of BGI). The de novo assembly detected 62,687 unigenes. The details of unigenes total length, average length, and N50 is presented in (Additional file 1: Table S1).

Structural and functional annotation of unigenes

For functional annotation analysis, we obtained 25,880, 15,432, 17,738, 16,106, 8828, 9494 unigenes, which annotated to the NR, NT, Swiss-Prot, KEGG, COG, and GO databases, respectively. The total annotated unigenes were 27,836 (Table 1). For protein coding region prediction analysis, the number of coding DNA sequence (CDS) that mapped to the protein database was 25,715, whereas the number of predicted CDS was 2719 (Additional file 3: Table S2).

Table 1 Summary of annotated unigenes obtained from Cadra cautella female abdominal tissue transcriptome analysis

Among the unigenes, 6789, 2, 13, and 36 were annotated exclusively to the NR, COG, KEGG, and Swiss-Prot protein databases, respectively, with 1297 unigenes annotated using both the NR and KEGG databases. In addition, 42 unigenes were commonly annotated using the NR, COG, and KEGG databases whereas no unigenes were commonly annotated using the KEGG and COG protein databases. Furthermore, 8401 common elements were annotated in the NR, COG, KEGG, and Swiss-Prot databases (Fig. 1).

Fig. 1
figure1

Schematic presentation of Cadra cautella female abdominal tissue transcripts annotated in different protein databases (e-value < 0.00001)

A total of 27,836 unigenes sequences shared some similarity to known genes from the National Center for Biotechnology Information (NCBI) database. The ranges in e-value and sequence similarity of the top hits in the NR database were comparable, with 49% (e-value of 0 to 60) and 28.5% (100–80%), respectively, of the sequences possessing homology (Fig. 2a, b). On a species basis, the highest proportion of matching sequences in the NR database were derived from Bombyx mori (45.59%), followed by Danaus plexippus (31%) (Fig. 2c).

Fig. 2
figure2

Proportional distribution of e-value, sequence similarity, and species distribution unigenes against the non-redundant protein (NR) database

Functional annotation was assigned using the protein (NR and Swiss-Prot), COG, and GO databases. BLASTX was employed to identify related sequences in the protein databases. The COG database attempts to classify proteins from completely sequenced genomes on the basis of the orthology concept. The COG analysis permitted the functional classification of 8828 of the unigenes. Among these genes, the peak regularly recognized classes including “general function” (3636, 41.18%), followed by “replication, recombination, and repair” (1816, 20.57%), “translation, ribosomal structure, and biogenesis” (1562, 17.69%), “function unknown” (1342, 15.20%), “transcription” (1278, 14.47%), and “posttranslational modification, protein turnover, and chaperones” (1237, 14.01%) (Fig. 3).

Fig. 3
figure3

COG functional classification of unigenes from Cadra cautella female abdominal tissue transcriptome. The horizontal coordinates represent the functional classes identified using COG analysis and the vertical coordinates shows the numbers of unigenes in each class. The functions of each class are provide in the notation on the right

Functionally categorized genes of C. cautella were assigned GO terms for each assembled unigenes [44]. The unigenes were placed in three main GO categories: biological process (34,770, 55.46%), cellular component (17,661, 28.17%), and molecular function (11,232, 17.91%). These GO terms were additionally sectioned into 62 sub-categories. NR annotation was given the type of “biological process” and, within this ontology, the three most common functions were “biogenesis” (5521, 15.27%), “metabolic process” (5177, 14.88%), and “single-organism process” (4731, 13.60%). At the level of cellular components, the three most common functions were “cell part” to 3714 unigenes (21.02%), “cell” to 3714 unigenes (21.02%), and “organelle” to 2637 unigenes (14.93). Whereas within the ontology of molecular functions, “catalytic activity” (4574, 40.72%) and “binding” (4380, 38.99%) proteins made up the majority of the unigenes (Fig. 4).

Fig. 4
figure4

GO functional classification of unigenes identified from Cadra cautella female abdominal tissue transcriptome. The horizontal coordinates represent the functional classes identified using GO analysis and the vertical coordinates show the numbers of unigenes in each class

Protein coding region prediction

Unigenes were aligned by BLASTX (e-value < 0.00001) to protein databases in the following order: NR, Swiss-Prot, KEGG, and COG. Proteins with the highest ranks in the BLAST results were taken to decide the coding region sequences of unigenes, and the coding region sequences were translated into amino sequences. Unigenes that could not be aligned to any database were scanned by ESTScan (Version = V3.0.2) to predict the protein coding region, which is very important to determine the sequence direction (5′ – > 3′). The number of CDS that mapped to the protein databases was 25,715, whereas the ESTScan predicted that the CDS would be 2719 unigenes. The total number of CDS obtained in the study was 28,434 (Additional file 3: Table S2). The prediction of the protein coding region is very important to determine the accurate functioning of a gene, because the DNA is a long molecule that carries genes and these genes contain introns and exons. The exons are the only segments of a gene that carries the code for protein formation. The protein-coding sequenc and distribution of ESTScan sequences from Cadra cautella female abdominal tissue transcriptome are presented in (Figs. 5 and 6).

Fig. 5
figure5

Length distribution of protein-coding sequence from Cadra cautella female abdominal tissue transcriptome. The horizontal axis shows the length and the vertical axis shows the numbers of unigenes with a given length

Fig. 6
figure6

Length distribution of ESTScan sequences from Cadra cautella female abdominal tissue transcriptome. The horizontal axis shows the length while the vertical axis shows the numbers of unigenes with a given length

Most highly abundant transcripts in the Cadra cautella female abdominal tissue

The transcripts that were most highly expressed in the C. cautella adult female abdominal tissues are presented in Table 2. The highly abundant transcripts were yolk polypeptide 2 and follicular epithelium yolk protein subunits with FPKM values of 19,538.56 and 6939.47, respectively. Moreover, apolipophorin III and Vg genes were also among the highly expressed transcripts in the C. cautella female abdominal tissue with 4262.26 and 3234.95 FPKM values, respectively. The abundance of the reproduction control genes and yolk polypeptide encoding transcripts in the data reflects their key role in the development of future embryos inside the eggs.

Table 2 Most highly abundant transcripts detected by transcriptome analysis in the Cadra cautella adult female abdominal tissue

Identification of reproduction control genes from Cadra cautella female abdominal tissue

By means of BLASTX, almost 57 genes potentially responsible for C. cautella reproduction were identified from the transcriptome analysis of female abdominal tissue. The genes identified were Vg, VgR, and lipid carrier protein (apolipophorin), sulfur containing amino acids carrying proteins that enhance vitellogenesis (hexamerins) and egg shell protein (chorion). All of these genes were submitted to NCBI and their accession numbers obtained (see Table 3). The details regarding FPKM values, blast hit score, putative identification of the gene, and resemblance with closely related species are presented in Table 3. There were also the transcripts that encode very important proteins and enzymes that play a role in development. The identification of the juvenile hormone and ecdysone receptor might be a very important addition to study the reproductive development in this pest, because these two genes are responsible for regulating many aspects of arthropods life cycles. Insect development and reproduction are mainly linked to the fluctuating levels of juvenile hormone and ecdysone.

Table 3 Putative reproduction control genes obtained from transcriptome analysis of Cadra cautella adult female abdominal tissue

Identification of Vg genes from Cadra cautella transcriptome data and validation by RT-PCR

The C. cautella transcriptome data provided six partial Vg gene transcripts. Among the six Vg transcripts, one of the transcripts was more highly expressed with a FPKM value 3234.95 than the other five Vg transcripts (FPKM values of 6.343, 3.34, 1.13, 0.83, and 0.057, respectively). These transcripts were designated as CcVg, CcVg like 1, CcVg like 2, CcVg like 3, CcVg like 4, and CcVg like 5. The information regarding the length, and compositions, of the 6 transcripts identified in the transcriptome assembly, are given in the Additional file 4: Table S3. It was very important to check how many of the Vg transcripts were functional in C. cautella. Therefore, the expression levels of all Vg transcripts were verified by RT-PCR using gene specific primers (Additional file 5: Table S4). The gene specific primers were designed based on the partial transcripts identified in the transcriptome assembly by using Primer3 software (http://bioinfo.ut.ee/primer3-0.4.0/). The amplified cDNA was sequenced and aligned by using (BioEdit Sequence Alignment Editor) with the 6 Vg transcripts, result showed that the amplified sequence was exactly similar with the partial sequence of CcVg transcript. It reflects that CcVg had a higher expression level (over 3000 times) than that of the other five Vgs transcripts, and it might be the primarily functional Vg gene in C. cautella (Fig. 7).

Fig. 7
figure7

Confirmation of Cadra cautella Vg gene transcripts. Cadra cautella Vg gene transcripts identified by next- generation sequencing with reverse transcription polymerases chain reaction (RT-PCR). Agarose gel 1.2% was used to analyze the amplified PCR products. The CcVg and actin genes amplified products size are shown on the right. The amplified bands were visualized under ultra violet light and photographed using gel documentation BioDocAnalyze system (Biometra). M = molecular weight marker, bp = base pairs

Discussion

The order Lepidoptera is one of the most important groups of insect pests, which cause severe losses to agricultural products worldwide. The majority of lepidopterans (approximately 90%) are moths, with their caterpillars in particular being notorious pests of agricultural produce. Approximately 70% of moths are linked to stored product infestations. The almond moth, C. cautella (Walker), is an economically important pest of dates [6, 12, 45]. Recent studies have focused on its biology and ecology, and have proposed several management strategies to control these pests, including use of botanical extracts [46], heat treatments [47], freezing effects [48], essential oil extract [49, 50], and modified atmosphere [12, 51]. However, due to a lack of genetic information nothing is known about the reproductive mechanism of this economically important pest. Thus, the objective of the present study was to isolate the reproduction control genes from C. cautella by deploying the NGS approach.

Illumina NGS sequencing of C. cautella resulted in 62,687 unigenes discovered, with 44.4% of these (27836) having remarkable homology to operating genes encoding precise proteins with BLASTX analysis in GenBank. The analysis of unigenes homology indicated that 45.59% of genes showed the highest resemblance with Bombyx mori followed by 31% similarity with D. plexippus. These results indicate that C. cautella has a closer relationship to Bombyx mori and D. plexippus then to other lepidopteran members [52]. Bombyx mori is an extremely significant model organism for insect biology, in particular, and other life sciences, in general. The species distribution of C. cautella unigenes was almost in accordance with the transcriptome analysis results of other lepidopteran species such as Galleria mellonella and Heliothis virescens [53, 54].

In the present study, 8828 unigenes were annotated using the COG database. In COG analyses, the most frequently identified class was related to the general function prediction, followed by replication, recombination and repair, translation, ribosomal structure and biogenesis, function unknown, transcription, and post translational modification (Fig. 3). The general function prediction class (3636 unigenes, 41.18%) was the largest COG class, which was similar to the results of Shen, Dou [42] and Yan, Liu [55].

We surveyed our transcriptome data and identified several important enzymes and genes involved in reproduction. The Vg, VgR, lipophorin, lipophorin receptor, apolipophorins, doublesex, transmembrane protein, juvenile hormone esterase, ecdysone oxidase, rab5, and many others were identified (Table 3). In the present study, 57 genes encoding proteins vital for reproduction have been submitted to the NCBI genomic database and their accession numbers obtained (Table 3).

The Vg gene play a major role in insect reproduction and proliferation. The specificity of Vg with sex, tissues, and stage has been reported in many insect species [30, 56]. Vg gene expression in female fat body tissues and the evidence of Vg protein in adult female hemolymph and ovariole extracts have been reported in the American cockroach Periplaneta americana [21], madeira cockroach, Leucophae maderae [28], and oriental leafworm moth, Spodoptera litura [57]. It has been reported that different insect species have different numbers of Vg genes [58]. In some insects, there is one Vg gene, whereas others have two or multiple Vg genes. Multiple Vg genes have been described from numerous insect species including Aedes aegypti [59], brown winged green bug, Plautia stali [60], Periplaneta americana [21, 27], and Leucophae maderae [28, 29]. However, in the lepidopteran species till date only one Vg transcript has been reported which might yield different numbers of yolk polypeptides as identified in pyralid moths including C. cautella comprising two true vitellogenin subunits (+/−160KDa and 47KDa) [61]. Whereas, we reported only one functional vitellogenin transcript in the present data, because there are post transcriptional modifications and the Vg transcript cleaves into two subunits of different size and in this regards the two polypeptides of different size can bee observed. Previously we have cloned and characterized several vitellogenin and its receptors genes in different insect species and reported very clearly about the cleavage process of Vg transcript in insects. For detail plz. See [28, 56, 58].

To date, the complete Vg mRNA has been sequenced from 23 lepidopteran species; however, among these there is only one species, the rice moth Corcyra cephalonica that is associated with stored grain infestations. Thus, the addition of C. cautella Vg/VgR and other transcripts in the GenBank will strengthen the amount of available genomic data regarding reproductive physiology. Additionally, to date there is no report on the sequencing of the VgR from any moth species, which is also associated with stored grain infestations. The Vg protein is carried by the hemolymph to the ovaries where it is taken up by its counterpart, the VgR, and deposited in the developing oocyte. The VgR is an important carrier for the uptake of Vg into the developing oocytes of all oviparous species [58]. The VgRs of insects have large membrane bound proteins approximately 180–214 kDa in size [62]. The molecular characterization of insect VgRs has revealed that these receptors, regardless of their origin, are extremely conserved not only in their structure but also in terms of their regulation [21, 63]. VgR plays a crucial role in insect reproduction but little is known about this receptor in insects compared to its ligand, the Vg.

The higher expression of some transcripts in the C. cautella adult female abdominal tissue has revealed the importance of these genes in the biological process, physiology, and reproduction of C. cautella. Vg and apolipophorin are very important for the nourishment of the developing embryo inside the egg. Yolk polypeptide 2, follicular epithelium yolk protein, and ribosomal proteins were among the most abundant transcripts, which play crucial roles in the reproduction of insects [64, 65]. Similarly, lipid carrier protein (apolipophorins) and sulfur containing amino acids carrying proteins (hexamerins), might play a role to enhance the vitellogenesis, whereas, the egg shell protein (chorion), juvenile hormone, and ecdysteroids play crucial roles in insect metamorphosis and reproduction [66,67,68,69,70]. In the desert locust, the silencing of ecdysone receptor affected the choriogenesis and ovarion development. The effect was not only limited up to the disruption of oogenesis it also has affected the JH biosynthesis in corpora allata [71]. Insect development and reproduction are primarily associated with the fluctuating levels of these genes [72, 73]. Methoprene tolerant protein, works as a nuclear receptor for the JH functioning and plays a key role in the larval metamorphosis as well as vitellogenesis in adult females. In H. armigera, the knockdown of methoprene tolerant gene has adversely affected the larval development and adult female oogenesis [74].

The present study is the foremost distinctive study that has provided a wealth of genes related to molecular mechanism of reproduction in C. cautella, which is the key pest of stored grains and dates.

Conclusions

The warehouse moth, C. cautella, is a serious pest of dates, both in the field and under storage conditions. The present study provides comprehensive data on reproduction control genes including Vg that has vital importance and the genes expressed in the abdominal tissues related to different physiological functions such as juvenile hormone and ecdysone receptor. Results from the present study have greatly strengthened the genetic understanding of different life processes of this pest. The availability of a huge number of transcripts will provide a foundation for future studies. Although NGS data provided 6 CcVg partial transcripts, RT-PCR analysis together with high expression level identified in terms of FPKM values showed that there might be one functional Vg gene (CcVg) in C. cautella. Next efforts will be made to get full sequence of these genes, their characterization, expression analysis, and knockdown deploying RNAi technology. The sequencing/characterization and silencing of reproduction control genes will elucidate the developmental strategies of C. cautella at the molecular level and, it could lead toward the development of an environmentally benign strategy for the management of this key pest.

Methods

Insect rearing

The C. cautella culture was maintained at the Economic Entomology Research Unit, Department of Plant Protection, College of Food and Agriculture Sciences, King Saud University, Riyadh, Saudi Arabia. The colony was maintained in an environmental chamber (Steridium, Australia) at 25 ± 2 °C and 65 ± 5% relative humidity under a 15:9 (light/dark cycle) on a slightly modified artificial diet media developed by [75]. Wandering instar larvae were separated to pupate and female pupae were placed separately in the growth chamber under the same conditions as those for the tissue collection.

Tissue preparation for transcriptome analysis

One-day old virgin adult C. cautella females were selected for tissue preparation because from the previous studies, it is obvious that the expression of vitellogenin receptor and its ligand remains maximum in one-two days old females confirmed through semi quantitative and qRT-PCR results of several studies. In silk moth, the maximum expression of Vg was reported at the age of 24 h old female moth. Several other studies have also reported the same findings in lepidopteran species [28, 31, 57, 66]. The female moths were hold gently, their wings were removed and the last 5–7 abdominal segments were cut out with micro scissors and placed directly into phosphate buffered saline (PBS; pH 8.0) solution for washing [28]. Tissues were washed for 3–4 min in the PBS solution, and then transferred into a 1 mL Eppendorf tube containing liquid nitrogen, and preserved at − 80 °C until subsequent analysis.

RNA isolation and construction of cDNA library for transcriptome analysis

The abdominal tissues of one-day-old virgin female moths ~ 800 mg in size were used for RNA extraction with Tri-RNA reagent (Favorgen Biotech CORP, Taiwan). The total RNA concentration, RNA integrity number, 28S/18S, and size of the RNA sample were determined using an Agilent 2100 bioanalyzer and Agilent RNA 6000 nano kit, and the purity of the sample was assessed using a nanodrop instrument. The concentration and total volume of the RNA samples were 378 ng/μL and 70 μL, respectively, and the RNA integrity number was 5.2. The integrity of RNA was confirmed by 1% agarose gel electrophoresis.

After the confirmation of RNA integrity, total RNA was used for cDNA synthesis. The total RNA sample was digested by DNaseІ (New England Biolab), purified by oligo-dT beads (Dynabeads mRNA purification kit, Invitrogen), and then poly (A)-containing mRNA were fragmented into 130 base pairs (bp) with the first-strand buffer. First-strand cDNA was generated by random hexamer primers (N6), first-strand master mix, and super script II reverse transcription (Invitrogen) (reaction conditions: 25 °C for 10 min, 42 °C for 50 min, and 70 °C for 15 min).

For the second-strand cDNA synthesis, a second-strand master mix was added to the first-strand cDNA and the prepared mixture was incubated for 1 h at 16 °C. AMPure XP magnetic beads were used to purify the double strand cDNA. The purified cDNA was subjected to the End-Repair mix to recover any damaged or incompatible ends, incubated for 30 min at 20 °C, and then purified. The products were ligated with one another using a sequencing adapter and, after agarose gel electrophoresis, a suitable size range of fragments were selected for polymerase chain reaction (PCR) amplification with a PCR primer cocktail and PCR master mix at 20 °C for 20 min. Finally, PCR products were purified using AMPure XP beads, the library was quantitated, and the qualified libraries were sequenced using the Illumina HiSeqTM 2000 system.

Illumina sequencing and de novo assembly

The library was quantitated and the qualified libraries were amplified on cBot to generate clusters on the flow cell (TruSeq PE Cluster Kit V3–cBot–HS Illumina), and the amplified flow cell was sequenced pair end on the HiSeq 2000 system (TruSeq SBS KIT-HS V3, Illumina). The sequences with a read length of 50 bp were sequenced with a paired end strategy (Additional file 2: Figure S1). Raw reads produced from the sequencing machine contain dirty reads composed of adopters, which are unknown or low-quality bases that have a negative effect on the bioinformatics analysis. Therefore, the raw reads produced from the sequencing data were cleaned by removing the reads with adopters and reads with unknown nucleotides larger than 5% with the help of filter-fq software (version internal filter_fq software of BGI). Transcriptome de novo assembly was carried out with the short reads assembling program Trinity (version = release-20,130,225) [76]. Further, the TGICL (version = v2.1) and Phrap (version = Release 23.0) software were used for the downstream processing of large volumes RNA-sequence reads into unigenes.

Unigenes annotation and functional organization

In the final step, unigenes were aligned to the nucleotide database (NT) with the blastN and protein databases: non-redundant protein (NR), Swiss-Prot, Kyoto Encyclopedia of Genes and Genomes (KEGG), and clusters of orthologous groups of protein (COG) via BLASTX with an e-value < 0.00001. The sequence alignment outcomes with greatest sequence resemblance were selected and annotated to unigenes. The unigenes that were unsuccessful in lining up the above mentioned databases were separated out with ESTScan software to decide sequence direction and detect coding region. Blast2GO software (version = v2.5.0) was used in NR annotation to obtain gene ontology (GO) annotation (i.e., biological process, molecular function, and cellular component) [77].

WEGO software was applied to deduce the functional classification of all annotated unigenes [78]. All unigenes were aligned with the COG database to classify and investigate their possible functions. Similarly, the KEGG pathway database was surveyed with the BLASTX program to predict the possible pathways where each of the unigenes were involved.

Validation Vg gene transcripts via reverse transcription (RT) PCR

Six transcripts of the Vg gene with dissimilar fragments per kilobase of transcript per million mapped reads (FPKM) values were recognized from the C. cautella female abdominal tissue transcriptome. The 6 Vg transcripts were evaluated through RT-PCR with gene specific primers synthesized from the 6 transcripts they had identified in the transcriptome assembly. Actin gene primers, Cc-Act-F1 and Cc-Act-R1, were used as internal controls (Additional file 5: Table S4). For validation, a cDNA library was exposed to PCR with the Gene Amp PCR system 9700 thermo cycler (Applied Biosystems, Foster City, CA, USA), and the following PCR conditions were used: initial denaturation at 94 °C for 1 min, followed by 32 cycles of denaturation at 94 °C for 30 s, and annealing at 68 °C for 3 min. The PCR-amplified products were run on 1.2% agarose gel, stained with ethidium bromide for 30 min, and visually observed under ultra violet light with the gel documentation system BioDocAnalyze (Biometra). The successful amplified samples were sent to BGI for sequencing.

Availability of data and materials

The transcriptome data of C. cautella female abdominal tissue has been submitted to the National Center for Biotechnology Information (NCBI) (accession no: SRP156514) and is freely accessible.

Abbreviations

BGI:

Beijing Genomics Institute

BLAST:

Basic Local Alignment Search Tool

CcVg :

Cadra cautella vitellogenin

CDS:

Coding DNA sequence

COG:

Clusters of Orthologous Groups of protein

EERU:

Economic Entomology Research Unit

FPKM:

Fragments per kilobase of transcript per million mapped reads

GO:

Gene Ontology

KEGG:

Swiss-Prot, Kyoto Encyclopedia of Genes and Genomes

NCBI:

National Center for Biotechnology Information

NGS:

Next-generation sequencing

NR:

Non-redundant protein

NT:

Non-redundant nucleotide

RT-PCR:

Reverse transcription polymerase chain reaction

References

  1. 1.

    Chao CT, Krueger RR. The date palm (Phoenix dactylifera L.): overview of biology, uses, and cultivation. HortScience. 2007;42(5):1077–82.

  2. 2.

    Tang ZX, Shi LE, Aleid SM. Date fruit: chemical composition, nutritional and medicinal values, products. J Sci Food Agric. 2013;93(10):2351–61.

  3. 3.

    Zhang C-R, Aldosari SA, Vidyasagar PS, Nair KM, Nair MG. Antioxidant and anti-inflammatory assays confirm bioactive compounds in Ajwa date fruit. J Agric Food Chem. 2013;61(24):5834–40.

  4. 4.

    Assirey EAR. Nutritional composition of fruit of 10 date palm (Phoenix dactylifera L.) cultivars grown in Saudi Arabia. J Taibah Univ Sci. 2015;9(1):75–9.

  5. 5.

    Al-Shahib W, Marshall RJ. The fruit of the date palm: its possible use as the best food for the future? Int J Food Sci Nutr. 2003;54(4):247–59.

  6. 6.

    Arbogast R, Chini S, Kendra P. Infestation of stored saw palmetto berries by Cadra cautella (Lepidoptera: Pyralidae) and the host paradox in stored-product insects. Fla Entomol. 2005;88(3):314–20.

  7. 7.

    Husain M, Alwaneen WS, Mehmood K, Rasool KG, Tufail M, Aldawood AS. Biological traits of Cadra cautella (Lepidoptera: Pyralidae) reared on khodari date fruits under different temperature regimes. J Econ Entomol. 2017;110(4):1923–8.

  8. 8.

    Keever D, Arbogast R, Mullen M. Population trends and distributions of Bracon hebetor say (Hymenoptera: Braconidae) and lepidopterous pests in commercially stored peanuts. Environ Entomol. 1985;14(6):722–5.

  9. 9.

    Bell C. Effects of temperature and humidity on development of four pyralid moth pests of stored products. J Stored Prod Res. 1975;11(3–4):167–75.

  10. 10.

    Subramanyam B, Hagstrum D. Predicting development times of six stored-product moth species (Lepidoptera: Pyralidae) in relation to temperature, relative humidity, and diet. Eur J Entomol. 1993;90:51–64.

  11. 11.

    Aldawood AS, Rasool KG, Alrukban AH, Sofan A, Husain M, Sutanto KD, et al. Effects of temperature on the development of Ephestia cautella (Walker) (Pyralidae: Lepidoptera): a case study for its possible control under storage conditions. Pak J Zool. 2013;45:1573–8.

  12. 12.

    Husain M, Rasool KG, Tufail M, Alhamdan AM, Mehmood K, Aldawood AS. Comparative efficacy of CO2 and ozone gases against Ephestia cautella (Lepidoptera: Pyralidae) larvae under different temperature regimes. J Insect Sci. 2015;15(1):126.

  13. 13.

    Follett PA, Neven LG. Current trends in quarantine entomology. Annu Rev Entomol. 2006;51:359–185.

  14. 14.

    Pimentel MAG, Faroni LRDA, Tótola MR, Guedes RNC. Phosphine resistance, respiration rate and fitness consequences in stored-product insects. Pest Manag Sci. 2007;63(9):876–81.

  15. 15.

    Pimentel MAG, Faroni LRDA, Batista MD, FHd S. Resistance of stored-product insects to phosphine. Pesq Agrop Brasileira. 2008;43(12):1671–6.

  16. 16.

    Opit G, Phillips TW, Aikins MJ, Hasan M. Phosphine resistance in Tribolium castaneum and Rhyzopertha dominica from stored wheat in Oklahoma. J Econ Entom. 2012;105(4):1107–14.

  17. 17.

    Bouma W. Fourth meeting of the parties to the Montreal protocol. Clean Air. 1993;27(1):11.

  18. 18.

    Graham W. Warehouse ecology studies of bagged maize in Kenya: the distribution of adult Ephestia (Cadra) cautella (Walker) (Lepidoptera, Phycitidae). J Stored Prod Res. 1970;6(2):147–55.

  19. 19.

    Olsson P-OC, Anderbrant O, Löfstedt C. Flight and oviposition behavior of Ephestia cautella and Plodia interpunctella in response to odors of different chocolate products. J Insect Behr. 2005;18(3):363–80.

  20. 20.

    Ryne C, Ekeberg M, Jonzén N, Oehlschlager C, Löfstedt C, Anderbrant O. Reduction in an almond moth Ephestia cautella (Lepidoptera: Pyralidae) population by means of mating disruption. Pest Manag Sci. 2006;62(10):912–8.

  21. 21.

    Tufail M, Lee J, Hatakeyama M, Oishi K, Takeda M. Cloning of vitellogenin cDNA of the American cockroach, Periplaneta americana (Dictyoptera), and its structural and expression analyses. ArchInsect Biochem. 2000;45(1):37–46.

  22. 22.

    Tufail M, Takeda M. Molecular cloning, characterization and regulation of the cockroach vitellogenin receptor during oogenesis. Insect MolBiol. 2005;14(4):389–401.

  23. 23.

    Tufail M, Takeda M. Molecular cloning and developmental expression pattern of the vitellogenin receptor from the cockroach, Leucophaea maderae. Insect Biochem Molec. 2007;37(3):235–45.

  24. 24.

    Liu L, Li Y, Li S, Hu N, He Y, Pong R, et al. Comparison of next-generation sequencing systems. Biomed Res Int. 2012;2012:11. Article ID: 251364.

  25. 25.

    Zhong R, Ding T-B, Niu J-Z, Xia W-K, Liao C-Y, Dou W, et al. Molecular characterization of vitellogenin and its receptor genes from citrus red mite, Panonychus citri (McGregor). Int J Mol Sci. 2015;16(3):4759–73.

  26. 26.

    Antony B, Soffan A, Jakše J, Alfaifi S, Sutanto KD, Aldosari SA, et al. Genes involved in sex pheromone biosynthesis of Ephestia cautella, an important food storage pest, are determined by transcriptome sequencing. BMC Genomics. 2015;16(1):532.

  27. 27.

    Tufail M, Hatakeyama M, Takeda M. Molecular evidence for two vitellogenin genes and processing of vitellogenins in the American cockroach, Periplaneta americana. Arch Insect Biochem. 2001;48(2):72–80.

  28. 28.

    Tufail M, Takeda M. Vitellogenin of the cockroach, Leucophaea maderae: nucleotide sequence, structure and analysis of processing in the fat body and oocytes. Insect Biochem Molec. 2002;32(11):1469–76.

  29. 29.

    Tufail M, Bembenek J, Elgendy AM, Takeda M. Evidence for two vitellogenin-related genes in Leucophaea maderae: the protein primary structure and its processing. Arch Insect Biochem. 2007;66(4):190–203.

  30. 30.

    Tufail M, Naeemullah M, Elmogy M, Sharma P, Takeda M, Nakamura C. Molecular cloning, transcriptional regulation, and differential expression profiling of vitellogenin in two wing-morphs of the brown planthopper, Nilaparvata lugens Stål (Hemiptera: Delphacidae). Insect Mol Biol. 2010;19(6):787–98.

  31. 31.

    Veerana M, Kubera A, Ngernsiri L. Analysis of the vitellogenin gene of rice moth, Corcyra cephalonica Stainton. Arch Insect Biochem. 2014;87(3):126–47.

  32. 32.

    Faunes F, Sánchez N, Castellanos J, Vergara IA, Melo F, Larraín J. Identification of novel transcripts with differential dorso-ventral expression in Xenopus gastrula using serial analysis of gene expression. Genome Biol. 2009;10(2):R15.

  33. 33.

    Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008;5(7):621.

  34. 34.

    Hegedűs Z, Zakrzewska A, Ágoston VC, Ordas A, Rácz P, Mink M, et al. Deep sequencing of the zebrafish transcriptome response to mycobacterium infection. Mol Immunol. 2009;46(15):2918–30.

  35. 35.

    Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, et al. Alternative isoform regulation in human tissue transcriptomes. Nature. 2008;456(7221):470.

  36. 36.

    Nagalakshmi U, Waern K, Snyder M. RNA-Seq: a method for comprehensive transcriptome analysis. Curr Protocols Mol Biol. 2010;89:1–13.

  37. 37.

    Li S, Tighe SW, Nicolet CM, Grove D, Levy S, Farmerie W, et al. Multi-platform assessment of transcriptome profiling using RNA-seq in the ABRF next-generation sequencing study. Nat Biotechnol. 2014;32(9):915–25.

  38. 38.

    Wakasa Y, Oono Y, Yazawa T, Hayashi S, Ozawa K, Handa H, et al. RNA sequencing-mediated transcriptome analysis of rice plants in endoplasmic reticulum stress conditions. BMC Plant Biol. 2014;14(1):101.

  39. 39.

    Bai X, Mamidala P, Rajarapu SP, Jones SC, Mittapalli O. Transcriptomics of the bed bug (Cimex lectularius). PLoS One. 2011;6(1):e16336.

  40. 40.

    Rajan P, Sudbery IM, Villasevil MEM, Mui E, Fleming J, Davis M, et al. Next-generation sequencing of advanced prostate cancer treated with androgen-deprivation therapy. Eur Urol. 2014;66(1):32–9.

  41. 41.

    Park Y, Aikins J, Wang L, Beeman RW, Oppert B, Lord JC, et al. Analysis of transcriptome data in the red flour beetle, Tribolium castaneum. Insect Biochem Molec. 2008;38(4):380–6.

  42. 42.

    Shen G-M, Dou W, Niu J-Z, Jiang H-B, Yang W-J, Jia F-X, et al. Transcriptome analysis of the oriental fruit fly (Bactrocera dorsalis). PLoS One. 2011;6(12):e29127.

  43. 43.

    Dong Q, Hu L, Zhuang L, Li R, Liu Q, Wang P. Odor discrimination by mammalian olfaction based on brain-machine interface and olfactory decoding. Sens Lett. 2014:1023–108. https://doi.org/10.1166/sl.2014.3180.

  44. 44.

    Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et al. Gene ontology: tool for the unification of biology. Nat Genet. 2000;25(1):25.

  45. 45.

    Navarro S, Donahaye E. Generation and application of modified atmospheres and fumigants for the control of storage insects. Fumigation and Controlled Atmosphere Storage of Grain. 1989;14:152.

  46. 46.

    Ayvaz A, Sagdic O, Karaborklu S, Ozturk I. Insecticidal activity of the essential oils from different plants against three stored-product insects. J Insect Sci. 2010;10(1):21.

  47. 47.

    Hulasare R, Subramanyam B, Fields P, Abdelghany A. Heat treatment: a viable methyl bromide alternative for managing stored-product insects in food-processing facilities. Julius-Kühn-Archiv. 2010;425:661.

  48. 48.

    Collins D, Conyers S. The effect of sub-zero temperatures on different lifestages of Lasioderma serricorne (F.) and Ephestia elutella (Hübner). J Stored Prod Res. 2010;46(4):234–41.

  49. 49.

    Chu SS, Liu O, Zhou L, Du SS, Liu ZL. Chemical composition and toxic activity of essential oil of Caryopteris incana against Sitophilus zeamais. Afr JBiotechnol. 2011;10(42):8476–80.

  50. 50.

    Chu SS, Du SS, Liu ZL. Fumigant compounds from the essential oil of Chinese Blumea balsamifera leaves against the maize weevil (Sitophilus zeamais). J Chem. 2012;2013:7. Article ID: 289874.

  51. 51.

    Husain M, Sukirno S, Mehmood K, Tufail M, Rasool KG, Alwaneen WS, et al. Effectiveness of carbon dioxide against different developmental stages of Cadra cautella and Tribolium castaneum. Environ Sci Pollut R. 2017;24(14):12787–95.

  52. 52.

    Wang X, Li Y, Peng L, Chen H, Xia Q, Zhao P. Comparative transcriptome analysis of Bombyx mori spinnerets and Filippi’s glands suggests their role in silk fiber formation. Insect Biochem Molec. 2016;68:89–99.

  53. 53.

    Vogel H, Altincicek B, Glöckner G, Vilcinskas A. A comprehensive transcriptome and immune-gene repertoire of the lepidopteran model host Galleria mellonella. BMC Genomics. 2011;12(1):308.

  54. 54.

    Perera OP, Shelby KS, Popham HJ, Gould F, Adang MJ, Jurat-Fuentes JL. Generation of a transcriptome in a model lepidopteran pest, Heliothis virescens, using multiple sequencing strategies for profiling midgut gene expression. PLoS One. 2015;10(6):e0128563.

  55. 55.

    Yan W, Liu L, Qin W, Li C, Peng Z. Transcriptomic identification of chemoreceptor genes in the red palm weevil Rhynchophorus ferrugineus. GenetMolRes. 2015;14(3):7469–80.

  56. 56.

    Tufail M, Takeda M. Molecular characteristics of insect vitellogenins. J Insect Physiol. 2008;54(12):1447–58.

  57. 57.

    Shu Y, Zhou J, Tang W, Lu K, Zhou Q, Zhang G. Molecular characterization and expression pattern of Spodoptera litura (Lepidoptera: Noctuidae) vitellogenin, and its response to lead stress. J Insect Physiol. 2009;55(7):608–16.

  58. 58.

    Tufail M, Raikhel AS, Takeda M. Biosynthesis and processing of insect vitellogenins. Progress Vitellogenesis Reprod Biol Invertebrates. 2005;12(Part B):1–32.

  59. 59.

    Chen J-S, Cho W-L, Raikhel AS. Analysis of mosquito vitellogenin cDNA: similarity with vertebrate phosvitins and arthropod serum proteins. J Mol Biol. 1994;237(5):641–7.

  60. 60.

    Lee JM, Hatakeyama M, Oishi K. A simple and rapid method for cloning insect vitellogenin cDNAs. Insect Biochem Molec. 2000;30(3):189–94.

  61. 61.

    Shirk PD. Comparison of yolk production in seven pyralid moth species. Int J Invertebr Reprod Dev. 1987;11(2):173–87.

  62. 62.

    Ferenz H-J. Yolk protein accumulation in Locusta migratoria (R. & F.) (Orthoptera: Acrididae) oocytes. IntJ Insect Morpholy. 1993;22(2):295–314.

  63. 63.

    Tufail M, Elmogy M, Ali Fouda M, Elgendy A, Bembenek J, Trang L, et al. Molecular cloning, characterization, expression pattern and cellular distribution of an ovarian lipophorin receptor in the cockroach, Leucophaea maderae. Insect Mol Biol. 2009;18(3):281–94.

  64. 64.

    Shirk PD, Bean D, Millemann AM, Brookes VJ. Identification, synthesis, and characterization of the yolk polypeptides of Plodia interpunctella. J Exp ZoolPart A. 1984;232(1):87–98.

  65. 65.

    Zhang W, Ma L, Xiao H, Xie B, Smagghe G, Guo Y, Liang G. Molecular characterization and function analysis of the vitellogenin receptor from the cotton bollworm, Helicoverpa armigera (Hübner) (Lepidoptera, Noctuidae). PLoS One. 2016;11(5):e0155785.

  66. 66.

    Jing YP, Wang D, Han XL, Dong DJ, Wang JX, Zhao XF. The steroid hormone 20-Hydroxyecdysone enhances gene transcription through the cAMP response element-binding protein (CREB) signaling pathway. J Biol Chem. 2016;291(24):12771–85.

  67. 67.

    Liu S, Li K, Gao Y, Liu X, Chen W, Ge W, Feng Q, Palli SR, Li S. Antagonistic actions of juvenile hormone and 20-hydroxyecdysone within the ring gland determine developmental transitions in Drosophila. Proc Natl Acad Sci. 2018;115(1):139–44.

  68. 68.

    Roy S, Saha TT, Zou Z, Raikhel AS. Regulatory pathways controlling female insect reproduction. Annu Rev Entomol. 2018;63:489–511.

  69. 69.

    Tufail M, Takeda M. Vitellogenesis and yolk proteins in insects. Encyclopedia Reprod. 2018;6:285–9.

  70. 70.

    Huybrechts R. Endocrine control of reproduction. Insects Encyclopedia Reprod. 2018;6:385–92.

  71. 71.

    Lenaerts C, Marchal E, Peeters P, Broeck JV. The ecdysone receptor complex is essential for the reproductive success in the female desert locust, Schistocerca gregaria. Sci Rep. 2019;9(1):15.

  72. 72.

    Ramaswamy SB, Shu S, Park YI, Zeng F. Dynamics of juvenile hormone-mediated gonadotropism in the Lepidoptera. Arch Insect Biochem. 1997;35(4):539–58.

  73. 73.

    Medeiros MN, Logullo R, Ramos IB, Sorgine MH, Paiva-Silva GO, Mesquita RD, et al. Transcriptome and gene expression profile of ovarian follicle tissue of the triatomine bug Rhodnius prolixus. Insect Biochem Molec. 2011;41(10):823–31.

  74. 74.

    Ma L, Zhang W, Chen L, Liu C, Xu Y, Xiao H, Liang G. Methoprene-tolerant (met) is indispensable for larval metamorphosis and female reproduction in the cotton bollworm Helicoverpa armigera. Front Physiol. 2018;9:1601.

  75. 75.

    Al-Azab AMA. Alternative approaches to methyl bromide for controlling Ephestia cautella (Walker) (Lepidoptera: Pyralidae) [dissertation]. Master thesis, King Faisal University, Kingdom of Saudi Arabia; 2007.

  76. 76.

    Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011;29(7):644–52.

  77. 77.

    Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005;21(18):3674–6.

  78. 78.

    Ye J, Fang L, Zheng H, Zhang Y, Chen J, Zhang Z, et al. WEGO: a web tool for plotting GO annotations. Nucleic Acids Res. 2006;34(suppl_2):W293–W7.

Download references

Acknowledgements

The authors thanks to the Research Support Service Unit (RSSU) for their technical support and the Deanship of Scientific Research for supporting the present work through Research group (No: RGP-1438-009) King Saud University.

Sequence submission

The transcriptome data of C. cautella female abdominal tissue has been submitted to the National Center for Biotechnology Information (NCBI) (accession no: SRP156514) and is freely accessible.

Funding

MT and ASA received funding from the Deanship of Scientific Research at King Saud University for this project through “Research group NO: RGP -1438-009”.

Author information

MH, MT, KGR, and ASA participated in the planning, design and coordination of the study. MH and KM participated in the C. cautella rearing, tissue preparation, conducted practical work, data analysis, and write up. MT, and ASA supervised the work. All authors have read the final version of the manuscript carefully and approved it.

Authors’ information

Economic Entomology Research Unit, Plant Protection Department, College of Food and Agriculture Sciences, King Saud University, Riyadh 11451, Saudi Arabia.

Correspondence to Mureed Husain.

Ethics declarations

Ethics approval and consent to participate

Cadra cautella was collected from date palm orchards in Riyadh region of Saudi Arabia. Its colony was maintained at the Economic Entomology Research Unit (EERU), Plant Protection Department, College of Food and Agriculture Sciences, King Saud University, Riyadh, Saudi Arabia. We approve that none of the C. cautella was collected from the public parks or protested areas. Moreover, it is not an endangered species.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Husain, M., Tufail, M., Mehmood, K. et al. Transcriptome analysis of the almond moth, Cadra cautella, female abdominal tissues and identification of reproduction control genes. BMC Genomics 20, 883 (2019) doi:10.1186/s12864-019-6130-2

Download citation

Keywords

  • Cadra cautella
  • Next-generation sequencing
  • Female abdominal tissues
  • Transcriptome
  • Reproduction