Comparative transcriptome analysis of soybean response to bean pyralid larvae

Background Soybean is one of most important oilseed crop worldwide, however, its production is often limited by many insect pests. Bean pyralid is one of the major soybean leaf-feeding insects in China. To explore the defense mechanisms of soybean resistance to bean pyralid, the comparative transcriptome sequencing was completed between the leaves infested with bean pyralid larvae and no worm of soybean (Gantai-2-2 and Wan82–178) on the Illumina HiSeq™ 2000 platform. Results In total, we identified 1744 differentially expressed genes (DEGs) in the leaves of Gantai-2-2 (1064) and Wan82–178 (680) fed by bean pyralid for 48 h, compared to 0 h. Interestingly, 315 DEGs were shared by Gantai-2-2 and Wan82–178, while 749 and 365 DEGs specifically identified in Gantai-2-2 and Wan82–178, respectively. When comparing Gantai-2-2 with Wan82–178, 605 DEGs were identified at 0 h feeding, and 468 DEGs were identified at 48 h feeding. Gene Ontology (GO) annotation analysis revealed that the DEGs were mainly involved in the metabolic process, single-organism process, cellular process, responses to stimulus, catalytic activities and binding. Pathway analysis showed that most of the DEGs were associated with the plant-pathogen interaction, phenylpropanoid biosynthesis, phenylalanine metabolism, flavonoid biosynthesis, peroxisome, plant hormone signal transduction, terpenoid backbone biosynthesis, and so on. Finally, we used qRT-PCR to validate the expression patterns of several genes and the results showed an excellent agreement with deep sequencing. Conclusions According to the comparative transcriptome analysis results and related literature reports, we concluded that the response to bean pyralid feeding might be related to the disturbed functions and metabolism pathways of some key DEGs, such as DEGs involved in the ROS removal system, plant hormone metabolism, intracellular signal transduction pathways, secondary metabolism, transcription factors, biotic and abiotic stresses. We speculated that these genes may have played an important role in synthesizing substances to resist insect attacks in soybean. Our results provide a valuable resource of soybean defense genes that will benefit other studies in this field. Electronic supplementary material The online version of this article (10.1186/s12864-017-4256-7) contains supplementary material, which is available to authorized users.


Background
Soybean (Glycine max (L.) Merr.) is the largest oil crop worldwide, and is widely used in the production of food, feed, industrial products and other sideline fields [1]. However, there have been large increases in soybean production costs due to pests. Bean pyralid (Lamprosema indicate (Fabricius)) is one of the major leaffeeding insects that affects soybean crops in central and southern China, the larvae lurk inside soybean leaves, cause leaf curling and feed on leaf tissues, affecting photosynthesis. Therefore, the plants cannot grow normally [2]. So bean pyralid is different from other leaffeeding insects with chewing mouthparts which cause holes or incisions by means of encroachment [3]. In the soybean-producing areas of southern China, many generations of bean pyralids may appear in 1 year. In serious pest-damaged years, only veins and petioles will be left, causing serious yield losses [4]. Sun et al. and Long et al. evaluated rolled leaflet number and larva number could be used as an evaluation index for bean pyralid in soybean, and screened the highly resistant line Gantai-2-2 and the highly sensitive line Wan 82-178 [4][5][6]. Two indicators of resistance to bean pyralid, rolled leaflet number and rolled leaflet percentage, were a significantly positively correlated with the pubescence angle, length on leaf blade, angle on petiole and a significantly negatively correlated with the pubescence density on leaf blade, but on correlation with pubescence tip shape was observed [7]. Xing et al. and Li et al. showed that soybean resistance to bean pyralid accords with two or three major genes and polygene, 81-92% of the phenotypic variation was accounted for by additive quantitative trait locus (QTL) (27-43%), epistatic QTL pairs (5-13%) and collective unmapped minor QTL (38-58%) [8][9][10]. The contents of soluble sugar, superoxide dismutase (SOD), polyphenol oxidase (PPO), jasmonate (JA) and abscisic acid (ABA) are significantly increased after bean pyralid feeding [11]. However, the results of comparative transcriptome research which has focused on soybeans' resistance to bean pyralid has not yet been made available. This is the first study of soybean transcriptome in response to bean pyralid feeding.
Transcriptome sequencing has become an important method for gene expression analysis, differentially expressed genes (DEGs) selection, functional gene mining, and genetic evolution analysis. The soybean genome was released in 2010 [12]. Based on the soybean genome data and transcriptome technology, can be better to examine all the transcription reactions, structural functions, and transcriptional regulation of resistant soybean varieties at the overall level. In the present paper, we tried to find important DEGs and metabolism pathways might related to the soybean in response to bean pyralid larvae through the comparative transcrptome analysis between the leaves infested with bean pyralid larvae and no worm of soybean using the Illumina HiSeq™ 2000 platform. Our results provide a valuable resource of soybean defense genes that will benefit other studies in this field.

Transcriptome sequencing and sequence alignment
An Illumina HiSeq™2000 sequencer was employed to analyze the comparative transcriptome of eight samples of Gantai-2-2 and Wan82-178 leaves that bean pyralid had been feeding on 0 h and 48 h. The original image data obtained by sequencing base-calling were the original sequence reads. Each read in the Solexa paired-end (PE) sequencing was 100 bp in length. There were 45.88 G original data sets produced during sequencing. The mean sequencing depth was 5.67. After the raw data were trimmed, 442,422,398 clean reads were obtained. The clean/raw read rates of the eight samples ranged from 95.30% to 97.11%. All clean reads were matched to the soybean reference genome by BWA software, allowing two base mismatches. The mapped genome reads ranged from 42,474,863 to 44,489,050 sets, genome map rates ranged from 77.23% to 78.95%, and unique match rates ranged from 72.86% to 75.22%. The expressed genes ranged from 50,283 to 53,739 (Table 1). To estimate whether the sequencing depth was sufficient for transcriptome coverage, the sequencing saturation in the eight cDNA libraries was analyzed. The results showed that most genes became saturated when the amount of PE reads was 20 M (200 × 100 kb) ( Fig. 1), which indicated that the overall quality of sequencing saturation in the eight cDNA libraries was high and that the sequencing amount covered the vast majority of expressed genes.

Correlation analysis of samples
The correlation of gene expression levels among samples is a key criterion to test whether the experiments are reliable and whether the chosen samples are reasonable. If one sample is highly similar to another one, the correlation value between them is very close to 1. We calculated the correlation value between each of two samples based on the FPKM results. According to the standard that Encode plan recommends, the square of the correlation value (R 2 ) should be ≥0. 92 (under an ideal experimental environment with reasonable samples). Our results showed that the R 2 of all repetitions were >0.95 (Fig. 2), which signified that our experimental samples and results were satisfactory and reliable.  Noiseq, DESeq2 and edgeR methods were used to screen DEGs (Fig. 3). The results showed that the Noiseq method can screen DEGs between two groups, showing a good performance when comparing it to other differential expression methods, such as edgeR and DESeq2. Noiseq maintains good True Positive and False Positive rates when the sequencing depth is increased, whereas most other methods show poor performance. Further more, Noiseq models the noise distribution from the actual data to better adapt to the size of the data set and be more effective in controlling the rate of false discoveries. Therefore, the Noiseq method was used to screen the DEGs.
As a result, under bean pyralid larvae feeding for 48 h, 1064 DEGs were identified in the Gantai-2-2, of which 894 DEGs were up-regulated and 170 DEGs were downregulated compared to 0 h (Additional file 1: Table S1). Additionally, 680 DEGs were identified in Wan82-178, of which 495 DEGs were up-regulated and 185 DEGs were down-regulated (Additional file 2: Table S2). After being induced with bean pyralid larvae, the number of up-regulated genes was significantly higher than that of down-regulated genes. These results indicated that most of the genes were activate and a few genes were inhibited after insect feeding.
The DEGs were further divided into three categories. The first category was the "DEGs with non-bean pyralidinduced genotype", and there were 605 DEGs in total. This class of genes was the "DEGs identified in Gantai-2-2 compared to Wan 82-178 before bean pyralid feeding induction", in which 52 DEGs were always upregulated and 83 DEGs were always down-regulated at 0 h and 48 h. In addition, 9 DEGs were up-regulated at 0 h but down-regulated at 48 h, and 2 DEGs were down-regulated at 0 h but up-regulated at 48 h, whereas the other 459 DEGs displayed no changes at 48 h (Fig. 4a). The second category was the "bean pyralidinduced DEGs that appeared in both materials". This category included 315 DEGs, which mainly displayed an up-regulated trend, with 274 DEGs were up-regulated and 31 DEGs were down-regulated in the two materials. A total of 8 DEGs were down-regulated in Gantai-2-2 but up-regulated in Wan82-178, 2 DEGs up-regulated in Gantai-2-2 but down-regulated in Wan82-178 (Fig. 4b). The third type was the "bean pyralid-induced genotype DEGs", which consisted of a total of 1114 DEGs, of which 749 DEGs were only expressed in Gantai-2-2 and 365 DEGs were only expressed in Wan82-178.
Among GO annotations of the DEGs in the above four groups, the largest common functional groups were metabolic process, cellular process, singleorganism process, responses to stimuli, catalytic activities and binding. These results indicated that when the soybean was subjected to bean pyralid larvae feeding, the defense systems in the plants would immediately respond to the stimuli, appropriately increase metabolic activities in vivo and produce defense substances, such as defendant enzymes, and proteinase inhibitors, thereby enhancing the activities of various enzymes to promote defense.
Pathway analysis showed that differentially expressed of these genes in the metabolic pathways after induced by bean pyralid might be related to the resistance, such as plant-pathogen interactions, phenylpropanoid biosynthesis, phenylalanine metabolism, flavonoid biosynthesis, peroxisome, plant hormone signal transduction, terpenoid backbone biosynthesis, and so on, and that they played a defensive role against insect stress. R language was used to conduct a super geometric algorithm. The result showed that the DEGs belonged to the bean pyralid-induced DEGs that appeared in both materials, mainly involving the global and overview maps, biosynthesis of the other secondary metabolites, carbohydrate metabolism and amino acid metabolism (Fig. 7a). A total of 146 DEGs were identified in Gantai-2-2 compared to Wan 82-178 before and after bean pyralid feeding that were mainly involved environmental adaptation, translation, global and overview maps and signal transduction (Fig. 7b).

Analysis of DEGs potentially related to anti-bean pyralid in soybean
Man pathway cluster analysis was employed to identify the pathway classification of the DEGs. The results showed that some important DEGs were categorized into ROS removal, hormone metabolism, signaling, stress, secondary metabolism and cell wall (Tables 2 and  3).
Under biotic and abiotic stresses, a great number of ROS in plants will be produced, and the cell structure is then destroyed. Plants often remove ROS in vivo through the production of a reactive oxygen scavenging agent, in order to alleviate damage to the plants caused by oxidative stress [13]. ROS removal system mainly includes enzymes, such as POD and PPO, and low molecular weight antioxidants, such as ferredoxin and thioredoxin (TRX) [14,15]. After bean pyralid larvae feeding for 48 h, many genes related to the ROS removal system were identified that were significantly upregulated (Tables 2 and 3). For example, there were 23 POD, 5 PPO, 9 glutathione S-transferase (GST) and 3 TRX1 identified in Gantai-2-2. Additionally, there were 20 POD, 4 PPO and 2 GST identified in Wan82-178. When comparing Gantai-2-2 with Wan82-178 at 0 h feeding, 1 GST was down-regulated. When comparing Gantai-2-2 with Wan82-178 at 48 h feeding, 1 TRX1 was up-regulated.

Analysis of DEGs by qRT-PCR
The quantitative real time-PCR (qRT-PCR) technology was used to verify the credibility of the RNA-Seq. Therefore, we used the qRT-PCR technology to verify 17 DEGs identified by RNA-Seq. It was found that the qRT-PCR expression patterns of 17 DEGs were all consistent with the RNA-Seq results (Fig. 8), which indicated that the RNA-Seq results were reliable in the present study.

Discussion
Influence of the genes related to ROS removal on bean pyralid response As an electron acceptor of H 2 O 2 , POD oxidizes various materials in the process of secondary metabolites and can produce phenoxy and oxygen free radicals by interacting with some phenolic compounds, which may also directly interfere with the feeding of insects or reduce the nutrition level of leaves, thereby lowering the edibility of the plants [27]. Meanwhile, POD has been found to have significant insecticidal effects on Lepidoptera and Coleoptera [28]. Previous studies have shown that POD could be induced by insects in wheat [29], sorghum [30], cucumber [31] and rice [32,33]. PPO is widely distributed in plants and can potentially catalyze the oxidation of polyphenols into a keto metal enzyme, it can directly oxidize acid into quinone using O 2 as an oxidation substrate, then produce amino acids and proteins that are difficult to digest and toxic to herbivores.
Meanwhile, as an anti-nutritional protein, PPO can produce protein affinity, and therefore reduce the edibility of plants, which plays an important role in the defense of insect feeding [34,35]. Previous research studies have shown that the activity of PPO increased after tomatoes were eaten by Spodoptera exigua or were treated with oral secretions of Spodoptera exigua [36,37]. GST is encoded by a large and diverse family of genes, that plays important roles in the responses of plants to oxidative damage induced by various environmental conditions, especially in the resistance to resist the toxic effects of ROS on cells [38][39][40]. In oxygen stress reactions, TRX would pass restoring forces to the reductase, which could potentially remove the lipid peroxide or repair the oxidized protein, resulting in an alleviation of oxygen stress [41]. TRXh5 and TRXh8 in Arabidopsis thaliana were strongly induced under biotic or abiotic stress [42,43]. After bean pyralid larvae feeding for 48 h, we found that POD, PPO and GST were all significantly up-regulated in Gantai-2-2 and Wan82-178. These results suggested that these genes were involved in the defense reactions of soybean to bean pyralid. TRX1 was up-regulated in Gantai-2-2, which suggested that TRX1 involved in defense responses to bean pyralid in the highly resistant material only.

Influence of plant hormones on bean pyralid response
JA signaling pathway is implicated in insect-induced responses in plant; it was involved in the signal transduction of insect defense, regulated the expression of plant downstream defense genes, and significantly induced responses of defense systems in plant, thereby effectively reducing pests [44]. After bean pyralid feeding, 3 key genes involved in JA biosynthetic pathway were identified in Gantai-2-2 and Wan82-178, namely, LOX, α-DOX and OPDA, which were significantly up-regulated. LOX is the key enzyme in the synthesis of the JA [45], and plays an important signal factor in plant induction defense pathways [46]. Adversity stresses, such as insects and diseases, could induce single or multiple LOX genes in plants [47,48]. Additionally, α-DOX could catalyze the oxidation of fatty acids and produce 2-hydrogen peroxide fatty acids, and it is also an enzyme related to plant stress resistance in JA biosynthetic pathway [32]. For example, α-DOX can be induced by insects in rice [32,49]. OPDA is the intermediate product of JA biosynthesis pathway with biological activities that can regulate plant defense genes. For example, OPDA has been found to induce defense genes in Arabidopsis thaliana which resist attacks from Bradysia odoriphaga [50]. As a plant endogenous hormone, ET regulates the defense responses of plants to pests and diseases [51,52]. Our results found that 3 types of genes related to ET biosynthesis and signal transduction pathway. For example, most of ACC oxidase, ethylene receptor and ERF1 genes were up-regulated after bean pyralid feeding. ACC oxidase is one of the key enzymes in ethylene biosynthesis pathway [52]. Ethylene receptor is an upstream component of ethylene signal transduction pathway, that plays an important role in plant growth and responses to adversity stresses [53]. ERF is a specific plant transcription factor that plays an important role in plant cell growth, hormone regulation, disease resistance and abiotic stresses [54][55][56]. Auxin can directly act on cell membranes or intracellular components, and affects some cellular responses. In addition, it can also indirectly regulate the expression of genes, and has a direct impact on plant stress [57,58]. After bean pyralid feeding, some genes related to auxin synthesis and signal transduction pathways were identified, such as IAAamino acid hydrolase, auxin responsive GH3 (Gretchen Hagen3) gene family and SAUR (small auxin up RNA) family protein. Early/primary auxin response genes were composed of three major gene families: Aux/IAA (auxin/indoleacetic acid), GH3 and SAUR transcription factor family [59]. GH3 gene family is a typical auxin-responsive gene family and can promote amino acids to be combined with IAA, JA and SA, then changed the concentration of biologically active forms within the cells. This aided in regulating plant growth, development and defense responses [60]. SAUR gene family is known to be the specific and largest family among the auxin response factors in plants and is related to environmental stimuli [61,62]. These results indicated that JA, ET and auxin may be involved in elevating the basal resistance of soybean to herbivory.

Influence of the intracellular signal transduction pathway on bean pyralid response
Protein kinase is involved in the signal transduction pathway of plants and plays an important role in the defense responses of plants [63]. Protein kinases could be induced by bean pyralid, including protein kinase, protein kinase A, LRR receptor-like serine/threonineprotein kinase FLS2, serine/threonine-protein kinase PBS1 (STK), serine/threonine-protein kinase SRK, serine/threonine-protein kinase WNK1, and serine/ threonine-protein kinase CTR1. LRR receptor-like kinase is the largest family of receptor protein kinase in plants; it plays an important role in the regulation of plant growth, development, biotic and abiotic stresses [64]. Serine/threonine-protein kinase (STK) is an important signal molecule; when plant suffers stimulation, such as pest feeding, salinity, drought stress, trauma, cytokines or hormones, STK is rapidly activated in the serine and threonine residue phosphorylation sites and is further activated in the downstream signal molecules through phosphorylation cascades, which activate specific signaling pathways. It is eventually transmitted by an outside signal to the nucleus, and activates or inhibits specific genes [64][65][66]. These results indicated that protein kinases played an important role in the defense against bean pyralid. When the plants were stimulated by external environmental and received the signal, the receptor activated the calcium channels on the membrane through a series of phosphorylation reactions to cause calcium ions to be released from the calcium base into the cytoplasm. This response led to an increase in the concentration of Ca 2+ in the cytoplasm, which then activated the plant defense response [67,68]. Previous studies have shown that Ca 2+ could be induced by Spodoptera littoralis in Phaseolus lunatus [69] and Ginkgo biloba [70]. After bean pyralid feeding, most of the DEGs related to Ca 2+ signaling identified in Gantai-2-2 were up-regulated, but in Wan82-178 were down-regulated. It was suggested that Ca 2+ signaling was involved in the defense responses to bean pyralid in the highly resistant material. After soybean was stimulated by bean pyralid, the concentrations of Ca 2+ in the cytoplasm changed; therefore, calcium was sent to transfer the stimulation signal. However, Ca 2+ signaling may have played a negative role in the regulation of defense responses in the highly susceptible material.

Influence of the genes related to biotic stress on bean pyralid response
When plants suffer from pest stress, a variety of defense proteins which inhibit the insect from producing digestive enzymes are produced to resist insect pests, including PR, proteinase inhibitors, chitinase and lectin [71,72] and thereby destroy the insect's normal digestive absorption functions, then disrupt their nutrient uptake and utilization, which could ultimately lead to malnutrition and inhibited growth of the insects [73].
PR proteins are generated and accumulated after pathogen invasions or abiotic stress and are inducible components in the self defense mechanisms of plants [74][75][76]. Uknes et al. found that the contents of PR-1, PR-2 and PR-5 were increased, when the Arabidopsis thaliana were treated with TCV and INA, and its disease resistance was increased [77]. After bean pyralid feeding, most of the DEGs related to PR were induced. Our results speculated that the PR proteins might be involved in the defense responses of soybean to bean pyralid.
Proteinase inhibitors can inhibit the activity of chymotrypsin and trypsin in the digestive tract of the Lepidoptera and Diptera, to inactivate the related digestive enzyme in intestines, and thereby interfering with the insects' normal feeding and digestion. Therefore, the insects cannot attain enough nutrition, which in turn affects their growth and development, then leads to insect death [78,79]. In other cases, it caused excessive production of proteases in insects, which resulted in a deficit of essential amino acids, then caused insect death [80]. Our results showed that the proteinase inhibitors could be induced by bean pyralid, so we assumed that the proteinase inhibitors were involved in the defense responses.
Chitinase could destroy the peritrophic membranes of insects [81]. When insect eats plants with sustained chitinase expression, its digestive tract becomes damaged and its epidermis cannot form normally [82]. Chitinase could be induced by aphid and spider mites in sorghum [83] and tomato [84], respectively. Our results showed that chitinase could be induced by bean pyralid as well. It was assumed that after the chitinase was absorbed into the insects' bodies, chitin was hydrolyzed, which inhibited the growth and development of insects and achieved the purpose of defense for the plants [85].
Lectin has at least one non-catalytic domain-specific reversible binding to monosaccharides or oligosaccharides, and after being absorbed into the gut of an insect, lectin induces local or systematic toxic effects, which bond to glycoproteins on the peritrophic membrane and damage the structure of the peritrophic membrane. This response causes insect growth arrest, antifeedant and even death [86,87]. Previous results have shown that lectin displays insecticidal activities against a variety of insects. For example, after Heliothis virescens [88], Myzus persicae [89] and Lacanobia oleracea [90] fed on lectin-transgenic plants, their survival rates and fecundity were greatly reduced. Our results showed that the lectin, mannose-binding 2 could be induced by bean pyralid. It was assumed that the lectin that was released from damaged insect cells of the soybean combined with the chitin of the peritrophic membranes, sugar compounds of the digestive tract epithelial cells, and the glycosylation digestive enzymes, after bean pyralid feeding. This affected the normal absorption of nutritional materials. It also induced disease in the digestive tracts and promoted gastrointestinal bacterial reproduction, thereby causing insect growth inhibition or death, which achieved the defensive purpose of killing the pests [91].
Influence of the genes related to the secondary metabolism on bean pyralid response Isoprenoid biosynthesis pathway is an important secondary metabolism route that widely exists in plants. It has the ability to synthetize isoprenoid compounds, and plays an important role in plant resistance reactions, in addition to participating in the growth and development of plants [92]. The genes related to isoprenoid biosynthesis pathway could be induced by bean pyralid, which speculated that isoprenoid biosynthesis pathway was involved in the defense reactions of soybean to bean pyralid.
Phenylpropanoid and its derivatives play an important role in plant development and stress responses. Environmental stresses can promote plant carbon synthesis, which changes into benzene propane synthesis, causing the accumulations of large amounts of substances related to plant stress resistance, such as lignin, flavone, flavonol and alkaloids. These substances have been found to improve the resistance of plants to various biotic and abiotic stresses [93]. The genes related to phenylpropanoid biosynthesis pathway, including PAL, CCOAOMT, COMT and CAD, could be induced by bean pyralid. PAL is not only the enzyme that catalyzes the first step of phenylpropanoid biosynthesis pathway but also the key and rate limiting enzyme for the pathway. Various metabolite substances that are catalyzed by PAL, such as lignin, flavonoids and phytoalexin, were related to plant insect resistance [94]. PAL can be induced by many types of biotic and abiotic stresses, including pathogen infections, insect feeding, mechanical injuries and exogenous plant hormones [95][96][97]. CCOAOMT is a type of important methyltransferase in plant lignin biosynthesis [98,99]. COMT catalyzes the reaction that converts caffeic acid into ferulic acid, which is an important step in the biosynthesis of lignin monomers [100]. CAD is the last step in lignin synthesis metabolic pathway and plays a key role in lignin biosynthesis [101,102]. PAL, CCOAOMT, COMT and CAD were related to lignin, which suggested that soybean might have enhanced its cell mechanical strength through the synthesis of lignin, which reinforced the cell walls, strengthened the cell to blocked insect feeding, and defended the insect pests by nutrient limitation [103,104].
Flavonoid plays an important role in insect-induced, disease, and other stress responses in plants [105]. The genes related to flavonoid biosynthesis pathway, including CHI, CHS, DFR and LAR, were found to be induced by bean pyralid. CHS and CHI are two key enzymes in flavonoid biosynthesis pathway; CHS catalyzes the condensation reaction of 4-coumaric acid-CoA and acyl-CoA, which causes chalcone to be formed, and flavonoid is then formed under the action of CHI, a necessary enzyme that syntheses flavanone, flavone, flavonol and anthocyanin substances [106,107]. The amount of flavonoid metabolites was directly affected by the expression of CHS and CHI, and the loss of expression ability or loss of enzyme function [108]. DFR and LAR are key enzymes in flavonoid synthesis, and during the process of plant flavonoid biosynthesis, anthocyanins, proanthocyanidins and other common synthesis precursors are leucoanthocyanins. DFR was catalyzed by flavanonols to generate leucoanthocyanidin, LAR converted leucoanthocyanidin into 2,3-trans-flavan-3-ols [109]. Previous studies have shown that over-expressions of DFR could improve the content and antioxidant capacities of flavonoid [110]. Our results suggested that flavonoid biosynthesis pathway played an important role in the defense against bean pyralid and that flavonoid accumulation is an important characteristic of responses to abiotic stress in plants.
Cytochrome P450 (CYP) is a type of pheme containing oxidoreductases. It can catalyze some substances with a defensive function, such as sterol, isoflavone, alkaloid, terpenoid, and so on [111,112]. CYP plays an important role in the defense of organisms against pests and abiotic stress [113]. CYP genes can be induced by aphid in soybean [114]. Our results showed that some genes related to CYP can also be induced by bean pyralid. Previous studies have shown that cyanogentic glycoside which was catalyzed and synthesized by the CYP79A and CYP71E1 genes in sorghum, was harmful to pests [115]. Therefore, it was speculated that soybean would use the CYP family to mitigate the threat of insect infestation.
AO is a copper-containing oxidase family that is wide spread in plants. It is a type of simple phenol and can catalyze the ascorbic acid (AA) of protoplasts in vitro and can oxygenate AA to generate monodehydroascorbate (MDHA), which regulates the overall oxidation state of the AA library of plant protoplasts in vitro [116,117]. When activity of AO increased, the ratio of oxidation and reduction in extracellular AA was also increased. There was a significant redox gradient on both sides of the plasma membrane, which modulated plant defense responses against insect attack and other stressed [118][119][120]. Our results showed that some genes related to L-AO can also be induced by bean pyralid. Therefore, L-AO was involved in the response reactions of soybean to bean pyralid.
Transcription factor on bean pyralid response WRKY is one of the largest families of transcription factors in plants and plays an important role in the regulation of plant growth and development as well as biotic and abiotic stresses [121][122][123]. Previous studies have shown that WRKY could be induced by insect stress [124]. For example, one WRKY23 gene was induced by Heterodera schachtii in Arabidopsis thaliana [125] and 20 WRKY transcription factors were induced by cotton boll weevil in Gossypium hirsutum [126]. NAC is known to be a second major transcription factor in plants and is also a specific transcription factor. NAC can activate downstream genes when plants are exposed to biotic or abiotic stress, so it is involved in the stress responses of plants [127][128][129]. Some NAC genes were induced by insects in rape [130], sugarcane [131] and Gossypium hirsutum [126]. MYB is one of the central regulators of development, metabolism, and response to abiotic and biotic stresses [132]. One of the known mechanisms is that MYB transcription factor contribute to the accumulation of flavonoids to protect plants from radiation injury and enhance their resistance to cold and insect pests in plants [133]. MYB could be induced by small cabbage white caterpillars [134] and cotton boll weevil [126]. AP2/EREBP is a type of specific transcription factor which is bound with ethylene responsive elements. It is able to combine with the GCC box of the promoter region of some resistance related genes, and carry out the function of activating the expression of these genes [135]. AP2/EREBP can potentially regulate the molecular responses of plants to hormones, pathogens, low temperatures, drought and high salt, which could improve the tolerance of crops to stress [136][137][138]. For example, the over expression of Tsi1 gene encoding EREBP/AP2 transcription factor increased the resistance to pathogens in tobacco, as well as the resistance ability to osmotic stress [138]. Our results showed that the transcription factor, such as WRKY, NAC, MYB and AP2/EREBP, could be induced by bean pyralid. It was confirmed that these transcription factors were involved in the defense reactions to bean pyralid.

Conclusions
To explore the defense mechanisms of soybean resistance to bean pyralid, the comparative transcriptome sequencing was completed between the leaves infested with bean pyralid larvae and no worm of soybean (Gantai-2-2 and Wan82-178) on the Illumina HiSeq™ 2000 platform. The results showed that there were 1064 and 680 DEGs were identified in the Gantai-2-2 and Wan82-178 after bean pyralid larvae feeding for 48 h, respectively, compared to 0 h. When comparing Gantai-2-2 with Wan82-178, 605 DEGs were identified at 0 h feeding, and 468 DEGs were identified at 48 h feeding. The DEGs were divided into three categories, including the DEGs with non-bean pyralid-induced genotype, bean pyralid-induced DEGs that appeared in both materials and bean pyralid-induced genotype DEGs. According to GO and KEGG functional and metabolic pathway analysis combined with the previously reported literatures, we concluded that the response to bean pyralid feeding might be related to the disturbed functions and metabolism pathways of some key DEGs, such as DEGs involved in the ROS removal system, mainly POD, PPO, GST and TRX; phyto-hormone signaling pathways, mainly JA, ET and auxin pathway; intracellular signal transduction pathways, including plant protein kinases and Ca 2+ signaling; secondary metabolism, such as isoprenoid biosynthesis pathway, phenylpropanoid biosynthesis pathway, flavonoid biosynthesis pathway, CYP and L-AO; transcription factors, such as WRKY, NAC, MYB and AP2/EREBP. Meanwhile, bean pyralid activated a large number of genes related to biotic and abiotic stresses. These results will help to elucidate the molecular mechanism of response to bean pyralid feeding in soybean, and provide a valuable resource of soybean defense genes that will benefit other studies in this field. Future research will focus on the cloning and transgenic function validation of possible candidate genes associated with the anti-bean pyralid soybean.

Plant materials
The tested materials Gantai-2-2 (highly resistant line) [3] and Wan82-178 (highly susceptible line) [3] were planted inside insect net rooms in experimental fields at the Guangxi Academy of Agricultural Sciences. The plants were sown in three rows, with 10 strains per row. During the entire growth period of soybeans, pesticides and fertilizers were not used. When the plants grew to a level of 10 compound leaves, the seventh compound leaves on the left side were collected before the infestation. There was a total of five plants for each sample. Then, each of the samples was simultaneously artificially infested with 5 four-year-old bean pyralids on the right side of the seventh leaves, counted downward. There were two biological repetitions. In each sample, five leaves were mixed, and they were frozen with liquid nitrogen and stored at −80°C for further use.

Total RNA extraction
The total RNA (5 μg) from the leaves (100 mg per sample) of Gantai-2-2 and Wan82-178 by using TRIzol kit (Invitrogen, Carlsbad, CA, USA) according to the manufacture's protocol. Briefly 1.3 ml of TRIzol kit was added into 100 mg of leaf sample, the sample was homogenized using power homogenizer and centrifuged at 12,000×g for 10 min at 4°C. After the fatty layer was removed and discarded, the cleared supernatant was transferred into a new tube and mixed with 0.2 ml of chloroform. The sample tube was shaken for 15 s, followed by an incubation for 3-5 min at room temperature. Next, the sample was centrifuged 12,000×g for 10 min at 4°C and the aqueous phase was moved into a new tube for RNA precipitation. For precipitating RNA from each sample, 10 μg of RNase-free glycogen was added to the aqueous phase as a carrier, followed by 0.5 ml of 100% isopropanol, then samples were placed at −20°C for 1 h and centrifuged at 13,600×g for 20 min at 4°C and discarded the supernatant. To wash the RNA pellet, we added 1.0 ml of 75% ethanol into the tube, vortexed the tube gently, centrifuged the tube 13,600×g for 3 min at 4°C and discarded the wash. The RNA pellet was air-dried, suspended in 50 μl Nuclease-free water, incubated at 55-60°C for 10 min. RNA concentration, purity and integrity were determined using a Nano-Drop2000 (Thermo Fisher Scientific, Waltham, MA, USA) and an Agilent 2100 Bioanalyzer (Agilent, Santa Clara, CA, USA).

cDNA library construction and transcriptome deep sequencing
Equal amount of total RNA (5.0 μg) was used for cDNA library construction using TruSeq™ RNA Sample Preparation Kit v2 (Illumina) (Illumina, SanDiego, CA, USA) and the cDNA library was sequenced on an Illumina HiSeq™ 2000 platform (Hiseq2000 Truseq SBS Kit v3-HS (200 cycles), Illumina) following the manufacturers' protocols. Briefly, Dynal Oligo (dT) beads (Invitrogen) were obtained poly(A) mRNAs. Then mRNAs were chemically fragmented into~200 nt fragments. Firststrand cDNA was generated by using reverse transcriptase and random primers, the second strand cDNA synthesis using DNA Polymerase I (Invitrogen) and RNase H (Invitrogen) treatment. The cDNA fragments were end repaired by using End Repair Mix (Illumina) reagent, Finally, purified and enriched to create the final cDNA library. Eight cDNA libraries were sequenced by using pair-end (2 × 100 bp) sequencing technology with an Illumina HiSeq™ 2000 sequencer (The RNA-Seq test and its results were analyzed by the BGI Tech Solutions Co., Ltd. (BGI Tech) (BGI, Shenzhen, PR China).
Raw read filtering and mapping to the reference genome and gene sequences SOAPfuse software was used to filter noise for the original sequencing reads [139]. Raw reads of eight libraries were performed by removing adapter sequences and low-quality reads, higher N rate sequences, and excessively short sequences. After passing the QC process of the alignment, the results were used for further analysis. The remaining high-quality reads were submitted for mapping analysis against the soybean reference genome (ftp://ftp.jgi-psf.org/pub/compgen/phytozome/v9.0/ear-ly_release/Gmax_275_Wm82.a2.v1/, version Glyma 2.0, 975 Mb), using BWA software and Bowtie software and allowing two base mismatches. We assembled the transcripts with reads using Cufflinks (http://cufflinks.cbcb.umd.edu/) [140].

DEG analysis
Genes and isoforms expression level are quantified by a software package: RSEMV1.2.12 (RNASeq by Expectation Maximization) [141]. RSEM computes maximum likelihood abundance estimates using the Expectation-Maximization (EM) algorithm for its statistical model, including the modeling of paired-end (PE) and variablelength reads, fragment length distributions, and quality scores, to determine which transcripts are isoforms of the same gene. The expression quantity of each gene (fragments per kilobase of exon model per million mapped fragments, FPKM) was used for the calculated expression level, the formula for which is as follows: where FPKM (A) is the expression of gene A, C is the number of fragments that are uniquely aligned to gene A, N is the total number of fragments that are uniquely aligned to all genes, and L is the number of bases on gene A. The FPKM method can eliminate the influence of different gene lengths and sequencing discrepancies in the calculation of gene expression. Therefore, the calculated gene expression can be directly used for comparing the difference in gene expression among samples. According to the Noiseq method, an absolute value of |Log2FC (Fold Change)| ≥ 1 and diverge probability ≥0.8 were used as the threshold to screen DEGs. In the DESeq2 method, the absolute value of |Log2FC (Fold Change)| ≥ 1 and p-adj ≤ 0.001 were used as the threshold to judge the significance of the gene expression difference. For the EdgeR method, the absolute value of |Log2FC (Fold Change)| ≥ 1 and FDR (False Discovery Rate) ≤ 0.001 were used as the threshold to judge the significance of the gene expression difference.

Bioinformatics analysis
GO (GO, http://www.geneontology.org/) and functional enrichment analysis were conducted on all DEGs using TermFinder software (http://www.yeastgenome.org/ help/analyze/go-term-finder). Then, all DEGs were mapped to a pathway in the KEGG database (http:// www.genome.jp/kegg/pathway.html) using Blast_v2.2.26 software. A p-value was used as ≤0.05 was used as the threshold to judge the significance of the GO and KEGG pathway enrichment analyses. The transcript version number of Wm82.a2.v1, which was obtained from the sequencing analysis was deleted from the website (http://www.soybaen.org/correspondence/) and only the names of the genes were retained. These names were converted into the differential gene transcription version number of Wm82.a2.v1. MapMan software (http://www.gabipd.de/projects/MapMan/) was used to assign the pathways and functional classifications to the DEGs. The latest Osa_MSU_v7 mapping file and pathways files were downloaded from the website.

Quantitative real time-PCR (qRT-PCR) analysis
qRT-PCR analysis was used to verify the RNA-Seq results. Primer Premier 5.0 (Premier Biosoft International, Palo Alto, CA) was used to design primers for qRT-PCR experiment (Additional file 13: Table S13). PrimeScript RT Reagent kits with gDNA Eraser (Takara, Dalian, China) were used to reverse transcriptase and synthesize the cDNA. The qRT-PCR reaction mixture (25 μl) contained 12.5 μl SybrGreen qPCR Master Mix (2 × concentration, Ruian Biotechnologies, Shanghai, China), 0.5 μl reverse and forward primers (10 μM), 2 μl cDNA and 9.5 μl ddH 2 O. Then, the qRT-PCR reaction was performed on an ABI7500FAST Real-Time PCR System (Applied Biosystems, Foster City, CA, USA) as follows: 2 min at 95°C, 40 cycles of heating at 95°C for 10 s and annealing at 60°C for 40 s. The β-actin gene was used as the internal control to calculate the relative