Transcriptomic analysis of genes in soybean in response to Peronospora manshurica infection

Background Soybean downy mildew (SDM), caused by Peronospora manshurica (Pm), is a major fungal disease in soybean. To date, little is known regarding the defense mechanism at molecular level and how soybean plants response to Pm infection. In this study, differential gene expression in SDM-resistant (HR) and SDM-susceptible (HS) genotype was analyzed by RNA-seq to identify differentially expressed genes (DEGs) following Pm infection. Results Of a total of 55,017 genes mapped to the soybean reference genome sequences, 2581 DEGs were identified. Clustering analysis of DEGs revealed that these genes could be grouped into 8 clusters with distinct expression patterns. Functional annotation based on gene ontology (GO) and KEGG analysis indicated they involved in diverse metabolism pathways. Of particular interest were the detected DEGs participating in SA/ROS and JA signalling transduction and plant/pathogen interaction. Conclusion Totally, 52 DEGs with P value < 0.001 and log2 fold change > 2 or < − 2 upon fungal inoculation were identified, suggesting they were SDM defense responsive genes. These findings have paved way in further functional characterization of candidate genes and subsequently can be used in breeding of elite soybean varieties with better SDM-resistance. Electronic supplementary material The online version of this article (10.1186/s12864-018-4741-7) contains supplementary material, which is available to authorized users.


Background
Soybean (Glycine max L.) is a major crop worldwide that provides abundant amount of protein and oil for human and animal consumption. However, soybean was severely affected by an obnoxious fungal disease, downy mildew (SDM), caused by Peronospora manshurica (Pm). The yield loss could be as high as 6-15% on average in the years when it was epidemic [1][2][3][4]. The proliferation of this fungus was favored by high humidity and relatively low temperature at 20-22°C. Initially infected leaves had pale green to light yellow spots on the upper surface but later on these spots were creamy white to pinkish coupled with growth of fruiting bodies on the back side of leaves leading to chlorophyll degradation and premature defoliation. In addition, the fungi obtain nutrients exclusively from living plant cells and cannot be cultured apart from their host [5,6].
Although some fungicides can be applied to reduce SDM infection, the most effective way to control this disease and reduce yield loss is to develop and use resistant variety in soybean production. In our previous study [7], 15 soybean genotypes were either in vitro diseaseindexed via leaf/Pm inoculation or by field observation in two consecutive years. By which the genotypes were classified into five categories: HR (highly resistant), R (resistant), MR (moderate resistant), S (susceptible) and HS (highly susceptible). We had cloned a GmSAGT1 gene coding for a bi-functional enzyme of serine/alanine glyoxylate aminotransferase from a soybean genotype highly resistant to SDM. Fifteen soybean genotypes were used for qRT-PCR analysis, which showed the average expression level of GmSAGT1 in HR group was 13 times of that in HS group. Northern blot analysis indicated that the expression level of GmSAGT1 in HS, S, MR and R group was 5, 22, 39 and 65% of that in HR, respectively. Transgenic tobacco expressing GmSAGT1 increased resistance to a fungal pathogen Rhizoctonia solani, demonstrating that higher expression of GmSAGT1 gene was closely correlated with increased SDM resistance.
However, the genomic mechanism associated with SDM resistance is currently not known that needs to be further elucidated. Understanding the genetic regulation of plant defense system will help to identify specific genes for either gene-based marker selection or genetic engineering to develop soybeans with better disease resistance. With the advent of next-generation sequencing technology, Illumina sequencing approach has been used for understanding the complexity of gene expression and regulation networks in plants responding to different pathogen attack, including bacterial leaf pustule (Xanthomonas axonopodis pv. glycines) [8,9], Phytophthora root and stem rot (Phytophthora sojae) [10], Asian soybean rust (Phakopsora pachyrhizi) [11] in soybean, and soybean cyst nematode (SCN; Heterodera glycines) in common bean [12]. By which the responsive genes and metabolism pathways associated with disease resistance have been identified. Unfortunately, no such a study on soybean/Pm interaction has yet been reported so far.
To gain transcriptional profiling data after Pm infection, a highly SDM-resistant genotype (HR) Jilinxiaoli1 (JL1) and a highly susceptible (HS) genotype Kefeng1 (KF1) were used for comparison in this study. The differentially expressed genes of HR and HS at 72 h after inoculation (hai) was compared with that obtained from non-inoculated leaves. Totally, 52 DEGs with P value < 0.001 and log 2 fold change > 2 or < − 2 upon fungal inoculation were identified. Of particular interest was the detected DEGs might involve in SA (salicylic acid), ROS (reactive oxygen species) and jasmonic acid (JA) signaling, implicating their roles in response to Pm invasion. A high level of accumulation of defense responsive gene products such as pathogenesis-related (PR) proteins and NBS-LRR (nucleotide binding site-leucine rich repeat) might also contribute to SDM-resistance in soybean. In addition, some transcription factors (TFs) of bHLHs, MYBs and WRKYs were up-regulated after inoculation. These findings provided an insight for further functional characterization of candidate genes resistant to SDM that would help in improving soybean breeding program. To the best of our knowledge, this is the first report describing differential expression of genes in response to soybean/Pm interaction.

SDM symptoms
Symptoms on leaves of JL1 (HR) and KF1 (HS) were investigated 72 hai. There were a large number of brown lesions surrounded by yellow haloes on the surface of leaves of HS KF1. In contrast, only a few number of small yellow spots on the leaves of HR JL1 formed (Fig. 1). The behavior of HR and HS genotype after inoculation was as expected, indicating their leaf samples could be used for further transcriptomic analysis.

RNA-seq analysis
Changes in transcript levels between inoculated and non-inoculated (JL1i vs. JL1ni and KF1i vs. KF1ni) were analyzed by RNA-seq. A total of 128.22 million raw reads were generated by 75 bp single end sequencing from the four cDNA libraries, constituting 4.51 Gb of cDNA sequences per library on average (Additional file 1: Table S1). By trimming adaptor sequences and elimination of low quality reads as well as reads containing unknown nucleotides larger than 10%, a total of 120.07 million clean reads were generated. Of which, 97.00 and 95.75 million clean reads were mapped (80.78%) and uniquely mapped (79.47%), respectively, to the soybean genome reference sequence (Glycine max Wm82.a2.v1) by using TopHat software (Additional file 2: Table S2). Of 65,781 predicted genes in the soybean genome, the expression level of 55,017 mapped genes were quantified based on RPKM values (reads per kilo bases per million reads) converted from mapped read counts of DEGs.

Transcriptome analysis in response to Pm-inoculation
Using the random sampling model in the DEGseq program, the mapped read counts of each gene with a Pvalue < 0.001 and log 2 fold change > 2.0 or < − 2.0 were selected, by which a total of 2581 DEGs were obtained. In comparison of gene expression level in the noninoculated samples (JL1ni vs. KF1ni), the expression level of 1174 and 937 DEGs were higher or lower in JL1 than that in KF1 (Fig. 2a), respectively, representing genotypic differences between HR and HS genotype. In the inoculated samples, the expression level of 164 genes were Y-axes indicates the mean of normalized counts (padj < 0.05) and X-axes indicates the log 2 fold change values. DEGs are shown in blue, green and red indicating down-regulated, no change and up-regulated genes, respectively higher and 235 genes were lower in JL1 than that in KF1, implying these genes might be specifically involved in the process of soybean/Pm interaction (Fig. 2b).
In terms of paired comparison of up-and downregulated DEGs in the inoculated vs. non-inoculated of HR or HS genotype, it was found that in the HR/JL1, out of 215 DEGs 90 were up-regulated and 125 were down-regulated (Figs. 2c and 3), while in the HS/KF1, of a total of 278 DEGs, 240 genes were up-regulated and 38 down-regulated, respectively (Figs. 2d and 3).

Clustering of DEGs
H-clustering analysis was used to group 2581 DEGs into clusters and sub-clusters based on common expression patterns. Eight distinct sub-clusters A to H were generated reflecting the general trends and key transitional states in the HR and HS following Pm-inoculation. The number of genes within each cluster was presented in Fig. 4.
To understand the functional components involved in molecular responses to Pm, we annotated the most enriched GO terms in the HR/JL1 and HS/KF1 with significant difference at a level of corrected P value < 0.05 (Table 1).
It was notable that the DEGs related to photosystem were all down-regulated upon Pm-inoculation in both HR and HS genotypes. Under the category of biological process, 7 DEGs associated with photosynthesis and 2 DEGs related to thyazole metabolic process were downregulated in the HR genotype. Similarly, in the category of cellular component, 7 enriched DEGs related to photosynthesis and 7 DEGs associated with thylakoid were down-regulated in HR genotype, while in the HS genotype 4 DEGs in the category of biological process and 4 DEGs in the cellular component were also downregulated. This common feature reflects inhibition of photosystem in soybean plants upon Pm-infection.
In contrast to the reduction of photosynthesis, 5 of the most enriched DEGs under the category of biological process and 4 DEGs in the category of molecular function, that all associated with ion homeostasis (e.g. oxidizing metal ions, ferric iron binding, ferroxidase and oxidoreductase activity), were up-regulated in HR genotype, implicating modification of diverse ion channels after inoculation might play a vital role in defense responses to Pm invasion (Table 1).

KEGG pathway analysis
In doing KEGG analysis, 2581 DEGs were annotated to KEGG pathway (http://www.genome.jp/kegg/), by which a total of 216 DEGs were identified. Results showed that in HR/JL1, among a total of 107 DEGs, there were 64 DEGs up-regulated involving in 31 KEGG pathways and 43 DEGs down-regulated participating in 19 different KEGG pathways (Additional file 4: Table S4). In HS/ KF1, of 109 DEGs, 91 were up-regulated involving in 26 KEGG pathways and 18 were down-regulated involving in 13 KEGG pathways (Additional file 5: Table S5).
Apart from the basic metabolic pathways, e.g. carbohydrate, amino acid and lipid metabolism, the most interest to us were pathways participating phytohormone signalling transduction and plant/pathogen interactions based on the knowledge already published in the literature.
In the plant hormone signaling, two genes encoding either EFB1/2 (EIN3-binding F-box protein1-like, involving in ethylene signaling) or AUX/IAA protein (auxinresponsive protein IAA, involving in auxin signaling) were identified in JL1i vs. JL1ni (Table 2), while in KF1i vs. KF1ni two AUX/IAA genes and seven JAZ (JASMO-NATE ZIM-motif ) genes (involving in jasmonate signaling), and one BAK1 protein kinase gene (Brassinosteroid insensitive1-associated receptor kinase1) were detected ( Table 2). In addition, in the glyoxylate and dicarboxylate metabolism, two genes coding for isocitrate lyase, a key catalytic enzyme involving in ROS signaling, were identified. All of these results demonstrated that these genes In the plant pathogen interaction, three genes in JL1i vs. JL1ni were detected including genes coding for LRR receptor-like serine/threonine-protein kinase FLS2 protein (Flagellin-sensitive 2), calcium-binding protein CML, and a protein associated with cyclic nucleotide gated channel (Table 2). In KFi vs. KF1ni, totally 18 genes were identified, including seven JAZ genes, seven calmodulin genes, one BAK1, one DNA-binding WRKY (Glyma01g31921), and one FLS2 ( Table 2).
Except one calmodulin gene was down-regulated, all above mentioned DEGs in the pathways of phytohormone signal transduction and plant/pathogen interaction were up-regulated in both genotypes of HR and HS after inoculation with Pm (Table 2).

SDM defense responsive genes
In searching SDM defense responsive genes, a total of 52 DEGs either up-or down-regulated were identified. Of the total, 31, 11 and 10 DEGs were categorized into phytohormone signalling transduction, plant/pathogen interaction and TFs, respectively ( Table 2).
In the phytohormone signalling pathway, 12, 8, 7, 1, 1 and 2 DEGs participating calmodulin, SA/ROS, JA, ET, BR (brassinosteroid) and AUX (auxin), respectively, in response to Pm-infection. In particular, 3 and 4 peroxidase genes were highly expressed with 4-6-fold increase in HR/JL1 and 7-9-fold increase in HS/KF1 by comparison of inoculated with non-inoculated samples. In the category of plant/pathogen interaction, 11 PR proteins were involved. In terms of TFs, 1 bHLH, 6 MYBs and 3 WRKYs were identified in the study (Table 2). By comparison of the expression level of the 52 DEGs in non-inoculated samples of JL1ni and KF1ni, if the read counts of KFni was taken as 1, then the log 2 fold change of 38 DEGs in JL1ni were higher than that in KF1ni, accounting for 73% of the total DEGs. Noteworthy, there were 14 DEGs associated with plant/pathogen interaction (including 5 PR proteins, 5 calmodulins and 4 oxidative proteins) and 3 transcription factors, all of which with log 2 fold change value greater than 2.0, demonstrating that the basal expression level of these DEGs were much higher in HR genotype without fungal pathogen and might be served as basal defense responsive genes ( Table 2).
On the other hand, when comparing the data of HR and HS after fungal inoculation, it was found that 50 DEGs in KF1i were up-regulated with log 2 fold change all greater than 2,0, accounting for 96% of the total. While in the JL1i, 43 DEGs (83%) with log 2 fold change greater than 2,0 were found after pathogen attack. These data suggested that the HS responded to Pm infection more quickly and strongly. By counting the number of DEGs, however, it was found that the read counts of 30 DEGs in JL1i were still higher than that in KF1i (Table 2).

qRT-PCR validation
To validate the RNA-seq expression data and its reliability, seven DEGs were randomly selected for Real-time quantitative PCR (qRT-PCR) analysis. To compare these two methods, the relative expression measurement from qRT-PCR was transformed into fold change by base 2 to match with the RNA-seq fold change value. The 7 selected genes for this comparison included five defense responsive genes i.e. Glyma08g08810, Glyma04g20330, Glyma04g10940, Glyma16g06940 and Glyma06g40710 and two WRKYs. Results of comparison between qRT-PCR and RNA-seq showed similar expression patterns indicating they were correlated. Therefore, the RNA-seq data could be used for gene expression profiling of soybean in response to Pm-inoculation. The primer set used for qRT-PCR were listed in Additional file 6: Table S6.

Discussion
The sequences of soybean genome publicly available at Phytozome (http://www.phytozome.net/soybean) coupled with  availability of next-generation RNA sequencing analysis provided a powerful tool to study the differential expression of genes in soybean plants in response to biotic and abiotic stresses. Kim et al. (2011) reported the first application of RNA-seq for profiling gene expression in soybean in response to pathogen attack. In their study, the transcriptomes of two near isogenic lines (NILs), one resistant and one susceptible to bacterial leaf pustule (Xanthomonas axonopodis pv. glycines), were analyzed 0, 6, and 12 hai and a total of 2761 DEGs, including a set of defense response genes, were identified. In the following years, transcriptomic analyses were reported on Asian soybean rust (Phakopsora pachyrhizi) [11], soybean root and stem rot (Phytophthora sojae) [10] in soybean, and soybean cyst nematode (SCN; Heterodera glycines) in common bean [12]. By which the responsive genes and metabolism pathways associated with disease resistance have been identified. Unfortunately, no such a study in soybean/Pm interaction has been reported so far. Here, we provided data demonstrating there were significant transcriptomic variations in SDM-resistant and susceptible genotypes of soybean in response to Pm-infection.

SDM resistance and inoculation
In our previous study, we cloned a GmSAGT1 gene from a HR genotype of soybean, Zaofeng5. After 4 h of SA induction, the transcript level of this gene in the HR genotype was 13 times higher than that in HS genotype Heinong10, demonstrating that higher expression of GmSAGT1 gene was associated with increased SDM resistance [7]. In the current study, we used an additional pair of genotypes i.e. JL1 (HR) and KF1 (HS) as plant materials for conducting transcriptomic analysis. JL1 was bred by the Soybean Institute, Jilin Academy of Agricultural Sciences in 1979. This cultivar was derived from a cross between cultivated soybean (G. max) and wild spices (G. sojae, GD50477). It featured as small grain size, relative narrow and long leaves, early maturity and higher resistance to SDM. These characters might be derived from wild species of soybean [13,14]. The KF1 was a commercial variety in late 1990s with SMV (soybean mosaic virus) resistance that later on turned to be highly susceptible to SDM. For understanding differential expression of genes in response to Pm infection, it is important to know the sampling time point corresponding to what stage of pathogen invasion. In this study, 72 hai was used. According to the literature, the 72 hai was corresponding to the stage of hyphae growth and haustoria formation of Pm fungi in the soybean plant tissue. Riggle (1974) conducted a detailed histopathological study of Pm on leaves of soybean cultivars with varying degree of resistance and on a non-host. It indicated that the germination of conidia, germ tube growth, appressorium formation, development of penetration pegs, and hyphal growth for 36 hai were about the same on all categories of plants, including a susceptible, a resistant near isogenic line (NIL), a heterozygous line for resistance, a moderately resistant cultivar and a non-host common bean. During 36 to 120 hai, hyphae grew rapidly on the leaves of susceptible cultivars and the number of haustoria/mm of hyphae was high. However, on the resistant NIL, the hyphae growth was very slow and few haustoria were formed. The author concluded that the resistance appeared to be related to slow rate of hyphal growth and the ability of hyphae to form haustoria. These data suggested that the 72 hai used in the current study was the right time point to represent the early stage of Pm invasion. The DEGs identified in the HR and HS genotype at this stage might provide insight and expand our knowledge in understanding the molecular basis of SDM resistance.

SDM defense responsive genes
By comparison of the read counts of the 52 SDMdefense responsive DEGs, it was interesting to find that the expression levels of most DEGs (73%) in HR noninoculated samples were higher than that in the HS non-inoculated ones ( Table 2). It indicated that the basal expression levels of these DEGs might be sufficient for resistance and could be serve as basal defense responsive genes. In contrast, by comparison of the log 2 fold change value of inoculated vs non-inoculated samples of HR and HS, it showed that the susceptible KF1 reacted to Pm attack more rapidly and strongly (Table 2). This was not surprising since SDM lesions were developed on leaves of HS plant 72 hai, while during this period only few hypersensitive spots showed on leaves of HR plant. Although HS/KF1 responded to fungal attack more quickly, the read counts of DEGs (30/52, 58%) were still lower than that in HR/JL1 after inoculation (Table 2). It should also note that, after Pm inoculation, the log2 fold change value of 83% DEGs in HR and 96% DEGs in HS were greater than 2.0 (Table 2), demonstrating additive effects existed following fungal invasion.
Plants have developed two defense mechanisms in its evolution, which are (i) pathogen triggered immunity (PTI) that confers non-host resistance by recognizing a broad range of pathogens with conserved molecular pattern. (ii) effectors-triggered immunity (ETI), in which the pathogen effectors are recognized by specific R genes that encode nucleotide binding site and NBS-LRR domains [15,16]. Unfortunately, to the best of our knowledge, the ETI genes specifically resistant to Pm have not yet been reported to date in soybean due to both genetic and pathogenic analysis at molecular level are lacking. As such, the gene-for-gene interaction between soybean and Pm is unclear so that it is not possible to classify the putative genes detected in this study into PTI and ETI. Therefore, we further analyzed the genes involving in phytohormone signalling pathways and the genes encoding PR proteins as well as TFs based on the existing knowledge and the functional annotation of DEGs in GO and KEGG analysis in the study.
It has been shown that SA, JA and ET play major roles in response to biotic stress as their levels increase with pathogen infection [17,18]. Upon pathogen infection, the early signalling events occurred, including increase of intracellular Ca 2+ concentration and production of secondary signalling molecules such as ROS. In our GO analysis, 5 of the most enriched DEGs (iron ion homeostasis, metal ion homeostasis, cation homeostasis, ion homeostasis, chemical homeostasis) under the category of biological process were all associated with ion homeostasis and 4 DEGs (ferroxidase activity, oxidoreductase activity, ferric iron binding, isocitrate lyase activity) in the category of molecular function were associated with oxidation as well as ROS production (Table 1). In the analysis of SDM defense responsive genes, 10 DEGs coding for calmodulin proteins and 1 coding for calmodulin-regulated Ca 2+ ATPase (ACA4) were identified (Table 2), indicating a wide range of genes participated in the modulation of Ca 2+ metabolism.
Studies have shown that Ca 2+ functions in concert with other important second messengers like ROS [19]. In the SA signalling pathway, a rapid increase in the rate of ROS production, known as 'the oxidative burst' , occurs in response to biotic and abiotic stresses [20]. Higher rate of ROS production results in the induction of PR genes expression which in turn leading to a longlasting and broad-spectrum induced resistance known as systemic acquired resistance (SAR) [18]. Once SA pathway is activated, the expression of downstream PR genes is induced. Results obtained in this study indicated there were four genes encoding PR proteins (Glyma01g31730, Glyma03g30420, Glyma06g40710, Glyma16g06940) that were highly differential expressed (Table 2).
In soybean, our previous study indicated that after SA induction, the peroxisome-located gene GmSAGT1 and its upstream gene GmGOX coding for glycolate oxidase were highly expressed in soybean resistant variety than that in susceptible ones [7]. The function of GmGOX protein is to convert glycolic acid into glyoxylate and H 2 O 2 by oxidation [21]. Interestingly, in the current study we have identified seven DEGs coding for oxidative enzymes, including one glyoxylase, two isocitrate lyases and four peroxidases. It is known the isocitrate lyase is a key enzyme in the glyoxylic acid cycle, which catalyzes the lysis of isocitric acid to produce succinic acid and glyoxylic acid. Subsequently, the resultant glyoxylic acid is further converted to serine and H 2 O 2 . These findings have provided evidence that SA in conjunction with ROS appear to be involved in the establishment of systemic defenses and play an essential role in H 2 O 2 accumulation, PR gene expression, hypersensitive response and SAR induction [22].
SA is generally involved in the activation of defense responses against biotrophic and hemi-biotrophic pathogens [23], whereas, JA and ET are responsible for defense against necrotrophic pathogens [24,25]. The P. manshurica is a biotrophic pathogen and the annotation of the most DEGs identified in this study let us reasonably to suggest SA signalling coupled with ROS is a key prominent pathway in soybean in response to Pm infection. Based on above results, a schematic diagram was drawn in describing the molecular processes and the involvement of DEGs in soybean host upon Pm-invasion (Fig. 5).
Recent studies have shown that there is sophisticated crosstalk between different hormone-mediated pathways. In a recent review article, Verma et al. (2016) have emphasized that the defense responses activated in plants in response to different stresses depends on the type of crosstalk (positive or negative) between the hormone signaling pathways rather than solely on the individual contributions of each hormone.
It is commonly viewed that SA and JA regulate biotic stress responses antagonistically [17]. Studies have shown NPR1 to be a key player in the antagonistic crosstalk of SA and JA [26]. The SA-facilitated suppression of JA-responsive genes like LIPOXYGENASE 2 (LOX2) has been reported [27]. Interestingly, an isozyme LOX1 was also detected in the present study, however, it was upregulated after Pm inoculation. Moreover, it is noteworthy that, in contrast to antagonistic crosstalk between SA and JA, we found they are simultaneously activated since seven JAZ and two FLS2 were highly differential expressed in both HR/JL1 and HS/KF1 after Pm infection. Although most studies prove antagonistic interaction between SA and JA, synergistic interactions have also been observed at low SA-JA concentrations and by simultaneous induction of both defenses [28,29].
The DEGs involving in ET, BR and AUX signalling pathways were detected in this study. It was reported that JA and ET operate synergistically in regulating defense related genes after pathogen infection [30]. However, in considering JA and ET are responsible for defense against necrotrophic pathogens [30] and auxin is mainly responsible for plant growth and development [18], the roles of these signalling in SDM defense responses require further elucidation. Due to the complexity of crosstalk between different plant hormones and in line with the simultaneous activation of SA and JA observed in this study, it is of great interest to investigate how they crosstalk one another and whether they are synergistically reacted. All of these remain to be addressed.

Transcription factors
In the study, three types of TFs including one bHLH (the core JA-signalling components MYC2), six MYBs and three WRKYs, which differentially expressed in HR/ JL1i vs. JL1ni and HS/KF1i vs. KF1ni, were identified ( Table 2).
MYC2 is a basic helix-loop-helix (bHLH) TF that recognizes the G-box and G-box variants in the promoter of its target genes and regulates different branches of the JA pathway [31][32][33][34][35][36][37]. MYB1 gene in tobacco is known to encode a signaling component downstream of SA that may participate in transcriptional activation of PR genes and plant disease resistance [38]. MYBs and WRKYs are large families in plant genome, in soybean there are 252 and 188 members of MYB [39] and WRKY families [40], respectively. WRKY proteins, a large family of transcriptional regulators that has to date only been found in plants. Many previous studies have shown WRKYs are involved in the plant response to biotic and abiotic stresses either act as transcriptional activators or repressors [41][42][43]. In soybean, considerable effort has gone into identifying WRKY TFs and its mediated gene expression in response to pathogen invasion. Bencke-Malato et al. [11] have conducted a detailed study on genome-wide annotation of the soybean WRKY family and functional characterization of genes involved in response to Phakopsora pachyrhizi (Asian soybean rust) infection. In this study, we have identified three WRKYs, i. e. GmWRKY4 (Glyma01g31921), 31 (Glyma04g06495) and 156 (Glyma17g04710) (the name of GmWRKYs are given according to the Yu et al., [40]), that are highly differential expressed following Pm inoculation. In particular, their expression level has increased 4.13-to 5.14-fold in HS/KF1, while it has increased 1.39-to 3.95fold in HR/JL1. To date, the roles of WRKYs in soybean immune response to different pathogen attack and how WRKYs react with their target genes or crosstalk to the genes involving in the SA-, JA-and other plant hormone signalling remain largely unknown.

Conclusion
Comparative transcriptomic analysis of HR and HS genotype with and without inoculation allowed us to explore defense responsive genes to P. manshurica infection. Totally, 52 DEGs were identified in the study, which participated in the SA/ROS and JA signalling pathway resulting in the calcium modulation, oxidative burst and PR protein production. Of particular interest are the DEGs coding for glyoxylase, isocitrate lyase and an array of peroxidase. The PR proteins and TFs are also worth to be further investigated and characterized to identify candidate genes resistant to soybean downy mildew disease. As plant defense response is a very complex system, which needs further comprehensive analysis step by step.

Plant materials and Pm-infection
In our previous study [7], 15 soybean genotypes were classified into five categories: HR (highly resistant), R (resistant), MR (moderate resistant), S (susceptible) and HS (highly susceptible) by using in vitro leaf/Pm inoculation or by field observation. In the current study, we have used JL1 (HR) and KF1 (HS) as plant materials for conducting transcriptomic analysis.
To prepare source of Pm, infected leaves (mixture of different races) were collected from soybean field at Shenyang Agricultural University. Collected leaves were placed on moisture paper in a Petri dish and maintained at 20°C overnight for proliferation of conidia. Leaves were then washed with autoclaved double distilled water to collect conidia followed by dilution to a final concentration of 5×10 5 conidia/ml. Back side of the top third expanded leaves of 40 d-old seedlings were inoculated using brush pen. Leaves of JL1 and KF1 brushed with distilled water in the same manner without pathogen were used as a control. Plants were then placed in a growth chamber under conditions of 20/24°C, 14 h dark/10 h light cycle and relative humidity ≥95%.

Sample preparation and RNA sequencing
Total RNA was extracted from leaves 72 hai using RNAprep Pure Plant Kit (Tiangene Co., China) following the manufacturer's instructions. To prepare cDNA samples for transcriptome sequencing, the quantity of total RNA was assessed with Nanodrop spectrophotometer (Thermo Fisher Scientific Inc., USA). The RNA quality and integrity was evaluated on the Agilent Bioanalyzer 2100 system (Agilent Technologies, USA) and then processed with NEBNext Ultra RNA Library Prep Kit. The cDNA fragments of~150 bp in length were selected and further enriched with PCR amplification to generate cDNA libraries. Four cDNA libraries derived from JL1ni, JL1i, KF1ni and KF1i, each with two biological replicates, were used for Illumina sequencing. RNA-sequencing was conducted by Allwegene Co., China using equipment of HiSeqTM2000/ MiSeq, which were technically repeated three times.

Reads filtering, alignment and de novo assembly
Before assembly, the raw reads were first filtered to gain high quality clean reads by removing low quality reads as well as reads containing adaptor or unknown nucleotides larger than 10%. Clean reads were de novo assembled into transcripts based on the soybean reference genome sequence. For removing redundancy, when a component contained more than one assembled transcript, only the longest one was preserved. The clean reads from each RNA-seq library were aligned with soybean genome (Glycine max Wm82.a2.v1) as a reference (http://www.phytozome.net/soybean) using TopHat software (http://tophat.cbcb.umd.edu). Resulted read counts for each matched gene were normalized with RPKM (reads per kilo bases per million reads) to measure its expression level. For gene annotation, the matched genes were searched by BLAST against the databases of Phytozome and NCBI (www.ncbi.nlm.nih.gov/).

Differential expression of genes in response to Pm infection
Genes differentially expressed in JL1i vs. JL1ni and KF1i vs. KF1ni with P-value < 0.001 were selected and the difference of expression level was log 2 -transformed and filtered at a level of 2-fold or greater. These genes were considered as DEGs. DEGs in inoculated and noninoculated leaves of HR and HS genotype were represented as Venn diagrams.
Annotation of the up-and down-regulated genes was carried out based on the identification of conserved PFAM domains (pfam.xfam.org). PFAM domains were converted into GO-identities by mapping to GO (http:// www.geneontology.org).
DEGs were further clustered according to the expression patterns using H-cluster method. The significantly enriched GO terms involving in biological process, cellular component and molecular function in response to soybean/Pm interaction were identified. In addition, the DEGs were mapped to soybean KEGG database using the DEGseq package [44] to search the genes involving in different KEGG pathways. To identify genes in soybean immune responses to Pm infection, the up-and down-regulated DEGs and TFs were analyzed based on annotation of their function.
Validation of RNA-seq data by qRT-PCR Seven DEGs were randomly selected for qRT-PCR analysis for validation of RNA-seq data. In brief, 1.0 μg of total RNA prepared from each sample was first reverse transcribed into its cDNA with oligo(dT)18 through M-MLV Reverse Transcriptase (TaKaRa, Japan). The qPCR was then performed on an ABI 7500 Real-Time System (Applied Biosystems, USA) with a 20 μl reaction system. In which it contained 2 μl of cDNA, 0.8 μl of each primer, 6 μl of sterile water, 10 μl of SYBR premix TagII (TaKaRa, Japan), and 0.4 μl of Rox Reference DyeII (TaKaRa, Japan). The conditions for PCR amplification were: 15 s denaturation at 95°C followed by 40 cycles at 95°C for 5 s, 60°C for 34 s, and 72°C for 10 s. Primers used for qRT-PCR were designed by DNAMAN software and listed as Additional file 6: Table S6.
Soybean Actin gene (Glyma 08g19420) was used as an internal control. Samples were analyzed based on the stable expression level of the reference gene according to the method described by Livak & Schmittgen [45]. Three independent replicates were performed for each sample. The data of mean and standard deviation were calculated and statistically analyzed by Student's T-test.