Identification of novel soybean microRNAs involved in abiotic and biotic stresses

Background Small RNAs (19-24 nt) are key regulators of gene expression that guide both transcriptional and post-transcriptional silencing mechanisms in eukaryotes. Current studies have demonstrated that microRNAs (miRNAs) act in several plant pathways associated with tissue proliferation, differentiation, and development and in response to abiotic and biotic stresses. In order to identify new miRNAs in soybean and to verify those that are possibly water deficit and rust-stress regulated, eight libraries of small RNAs were constructed and submitted to Solexa sequencing. Results The libraries were developed from drought-sensitive and tolerant seedlings and rust-susceptible and resistant soybeans with or without stressors. Sequencing the library and subsequent analyses detected 256 miRNAs. From this total, we identified 24 families of novel miRNAs that had not been reported before, six families of conserved miRNAs that exist in other plants species, and 22 families previously reported in soybean. We also observed the presence of several isomiRNAs during our analyses. To validate novel miRNAs, we performed RT-qPCR across the eight different libraries. Among the 11 miRNAs analyzed, all showed different expression profiles during biotic and abiotic stresses to soybean. The majority of miRNAs were up-regulated during water deficit stress in the sensitive plants. However, for the tolerant genotype, most of the miRNAs were down regulated. The pattern of miRNAs expression was also different for the distinct genotypes submitted to the pathogen stress. Most miRNAs were down regulated during the fungus infection in the susceptible genotype; however, in the resistant genotype, most miRNAs did not vary during rust attack. A prediction of the putative targets was carried out for conserved and novel miRNAs families. Conclusions Validation of our results with quantitative RT-qPCR revealed that Solexa sequencing is a powerful tool for miRNA discovery. The identification of differentially expressed plant miRNAs provides molecular evidence for the possible involvement of miRNAs in the process of water deficit- and rust-stress responses.


Background
Small, non-coding RNAs have been characterized in plants as important factors involved in gene expression regulation in developmental processes [1,2], as well as adaption to biotic and abiotic stress conditions [3,4]. In general, small RNAs are grouped into two major classes: microRNAs (miRNAs) and short-interfering RNAs (siRNAs). These two classes of small RNAs cannot be discriminated by either their chemical composition or mechanism of action [5,6]. However, siRNAs and miR-NAs can be distinguished by their origin, evolutionary conservation and the types of genes that they silence [5,6]. In this way, miRNAs are well differentiated due to some particular characteristics. These characteristics include the following: derived from genomic loci distinct from other recognized genes, processed from transcripts that can form local RNA hairpin structures, and usually, miRNAs sequences are nearly always conserved in related organisms [6,7].
In plants, MIRNA genes are transcribed by RNA polymerase II enzymes (Pol II) generating primary miRNA (pri-miRNA). The pri-miRNA forms an imperfect foldback structure, which is processed into a stem-loop precursor (pre-miRNA) by nuclear RNaseIII-like enzymes called DICER-LIKE proteins (e.g., DCL1) [8]. The resulting pre-miRNA contains a miRNA:miRNA* intermediate duplex, formed by a self-complementary fold-back structure. A mature miRNA sequence can range from 19 to 24 nucleotides (nt) in length and act as a regulatory molecule in post-transcriptional gene silencing by base pairing with target mRNAs. This leads to mRNA cleavage or translational repression, depending on the degree of complementarity between the miRNA and its target transcript [6,9]. The same mature miRNA can also present several variants of their sequence in length. These populations of miRNA variants are called isomiR-NAs, which are isoforms of microRNAs [10]. They are caused by an imprecise or alternative cleavage of Dicer during pre-miRNA processing [10]. IsomiRNAs have been recently identified in both plants and animals [10][11][12].
In soybean (Glycine max (L.) Merrill), the major legume crop worldwide, Subramanian et al. in 2008 [26] identified 35 novel miRNA families for the first time. In this study, the role of miRNAs in soybean-rhizobial symbiosis was investigated [26]. During that same year, Zhang et al. [27] used a comparative genome-based in silico screening of soybean EST databases and quantitative PCR to provide evidence for 69 miRNAs belonging to 33 families. A second study involving miRNAs and soybean root nodules was performed by Wang and colleagues [28]. They identified 32 miRNAs belonging to 11 miRNA families. The identification of nine novel miRNAs in wild soybean (Glycine soja) was also reported by Chen et al. [29]. Another study looked at four different soybean tissues (root, seed, flower and nodule) and identified 87 novel soybean miRNAs [30]. Recently, Song and coworkers [31] identified 26 new miRNAs and their related target genes from developing soybean seeds. Although these studies resulted in a large number of miRNAs identified in soybean, none of them looked at microRNAs with respect to biotic and abiotic stresses.
Drought is the major abiotic stress factor to negatively affect soybean productivity around the world. The impact of limited water during the flower formation can cause shorter flowering periods [32,33], and water stress during the later phases of soybean reproductive development has been reported to accelerate senescence, which decreases the duration of the seed-filling period [32,33]. With regards to biotic stress, Asian soybean rust (ASR) is a foliar disease caused by the fungus Phakopsora pachyrhizi Sydow & Sydow. This pathogen presents a rapid aerial spread and a high capacity to colonize leaf tissue and, to a lesser extent, stem and pods [34]. ASR is one of the most severe diseases on the soybean culture, which causes damage between 10% and 90% in the different regions where it has been identified [35,36]. This disease is the main threat in soybean-producing countries.
Currently, there are 203 miRNAs identified in Glycine max (miRBase database, release 16, http://www.mirbase. org/); however, none of these miRNAs were associated with water deficit or ASR stress conditions. We consider that the identification of these miRNAs is important to understanding small RNA-mediated gene regulation in soybean roots under water deficit stress and in leaves during rust infection. In this context, our goal was to identify new miRNAs and to discover those that may be regulated by water deficit and soybean rust stress. Using high-throughput sequencing, we constructed four libraries of small RNAs from the roots of drought-sensitive and tolerant seedlings in response to control or water deficit conditions. We also constructed four libraries from leaves of rust-susceptible and resistant seedlings with mock and infected conditions. A set of eight small RNAs libraries was analyzed from soybean plants. A total of 256 miRNAs were detected in Solexa sequencing. We discovered 24 novel miRNAs families and also detected several isomiRNAs in soybean. In our RT-qPCR analysis, we verified that the expression profile of several miRNAs varied during abiotic and biotic stresses. This study has important implications for gene regulation under water deficit and pathogen-infection conditions and also contributes significantly to increase the number of identified miRNAs in soybean.

Plant materials and treatments
Water deficit assay For water deficit treatment, we used the soybean (Glycine max (L.) Merrill) cultivars 'Embrapa 48' as a drought-tolerant standard and 'BR 16' as a sensitive standard [37]. Plants were grown in a greenhouse at Embrapa-Soybean (Londrina, Brazil) using a hydroponic system compound for plastic containers (30 liters) and an aerated pH 6.6-balanced nutrient solution. Seeds were pre-germinated on moist filter paper in the dark at 25°C ± 1°C and in 65% ± 5% relative humidity. Plantlets were then placed in polystyrene supports so the roots of the seedlings were completely immersed in the nutrient solution. Each seedling tray was maintained in a greenhouse at 25°C ± 2°C and in 60% ± 5% relative humidity under natural daylight (photosynthetic photon flux density (PPFD) = 1.5 × 10 3 μmoles m -2 s -1 , equivalent to 8.93 × 10 4 lux) for a 12 h day. After 15 days, seedlings with the first trifoliate leaf fully developed (V2 developmental stage) [38] were submitted to different waterdeficit treatments according to Martins et al. [39]. The nutrient solution was removed from each plastic container where the roots were kept in the tray in the dark without nutrient solution or water for 0 minutes (T0 or control), 125 minutes (T125) and 150 minutes (T150). At the end of each water-deficit period, the roots of the seedlings were immediately frozen in liquid nitrogen and stored at -80°C until RNA extraction. The experimental design was a factorial (cultivars × duration of water deficit) with three replicates. Each replicate was composed of five plantlets that were sampled in bulk. Four libraries of small RNAs were constructed for the water deficit-stress assays from the following root tissues: 1) roots of drought-sensitive seedlings submitted to 0 minutes of stress (Drought-Sensitive Root Control (DSRC)); 2) roots of drought-sensitive seedlings submitted to 125 minutes and 150 minutes of stress (Drought-Sensitive Root Treated (DSRT)); 3) roots from drought-tolerant seedlings submitted to 0 minutes of stress (Drought-Tolerant Root Control (DTRC)); and 4) roots of drought-tolerant seedlings submitted to 125 minutes and 150 minutes of stress (Drought-Tolerant Root Treated (DTRT)).

Asian Soybean Rust assay
The ASR reaction was evaluated in soybean plants in a greenhouse at Embrapa-Soybean (Londrina, Brazil) using a field population of Phakopsora pachyrhizi collected from soybean fields in the state of Mato Grosso, which were maintained for over 10 generations on the susceptible cv. BRSMS-Bacuri. ASR identification was confirmed by ITS-sequencing analysis as described by Silva et al. [40], and it revealed a similarity to the MUT Zimbabwe isolate. The soybean plants were grown in a pot-based system. The 'Embrapa 48' genotype was used as a susceptible host plant, which develops a susceptible lesion (TAN) after Phakopsora pachyrhizi infection. The 'PI561356' genotype was used as the resistant host, which carries an ASR resistance gene mapped onto linkage group G (Ricardo V. Abdelnoor, personal communication) and develops a reddish-brown (RB) lesion with few or no spores.
Urediniospores were collected from infected BRSMS-Bacuri plants in a separate greenhouse by tapping infected leaves over a plastic tray. The urediniospores were then diluted in distilled water with 0.05% Tween-20 to a final concentration of 3 × 10 5 spores/mL. This spore suspension was sprayed onto three plants per pot at the V2 to V3 growth stages [38]. A solution without the spores was used for the mock inoculations. Following the ASR or mock inoculations, water-misting bags were placed over all plantlets for one day to aid the infection process and to prevent cross-contamination of the mock-infected plants. The third trifoliolate leaves of six plants were collected 12 hours after inoculation (hai) for RNA extraction. The experiment followed a completely randomized design with the three replicates as blocks and a full factorial treatment structure consisting of three treatment factors: hai (12 hours), genotype (resistant or susceptible), and inoculation type (ASR or mock).

RNA extraction and sequencing
Total RNA was isolated from fresh leaves and root materials using Trizol (Invitrogen, CA, USA), and the RNA quality was evaluated by electrophoresis on a 1% agarose gel. The amount of the RNA was verified using a Quibit fluorometer and Quant-iT RNA assay kit according to the manufacturer's instructions (Invitrogen, CA, USA). Total RNA ( > 10 μg) was sent to Fasteris Life Sciences SA (Plan-les-Ouates, Switzerland) for processing and sequencing using Solexa technology on the Illumina Genome Analyzer GAII. The libraries were constructed from the eight bar-coded samples (DSRC, DSRT, DTRC, DTRT, RSLC, RSLI, RRLC and RRLI) sequenced in a total of two channels. Quality scores were generated from Illumina's data analysis pipeline, which are similar to SAGE Phred scores with a maximum value of 40. Quality scores are based on the relative confidence of base calls using elements of cluster generation and image quality. Briefly, the processing by Illumina for the miRNA analyses consisted of the following successive steps: acrylamide gel purification of the RNA bands corresponding to the size range 20-30 nt, ligation of the 3' and 5' adapters to the RNA in two separate subsequent steps each followed by acrylamide gel purification, (3) cDNA synthesis followed by acrylamide gel purification, and a final step of PCR amplification to generate cDNA colonies template library for Illumina sequencing. After removing the adapter sequences, the sequences were trimmed into different read lengths from 19 to 24 nt for further analysis.

Prediction of miRNAs
The reads were grouped into unique sequences, and the read counts were calculated for each library. The sequences that presented low read counts (read count < = 2) were discarded from the final list of unique sequences, which are referred to as a tag. The sequences were mapped into the soybean genome (http://www. phytozome.net) assembly using the SOAP program [41], which returns information concerning the alignment position, chromosome number and strand. No mismatches were allowed in the alignments. The tag alignment position's upstream and downstream genomic sequences (200 bp each) were extracted from the genome assembly using homemade Perl scripts. These genomic regions were then aligned against the reverse complement of its respective tag (rc-tag) using the Smith-Waterman algorithm [42]. To ensure that these pre-miRNA sequences could be precisely processed into mature miRNA, the candidates were examined according the following criteria [43]: i) the miRNA and antisense miRNA should derive from the opposite stemarms and must be entirely within the arm of the hairpin; ii) base-pairing between the miRNA and anti-sense miRNA were restricted to four or fewer mismatches; and iii) the frequency of asymmetric bulges was restricted to less than one and the size should be less than two bases. The genomic regions that were not possible to align the tag and rc-tag were discarded. Finally, the genomic regions that were limited between the alignment positions of the tag and rc-tag were considered as pre-microRNA candidates. From all the pre-microRNA candidate sequences that we selected, only the ones with no more than five matches to the soybean genome were selected for analyzing the secondary structure using the RNA-folding program Mfold [44]. If a perfect stem-loop structure was formed, the small RNA sequence was at one arm of the stem, and the respective anti-sense sequence was at the opposite arm; then, the small RNA was consisted as a novel soybean miRNA.

miRNA validation and expression analysis by RT-qPCR
To validate predicted new miRNAs, RT-qPCR in respect to eleven miRNAs was performed to examine their expression across the eight different libraries. From those, six were new miRNAs belonging to conserved soybean miRNAs families (MIR166a-5p, MIR166f, MIR169f-3p, MIR482bd-3p, MIR1513c, MIR4415b); one new miRNA pertencing to conserved miRNAs families in other plants species (MIR397ab); and four were miR-NAs belonging to novel miRNAs families (MIR-Seq07, MIR-Seq11, MIR-Seq13, MIR-Seq15ab). The forward miRNAs primers were designed based on the full miR-NAs sequence, and the reverse primer was the universal reverse primer for miRNA [45]. The stem-loop primer, used for miRNA cDNA synthesis, was designed according to Cheng et al. [45]. The stem-loop sequence consisted of 44 conserved and six variable nucleotides that were specific to the 3' end of the miRNA sequence (5' GTCGTATCCAGTGCAGGGTCCGAGGTATTCG-CACTGGATACGACNNNNNN 3'). The RT-qPCR was performed in an ABI 7500 Real-Time PCR System (Applied Biosystems) using SYBR Green I (Invitrogen) to detect double-stranded cDNA synthesis. Reactions were completed in a volume of 24 μL containing 12 μL of diluted cDNA (1:50), 1X SYBR Green I (Invitrogen), 0.025 mM dNTP, 1X PCR Buffer, 3 mM MgCl 2 , 0.25 U Platinum Taq DNA Polymerase (Invitrogen) and 200 nM of each reverse and forward primer. The universal reverse primer (5' GTGCAGGGTCCGAGGT 3') was used in all RT-qPCR reactions. Samples were analyzed in biological triplicate in a 96-well plate, and a no-template control was included. We used MIR156b (5'-TGACAGAAGAGAGAGAGCACA -3'), MIR172ab (5'-AGAATCTTGATGATGCTGCAT -3') and MIR1520d (5'-ATCAGAACATGACACGTGACAA -3') as reference genes, which has been demonstrated as optimal normalizers for water deficit and rust-stress analysis in Glycine max [46]. The conditions were set as the following: an initial polymerase activation step for 5 minutes at 94°C, 40 cycles for 15 seconds at 94°C for denaturation, 10 seconds at 60°C for annealing and 25 seconds at 72°C for elongation. A melting curve analysis was programmed at the end of the PCR run over the range 65-99, increasing the temperature stepwise by 0.4°C. Threshold and baselines were manually determined using the ABI 7500 Real-Time PCR System SDS Software v2.0. To calculate the relative expression of the miRNAs, we used the 2 -ΔΔCt method. Student's t-test was performed to compare pair-wise differences in expression. The parameters of two-tailed distribution and two samples assuming unequal variances were established. The means were considered significantly different when P < 0.05.

Prediction of miRNA targets
Target prediction for miRNAs is straightforward because it is assumed that most of them match their targets with almost perfect complementarity [8,9]. The putative target genes for all miRNAs identified were searched for by using the web-based computer psRNA Target Server (http://biocomp5.noble.org/psRNATarget/) [47] which can identify putative targets that may be regulated at post-transcriptional or at translational levels. Mature miRNA sequences were used as queries to search for potential target mRNAs in the Glycine max database (DFCI gene index release 15). The total scoring for an alignment was calculated based on the miRNA length, and the sequences were considered to be miRNA targets if the total score were less than 3.0 points (mismatch = 1 and G:U = 0.5). Results from these analyses were individually inspected on the Phytozome, where the loci and protein annotation were obtained. In order to look for evidences of the predicted targets of the novel identified miRNA, we searched for the miRNA targets sites in the soybean degradome libraries published by Song et al. [31] available under NCBI-GEO accession nμ. GSE25260. Finally, all putative targets regulated by soybean new miRNAs which were analyzed by RT-qPCR were subjected to AgriGO database to investigate the gene ontology [48].

Results
To identify miRNAs from soybean under water deficit and rust stresses, we generated eight libraries of small RNAs species. From these libraries, a total of 256 miR-NAs ranged from 19 to 24 nt-long sequence sizes were identified ( Table 1). All pre-miRNA sequence candidates that were selected by the parameters stipulated during the miRNA prediction and those that had no more than five matches on the soybean genome were folded using the Mfold program. All miRNA sequences with the respective precursor sequence originating at a hairpin structure were submitted to the miRBase to determine if they were a new or known miRNA. We separated the results of these miRNAs according the following classes: novel miRNAs belonging to miRNAs families never detected before (29 miRNAs); new miRNAs belonging to conserved miRNA families in other plants species detected for the first time in soybean (15 miRNAs); miRNAs belonging to conserved miRNA soybean families (71 miRNAs); different isoforms of new and known miRNAs (121 isoforms); and known miRNAs already deposited into the miRBase database (20 miR-NAs) ( Table 1).

Identification of novel miRNAs from soybean
A total of 29 new miRNAs belonging to 24 novel families ( Table 2) were identified by Solexa sequencing in libraries from water deficit and rust infections of Glycine max. These families were provisory nominated Seq01 to Seq25 ( Table 2). The precursor miRNA sequences varied from 55 to 239 nt in length. Precursors of these novel miRNAs were identified, and they formed proper secondary hairpin structures, with MFEs ranging from -16.50 to -153.80 kcal/mol (Additional file 1). The most abundant mature miRNAs were 21 nt in length. We also evaluated the genomic location of the new miRNAs ( Table 2). Of the 29 new miRNAs genes identified in soybean, around 86% (25) were located in intergenic regions and the rest were situated inside genes. The mature miRNAs sequences were localized inside the stem-loop sequence with almost half in each arm: 17 miRNAs were localized in the 3' arm and 12 miRNAs were in the 5' arm. More than 63% of the pre-miRNA sequences were in the same sense direction (+) as the soybean genome annotation. For all 24 novel families identified, four were compounded by miRNAs provided from two loci, and we detected only one miRNA member for the rest. Sense and anti-sense miRNAs were detected only in one family, the Seq10, and both were nominated according the arm localization (3p or 5p). Most of the new mature miRNA sequences presented a uracil (U) as their first nucleotide, which is in agreement with previous results for soybean root sequences [26].

Identification of homologues miRNAs of other plant species
To determine whether any of the miRNAs identified in our libraries were conserved among other plant species, we searched miRBase for homologues. Besides the novel families identified, we also detected 15 miRNAs belonging to six conserved families in other plants species (Table 3). The families MIR170, MIR395, MIR397, MIR408, MIR2118 and MIR3522 were detected for the first time in soybean. For families MIR170 and MIR3522, only a single locus was identified, and for MIR408, three genes were found. In two families, MIR408 and MIR2118, we detected sense and antisense    (Table 3). We observed two families (MIR2118 and MIR3522) to be conserved between Glycine max and Glycine soja; however, we expect that more miRNA families could be conserved between these species considering that they are closely related. This low number is probably due to Glycine soja showing fewer miRNAs identified to date.

Identification of conserved soybean miRNAs
To identify conserved soybean miRNAs, all 256 sequences were searched using BLASTn against the soybean miRNAs in miRBase. We identified 22 families of conserved soybean miRNAs in our libraries. Only 20 miRNA soybean genes that were registered in the miR-Base were observed (indicated by the number five in Table 1). From the remaining 71 miRNA genes, 12 were miRNAs antisense (in the opposite arm) to the miRNAs presents in miRBase (indicated as group four in Table  4), and 59 were new members detected from new loci of known families (indicated by number three in Table 4). Of the 12 miRNAs identified from the opposite strand of previously known miRNAs, six were in the 5' arm and six in the 3'arm. For the 59 new members of conserved soybean families, 45 miRNAs were 21 nt in length. The family with the largest number of new miRNA genes (nine genes) was MIR319 (Table 4). Interestingly, in family MIR166, we found three new members with sense and antisense miRNAs. Also, in MIR159, two new genes with sequences originated from both the 3'and 5'arms were identified. One new gene was detected in MIR169, MIR172, MIR396 and MIR482 with mature sequences originated from both the 3'and 5'arms (Table 4). Similar to the observation for the novel soybean miRNAs (Table 2), the new genes in these conserved soybean families were compounded for a majority of mature miRNAs with a uracil as the first nucleotide in the 5' end.

Identification of miRNAs isoforms
Isoforms of microRNAs (isomiRNAs) are a population of known miRNA variants. They are caused by an imprecise or alternative cleavage of Dicer during pre-miRNA processing [10]. We detected numerous miRNAs with additional nucleotides in the 5'or 3' terminus compared to the recorded mature miRNAs. As isomiRNAs were previously reported in soybean highthroughput sequencing [31], we found 121 isomiRNAs in our libraries (Table 5). These isoforms were observed in 22 conserved miRNA families and in four novel families. These miRNA isoforms occurred in both strands from the 5' or 3' arm. The conserved MIR1507a and MIR1507b were found with the most isomiRNAs detected (eight isoforms each). MIR1507a showed a variation of three nucleotides in the 5'end and six nucleotides in the 3'end, and MIR1507b showed a variation of three and five nucleotides in the 5'and 3' terminal region respectively (Table 5). From the novel miRNAs identified, the MIR-Seq07 was the read with the most isoforms detected in our sequencing. This miRNA presented a total of 14 different sequences with 14 varying nucleotides in both the 5'and 3' ends from six fixed nucleotides (Table 5). All isoforms and their respective nominated mature miRNAs can be found in Additional File 1.

Validation of miRNAs validation and expression profile by RT-qPCR
The stem-loop RT-qPCR was used to validate and measure the expression of the respective miRNAs: MIR166a-5p, MIR166f, MIR169f-3p, MIR397ab, MIR482bd-3p, MIR1513c, MIR4415b, MIR-Seq07, MIR-Seq11, MIR-Seq13 and MIR-Seq15ab, detected by Solexa sequencing. These miRNAs were validated in all genotypes analyzed during dehydration and rust stress. The relative expressions of these miRNAs in the same eight conditions are shown in Figure 1.

Expression patterns of miRNAs during water deficit
To identify water deficit-responsive miRNAs, we compared the expression profiles of the 11 miRNAs in both genotypes before and after stress ( Figure 1A). A set of five different miRNAs (MIR166-5p, MIR169f-3p, MIR1513c, MIR397ab and MIR-Seq13) presented the same behavior during the water deficit stress. These miRNAs were commonly up-regulated during the stress condition in the sensitive genotype, and the opposite occur in the tolerant genotype, where they were downregulated during the water deficit. MIR-Seq11 and MIR-Seq15 demonstrated a similar expression across the four conditions. Water deficit significantly increased MIR-Seq11 and MIR-Seq15 expression in the roots compared to the control condition in the sensitive genotype, but both miRNAs did not vary in the tolerant plants.
MIR166f had its level increased in the sensitive genotype and decreased in the tolerant during the stress compared to the control situation. Interestingly, both genotypes presented the same level during the control condition. In the sensitive plants, MIR-482bd-3p showed a strong decrease when submitted to water deficit, being this low level equally observed in the tolerant genotype during the control condition and decreasing when subjected to stress. MIR4415b presented an effective rise in its expression level during the water deficit in the sensitive plants, and its high level was also observed in the tolerant genotype independent of the condition. Both sensitive and tolerant genotype exhibited the same expression pattern for MIR-Seq07 and its level was increased during the stress compared to the control situation.

Expression patterns of miRNAs during soybean rust stress
The RT-qPCR analyses of four libraries from the rust assays are shown in Figure 1B. The differential   a The group number refers to: (2) the miRNAs previously identified in other plant species as described in Table 2; (3) the new miRNAs identified in the families of conserved miRNAs in soybean; (4) miRNAs originated from the opposite arm of miRNAs previously identified; and (5) miRNAs registered in the miRBase database that were detected in our libraries. b Sequence conserved between all isoforms and the number of nucleotide variations in each end. c Total number of isoforms (isos) including the typical member for that gene. Figure 1 Effects of biotic and abiotic stresses on miRNA relative expression evaluated by RT-qPCR. A) Comparative analyses of four libraries from the water deficit experiment. For the water deficit-stress assay, the four libraries were named as: DSRC (drought-sensitive seedlings root submitted to 0 minutes of stress); DSRT (drought-sensitive seedlings root submitted to 125 minutes and 150 minutes of stress); DTRC (drought-tolerant seedlings root submitted to 0 minutes of stress) and DTRT (drought-tolerant seedlings root submitted to 125 minutes and 150 minutes of stress). B) Comparative analyses of four libraries from the rust infection experiment. For the rust-stress assay, the four libraries were named as: RSLC (rust-susceptible seedlings leaves mock inoculation); RSLI (rust-susceptible seedlings leaves with rust-spore inoculation); RRLC (rust-resistant seedlings leaves with mock inoculation) and RRLT (rust-resistant seedlings leaves with rust-spore inoculation). Samples that significantly differs (P < 0.05) according to a Students t-test statistical analysis, were label as: "*" effective differences between cultivars in control conditions; "a" effective differences between control and stressed conditions for sensitive or susceptible plants; "b" effective differences between control and stressed conditions for tolerant or resistant plants and "1" when an effective difference was also observed between sensitive or susceptible and tolerant or resistant under stress conditions. expression analyses revealed that MIR166a-5p, MIR166f, MIR169-3p, MIR397ab and MIR-Seq13 were dow-regulated in the susceptible genotype during pathogen infection, and equally expressed in the resistant plants. The level of MIR482bd-3p did not vary significantly between the two different conditions in the susceptible. However in the resistant genotype, its level is higher during the control condition and decrease with the pathogen attack. MIR1513c presented unchangeable expression in the control and stressed condition for both genotypes, but when we compared the two genotypes; the resistant was down-regulated compared to the susceptible. A strong decrease was observed for MIR4415b in the rust infection when compared with the control in the susceptible plants, and its level is higher in the resistant genotype showing no expression alteration between the conditions. MIR-Seq07 was down-regulated with respect to the soybean rust infection in both genotypes. Significant difference was observed in MIR-Seq11 expression between the mock and infected plants from the susceptible genotypes. This miRNA presented a low expression level after rust inoculation, and its level decreased in the resistant genotypes remaining similar in the both conditions. MIR-Seq15ab expression level was significantly decreased in the rust compared to the mock treatment in the susceptible genotype, the opposite occurs in the resistant genotype, when the control showed a lower level of expression compared to the stressed condition.

Target prediction of the soybean miRNAs
MiRNAs suppress gene expression by inhibiting translation, promoting mRNA decay or both [9]. Target gene identification is challenging due to many factors including the following: binding to their target mRNAs by partial complementarity over a short sequence, suppression of an individual target genes is often small, and targeting rules are not completely understood. We predicted the potential miRNAs targets in the psRNA database using all identified miRNAs as queries. The results of the analysis were divided into two tables, showing the targets predicted for the novel (Table 6) and for the conserved miRNAs families (Additional file 2). Among the 24 novel identified miRNAs families, only 14 families had targets predicted ( Table 6). The miRNAs families MIR-Seq01, MIR-Seq03, MIR-Seq06, MIR-Seq07, MIR-Seq08, MIR-Seq12 and MIR-Seq13 had multiple distinct targets. MIR-Seq10, MIR-Seq15 and MIR-Seq18 targeted only one locus. Although, MIR-Seq05, MIR-Seq11, MIR-Seq16 and MIR-Seq19 presented several loci as targets, all of them are coding for the same proteins. Fructose-bisphosphate aldolase, LRR (leucine-rich-repetitions)-containing proteins, translation elongation factor were predicted to be potential targets of the novel MIR-Seq07 which was investigated by RT-qPCR. The search for a target of the novel MIR-Seq11, also analyzed by RT-qPCR, showed a match to Glycine max peroxidase precursors mRNAs as potential targets. The oxidoreductase and a transcription regulator factor were predicted to be targeted by MIR-Seq13; and for the MIR-Seq15 only a translation initiator factor was predicted as a target.
After a comparative analysis of our novel identified miRNAs and the degradome libraries of developing soybean seeds it was possible to identify specific sequences in the degradome that corresponds to the downstream sequence of the predicted miRNA recognition site. We identified target sequences to six among the 24 novel soybean miRNAs (MIR-Seq01, MIR-Seq 06, MIR-Seq07, MIR-Seq11, MIR-Seq12 and MIR-Seq16). The list of the 10 identified genes is composed by a glucosyl transferase, serine carboxypeptidase, fructose biphosphate aldolase, three leucine-rich repeat protein, two peroxidases and two ATP dependent RNA helicases (Additional file 3).
Although many soybean conserved miRNAs targets have been predicted and validated by previous studies [26,27,30,31], we also investigated the possible targets for the 28 known families of miRNAs detected in our sequencing. Of these, only 21 families had predicted targets and they are listed in the Additional file 2. The conserved miRNA families showed multiples targets, however families MIR156, MIR172, MIR396, MIR397, MIR1510 and MIR1513 were highly conserved about their targets. For example, all members from the MIR156 family (which had a predicted target) targeted SBP (squamosa promoter binding)-domain protein. AP (2) APETALA 2 transcription factors were targeted by MIR172 family. The same occur with MIR396, MIR397, MIR1510 and MIR1513 families that targeted various genes families as GRF (growth regulating factor) transcription factor, multicopper oxidases, LRR (leucinerich-repetitions)-containing proteins and F-BOX domain proteins respectively. These results were already observed across several plant species (except for MIR1510 and MIR1513) [25,[49][50][51][52][53].

Gene Ontology analysis
The targets of those miRNAs which the expression was analyzed by RT-qPCR were also investigated in respect to their gene ontology (GO) [48]. Among the 11 miRNA genes, six presented target predictions, which were: MIR397ab, MIR1513c, MIR-Seq07, MIR-Seq11, MIR-Seq13 and MIR-Seq15ab. The putative soybean miRNAs targets presented diverse functions, however the most representative group was the proteins involved in oxidoreductase activity followed by the proteins involved in the catabolic process ( Figure 2). The result demonstrates that more than 76% of the target proteins are involved in oxidoreductase activity is consistent with the fact that some of the miRNAs libraries are originated from stressed plants. A consequence of many environmental stresses -including water deficit and pathogen attack -is a oxidative stress, i. e. the accumulation of reactive oxygen species (ROS), which damage cellular structures [49,54]. As miRNAs MIR397, MIR-Seq11 and MIR-Seq13 were predicted to match proteins with oxidative activity, they may act in some level of regulation during water deficit or ASR stress.

Discussion
The use of deep-sequencing technology was efficient to identify 256 miRNAs of Glycine max. These miRNAs were identified from eight different libraries from precursors with stem-loop secondary structures that also map to the soybean genome (Additional file 1). They were detected from water deficit and rust libraries and were characterized as following: detected for the first time, already detected in some plant species, conserved in soybean, or a variant of a known miRNA (isoform).
From these analyses, we found 24 novel families that had not been detected before, six families that had already been detected in Coniferophytes, Embryophytes and Magnoliophytes (dicotyledons and monocotyledons), and 22 conserved soybean families. In terms of conserved soybean miRNAs, we only detected 20 known miRNAs in our sequencing. This small number of known miRNA genes detected in our libraries could be due to the two filters used in our processing. These filters may have missed some known, conserved soybean miRNAs because they discarded reads with low frequency and those with more than five matches in the genome.
We detected 121 miRNAs with additional nucleotides in the 3' or 5' terminus compared to the recorded mature miRNA. These miRNA variants (isomiRNAs) were very common in our population of small, detected RNAs. Out of the isomiRNAs, we observed 21 pairs of sense and antisense miRNAs. The duplex presents the antisense strand paired to the corresponding miRNA with two nucleotides 3' overhangs (Additional file 1). This shows that the sense and antisense miRNAs originated from DCL1 processing and supports their validation as true miRNAs [26,55,56].
In addition, we validated the conserved miRNAs in our libraries based on homology to known miRNAs in miRBase. The phylogenetic conservation of miRNA sequences is one rule proposed by Ambros et al. [7] to characterize miRNAs. In this study, we established new miRNAs in soybean that were already detected in other plants species. However, as opposed to some studies that only blast the candidate to the known miRNA mature sequence, our identifications were determined by precursor sequence folding and verification of the genuine hairpin structures.
The complexity of the plant response to biotic and abiotic stresses involves many genes and biochemical and molecular mechanisms, and adaptation to these stresses is achieved through regulating gene expression at the transcriptional and post-transcriptional levels. With regard to post-transcriptional regulation, miRNAs are associated with water deficit response in others plants, but this was the first time that differential expression of these small RNAs were observed in soybean during water deficit. In order to validate 11 of the novel miRNAs detected in sequencing by the RT-qPCR method, we constructed primers stem-loop and analyzed their expression during abiotic and biotic stresses ( Figure 1). We observed that several miRNAs were up-regulated during the water deficit in the sensitive genotype ( Figure 1A). However, during the same stress, these miRNAs had a different expression in the tolerant genotype. This distinct miRNAs behavior between the two contrasting genotypes under the same conditions could be involved with the drought-tolerance that is observed in the tolerant genotype. One of these miRNAs with this expression pattern is the new MIR-Seq11. Interestingly, MIR-Seq11 was predicted to target peroxidase protein. As known, stress conditions can produce excess concentrations of reactive oxygen species (ROS), resulting in oxidative damage at the cellular level [57]. The increase of this miRNA in the sensitive genotype, when subjected to water deficit, could be one of the factors associated with vulnerability of these sensitive plants. Whereas in tolerant genotype during the two conditions, the expression levels of MIR-Seq11 are lower than in the sensitive cultivar during stress. This situation could indicate that the unchangeable MIR-Seq11 levels in the tolerant genotype may be related to its drought-tolerance capacity.
Another interesting point is the expression of a novel miRNA MIR-Seq07 that showed increased expression levels during the water deficit stress for both genotypes. This result allows us to associate this miRNA with water deficit stress mechanism independently of the genotype background. Our computational approach showed that one of the loci targeted by MIR-Seq07 corresponds to a fructose-bipfosphato-aldolase enzyme which is a constituent of both the glycolytic/gluconeogenic pathway and the pentose phosphate cycle in plants [58]. Therefore increase and/or activation of aldolase appear to be implicated in the plant growth mainly through promotion of the glycolytic pathway function to synthesize ATP [58]. Since, MIR-Seq07 expression was increased during the stress condition in both genotypes and assuming that it can inhibit or degrade aldolases, it could be associated to metabolism decreasing during water deficit in roots.
Plants possess several adaptive traits to support pathogen attacks. In Glycine max, ASR is responsible for significant losses in soybean growth areas. Nevertheless, no study investigating miRNAs and ASR disease had been preformed to date. To determine if miRNAs act as key factors during rust infection or for resistance maintenance, we performed expression analyses with the same 11 miRNAs during mock and infected conditions in two different genotypes ( Figure 1B). In general the miRNAs under the fungus infection were down-regulated in the susceptible genotype (except MIR482bd-3p). For example, MIR-Seq11, MIR-Seq13 and MIR-Seq15 which had predicted peroxidases, oxidoreductases and translational initiation factor respectively as targets proteins, were down regulated when infected with ASR. The peroxidases enzymes help to metabolize H 2 O 2 in higher plants, and these proteins, as also others proteins with oxidoreductase activity, have already been reported to be upregulated after pathogen infection and especially after ASR [57], indicating a possible involvement of MIR-Seq11 and MIR-Seq13 with the responses to ASR infection. Considering, that a translational initiator factor was predicted to be targeted by MIR-Seq15, we could speculate about the participation of this miRNA in the protein synthesis machinery.
In the resistant plants, most of the miRNAs analyzed by RT-qPCR (except MIR482bd-3p, MIR-Seq07, MIR-Seq15ab) did not vary across the mock and rust infection. Surprisingly, MIR-Seq07 was the unique miRNA that was down-regulated during the fungi infection for both genotypes analyzed in our study. We already mentioned that the MIR-Seq07 had predicted protein target related to metabolism and thus its possible association with water stress. However MIR-Seq07 also had predicted LRRs (leucine-rich repeats)-domain target which are known to be present in disease resistance proteins [59,60]. This suggested a good candidate for the investigation of the miRNAs' regulatory mechanisms during ASR stress. Although we investigated the expression patterns of some miRNAs detected in our sequencing and predicted the target genes that it regulates, additional experimental approaches must be addressed to confirm these hypotheses.

Conclusions
The present study detected a large number of small RNA sequences that were characterized as novel and as already known soybean miRNAs. We grouped some of these unique sequences into 24 novel soybean miRNAs and further classified several of new members in known families or as new loci in the soybean genome. Validation of new miRNA with quantitative RT-qPCR revealed that Solexa sequencing is a powerful tool for miRNA discovery. Many miRNA expression patterns were upor down-regulated by water deficit and rust stresses, which is an important discovery. Future investigations should use supplementary experimental approaches to verify the targets and to understand the complex gene regulatory network of these miRNAs. This work will contribute to improve systems to support soybean crop production and to mitigating crop losses during biotic or abiotic stresses.