Identification of microRNAs in species with or without fully sequenced genomes has been revolutionized by high-throughput sequencing methods. High-throughput sequencing miRNA analysis in Phaseolus vulgaris will facilitate more particular and specific bean miRNA studies as well as legume sncRNAs research in this rapidly growing field. Although deep-sequencing experiments have become the major source supporting microRNA annotations, technical variations, transcript length bias and mapping bias have been reported with this approach for transcriptome analysis of coding RNAs (mRNAs) [17, 55, 56]. Studies to elucidate the number of miRNA molecules sequenced from a small RNA library are still needed for more accurate small RNA profiling studies. In terms of reads, the small RNA libraries sequenced here yielded a larger number of total raw reads to work with than did many other studies in which plant miRNAs have been identified; however, the total number of reads of identified miRNAs, including non-abundant and miRNA isoforms, constitutes only 0.047% of total raw reads. Further studies are needed to understand to what extent the 24-nt class representing siRNA populations, usually the most abundant and diverse class of sncRNAs sequenced in small RNA libraries, masks miRNA populations. Also, coverage analyses with fully sequenced genomes are needed to elucidate sequenced sample proportions of small non-coding RNAs such as tRNA, rRNA, snoRNA or snRNA.
Most miRNA families identified in this study in P. vulgaris are evolutionarily conserved in Fabaceae, particularly in the best-studied plants M. truncatula and G. max. MiRNA family miR157 has not been reported in these two legumes, probably because of the similarity of its sequence to that of miR156. Other miRNA families not reported in miRBase in G. max and M. truncatula are miR397 and miR408. The only Fabaceae where miR408 has been previously found was Arachis hypogaea. One miRNA of the bean miR397 family with the same sequence as ath-miR397 was detected herein with 7416 reads in total, so it is likely to be present in other legumes as well. MicroRNA miR2199 has been reported in miRBase only for M. truncatula and the variant found here for this miRNA in all four libraries of P. vulgaris was previously reported as miRS1 by Arenas-Huertero et al. (2009) because mtr-miR2199 was not yet reported. As well as in Lotus japonicus, the A17- > G (miRS1) variant was found in peanuts (Arachis hypogaea L.) [35, 57].
It is interesting that miRNA families such as miR1511, miR1514 and miR1515 first detected in soybean, have not been found in other legumes besides P. vulgaris. In this regard, other miRNA families reported specifically in soybean (gma-miR1524, gma-miR1532, gma-miR1526, gma-miR1516 and gma-miR1513 and gma-miR1508) were detected by Valdés-López et al. (2010) using a hybridization approach (macroarrays) under several abiotic stress conditions in leaves, roots and nodules of P. vulgaris. These miRNAs from soybean were not detected by high-throughput sequencing, with the exception of two reads for gma-miR1508. In the Valdés-López et al. (2010) analysis, expression was detected for miRNAs miR1524, miR1526, miR1532 and miR1508 in common bean plants grown under sufficient nutrient conditions or stressed conditions. Family miRNAs gma-miR1513 and gma-miR1516 were detected only under stressed conditions. In addition, expression was detected in the macroarrays for an oligonucleotide probe designated as pvu-miR1509. This probe was based on a variant detected by Arenas-Huertero et al. (2009) that was not detected here because it has more than two mismatches with the reference miRNA gma-miR1509. The miR170 family, for which Valdés-López et al. (2010) detected expression in P. vulgaris, was neither found in this study. MiR170 is very similar to miR171, so it is possible that the macroarray probes hybridized with the same miRNA family.
MiRNA variants are considered to be a consequence of inaccuracies in Dicer pre-miRNA processing. Smaller variants with missing bases and low frequencies are viewed as degradation products or pyrophosphate sequencing errors. Herein, small RNA sequences from libraries were classified as miRNA isoforms only if they were similar to a reference miRNA reported in miRBase and had a significantly greater number of reads compared to those found for the reference miRNA in all four organs. From this isoform identification analysis, four more miRNA families (miR1510, miR2199, miR4376 and miR479) were added to the total number of conserved miRNA families identified in common bean. All of the variants found here for these four families were highly abundant in all four libraries. It is probable that miR1510 and miR2199 miRNA families are part of the P. vulgaris miRNA population, based on the two variants identified for each of these two families, along with their conservation in other species (see results) and previously experimental expression analyses [13, 22].
Another family highly likely to be present in common bean is the miR4376 family. The variant found here for miR4376 was very abundant in common bean and has only one nucleotide missing relative to its reference miRNA found in soybean. On the other hand, further experimental and genomic sequences analyses are still needed to validate the variant identified in common bean for the miR479 family, which has two nucleotides missing and one mismatch compared with its reference miRNA csi-miRNA479. That is also the case for other less conserved miRNA families identified based on the isoform analysis, like miR858, miR2597, miR894 and miR1310. Most of the reference miRNAs for these less conserved miRNA family variants were poorly or not detected in the libraries.
In plant and animal microRNAs, 3' heterogeneity is the most frequent length variation. Most of the variants identified from the length variant group exhibit 3' heterogeneity. Sequence length heterogeneity for plant microRNAs has been proven to be essential for correct plant development and environmental responses. It is known that miR168 in Arabidopsis thaliana is produced in length variants of 21 and 22 nucleotides . A decrease in abundance of the 21 nt variant reduces miR168 homeostasis and leads to developmental defects. The large number of reads detected for some variants in P. vulgaris suggests significant regulatory roles like detected for miR168 in A. thaliana. Using the pre-miRNA of pvu-miR482 (Figure 4) as an example, three variants generated for "mature-star" miRNAs (5p) and two variants generated for the mature miRNAs (3p) were observed. Considering the number of reads found for these variants, mature-star sequence variants were by far greater in abundance than their corresponding mature sequence variants. In particularl two mature-star sequence length variants, one previously reported in miRBase as pvu-miR482* of 22 nucleotides and the other here denoted as pvu-isomiR482* of 21 nucleotides, had 1641 and 1207244 total reads respectively. It is important to take into account that sequences for reference miRNA gso-miR482a and its highly abundant variant pvu-isomiR482a were detected, and that these variants necessarily have to be excised from another stem-loop precursor not yet identified. For this reason, it is possible that the variants pvu-miR482* and pvu-isomiR482* were generated from different loci, as actually happens for the two miR168 variants in A. thaliana. In addition to miRNAs previously reported in other plants and those present in closely related species such as soybean or Medicago, common bean may encode species-specific miRNAs. To address this question, sequencing reads were explored using the miRDeep algorithm . Those candidates were favoured where in addition to a mature miRNA present in the stem region of the hairpin precursor, there was also evidence of a miRNA* sequence recovered. Next, candidate miRNAs having a plausible stem-loop precursor but without any miRNA* sequences in the libraries were obtained. It will be interesting to see to what extent these candidates can be confirmed as genuine miRNAs and whether they are involved in biological processes specific to P. vulgaris. As in other examples shown here, the availability of a fully-sequenced genome will provide a more complete picture of novel miRNAs.
Currently, microarray hybridization approaches, real time quantitative PCR (RT-qPCR) analyses and high-throughput sequencing technologies are widely used microRNA profiling methods. MiRNA regulatory functions in different organs, different stress conditions, and different developmental stages are still unknown. MiRNA profiling studies are important first approximations to analyze miRNA functions according to their different expression patterns. In this work miRNA expression was analyzed in order to find important differences in miRNA family expression levels within common bean organs. Biological replicates are essential to determine if differences observed are caused by conditions and not just by experimental variations. Because of this, cluster analysis was employed. The DESeq package allows users to work without replicates with the caveat that the test will lose strength. It assumes that most transcripts will have approximately the same levels within replicates under the different conditions, and that the estimated variance should not change too much.
Experiments designed to explore miRNA and mRNA expression are subjected to many different technical and biological biases. Nevertheless, differential processing of stem-loop precursors, small RNA duplexes, and, in general, miRNA biochemical properties challenge expression analysis methods usually used to analyze mRNAs. The majority of conserved miRNA families identified were expressed in all organs at different levels. Conserved miRNA families seem to be transcribed in almost all plant organs. Which of those conserved miRNAs have an essential role in legumes for certain developmental stages or stress responses is not completely understood. Perhaps less conserved miRNAs families that are highly abundant in legumes such as miR1510, miR1511, miR1514, miR1515, miR2118 and miR2199, play essential roles in characteristic processes of legumes such as nodulation, as altered expression of miR482 and miR1515 has been proven to increase soybean nodulation .
The present study contributes, together with previous common bean miRNA studies, to characterizing the P. vulgaris miRNA population. It represents a unique analysis of P. vulgaris miRNAs performed with high-throughput next-generation DNA sequencing (NGS) technologies. Shortly, full genome sequence and transcriptome datasets from different P. vulgaris cultivars will be available. The analysis of novel common bean genome sequence information will benefit from the tools presented here to expand important small RNA research needed in this critical worldwide crop.