Analysis of the melon (Cucumis melo) small RNAome by high-throughput pyrosequencing

Background Melon (Cucumis melo L.) is a commercially important fruit crop that is cultivated worldwide. The melon research community has recently benefited from the determination of a complete draft genome sequence and the development of associated genomic tools, which have allowed us to focus on small RNAs (sRNAs). These are short, non-coding RNAs 21-24 nucleotides in length with diverse physiological roles. In plants, they regulate gene expression and heterochromatin assembly, and control protection against virus infection. Much remains to be learned about the role of sRNAs in melon. Results We constructed 10 sRNA libraries from two stages of developing ovaries, fruits and photosynthetic cotyledons infected with viruses, and carried out high-throughput pyrosequencing. We catalogued and analysed the melon sRNAs, resulting in the identification of 26 known miRNA families (many conserved with other species), the prediction of 84 melon-specific miRNA candidates, the identification of trans-acting siRNAs, and the identification of chloroplast, mitochondrion and transposon-derived sRNAs. In silico analysis revealed more than 400 potential targets for the conserved and novel miRNAs. Conclusion We have discovered and analysed a large number of conserved and melon-specific sRNAs, including miRNAs and their potential target genes. This provides insight into the composition and function of the melon small RNAome, and paves the way towards an understanding of sRNA-mediated processes that regulate melon fruit development and melon-virus interactions.


Background
Melon (Cucumis melo L., family Cucurbitaceae) is an important horticultural species cultivated in temperate, subtropical and tropical regions worldwide, with Spain being the largest producer in Europe and fifth in the world [1]. The melon genome has 12 chromosomes and is thought to contain 450-500 Mb of DNA, which is 3-4 times more than Arabidopsis [2]. Melon is a useful model for the analysis of fruit traits because of the vast morphological, physiological and biochemical diversity within the species, which can be exploited to dissect the biological processes controlling color, flavor and texture and how these properties arise during fruit development [3,4].
Despite the importance of melon, not much was available in the way of genomic sequence information prior to the establishment of a functional genomics consortium in 2004, which developed a range of tools and accumulated more than 33,000 expressed sequence tags (ESTs) and~17,000 tentative consensus sequences (unigenes) [5]. This EST collection has been expanded recently with the addition of 94,000 new ESTs from full-length enriched cDNA and standard cDNA libraries from various melon tissues and cultivars in the framework of the International Cucurbit Genome Initiative [6]. These ESTs as well as other resources are now accessible in a public database [7]. The unigene sequences have also been used to construct an oligonucleotide microarray, which has been applied in the analysis of fruit quality traits, ovary development and pathogen resistance [8]. In addition, a melon sequencing consortium has recently produced a high-quality draft of the melon genome (unpublished data). Although these resources provided significant advances in the analysis of melon gene expression, the small RNA (sRNAs) component of the melon transcriptome has not been studied in detail. These important molecules have been studied in other crop species and have been shown to fulfill a number of critical regulatory roles [9][10][11][12].
sRNAs are short, non-coding RNAs 21-24 nucleotides (nt) in length which are found in protists, fungi, plants and animals [13]. In plants, their roles include maintenance of genome stability, initiation of heterochromatin assembly, post-transcriptional regulation of gene expression and protection against viruses using an RNA-based immune system. The most abundant and best-characterised sRNAs include microRNAs (miRNAs) and small interfering RNAs (siRNAs). miRNAs are widely studied because of their regulatory activity, particularly in development, pathogen resistance and stress responses [13]. miRNAs are cleaved from stem-loop precursor molecules that derive from single stranded non-coding transcripts. miRNAs regulate protein-coding genes posttranscriptionally by mediating RNA cleavage or translational repression. Unlike miRNAs, siRNAs are generated from double-stranded RNA precursors and function on cognate RNA or DNA molecules by instigating degradation or promoting RNA-directed DNA methylation, respectively. cis-acting siRNAs (ca-siRNAs) arise from and target endogenous loci such as transposons and DNA repeats to direct cytosine methylation and chromatin modifications [14]. Natural antisense-transcript siRNAs (nat-siRNAs), which derive from pairs of natural-antisense transcripts, guide the cleavage of one of the two parent transcripts, leading to the production of a series of secondary 21-nt siRNAs of unclear function [15,16]. Finally, trans-acting siRNAs (ta-siRNAs) derived from TAS genes, which transcribe long primary noncoding RNAs as precursors for ta-siRNA biogenesis. TAS primary RNAs are cleaved by specific miRNAs and are sequentially processed into 21-nt ta-siRNAs starting from the miRNA-cleaved end, to generate clusters of phased siRNAs [17,18]. In addition to endogenous sRNAs, exogenous siRNAs from virus genomes can be detected in virus-infected plants as a part of the RNAbased immune system [19].
RNA viruses that infect melon are responsible for significant yield losses as well as poor fruit quality [20,21], particularly the widespread Watermelon mosaic virus (WMV, genus Potyvirus, family Potyviridae) [22,23]. Recently, a collection of accessions representing cultivated melon and its wild relatives was screened to identify sources of resistance to mosaic-inducing viruses [24]. TGR-1551 was identified as a resistant accession based on the lower WMV titer compared to susceptible genotypes (e.g. melon cv. Tendral) and the absence or mildness of the mosaic symptoms normally observed in systemically infected leaves [25]. Melon necrotic spot virus (MNSV, genus Carmovirus, family Tombusviridae), although less economically important, may also cause yield losses, and epidemic outbreaks have been reported worldwide [26,23]. In melon, resistance to MNSV is controlled by the single recessive gene nsv, which encodes eukaryotic translation initiation factor 4E (Cm-eIF4E) [27]. This resistance is effective against all MNSV strains (e.g. MNSV-Malfa5) except MNSV-264 [28]. Studies of chimeric viruses have shown that the MNSV 3' untranslated region (3'-UTR) contains the resistance-breaking determinant of MNSV-264, and that it functions as a cap-independent translational enhancer [29,30].
We constructed 10 sRNA libraries from a range of healthy and virus-infected melon tissues, and we sequenced a set of endogenous and exogenous sRNAs using the pyrosequencing-based 454 technology from Roche [31]. To gain insights into the role of sRNAs on key aspects of fruit development, maturation and pathogen defense, samples from two stages of the developing ovary, fruits 15 and 45 days after pollination, and photosynthetic cotyledons from resistant and susceptible melon accessions infected with WMV and MNSV were analysed. In a previous study, we reported the profile of virus-derived sRNAs (viRNAs) from cotyledon samples [32]. Here we report a catalog of endogenous melon sRNAs, including miRNAs from known families and new candidate miRNAs potentially unique to melon, focusing on the number of sequence reads as a reflection of their expression profiles. Potential targets for these miRNAs in the melon transcriptome were identified.

cDNA libraries and sequencing of small RNAs
We used high throughput sequencing data to analyze the composition of the small RNA transcriptome (sRNAome) of melon and compare the results to data in publicly-available RNA and genomic databases. Ten sRNA libraries were constructed from total RNA extracted from fruits, ovaries and healthy and virusinfected melon cotyledons (Table 1). PCR amplification products corresponding to each library were pooled in equal amounts and sequences were obtained by multiplexed high-throughput pyrosequencing (Roche 454). This produced 447,180 raw sequences, each~100 bases in length, 432,743 of which had a complete 3' adaptor in the correct position. Based on these data, we estimated a sequencing error rate of 3.7%. After removing reads where one or the two adaptors could not be identified, 398,450 useful sequences with 3' and 5' adaptors were selected. Only 44 sequences comprising ligated adaptors without an insert were identified. Although we pooled similar amounts of PCR products from each library, different numbers of sequences were obtained according to the 5' adaptor sequence barcode (Table 1). For instance, the fruit and ovary libraries (15d, 45d, c1 and c5) were poorly represented providing a collection of fewer than one third of the number of sequences from the other six libraries. A set of 186,698 non-redundant sRNA sequences was generated for downstream analysis. The representation of sequences with different lengths in the redundant and non-redundant sRNAs datasets is shown in Figure 1. The most abundant sequences were 21, 24, 20 and 22 nts. A few sequences shorter than 20 nt were also retrieved, and these probably represent cloning artifacts and/or degradation products. Sequences > 30 nt in length in our dataset predominantly represented combinations of other melon sRNAs identified in our work. Detailed data are provided in Additional file 1.

Identification of known miRNAs
In order to identify known miRNAs, the melon sRNA data set was used as a BLAST query against the Arabidopsis small RNA database (ASRP) [33] and the microRNA database (miRbase) [34]. We identified 46 melon unique sequences corresponding to 26 miRNA families. Thirty nine sequences were identical to known miRNAs from other plant species, while 7 additional species were sequence variants highly conserved (up to two mismatches allowed). In order to clearly identify each melon sequence, melon miRNAs were named according to the homologous reference miRNA from each database ( Table 2). For each reference miRNA, we found that~3% of the corresponding melon sequences differed at one or two sites with mismatches distributed randomly along the sequence, so these were considered sequencing errors. Only specific sequence variants that represented more than 3% of the total population for each reference miRNA were considered biologically relevant. We identified only two non-conserved miRNAs, corresponding to ath-miR2111a from Arabidopsis and peu-miR2910 from Populus euphratica, respectively ( Table 2). The largest diversity of miRNA species was found in ovary samples and the lowest in fruit samples.   The abundance distribution of different miRNAs in each library was estimated based on sequencing frequencies as shown in Figure 2. We used sequencing data for quantitative profiling of small RNAs, though estimation of abundance based on sequencing frequencies could be misleading due to limited sequencing depth. Many miRNAs differed in abundance according to the source library. Nevertheless, most of the redundancy reflected the accumulation of miR159a, which accounted for more than 14,000 sequences in total.  genotypes Tendral and TGR-1551. Melon miRNA species with similarity to Arabidopsis miR156abcdef, miR160abc and miR168ab, which target mRNAs encoding squamosa promoter binding proteins, auxin response factors (ARFs) and argonaute-like proteins (AGO), respectively, showed different trends in the genotypes tested. For example, miR168ab is more abundant in healthy Tendral tissues compared to infected tissues whereas it is more abundant in WMV-infected TGR-1551 tissues than in healthy tissues. Other known miR-NAs in our sequenced set were generally more abundant in healthy tissues irrespective of the melon variety tested. For example, miRNAs with similarity to Arabidopsis miR159a and miR167d, which target MYB transcription factors and ARFs, respectively, followed this trend in both genotypes albeit with differences in magnitude. Comparison of the two libraries from ovary and fruit samples ( Figure 2C, D) revealed that miRNAs were particularly abundant and diverse in ovaries compared to fruits. Several miRNAs appeared to be temporally regulated during ovary development (e.g. members of the miR160, miR164, miR167, miR169, miR319 and miR390 families) whereas others were equally abundant at both ovary stages (miR156, miR167 and miR171 families). Fruits contained far fewer miRNAs than ovaries, and only miRNAs similar in sequence to Arabidopsis miR159a, miR164ab and miR397a showed significant differences in accumulation (with trends opposite to those seen in ovaries). These findings indicated that miRNAs in melon were expressed in specific tissues and in response to particular physiological conditions. In Arabidopsis, most of these miRNAs target mRNAs encoding transcription factors with roles in development, such as hormone signal transduction and organ identity. Figure 2E, F compares the accumulation of miRNAs in healthy and MNSV-infected tissues. Similar accumulation profiles were observed in both samples for most of the miRNAs identified. Exceptionally, miRNAs similar to Arabidopsis miR396a, miR396b and miR162a, which regulate transcripts encoding GRF transcription factors and DCL proteins, respectively, showed opposite accumulation patterns.
Identification of miRNA/miRNA* duplexes DCL-mediated cleavage of miRNA precursors having the characteristic stem-loop structure gives rise to miRNA duplexes where one of the two strands is the guide miRNA (the functional molecule) while the nearperfect complement sequence is known as the passenger miRNA, or miRNA*. The miRNA* is rapidly degraded but transient species can be cloned and therefore sequenced. We identified 16 miRNA* sequences complementary to some of the 46 miRNAs in our dataset ( Table 2), nine of which had the predicted sequence based on the fold-back structure of their presumptive precursors with internal mismatches and two additional terminal nucleotides forming a 3' tail ( Figure 3A), whereas the other six had a different number of protruding nucleotides and were considered non-typical ( Figure 3B).
The number of sequenced miRNA*s was generally much lower than the number of mature sequences but there were some remarkable exceptions. For example, for miR396a we counted 134 miRNA and 84 miRNA* sequences, as opposed to miR159a for which 14,651 miRNA sequences but no corresponding miRNA* sequences were retrieved in the sequenced collections ( Table 2). The most extreme example was miR157d, for which we recovered the same numbers of miRNA and miRNA* sequences.

Identification of putative melon-specific miRNAs
After identification of known miRNA sequences and other sRNA sequences (see below), 108,454 unique melon sRNAs remained unclassified, from which the most abundant (28.6%) were 24-nt species. Initial analysis confirmed that 36,783 (33.9%) of these sequences had a perfect match in the melon genome. The frequency distribution was highly skewed: 33,621 sequences had fewer than 25 hits (24,488 originated from a single locus), and only 659 sequences had more than 100 hits.
Sequences that were 21, 22 or 24 nt in length with a maximum of six hits in the genome were selected as potential novel miRNAs, and flanking genomic regions were analysed according to three consecutive criteria. First, we used miRanda software to detect sequences complementary to the potential miRNA inside the flanking regions. Second, potential miRNAs with precursors less than 70 nt in length were discarded. Finally, the MFEI index, which is used to distinguish miRNA precursors from other coding and non-coding RNAs and is based on free energy estimates and nucleotide composition [35], was calculated for each precursor and the results were sorted accordingly (the more negative the index, the better the precursor).
Predicted miRNA precursors and their genomic flanking regions that were found to be similar in sequence to previously described transposons were discarded. Other predicted miRNA precursors with intramolecular folding potential showed no similarity to known transposon sequences although their secondary structures were similar to those of known foldback transposons; these were characterised by strong negative MFEI indexes and high miRanda scores, both features consequence of high sequence complementarity in the pairing stem sequence. For some of these precursors, several uncharacterised melon sRNAs mapped on them in both the sense and antisense orientations (e.g. a11_62726 in Figure 4), up to 85 in some cases. Therefore, these were also considered unsuitable miRNA candidates. Three other potential miR-NAs were shown to be the miRNA* sequences of known miRNAs that had not been picked up in our initial screen.
After manually inspecting the remaining secondary structures, 77 loci that fulfilled the structural criteria for annotation of plant miRNAs [36,37] were selected as plausible miRNA precursors; we also added to this list 7 other loci that had an asymmetric bulge involving 3 bases inside the putative miRNA duplex. From them, 43, 20 and 21 corresponded to sequenced sRNAs of 21, 22 and 24 nt in length, respectively (Table 3). Six selected sequences are shown as examples in Figure 4. By checking the pairing sequence on the stem of the predicted precursors, miRNA*s for seven candidate miR-NAs were found in the sequenced set. Therefore these miRNAs were regarded as authentic miRNAs that conformed to the biogenesis and expression criteria for confident miRNA annotation [37]. The remaining sequences, not supported by the complementary passenger strands, were classified as candidate miRNAs. Most of the potential novel miRNA were represented by a small number of sequences, a single sequence in more than half of the cases, but six exceptional candidates were represented by more than 10 sequences (Table 3). As occurred for conserved miRNAs, sequence variants were identified for some novel miRNAs (Table 3) which mapped on the genomic melon sequence with slight variations in length and position relative to the most abundant sequence. In the absence of a reference sequence from any database, these variants were counted. Sequencing errors of differential cleavage of potential miRNA precursors possibly explain these length and positional polymorphisms.  Table 3) were selected based on quality criteria and are shown as examples. Red lines represent regions where miRNA/miRNA* duplexes are located. Identifier a11_72726 represents an example of unsuitable miRNA candidate. Prediction of potential miRNA targets in the melon transcriptome To identify potential targets of miRNAs, we screened melon unigenes in the publicly-available database [7]. Two independent searches were performed using miRanda [38] and TargetFinder [39], and the results were compared. Each program scores potential targets based on sequence complementarity, with high scores better in miRanda, and low scores better in TargetFinder. Both algorithms identified a common set of presumptive targets albeit with different scores, and the few discrepancies involved targets with low confidence scores.
Targets in Arabidopsis defined by miRanda generally have a score ≥170, and using this value as cutoff we found 150 melon unigenes as potential miRNA targets, the best of which are summarized in Table 4 (a complete list is provided in Additional file 2). The potential miRNA targets in Table 4 generally had similar annotations to their Arabidopsis counterparts, although there are some exceptions. For example, melon unigene cHS_39-F10-M13R_c is a predicted target of miR159a but it is annotated as positive regulator of brassinosteroid signaling rather than a MYB or TCP transcription factor, which is sensitive to miR159 regulation in a Precursor secondary structures of sRNAs in bold are represented in Figure 4. b miRanda score calculated for the identification of the complementary region to the putative miRNA. c MFEI index calculated as described [35]. d Include cases where sequence variants with up to two mismatches were identified and counted e Indicates miRNAs for which a miRNA* sequence was identified. f Indicates miRNAs for which an asymmetric bulge of more than 2 bases was identified in the miRNA/miRNA* duplex of the precursor, not meeting exactly the structural criteria previously set forth [36,37]. Arabidopsis. Many of the melon unigenes identified as potential targets were not annotated, and some had previously been identified as potential miRNA precursors [5]. Interestingly, the highest miRanda scores (> 300) were achieved for transcripts with two separate miRNA targets on the same molecule. For example, unigene c15d_05-D02-M13R_c had two target sites for miR390ab separated by~200 nt. When this region was used as a BLAST query against the melon sRNA dataset, a group of 257 sequences (more than 92% of them being 21 nt long) was identified with nearby clusters of related 21-nt sequences in both the sense and antisense orientations, which is reminiscent of the ta-siRNAs biogenesis mechanism [18] (Figure 5). Both sites (complementary to miR390 family members in unigene c15d_05-D02-M13R_c) had similar miRanda scores, they did not contain mismatches or G:U wobbles involving nucleotides 9-11 and were phased 21 nt one of each other. The number of sRNA copies was different in each cluster and were more abundant in sense orientation compared to antisense orientation. Two registers of phased 21-nt siRNAs were observed. One of them was phased with the miR390 complementary sites but the other one was not. A representative sequence from each cluster was selected and used to search for potential targets in melon unigenes, identifying > 100 transcripts with a miRanda score > 170. Several of these transcripts were annotated as ARFs and ubiquitin related gene products ( Table 5).
The remaining unigenes with two predicted miRNA target sites listed in Table 4 were annotated as protein-coding transcripts and no sRNAs were identified with similarity to the region flanked by the two target sites ( Figure 5), suggesting that they did not account for authentic ta-siRNA-producing loci. Targets were also sought in the reverse-complement sequences of melon unigenes, because a small proportion of the ESTs could be incorrectly oriented as an artifact of the cloning procedure [5]. Twenty-eight unigenes were identified as potential miRNA targets using the same criteria described above, most of which were found to be nonannotated (Table 6). In this new set of data, unigenes with two targets were used again as a BLAST query against the melon sRNA dataset but no hits were obtained, so these unigenes were no longer considered as potential ta-siRNAs.
With some exceptions, several miRNA targets with miRanda scores ≥170 (see Additional File 3) were identified for each of the potential novel melon-specific miR-NAs listed in the previous section.

Characterization of other melon sRNAs
Next, we blasted our sRNA sequences against RNA and genomic databases to search for other sRNA species by sequence similarity (Figure 6). sRNAs similar to transfer RNA (tRNA), trans-acting siRNA, small nucleolar RNA (snoRNA) and transposons were the least abundant, whereas ribosomal RNAs (rRNA) were largely the most abundant non-coding sRNA species ( Figure 6A). Intriguingly, exogenous virus-derived sRNAs were as abundant as other endogenous plant sRNAs, at least in the case of MNSV. Most of the sRNAs identified had complete sequence similarity with the reference RNA from each    Figure 6A), even if up to two mismatches were allowed in BLAST comparisons. The only exception were sequences similar to ta-siRNAs, for which 14 melon sRNAs with similarity to Arabidopsis TAS3a|D7 (+) and TAS3a|D8(+) were identified, 2 containing 1 mismatch, and 12 containing 2 mismatches. All of them mapped very close in the melon genome and in a different region than c15d_05-D02-M13R_c unigene (the other potential source of ta-siRNAs, see above). To determine if they were authentic melon ta-siRNAs, we selected a 600 bp window sequence upstream and downstream from the genomic location determined in the melon genome for each candidate; then, a BLAST query against the melon sRNA dataset was performed, revealing that at least 126 sequences (95 of them being 21-nt in length) mapped in this region and were arranged according to a near 21-nt phase spacing (data not shown). Many sRNA sequences also generated hits in the plastid genomes (30,239 sRNAs corresponding to 4,254 unique plastid sequences). When these sRNAs were mapped onto the melon chloroplast genome (unpublished data) ( Figure 6B), two clusters of sequences resolved in regions presumably annotated as chloroplast rRNA. These regions lie within two inverted genomic repeats, and sRNAs were accordingly Table 6 Best quality miRNA targets identified in reverse-complement sequences of melon unigenes Some of the chloroplast sRNAs had previously been cloned in other species [40]. For example, melon sRNA a33_398374 (sequence AGT TAC TAA TTC ATG ATC TGG C) was the most abundant melon plastid sRNA (18,054 counts), and a matching sequence is present in more than 900 chloroplast genomes. It is located in an intergenic region and may target a methyltrasferase transcript, although there is no direct evidence that it has silencing functions. Melon sRNA a14_392967_ (sequence GGT AGT TCG ATC GTG GAA TTT) was less abundant (166 counts), it is present in 10 different chloroplast genomes, and it may target a transcript encoding an electron carrier protein. Interestingly, different numbers of plastid sequences were obtained from each library ( Figure 6C). For example, in the virus-resistant melon accession TGR-1551 there was no difference in the number of sRNAs with hits to melon chloroplast genome between healthy and virus inoculated samples, but in the virussusceptible accession Tendral, more sRNAs were counted in inoculated samples ( Figure 6C). Unlike chloroplast sRNAs, only 7,854 sRNAs (corresponding to 2,384 unique sequences) matched the melon mitochondrial genome (unpublished data). These sRNAs were mapped on the mitochondrial genome sequence and formed three clusters, again corresponding to the sites of rRNA genes (data not shown).

Discussion
In this report, we describe the first screen for melon sRNAs by deep sequencing. In total, 398,450 high-quality sequences were generated, representing 90% of the total raw reads. RNA species 21, 24, 20 and 22 nt in length dominated the sRNA transcriptome in melon with the 21-nt class being the most abundant in our libraries. Molecules of 24-nt processed by DCL3 are often the most abundant endogenous plant sRNAs [13], but this may vary according to species. For example, 24nt sRNAs are more abundant in Arabidopsis, rice and tomato [41,42,9], whereas 21-nt sRNAs are more abundant in grapevine, wheat and conifers [12,43,44]. It is also possible that the composition of the sRNA population of a given plant species varies according to tissue and physiological conditions, as seems to be the case of melon (see Additional file I). Perhaps the higher proportion of 24-nt sRNAs found in melon ovaries compared to the other tissues reflects the predominance of developmental processes based on epigenetic events in the ovary.
Recent studies have shown that cis-acting siRNAs arising from heterochromatin, transposons and other repeat elements account for the greatest proportion of endogenous sRNA populations in plants [13,[45][46][47][48]. In melon, only~7,000 sRNAs matched known transposon sequences, in contrast to~60,000 sRNA sequences matching ribosomal RNA, which may simply reflect the paucity of melon transposon sequence information in databases, as only 1.5% of the melon genome has been annotated for transposable elements [49]. Transposon sequences in different species show more divergence than rRNA sequences, so the representation of transposon-related sRNAs could increase when a more accurate and complete annotation of the melon genome becomes available. We also identified two sets of ta-siRNAs in our data, which mapped to different loci in the melon genome thus revealing the presence of at least two potential TAS genes. One locus was not represented in the melon unigene database, most likely because of its incomplete coverage. The sequence of the other locus was similar to that of a non-annotated melon transcript, and contained two registers of sRNAs in a 21-nt phase bounded by two target sites to miR390ab, reminiscent to TAS3 genes. Non-coding transcripts containing two miR390 complementary sites that give rise to phased siRNAs have been described in other organisms. In the moss Physcomitrella patens, both 3' and 5' target sites are cleaved. In Arabidopsis, the 5' miR390 complementary site contains a mismatch and two G:U wobbles involving positions 9-11 and, despite it is not cleaved, it binds the silencing complex and is required for full AtTAS3 function in vivo [50]. In melon, both 3' and 5' miR390 had perfect complementarity at positions 9-11, suggesting that both could be cleaved, as opposed to Arabidopsis, to specify a phased register for ta-siRNA biogenesis. Interestingly, an additional siRNA register that is likely independent of miR390-directed cleavage of the putative melon TAS transcript was observed. This alternative register might be determined by the processing activity of TAS transcripts by one of the most abundant melon primary ta-siRNAs during generation of secondary ta-siRNAs ( Figure 5), as proposed for alternatively phased TAS3 ta-siRNAs in Arabidopsis [18,50]. Since there are additional TAS loci in other plant genomes, it is reasonable that other melon TAS loci remain to be discovered.
More than 30,000 of our sRNA sequences matched the plastid genome, suggesting intense sRNA activity in this organelle. Mitochondrion-specific sRNAs were less abundant in comparison. The abundance of plastid sRNAs varied by source, with fewer sequences obtained from the ovary and fruit libraries compared to the cotyledon libraries, perhaps reflecting a relationship between chloroplast sRNA activity and photosynthesis. Interestingly, there was no significant difference in sRNA accumulation when comparing infected and healthy TGR-1551 cotyledons (resistant to WMV) whereas more sRNA accumulated in healthy Tendral (susceptible to WMV and MNSV) cotyledons than in infected ones. Whether or not this is related with the resistance phenotype is a matter of speculation.
More than 28,000 melon sRNAs in our sequenced collections matched known miRNAs in other plants, and 46 distinct melon sRNA species could be assigned to 26 known miRNA families. Although we generated a relatively low number of sequence reads, our data nevertheless were in good harmony with previous studies of miRNA profiling based on exhaustive sequencing of sRNA populations (e.g. in grapevine, 24 million reads, 26 known miRNA families and 26 non-conserved miRNA families; in tomato, 721,874 reads, 30 known miRNA families; and in orange, 13,106,573 reads, 42 highly-conserved miRNA families) [12,11,9]. This probably reflects the generally-accepted high level of expression reported for conserved miRNAs.
In addition to known miRNAs, 84 sRNA sequences derived from genomic loci with intramolecular folding capacities and not previously described as miRNAs in other plant species were predicted as potential melonspecific miRNAs. In most cases, only one sequence was counted from each of these miRNAs, which is consistent with reports suggesting that species-specific miRNAs are usually expressed at low level and in a tissue-specific manner [41]. The candidates listed in Table 2 include a number of special cases, i.e. miRNAs with miRanda scores ≥195 and very strong secondary structures including an internal loop, resembling type III foldback transposons [51,52,9]. Although these sequences do not match known melon transposons, they were not considered as miRNA candidates because accurate homologybased transposon annotation and prediction occasionally needs to be complemented with ab initio approaches based on structural features [53,49]. However, even not considering this particular group, our data indicate that most of the precursors we identified are candidates to encode melon-specific miRNAs.
The accumulation of miRNAs was estimated by census sequencing and this showed that there is more miRNA diversity and that miRNAs are more abundant in ovaries than fruits. Although miRNAs are involved in many processes, 60-70% of known plant miRNAs control the expression of transcription factors that regulate critical developmental processes, such as proper specification of floral organ identity or leaf polarity, and overexpression or knockout of MIRNA genes led to severe developmental defects [48,54,13]. It is likely that the greater abundance of miRNAs in the early ovary stages compared to fruit reflects the more significant developmental activity in ovaries, and confirms that meristems and other developmentally active tissues are good resources for miRNA screening.
The comparison of healthy and virus-infected melon tissues showed that generally miRNAs were less abundant in infected tissues. Viruses interfere with and exploit endogenous RNA-silencing pathways using diverse strategies [55,19]. For example, the potyvirus silencing suppressor HC-Pro has been shown to suppress the miRNA pathway by inhibiting miRNA assembly into AGO1-containing silencing complexes and unwinding of miRNA/miRNA* duplexes, causing accumulation of stable duplexes [56]. Several studies have shown that virus infection can regulate the accumulation of mRNAs targeted by miRNAs without affecting the abundance of the miRNAs themselves, or even by promoting a slight accumulation [57][58][59]. In contrast, we found that miRNA accumulation was generally depressed in infected plants compared to controls subjected to mock inoculations. A notable exception was miR168ab, which was upregulated in the resistant genotype but downregulated in the susceptible one. This miRNA has previously been shown to be involved in controlling the expression of ARGONAUTE1 (AGO1), the catalytic subunit of the RNA-induced silencing complex responsible for slicing of target mRNAs [60]. Recent work has described the enhanced expression of miR168 and AGO1 mRNA in virus-infected plants specifically and independently of other miRNAs [61][62][63]. The contrasting miRNA profiles observed in the TGR-1551 and Tendral varieties suggests that silencing may underly the resistance of TGR-1551 to WMV, although this is a hypothesis that will require further research.
We have identified more than 150 melon unigenes as potential targets for the known and novel miRNA sequences discovered in this investigation. Many animal transcripts are targets for more than one miRNA but this phenomenon is uncommon in plants [64]. Accordingly, most of melon unigenes identified as potential targets featured only a single miRNA site. miRNAs that are conserved across species tend to have conserved targets too, and our data confirm this is the case in melon. However, several unigenes predicted with high confidence as targets for conserved miRNAs had different annotations to the corresponding target genes in Arabidopsis, although these may represent false positives that would fail additional validation. Furthermore, the nonconserved targets of conserved miRNAs can be cleaved at a lower frequency than conserved targets [12]. For these two reasons, the selection of targets for individual validation experiments can be challenging.
An interesting alternative for miRNA target discovery in a genome-scale is the analysis of the small RNA degradome [65], as this avoids the a priori selection of potential targets. High-throughput gene expression profiling techniques such as microarray hybridization can also help to predict miRNA targets because some times a negative correlation between the abundance of miR-NAs and their target mRNAs can be identified [66,67]. We have used microarrays to monitor gene expression profiles in healthy TGR-1551 and Tendral plants and plants infected with WMV. When compared with our miRNA data, we were able to identify two unigenes encoding AGO proteins that were differentially expressed and showed contrasting expression profiles in susceptible and resistant genotypes (Gonzalez-Ibeas and Aranda, unpublished data). The same profile was observed for miR168 accumulation, suggesting that miR168 may be involved in virus resistance and providing the basis for future experiments.

Conclusion
We have analysed and catalogued a collection of melon endogenous sRNA obtained through massive cDNA sequencing and have identified known miRNAs and ta-siRNAs (conserved in other species) as well as potential melon-specific miRNAs with no database matches. We have also identified potential targets for these miRNAs in the melon transcriptome. Census sequencing (i.e. counting the number of sequence reads for each sRNA) was used to profile their expression in different tissues, and in healthy vs. virus-infected cotyledons. By comparing the predicted targets and the differential expression profiles we were able to provide insights into the role of miRNAs in the regulation of fruit development and plant-virus interactions.
Melon plants were grown under greenhouse conditions (~25/20°C, 16-h photoperiod,~70% relative humidity) in 0.5-L pots with substrate (Tendral and TGR-1551) or in soil bags with the capacity for four plants (Piel de Sapo). Fruits of 15 and 45 days after pollination (DAP) were collected and mesocarp tissues were recovered and used for RNA extractions. Virus infected samples were obtained from completely expanded cotyledons rubbed with carborundum (ø = 0.037 mm) and the corresponding viral inoculum. MNSV-infected melon cotyledons exhibiting lesions and marrow leafs systemically infected with WMV were ground in cold inoculation buffer (0.2 M phosphate buffer pH = 8.0, 0.1% (v/v) beta-mercaptoethanol, 0.03 g/ ml activated charcoal) for inoculum preparation. Mockinoculated control cotyledons were rubbed with inoculation buffer and carborundum alone.
Cotyledons were harvested at 1, 3, 5 and 7 days postinoculation (dpi) and pooled for RNA extraction. Fruit samples were prepared as previously described [8].
Ovaries were collected at stages C1 and C5 (Mascarell-Creus et al., unpublished). The C1 stage corresponds to flower emergence from the inflorescence bud, when the outermost perianth organs commence development and no floral whorls are visible. The C5 stage corresponds to anthesis, when the flower is ready to be fertilized and all floral organs are fully formed, including the yellow petals that attract pollinators. Under normal growth conditions, C1 to C5 development takes approximately 5 days.
DNA amplicons were gel-purified using 4% Metaphor Agarose and isolated using the QIAEX II Gel Extraction Kit (Qiagen, Hilden, Germany). The quantity and quality of the DNA amplicons were determined using a ND-1000 spectrophotometer (NanoDrop Technologies, Wilmington, Delaware, USA) and an Experion Automated Electrophoresis System (Bio-Rad, Hercules, California, USA). The same quantity of DNA from each library was pooled and sequenced using the 454 Life Science Technology platform (Lifesequencing S.L., Paterna, Valencia, Spain). Sequence data in this publication have been deposited in NCBI's Gene Expression Omnibus and are accessible through GEO Series accession number GSE28653 http://www.ncbi.nlm.nih.gov/geo/query/acc. cgi?acc=GSE28653.