Systematic characterization of small RNAome during zebrafish early developmental stages
© Yao et al.; licensee BioMed Central Ltd. 2014
Received: 5 November 2013
Accepted: 7 February 2014
Published: 10 February 2014
During early vertebrate development, various small non-coding RNAs (sRNAs) such as MicroRNAs (miRNAs) and Piwi-interacting RNAs (piRNAs) are dynamically expressed for orchestrating the maternal-to-zygotic transition (MZT). Systematic analysis of expression profiles of zebrafish small RNAome will be greatly helpful for understanding the sRNA regulation during embryonic development.
We first determined the expression profiles of sRNAs during eight distinct stages of early zebrafish development by sRNA-seq technology. Integrative analyses with a new computational platform of CSZ (characterization of small RNAome for zebrafish) demonstrated an sRNA class transition from piRNAs to miRNAs as development proceeds. We observed that both the abundance and diversity of miRNAs are gradually increased, while the abundance is enhanced more dramatically than the diversity during development. However, although both the abundance and diversity of piRNAs are gradually decreased, the diversity was firstly increased then rapidly decreased. To evaluate the computational accuracy, the expression levels of four known miRNAs were experimentally validated. We also predicted 25 potentially novel miRNAs, whereas two candidates were verified by Northern blots.
Taken together, our analyses revealed the piRNA to miRNA transition as a conserved mechanism in zebrafish, although two different types of sRNAs exhibit distinct expression dynamics in abundance and diversity, respectively. Our study not only generated a better understanding for sRNA regulations in early zebrafish development, but also provided a useful platform for analyzing sRNA-seq data. The CSZ was implemented in Perl and freely downloadable at: http://csz.biocuckoo.org.
KeywordsDeep sequencing miRNA piRNA Zebrafish Embryonic development
Small non-coding RNAs (sRNAs) of about 20 ~ 30 nucleotides (nt) play an essential role in a variety of animal developmental processes, such as embryonic, neuronal, muscle, and germline development [1–3]. MicroRNAs (miRNAs) and Piwi-interacting RNAs (piRNAs), which are different in biogenesis and biological function, are two predominant types of sRNAs [4, 5]. Mature single-strand miRNAs are 21 ~ 25 nt and derived from longer primary miRNA molecules (pri-miRNAs), which are double-stranded RNAs (dsRNAs) and are successively processed by two RNase III endonucleases, namely Drosha and Dicer [1, 4]. It was estimated that miRNAs regulate the gene expression of at least one third of all human genes and are involved in a broad spectrum of biological processes, such as development, metabolism, and tumorigenesis by either translational repression, RNA degradation or both through an RNA-induced silencing complex (RISC) [1, 2, 6, 7]. In contrast to miRNAs, piRNAs are slightly longer with a peak size distribution of 26 ~ 28nt, mainly necessary for germ cell maintenance and genome protection by silencing transposable elements . The primary piRNAs are maternally deposited or generated through an unclear mechanism, while more secondary piRNAs are originated from discrete genomic loci termed piRNA clusters, which produce piRNAs from both genomic strands and then reciprocally promote the generation of complementary piRNAs following a Ping-Pong model [9, 10].
Analysis of expression profiles of sRNAs is fundamental for understanding the molecular regulations during early vertebrate development . Recent studies in mice revealed an sRNA class transition that the expression levels of piRNAs/siRNAs are gradually reduced while more and more miRNA are expressed during embryonic development [11, 12]. This observation was also confirmed in chicken  and sea urchins . As a model system, zebrafish has been extensively used as for studying early vertebrate development . Previous studies revealed that most miRNAs are rarely expressed during the first 12 hpf (hours post fertilization) of early zebrafish development but their expression and diversity increase in later stages [16, 17]. Although the lack of diversity in early stages, miRNAs still play a crucial role in the regulation of gene expression [2, 3, 18]. For example, zebrafish miR-430 family is one of the most abundant miRNA families at early developmental stages and leads to degradation and clearance of maternal mRNAs for facilitating the maternal-to-zygotic transition (MZT) . Recently, with the next-generation sequencing (NGS) technique, Wei et al. quantitatively analyzed sRNA expression profiles in 256-cell (2.5 hpf), sphere (4 hpf), shield (6 hpf), and 1 dpf (days post fertilization) stages of early zebrafish development . In contrast with previous results, they observed the expressions of both miRNAs and piRNAs are firstly increased and then decreased, with a peak expression at sphere stage for miRNAs and shield stage for piRNAs, respectively . Therefore, more analyses should be carried out to clarify controversial viewpoints of sRNA expression dynamics during zebrafish embryonic development.
Rapid progress in NGS technologies has provided a great opportunity to investigate the sRNA transcriptome at an unprecedented sensitivity . However, it’s still a great challenge to analyze the deep sequencing data in an accurate and fast manner. For sRNA-seq data, characterization of both known and novel miRNAs have attracted most attention because of the functional importance of miRNAs . In contrast with directly mapping reads to known miRNAs for the quantification, prediction of potentially new miRNAs from short reads is particularly difficult and intriguing. In 2005, Xue et al. firstly obtained 32 local structure-sequence features (triplet elements) from known human pre-miRNAs and constructed a pre-miRNA predictor of triplet-SVM based on the support vector machines (SVMs) algorithm . Later, MiPred was constructed with a random forest (RF) algorithm by incorporating minimum of free energy (MFE) of the secondary structure of human pre-miRNAs and P-value of randomization test with triplet elements in triplet-SVM, with a superior performance than triplet-SVM . Recently, Lertampaiporn et al. designed an ensemble predictor of HeteroMirPred by combining a number of machine learning algorithms such as SVMs, RF and k-nearest neighbors (kNN), while the prediction results were integrated and selected through a voting system . Previously, by collecting known pre-miRNAs from 133 species, we also developed an SVM-based tool of miRD, which adopted 59 and 139 sequence and structure features for the prediction of single- and multi-stem pre-miRNAs . However, these approaches were designed to predict pre-miRNAs but not mature miRNAs.
Recently, MIREAP (http://sourceforge.net/projects/mireap/files/mireap/) and miRDeep  were developed for the identification and prediction of mature miRNAs from the heterogeneous background of high-throughput sequencing data based on the compatibility of miRNA biogenesis model with the distribution characteristics of short reads on candidate precursors. However, the local sequence and structure features of pre-miRNAs were not considered for the prediction. Furthermore, Hackenberg et al. constructed an integrative tool of miRanalyzer, which can identify and quantify both known and novel mature miRNAs . To predict novel miRNAs, the RF algorithm and 48 sequence and structure features were used, while known miRNAs for training were taken from human, C. elegans and rat. In addition, we also developed a computational platform of CPSS for analyzing the sRNA-seq data, whereas MIREAP and miRDeep were used for the prediction of novel miRNAs . Although a number of efforts have been contributed to this area, no tools were implemented specifically for analyzing zebrafish sRNA-seq data.
In this work, the sRNA-seq technology was first used to determine the expression profiles of sRNAs during eight stages of early zebrafish embryonic development. Based on known zebrafish pre-miRNAs, we designed a zebrafish-specific algorithm of ZmirP (zebrafish miRNA prediction), with 8 new and 57 previously reported sequence and structure features. These features were combined together to construct an SVM model for further filtering potentially false positive hits identified from MIREAP and miRDeep2. The performance and robustness of ZmirP were extensively evaluated by the leave-one-out (LOO) validation and n-fold cross-validations. By comparisons on zebrafish-specific pre-miRNAs, ZmirP exhibits greater sensitivity of 95.64% and specificity of 98.84%, which is proved to be better than other existing approaches through comparison. Also, the performance of ZmirP is comparative with other tools for predicting human pre-miRNAs. Then we greatly improved the CPSS  and developed a more specific platform as CSZ (characterization of small RNAome for zebrafish) for the analysis of the high-through sequencing data. From the results, we observed that the expression levels of piRNAs are gradually decreased, while miRNA expressions are gradually increased during early embryonic stages. Thus, the sRNA class transition from piRNA to miRNA was confirmed in early zebrafish embryonic development. Furthermore, we observed that the diverse and complex of expression patterns and levels of 129 known miRNA families are dramatically increased as development proceeds. Moreover, 25 novel miRNA candidates were predicted by CSZ with high confidence. We randomly selected three predicted miRNAs for further experimental investigation, and two of them, m0027-5p and chr6_7844-5p, were confirmed through Northern blots. In addition, widespread expression of piRNAs before MZT suggested piRNAs may play a potential role during early development. Taken together, our studies contributed valuable clues for further investigating the sRNA regulation of embryonic development, and provided useful techniques for small RNAome analysis.
Results and discussion
A novel algorithm for the prediction of zebrafish-specific pre-miRNAs
Features used in ZmirP
%(|G| + |C|)
(|G| + |C|) / L * 100, where L is the length of sequence
|XY| / (L − 1) * 100, where |XY| denotes the number of dinucleotide XY, X,Y ⊂ [A,C,G,U]
Maximum of consecutive paired nucleotides in the secondary structure
The total number of paired nucleotides / L
The total number of unpaired nucleotides / L
The total number of paired nucleotides / the total number of unpaired nucleotides
The total number of base pairs / the total number of stems a
The total number of base pairs / L
The total number of bulges b
The total number of unpaired nucleotides / the total number of bulges
The total number of bulges / L
Triplet structure-sequence elements
Calculated using RNAfold under the default parameters
MFE / L
(MFE / L) /%(|G| + |C|)
(MFE / L) / n_stems, where n_stems denotes the number of stems
(MFE / L) / n_loops, where n_loops denotes the number of loops
MFE / the total number of base pairs
(MFE / L) / n_bulges
Comparison of ZmirP with other approaches for zebrafish pre-miRNAs
Development of the CSZ platform for analyzing zebrafish sRNA-seq data
In our results, a large proportion of reads were identically mapped to known miRNAs in miRBase, but with a few number of shorter or longer nucleotides. These sequences might be produced by imprecise Dicer processing and non-template-directed nucleotide addition, and stand for different isoforms of the same miRNA, which were referred as isomiRs . Therefore, a small window around the location of an annotated pre-miRNA plus 2 nt upstream and 5 nt downstream, was tolerated through mapping unique reads to known pre-miRNAs . Although different isomiRs existed for one miRNA, the observed frequencies of most abundant isomiRs were adopted as the expression levels of sequenced miRNAs . For unknown sequences that could not be assigned to any of known categories, we detected potentially novel miRNAs using MIREAP (https://sourceforge.net/projects/mireap/) and miRDeep2  with the default settings. Because too many putative results were generated by MIREAP or miRDeep2, we adopted the ZmirP algorithm for further filtering potentially false positive hits, with a default cut-off value of 0.8.
Systematic analysis of small RNAome in early zebrafish development
The analysis of the length distribution of reads matching the reference genome within each developmental stage uncovered two peaks between 18 nt and 35 nt (Figure 3B). The first peak around 22 nt is a potentially strong signal for miRNAs according to their characteristic of size distribution, while the second broader peak around 28 nt may represent piRNAs (Figure 3B). Because different samples had different sequencing depths, we took the factor into account by normalizing observed frequencies of reads into RPM data, while the relative abundances of reads with different lengths in eight developmental stages were illustrated under the same scale. Again, the bimodal length distribution with two different peaks around 22 nt and 28 nt was robustly detected (Figure 3C). Interestingly, we also observed that the percentiles of ~22-nt sequences dramatically increased from ~1.64% of 1-cell to ~18.96% of 48hpf (~11.5 fold), while the proportions of ~28-nt sequences significantly decreased from ~18.11% in 1-cell to ~5.36% in 48hpf (~3.4 fold) (Figure 3C). In addition, to analyze the diversity of sRNA sequences, the relative abundances of unique sequences with different lengths also were visualized (Figure 3D). Although the peak around 22 nt was not very significant, the percentiles still increased from ~1.32% of 1-cell to ~6.91% in 48hpf (~5.2 fold) (Figure 3D). Thus, our results demonstrated that both miRNA abundance and diversity dramatically increased during the early development. In contrast, the second peak around 28 nt was not influenced, and the results suggested that both abundance and diversity of piRNAs significantly decreased from 1-cell stage to 48 hpf (Figure 3D). Taken together, our analyses clearly demonstrated an sRNA transition from piRNA to miRNA during zebrafish early embryonic development.
Expression profiles of known and novel miRNAs in zebrafish early embryonic development
By using different thresholds for mapped reads (≥ 3, ≥ 10, ≥ 100, and ≥ 1000), we observed that less miRNAs were observed before 5.3 hpf and the number of miRNAs rapidly increased after the stage (Figure 4B). Using a cutoff value of ≥ 3, the numbers of known miRNAs in eight samples were 107, 98, 81, 99, 141, 173, 202 and 214, respectively. When the threshold was increased to 100, the number of known miRNAs in eight samples significantly dropped to 12, 13, 12, 12, 33, 65, 107 and 130, respectively. Thus, the results demonstrated most miRNAs are lowly expressed before 5.3 hpf (Figure 4B). Moreover, the percentages of miRNAs in sRNA-seq data were also shown, while the expression curve further confirmed that the miRNA abundance increased more significantly than the miRNA diversity, especially after the 5.3 hpf stage (Figure 4B). In addition, the distribution of different types of sRNAs was shown for total reads (Figure 4C) and unique reads (Figure 4D). The results not only confirmed the piRNA to miRNA transition, but also demonstrated that miRNA abundance enhanced more dramatically than the diversity. Take together, the expression levels of miRNAs were largely determined by a limited number of potentially important miRNAs.
Totally, we predicted 25 potentially novel miRNAs with high confidence
Experimental validation of known and novel miRNAs
Expression levels of piRNAs in zebrafish early development
Previously, we only considered piRNAs that were not annotated as repeat sequences (Non-repeat-associated piRNAs) in CPSS . However, over 50% of piRNAs in both piRNABank database and our results could be mapped to repetitive sequences (Repeat-associated piRNAs). Also, by analyzing the length distribution, we observed that piRNAs from different sources were very similar (Figure 8B). Furthermore, the nucleotide preferences of 15-nt subsequences taken from 5′-end sequences of piRNAs were visualized by WebLogo 3 . No significant differences were observed between repeat-associated and non-repeat-associated piRNAs (Figure 8C). In addition, we separately grouped repeat-associated and non-repeat-associated piRNAs from piRNABank and our results into distinct clusters, and revealed that ~90% piRNA clusters were derived from repeat-associated piRNAs (Figure 8D). In this regard, the rescue of piRNAs from unclassified repeat sequences is essential for a more comprehensive analysis of piRNAs. Taken together, although piRNAs were proposed to be functional and abundantly produced in germline cells, highly expressed piRNAs before the MZT suggested that a considerable number of piRNAs might play an important role in zebrafish early embryonic development.
During early vertebrate development, various sRNAs are temporally and spatially expressed to orchestrate the embryogenesis . Previous studies in several model organisms revealed an sRNA class transition from piRNAs/siRNAs to miRNAs, and the transition is essential for the MZT by activating zygotic genome and clearing maternal RNAs [11–14, 18]. However, this phenomenon was not observed from a recent study of four developmental stages in zebrafish . To evaluate the viewpoint, here we systematically characterized the transcriptional profiles of zebrafish sRNAs with the sRNA-seq technology from eight early developmental stages, including 1-cell, 16-cell, 512-cell, oblong, 5.3 hpf, 6-somite, 24 hpf and 48 hpf. To promise the quality for mapping reads, only one nucleotide mismatch was permitted for all sequence alignments. After removing low quality reads and uninformative sequences, we totally mapped 32.78% ~ 70.43% of all reads to the zebrafish genome (Additional file 1: Table S1).
How to retrieve useful information from the huge amount of repeat sequences is a great challenge for analyzing the sRNA-seq data. Many types of RNAs were annotated in repetitive sequences, such as rRNA, tRNA, snRNA/snoRNA, and repeat-associated piRNAs. If these annotations were not considered, the identification of the five types of sRNAs would be greatly underestimated. Thus, we greatly optimized our previous pipeline and designed a more efficient platform of CSZ, which first recalled rRNA, tRNA, and snRNA/snoRNA back to their own groups and then re-characterized piRNAs from the remaining repeats. For predicting potentially novel miRNAs, we also greatly refined our algorithm. Totally, there were 197 sequence and structure features used in miRD for the prediction of single- or multi-stem pre-miRNAs . However, only 65 features were used in ZmirP, and only two features including MFE and the ratio of paired nucleotide to unpaired nucleotide were share by the two methods. Furthermore, because only small proportion of multi-stem pre-miRNAs exists in zebrafish, we used all known zebrafish pre-miRNAs for training. Indeed, we did not observe any multi-stem pre-miRNAs in 25 newly predicted results (Table 3, Additional file 4: Figure S1). By comparison, ZmirP exhibited a superior performance for predicting zebrafish pre-miRNAs (Figure 1A), and can be comparable with than other existing tools for human pre-miRNAs (Figure 1B). Because several steps in CSZ, such as the identification of potential miRNAs by MIREAP/miRDeep2 and reads mapping to genome, were too time-consuming, the development of a web server will be a heavy burden for our computational resources. Thus, the CSZ was written in Perl as a stand-alone package at: http://csz.biocuckoo.org/down.php. The SVM model for zebrafish pre-miRNAs was directly included in the program, while the human-specific model was also provided. Take together, the newly developed platform can accurately identify both miRNAs and piRNAs from the sRNA-seq data.
Our analyses on known miRNAs expressed in early development are consistent with previous studies in zebrafish or other model organisms. For example, as one of the most abundant miRNA families, zebrafish miR-430 family facilitates the MZT by the clearance of maternal mRNAs . In our results, the dre-miR-430a family (including dre-miR-430a, dre-miR-430b, dre-miR-430c and dre-miR-430i) is highly expressed from 512-cell to 6-somite stage, and may be functional in MZT (Figure 5C). Also, our results of dre-miR-456 and dre-miR-22a families are also consistent with the studies in the early development of chicken  and mouse , respectively. Moreover, we identified a number of known miRNAs reported to be involved in other processes may also play a potential role in embryonic development. For example, the dre-miR-192 family was characterized to be implicated in zebrafish immune response to bacterial infection , whereas our results suggested that the family may also participate in regulating the early development (Figure 5C). In addition, we predicted 25 potential novel miRNAs, and validated two of them, m0027-5p and chr6_7844-5p, are expressed in zebrafish 16-cell stage samples (Figure 7). Taken together, our results can be a useful resource for further analysis of sRNAs in early zebrafish embryonic development.
The zebrafish genome assembly version 2010 (Zv9) was downloaded from the UCSC database . Sequences of 247 zebrafish mature miRNAs and 344 pre-miRNAs were obtained from miRBase (release 19) . Also, zebrafish snRNA, snoRNA, tRNA and rRNA were retrieved from Rfam (release 11.0) . Zebrafish repeat sequences were extracted from the reference genome according to the annotations in the UCSC database . Moreover, the RefSeq mRNAs of zebrafish were downloaded from the UCSC database . The zebrafish piRNAs were retrieved from the piRNABank, a comprehensive data resource for piRNAs . And the protein coding sequences (CDSs) of zebrafish were obtained through the Table Browser in the UCSC database . In addition, we obtained 1,600 human pre-miRNAs from miRBase (release 19) .
The ZmirP algorithm
Previously, we selected different features to predict single- and multi-stem pre-miRNAs, respectively . However, because there were only 19 (~5.5%) known multi-stem pre-miRNAs in zebrafish, we simply mixed single- and multi-stem pre-miRNAs together for the feature selection. First, we collected 195 sequence and structure features from previously published studies [21, 24, 46]. We also introduced 11 new structure features, including the ratio of paired nucleotides, the ratio of unpaired nucleotides, the number of bulges, the normalized number of bulges, maximum number of consecutive paired nucleotides, average unpaired nucleotides per bulge, MFE4, MFE5, the size of largest bulge, the normalized number of stems, and the normalized number of loops. Because exhaustively searching optimal combination of 206 features is too time-consuming, we classified these features into 23 sets. The first set contains 16 sequence features of dinucleotide frequencies from microPred , and the second set includes 32 structure features of triplet elements raised by triplet-SVM . The third set contains 138 structure features for specifically predicting multi-stem pre-miRNAs in miRD . The remaining unclassified 20 sequence or structure features were directly regarded as 20 sets.
where n + and n − are the numbers of positive and negative samples, respectively; , and are the average of the i th feature set of total, positive and negative samples, respectively. and are the i th feature set of the k th positive sample and negative sample, respectively. For 20 sets containing only one feature, the F-scores were directly computed. For three sets with multiple features, the F-score value of each feature was calculated separately, and then the average F-score of each set was calculated, respectively. Finally, 19 feature sets including 17 sequence and 48 structure features were selected to build an SVM model for pre-miRNA prediction.
The LIBSVM package was used for training (version 3.16, http://www.csie.ntu.edu.tw/~cjlin/libsvm/), whereas the svm-scale program in LIBSVM was used to rescale all features with the interval [−1.0, 1.0]. The most widely used radial basis function (RBF) was chosen. The penalty parameter C and RBF kernel parameter γ were exhaustively searched as 8.0 and 0.03125, respectively.
In this study, the LOO validation and 4-, 6-, 8-, 10-fold cross-validations were performed as previously described . And the Receiver Operating Characteristic (ROC) curves were plotted.
Small RNA library construction and sequencing
Breeding wild-type zebrafish (Danio rerio) (AB type) were maintained under standard library conditions and the embryonic stages in this study were as described [15, 48]. Zebrafish embryos were collected at 1-cell (0.2 hpf), 16-cell (1.5 hpf), 512-cell (2.75 hpf), oblong (3.7 hpf), 5.3 hpf (50% epibody), 6-somite (12 hpf), 24 hpf (day1) and 48 hpf (day2) stages. Total RNA from embryos was isolated using Trizol reagent (Invitrogen). RNAs were fractioned on 15% denaturing polyacrylamide gels, and small RNAs were isolated and purified. Subsequently, small RNAs were ligated with both a 5′ adapter and 3′ adapter for reverse transcription using SuperscriptTM II reverse transcription kit (Invitrogen) following the manufacturer’s instructions at 42°C for 1 h and 70°C for 15 min. After that, the reverse transcribed product, cDNA was amplified by the following PCR program: a 15-cycle reaction at 98°C for 30 sec, followed by 15 cycles consisting of 10 sec at 98°C, 15 sec at 72°C, and then 10 min at 72°C. After obtaining a 92 bp DNA band on 6% denaturing PAGE gels, the PCR products were enriched by ethanol precipitation and purified using Spin-X filter columns (Fisher). Finally, small RNA libraries were constructed and sequenced by the Illumina HiSeq™ 2000 following the manufacturer’s protocol in BGI-Shenzhen, China. The primers
3′ ligation adaptor: 5′-GUUCAGAGUUCUACAGUCCGACGAUC-3′
5′ ligation adaptor: 5′-PUCGUAUGCCGUCUUCUGCUUGUidT-3′
PCR forward primer: 5′-CAAGCAGAAGACGGCATACGA-3′
PCR reverse primer:
Raw reads obtained from sequencing platform were processed by filtering out the low quality reads, trimming 3′ adaptor sequences, removing the 5′ adaptor contaminants, and eliminating reads containing ploy(A).
Quantitative real-time PCR
Quantitative real-time PCR (qRT-PCR) assays (miScript Reverse Transcription Kit and miScript SYBR Green PCR Kit, Qiagen) were performed to determine the expression levels of four known miRNAs (dre-miR-456, dre-miR-22a, dre-miR-206 and dre-miR-192). 10 ng of total RNA extracted from zebrafish embryos was used for cDNA synthesis. U6 snRNA was used as an endogenous control and each reverse transcription was conducted in triplicate. The expression levels of miRNAs were measured by the threshold cycle values (Ct). The relative expression levels were assigned as Equation 2-∆∆ct. The primers were designed as follows (Universal reverse primer was brought in miScript SYBR Green PCR Kit of Qiagen):
U6 snRNA_F: 5′-CTCGCTTCGGCAGCACA-3′
A total of 200 μg of total RNA was extracted from zebrafishes embroys at 16-cell stage, electrophoresed on 1.2% denaturing agarose-formaldehyde gel, and transferred to HybondTM-N + nylon membranes (Amersham, USA) for Northern blot analysis of three novel miRNAs, m0027-5p, chr6_7844-5p and m0026-5p. U6 snRNA was used as an endogenous control. The sequences of northern probes were 3′-DIG-labelled and complementary to the novel mature miRNAs and U6 sequences. The probes were as follows:
m0027-5p: 5′-CACGTCCCCCCAGAATTCACCATAAC-3′ DIG
chr6_7844-5p: 5′-CACCCTCATCTCCCCAAACCTACAC-3′ DIG
m0026-5p: 5′-AAGTGAGTCTCGTCATTCTGTC-3′ DIG
U6 snRNA: 5′-CTCGCTTCGGCAGCACA-3′ DIG
All of the studies using zebrafish were approved by the Animal Care and Use Committee of Huazhong University of Science and Technology.
The sequencing data reported in this work can be available from the NCBI Short Read Archive (SRA) under accession number [SRP028862].
small non-coding RNAs
primary miRNA molecules
RNA-induced silencing complex
hours post fertilization
days post fertilization
- triplet elements:
local structure-sequence features
Support vector machines
Minimum of free energy
Zebrafish miRNA prediction
Characterization of small RNAome for zebrafish
Radial basis function
Mathew’s correlation coefficient
Receiver Operating Characteristic
quantitative real-time PCR
reads per million
- Non-repeat-associated piRNAs:
piRNAs that were not annotated as repeat sequences
- Repeat-associated piRNAs:
piRNAs that could be mapped to repetitive sequences.
The authors thank Dr. Yong Zhang (Tongji Univ.) for their helpful suggestions on the analysis of deep sequencing data. This work was supported by grants from the National Basic Research Program (973 project) (2013CB933903, 2012CB910101, 2010CB945401, 2013CB945300, 2012CB911201); Natural Science Foundation of China (31171263, 81272578, 31071154, 31000640, 31171387, 81090414, 81230052, 91019020); and International Science & Technology Cooperation Program of China (0S2013ZR0003).
- Moazed D: Small RNAs in transcriptional gene silencing and genome defence. Nature. 2009, 457 (7228): 413-420. 10.1038/nature07756.PubMed CentralPubMedView ArticleGoogle Scholar
- Stefani G, Slack FJ: Small non-coding RNAs in animal development. Nat Rev Mol Cell Biol. 2008, 9 (3): 219-230. 10.1038/nrm2347.PubMedView ArticleGoogle Scholar
- Pauli A, Rinn JL, Schier AF: Non-coding RNAs as regulators of embryogenesis. Nat Rev Genet. 2011, 12 (2): 136-149. 10.1038/nrg2904.PubMed CentralPubMedView ArticleGoogle Scholar
- Bartel DP: MicroRNAs: genomics, biogenesis, mechanism, and function. Cell. 2004, 116 (2): 281-297. 10.1016/S0092-8674(04)00045-5.PubMedView ArticleGoogle Scholar
- Houwing S, Kamminga LM, Berezikov E, Cronembold D, Girard A, van den Elst H, Filippov DV, Blaser H, Raz E, Moens CB, et al: A role for Piwi and piRNAs in germ cell maintenance and transposon silencing in zebrafish. Cell. 2007, 129 (1): 69-82. 10.1016/j.cell.2007.03.026.PubMedView ArticleGoogle Scholar
- Lewis BP, Burge CB, Bartel DP: Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell. 2005, 120 (1): 15-20. 10.1016/j.cell.2004.12.035.PubMedView ArticleGoogle Scholar
- Lu J, Getz G, Miska EA, Alvarez-Saavedra E, Lamb J, Peck D, Sweet-Cordero A, Ebert BL, Mak RH, Ferrando AA, et al: MicroRNA expression profiles classify human cancers. Nature. 2005, 435 (7043): 834-838. 10.1038/nature03702.PubMedView ArticleGoogle Scholar
- Malone CD, Brennecke J, Dus M, Stark A, McCombie WR, Sachidanandam R, Hannon GJ: Specialized piRNA pathways act in germline and somatic tissues of the Drosophila ovary. Cell. 2009, 137 (3): 522-535. 10.1016/j.cell.2009.03.040.PubMed CentralPubMedView ArticleGoogle Scholar
- Brennecke J, Aravin AA, Stark A, Dus M, Kellis M, Sachidanandam R, Hannon GJ: Discrete small RNA-generating loci as master regulators of transposon activity in Drosophila. Cell. 2007, 128 (6): 1089-1103. 10.1016/j.cell.2007.01.043.PubMedView ArticleGoogle Scholar
- Aravin AA, Hannon GJ, Brennecke J: The Piwi-piRNA pathway provides an adaptive defense in the transposon arms race. Science. 2007, 318 (5851): 761-764. 10.1126/science.1146484.PubMedView ArticleGoogle Scholar
- Suh N, Blelloch R: Small RNAs in early mammalian development: from gametes to gastrulation. Development. 2011, 138 (9): 1653-1661. 10.1242/dev.056234.PubMed CentralPubMedView ArticleGoogle Scholar
- Ohnishi Y, Totoki Y, Toyoda A, Watanabe T, Yamamoto Y, Tokunaga K, Sakaki Y, Sasaki H, Hohjoh H: Small RNA class transition from siRNA/piRNA to miRNA during pre-implantation mouse development. Nucleic Acids Res. 2010, 38 (15): 5141-5151. 10.1093/nar/gkq229.PubMed CentralPubMedView ArticleGoogle Scholar
- Shao P, Liao JY, Guan DG, Yang JH, Zheng LL, Jing Q, Zhou H, Qu LH: Drastic expression change of transposon-derived piRNA-like RNAs and microRNAs in early stages of chicken embryos implies a role in gastrulation. RNA Biol. 2012, 9 (2): 212-227. 10.4161/rna.18489.PubMedView ArticleGoogle Scholar
- Song JL, Stoeckius M, Maaskola J, Friedlander M, Stepicheva N, Juliano C, Lebedeva S, Thompson W, Rajewsky N, Wessel GM: Select microRNAs are essential for early development in the sea urchin. Dev Biol. 2012, 362 (1): 104-113. 10.1016/j.ydbio.2011.11.015.PubMed CentralPubMedView ArticleGoogle Scholar
- Kimmel CB, Ballard WW, Kimmel SR, Ullmann B, Schilling TF: Stages of embryonic development of the zebrafish. Dev Dyn. 1995, 203 (3): 253-310. 10.1002/aja.1002030302.PubMedView ArticleGoogle Scholar
- Chen PY, Manninga H, Slanchev K, Chien M, Russo JJ, Ju J, Sheridan R, John B, Marks DS, Gaidatzis D, et al: The developmental miRNA profiles of zebrafish as determined by small RNA cloning. Genes Dev. 2005, 19 (11): 1288-1293. 10.1101/gad.1310605.PubMed CentralPubMedView ArticleGoogle Scholar
- Wienholds E, Kloosterman WP, Miska E, Alvarez-Saavedra E, Berezikov E, de Bruijn E, Horvitz HR, Kauppinen S, Plasterk RH: MicroRNA expression in zebrafish embryonic development. Science. 2005, 309 (5732): 310-311. 10.1126/science.1114519.PubMedView ArticleGoogle Scholar
- Giraldez AJ, Mishima Y, Rihel J, Grocock RJ, Van Dongen S, Inoue K, Enright AJ, Schier AF: Zebrafish MiR-430 promotes deadenylation and clearance of maternal mRNAs. Science. 2006, 312 (5770): 75-79. 10.1126/science.1122689.PubMedView ArticleGoogle Scholar
- Wei C, Salichos L, Wittgrove CM, Rokas A, Patton JG: Transcriptome-wide analysis of small RNA expression in early zebrafish development. RNA. 2012, 18 (5): 915-929. 10.1261/rna.029090.111.PubMed CentralPubMedView ArticleGoogle Scholar
- Wang Z, Gerstein M, Snyder M: RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009, 10 (1): 57-63. 10.1038/nrg2484.PubMed CentralPubMedView ArticleGoogle Scholar
- Xue C, Li F, He T, Liu GP, Li Y, Zhang X: Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine. BMC Bioinformatics. 2005, 6: 310-10.1186/1471-2105-6-310.PubMed CentralPubMedView ArticleGoogle Scholar
- Jiang P, Wu H, Wang W, Ma W, Sun X, Lu Z: MiPred: classification of real and pseudo microRNA precursors using random forest prediction model with combined features. Nucleic Acids Res. 2007, 35 (Web Server issue): W339-W344.PubMed CentralPubMedView ArticleGoogle Scholar
- Lertampaiporn S, Thammarongtham C, Nukoolkit C, Kaewkamnerdpong B, Ruengjitchatchawalya M: Heterogeneous ensemble approach with discriminative features and modified-SMOTEbagging for pre-miRNA classification. Nucleic Acids Res. 2013, 41 (1): e21-10.1093/nar/gks878.PubMed CentralPubMedView ArticleGoogle Scholar
- Zhang Y, Yang Y, Zhang H, Jiang X, Xu B, Xue Y, Cao Y, Zhai Q, Zhai Y, Xu M, et al: Prediction of novel pre-microRNAs with high accuracy through boosting and SVM. Bioinformatics. 2011, 27 (10): 1436-1437. 10.1093/bioinformatics/btr148.PubMedView ArticleGoogle Scholar
- Friedlander MR, Chen W, Adamidi C, Maaskola J, Einspanier R, Knespel S, Rajewsky N: Discovering microRNAs from deep sequencing data using miRDeep. Nat Biotechnol. 2008, 26 (4): 407-415. 10.1038/nbt1394.PubMedView ArticleGoogle Scholar
- Hackenberg M, Sturm M, Langenberger D, Falcon-Perez JM, Aransay AM: miRanalyzer: a microRNA detection and analysis tool for next-generation sequencing experiments. Nucleic Acids Res. 2009, 37 (Web Server issue): W68-W76.PubMed CentralPubMedView ArticleGoogle Scholar
- Zhang Y, Xu B, Yang Y, Ban R, Zhang H, Jiang X, Cooke HJ, Xue Y, Shi Q: CPSS: a computational platform for the analysis of small RNA deep sequencing data. Bioinformatics. 2012, 28 (14): 1925-1927. 10.1093/bioinformatics/bts282.PubMedView ArticleGoogle Scholar
- Hofacker IL: Vienna RNA secondary structure server. Nucleic Acids Res. 2003, 31 (13): 3429-3431. 10.1093/nar/gkg599.PubMed CentralPubMedView ArticleGoogle Scholar
- Ding J, Zhou S, Guan J: MiRenSVM: towards better prediction of microRNA precursors using an ensemble SVM classifier with multi-loop features. BMC Bioinformatics. 2010, 11 (Suppl 11): S11-10.1186/1471-2105-11-S11-S11.PubMed CentralPubMedView ArticleGoogle Scholar
- Morin RD, O’Connor MD, Griffith M, Kuchenbauer F, Delaney A, Prabhu AL, Zhao Y, McDonald H, Zeng T, Hirst M, et al: Application of massively parallel sequencing to microRNA profiling and discovery in human embryonic stem cells. Genome Res. 2008, 18 (4): 610-621. 10.1101/gr.7179508.PubMed CentralPubMedView ArticleGoogle Scholar
- Langmead B, Trapnell C, Pop M, Salzberg SL: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009, 10 (3): R25-10.1186/gb-2009-10-3-r25.PubMed CentralPubMedView ArticleGoogle Scholar
- Friedlander MR, Mackowiak SD, Li N, Chen W, Rajewsky N: miRDeep2 accurately identifies known and hundreds of novel microRNA genes in seven animal clades. Nucleic Acids Res. 2012, 40 (1): 37-52. 10.1093/nar/gkr688.PubMed CentralPubMedView ArticleGoogle Scholar
- Zhang B, Stellwag EJ, Pan X: Large-scale genome analysis reveals unique features of microRNAs. Gene. 2009, 443 (1–2): 100-109.PubMedView ArticleGoogle Scholar
- Schier AF: The maternal-zygotic transition: death and birth of RNAs. Science. 2007, 316 (5823): 406-407. 10.1126/science.1140693.PubMedView ArticleGoogle Scholar
- Eisen MB, Spellman PT, Brown PO, Botstein D: Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A. 1998, 95 (25): 14863-14868. 10.1073/pnas.95.25.14863.PubMed CentralPubMedView ArticleGoogle Scholar
- Lee SI, Lee BR, Hwang YS, Lee HC, Rengaraj D, Song G, Park TS, Han JY: MicroRNA-mediated posttranscriptional regulation is required for maintaining undifferentiated properties of blastoderm and primordial germ cells in chickens. Proc Natl Acad Sci U S A. 2011, 108 (26): 10426-10431. 10.1073/pnas.1106141108.PubMed CentralPubMedView ArticleGoogle Scholar
- Berenguer J, Herrera A, Vuolo L, Torroba B, Llorens F, Sumoy L, Pons S: MicroRNA 22 regulates cell cycle length in cerebellar granular neuron precursors. Mol Cell Biol. 2013, 33 (14): 2706-2717. 10.1128/MCB.00338-13.PubMed CentralPubMedView ArticleGoogle Scholar
- Stahlhut C, Suarez Y, Lu J, Mishima Y, Giraldez AJ: miR-1 and miR-206 regulate angiogenesis by modulating VegfA expression in zebrafish. Development. 2012, 139 (23): 4356-4364. 10.1242/dev.083774.PubMed CentralPubMedView ArticleGoogle Scholar
- Wu TH, Pan CY, Lin MC, Hsieh JC, Hui CF, Chen JY: In vivo screening of zebrafish microRNA responses to bacterial infection and their possible roles in regulating immune response genes after lipopolysaccharide stimulation. Fish Physiol Biochem. 2012, 38 (5): 1299-1310. 10.1007/s10695-012-9617-1.PubMedView ArticleGoogle Scholar
- Chen X, Li Q, Wang J, Guo X, Jiang X, Ren Z, Weng C, Sun G, Wang X, Liu Y, et al: Identification and characterization of novel amphioxus microRNAs by Solexa sequencing. Genome Biol. 2009, 10 (7): R78-10.1186/gb-2009-10-7-r78.PubMed CentralPubMedView ArticleGoogle Scholar
- Sai Lakshmi S, Agrawal S: piRNABank: a web resource on classified and clustered Piwi-interacting RNAs. Nucleic Acids Res. 2008, 36 (Database issue): D173-D177.PubMed CentralPubMedGoogle Scholar
- Crooks GE, Hon G, Chandonia JM, Brenner SE: WebLogo: a sequence logo generator. Genome Res. 2004, 14 (6): 1188-1190. 10.1101/gr.849004.PubMed CentralPubMedView ArticleGoogle Scholar
- Fujita PA, Rhead B, Zweig AS, Hinrichs AS, Karolchik D, Cline MS, Goldman M, Barber GP, Clawson H, Coelho A, et al: The UCSC genome browser database: update 2011. Nucleic Acids Res. 2011, 39 (Database issue): D876-D882.PubMed CentralPubMedView ArticleGoogle Scholar
- Kozomara A, Griffiths-Jones S: miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res. 2011, 39 (Database issue): D152-D157.PubMed CentralPubMedView ArticleGoogle Scholar
- Burge SW, Daub J, Eberhardt R, Tate J, Barquist L, Nawrocki EP, Eddy SR, Gardner PP, Bateman A: Rfam 11.0: 10 years of RNA families. Nucleic Acids Res. 2013, 41 (Database issue): D226-D232.PubMed CentralPubMedView ArticleGoogle Scholar
- Batuwita R, Palade V: microPred: effective classification of pre-miRNAs for human miRNA gene prediction. Bioinformatics. 2009, 25 (8): 989-995. 10.1093/bioinformatics/btp107.PubMedView ArticleGoogle Scholar
- Xue Y, Liu Z, Gao X, Jin C, Wen L, Yao X, Ren J: GPS-SNO: computational prediction of protein S-nitrosylation sites with a modified GPS algorithm. PloS one. 2010, 5 (6): e11290-10.1371/journal.pone.0011290.PubMed CentralPubMedView ArticleGoogle Scholar
- Arata N, Nakayasu H: A periaxonal net in the zebrafish central nervous system. Brain Res. 2003, 961 (2): 179-189. 10.1016/S0006-8993(02)03888-X.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.