Conserved generation of short products at piRNA loci
© Berninger et al; licensee BioMed Central Ltd. 2011
Received: 16 September 2010
Accepted: 19 January 2011
Published: 19 January 2011
The piRNA pathway operates in animal germ lines to ensure genome integrity through retrotransposon silencing. The Piwi protein-associated small RNAs (piRNAs) guide Piwi proteins to retrotransposon transcripts, which are degraded and thereby post-transcriptionally silenced through a ping-pong amplification process. Cleavage of the retrotransposon transcript defines at the same time the 5' end of a secondary piRNA that will in turn guide a Piwi protein to a primary piRNA precursor, thereby amplifying primary piRNAs. Although several studies provided evidence that this mechanism is conserved among metazoa, how the process is initiated and what enzymatic activities are responsible for generating the primary and secondary piRNAs are not entirely clear.
Here we analyzed small RNAs from three mammalian species, seeking to gain further insight into the mechanisms responsible for the piRNA amplification loop. We found that in all these species piRNA-directed targeting is accompanied by the generation of short sequences that have a very precisely defined length, 19 nucleotides, and a specific spatial relationship with the guide piRNAs.
This suggests that the processing of the 5' product of piRNA-guided cleavage occurs while the piRNA target is engaged by the Piwi protein. Although they are not stabilized through methylation of their 3' ends, the 19-mers are abundant not only in testes lysates but also in immunoprecipitates of Miwi and Mili proteins. They will enable more accurate identification of piRNA loci in deep sequencing data sets.
Members of the Argonaute family of proteins have been identified as key players in small-RNA-guided silencing pathways. Of these, proteins of the Piwi clade are predominantly expressed in germ cells where they associate with a small RNA population distinct from that of miRNAs, as it has been shown initially for mammals and insects [1–4]. Some of the Piwi-associated small RNAs (piRNAs) are involved in regulating the expression of transposable elements, a function which is particularly important during genome activation in the germ line. Evidence for this specific role of piRNAs comes from the studies in y [5–7], zebrafish [8, 9] and mouse [10–13]. Although Piwi-protein-associated small RNAs have also been discovered in worm (where they are called 21U RNAs) [14–17], their function in this species is less clear. Post-transcriptional silencing of retrotransposon transcripts is achieved by a piRNA amplification mechanism called ping-pong [6, 18]. Briefly, piRNAs that are generated by a so far unidentified mechanism and are complementary to retrotransposon transcripts guide Piwi proteins to cleave these transcripts. The position of the Piwi-catalyzed cleavage is the bond between the nucleotides that have base-pairing interactions with nucleotides 10 and 11 of the primary piRNA. This cleavage defines the 5' end of a secondary piRNA that is generated from the transposon transcript. Because a very high proportion of piRNAs have a uridine (U) at the first position and because the complementarity between piRNAs and targets is expected to be nearly perfect, secondary piRNAs typically have adenosines at position 10, which base-pairs with the U at the first position of the piRNA [6, 18]. After the 3' end of the secondary piRNA is generated through cleavage by a yet unknown nuclease and is subsequently 2'-O-methylated by the action of the Hen-1 methyltransferase [5, 19, 20], the secondary piRNA is loaded into a Piwi protein to guide the cleavage of a new primary piRNA precursor. This results in the production of piRNAs from the same loci from which the initial piRNAs were derived. Though the function of piRNAs has been characterized in detail in only a few species, evidence for Piwi protein-catalyzed cleavage and secondary piRNA production (a 10 nt overlap of the 5' end of piRNA sequences from opposite strands of the genome) has been provided for vertebrates , insects [6, 18], flatworms , sea anemones and sponges , suggesting an early origin of the ping-pong mechanism.
In mouse, piRNAs are observed in the developing germ cells starting from the stage of G1-arrested gonocytes in the embryonic mouse up to the round spermatids in adult mouse . Two classes of piRNAs, with somewhat different characteristic lengths have been described : piRNAs which are ≈ 26 nucleotides (nt) in length are observed prior to meiosis, pre-natally and in the pre-pachytene stage, whereas a population of longer piRNAs appear to take over around the pachytene stage. Evidence for secondary piRNAs production has been found up to this point only in the prenatal and pre-pachytene piRNA populations, which associate with the Mili and Miwi2 proteins [10, 12]. In contrast, pachytene piRNAs, which associate with both Mili and Miwi, are thought to consist solely of primary piRNAs [12, 23]. After initial studies defined the characteristic length of piRNAs to be in the range of 23-32 nucleotides [3, 4, 10, 12, 13], larger than that of miRNAs, most studies of piRNAs extracted and analyzed sequence reads whose length was in this range. Longer potential intermediates and longer or shorter by-products of the amplification loop, that could arise during the processing steps that generate both 5' and 3' ends of piRNAs and could be in principle observed in the deep sequencing libraries have therefore not been captured in these studies. Because such products may shed light on the individual steps of piRNA production and piRNA-dependent targeting, we re-analyzed the few published libraries from mouse, rat and platypus that covered a broader range of small RNA sizes (15-36 nt). We further generated and analyzed an additional library of small RNAs from adult mouse testes.
Signature of processing products at piRNA loci
As expected, we detected a strong signal characteristic for piRNA-directed targeting (see Figure 1B), i.e. we found a relatively high frequency of sense-antisense pairs with a distance of 9 between the 5' ends of their loci. Surprisingly, we found that a 28 nt distance between 5' ends is also very common, though it has not been reported before. In total, about 1'010'500 of the uniquely, perfectly-mapping sequences in the library contributed to those two peaks (which is about 16% of all mapped sequences). Reasoning that previous studies that did not report the peak at 28 nt investigated relatively long sequences (longer than 22 nt), we split the sequence set based on the length of the small RNAs and repeated the analysis. Indeed, restricting the analysis to sequences longer than 22 nt (the piRNA range) revealed only the peak at 9 nt (P9). The peak at 28 (P28) only became apparent when we included sequences with a broader length distribution, because it corresponds to a pattern in which on one strand there is a long sequence, in the range of a prototypical piRNA, and on the opposite strand a shorter sequence (≤ 22 nt) (see Figure 1B). Furthermore, we found that the shorter sequence is almost always 19 nucleotides in length (see Figure 1B, C, D).
P9 and P28 processing patterns co-localize
Short by-products are generated during piRNA targeting
To determine at what stage in piRNA biogenesis or targeting cascade the short products are generated, we retrieved the genomic sequence of all 2'888 loci from the mouse testes lysate data set where the two piRNAs together with the short by-product occurred. One of the main characteristics of primary piRNAs being their U bias [6, 18] we determined the frequency of the U nucleotide at position 1 of the long piRNA sequences generated from these 2'888 loci. While the piRNAs that are located on the same strand as the 19-mers have a frequency of 44% U at position 1, the piRNAs on the opposite strand have a much higher frequency, namely 78%. This suggests that the 19-mer is a product that arises during piRNA-guided cleavage of target transcripts, whose 5' end is defined by a yet-unknown nuclease.
Generation of short by-products is conserved across species
Statistics about mapped sequences and processing sites
perfect unique mappers
Thus, the 19-mers that we identified as products of piRNA targeting yield new insights into this process. Furthermore, because we cannot currently define precisely what a piRNA is (other than using as criteria the length of the sequence and its association with the Piwi proteins), the frequently occurring P28 processing pattern can be used to identify loci where piRNA directed targeting takes place.
From the GEO database (http://www.ncbi.nlm.nih.gov/geo/) we obtained the following publicly available data sets that were generated in previous studies [24–27]: GSE10571 for platypus, GSE19054 for rat GSE19172 for mouse and GSE15186 for fly.
The data sets from rat were converted from Solid color space into sequence format with the script Solid2Solexa.pl (http://seqanswers.com/). After downloading the data sets, the adaptors were trimmed and all sequences of at least 15 nt in length were aligned against the corresponding genomes (excluding the mitochondrial chromosome) with oligomap . For mouse, rat, platypus and fly, we obtained the genome assemblies mm9, rn4, ornAna1 and dm3 from the website of the University of California Santa Cruz (http://genome.cse.ucsc.edu).
After mapping the reads to the respective genomes, we only used those sequences that mapped uniquely (1 locus) and perfectly (without any error) for further processing analysis.
Correlations of positions
where weight+(i) is the sum of the copy numbers of all of sequences that have their 5' end on the plus strand at a particular position i and weight - (i + Δ) is the sum of the copy numbers of all of sequences that have their 5' end on the minus strand at position i + Δ. This measurement focuses only on the distance between the 5' ends and the length of the sequences is ignored. The 10 nucleotide overlap between the 5' ends of sequences from opposite strands that are generated by the ping-pong mechanism corresponds to Δ = 9. The results of this computation can be interpreted quite intuitively, it just indicates how often a particular distance has been observed in the entire set of reads. On the other hand, our measure is more conservative than that proposed by Olson et al., who multiplied the counts of sequences overlapping from sense and antisense strands, thereby putting more weight on distances observed in genomic loci with a large number of reads.
For the systematic analysis of the relative position of the P9 and P28 peaks, we computed the cross-correlation between the genomic positions where the pairs of sequences contributing to these peaks occurred with the crosscorr function of Matlab. First, we identified the sequence reads that give rise to P28 and P9 patterns, respectively (see also Additional files 5 and 6). For each of these patterns, we generated independently a vector in which the index was the genomic location of the nucleotide that was located midway between the 5' end of the sequence on the plus strand and the 5' end of the sequence on the minus strand, and the entries were 1 if a pair was associated with a given genomic location, otherwise 0. We then applied cross-correlation analysis between the two vectors in a window of length 50. This resulted in two peaks, one at -9 and the other at 10. We then repeated this analysis using the same vector for P9, but two different vectors for P28. One of these vectors was constructed from pairs containing the 19-mer on the plus strand (P28T) and the other from pairs containing the 19-mer on the minus strand (P28B). With the first vector we obtained only the peak at -9 and with the second vector only the peak at 10.
Strand-bias of processing sites
Assuming a Markov model with two states, T and B, and with transitions T→T, T→B, B→T, B→B, the most likely values of the transition probabilities can be estimated from the number of occurrences of T and B patterns at P28 patterns that occur consecutively along the chromosomes. For example, .
small RNA isolation, β-elimination and cloning
Total testis RNA from 6 months old C57/BL6 mice was isolated using TRIzol (Invitrogen) according to manufacturer's description. Sodium periodate treatment and β-elimination  were performed as described . Briefly, 20 μ g of total RNA was incubated with freshly prepared NaIO4 (final concentration 25 mM) in borate buffer (30 mM borax, 30 mM boric acid, pH 8.6) for 10 min at room temp. Unreacted NaIO4 was quenched with glycerol and sample was dried under vacuum, resuspended in borax buffer (30 mM borax, 30 mM boric acid, 50 mM NaOH, pH 9.5) and incubated at 45°C for 45 min. RNA was precipitated with ethanol, 5' radiolabeled by T4 polynucleotide kinase (New England Biolabs) and resolved by 15% denaturing PAGE. Small RNA fractions sized between 15 and 40 nt were recovered from the gel, converted into a cDNA library as described  and Solexa sequenced. The deep sequencing data from this study have been deposited in the Gene Expression Omnibus (GEO) database, http://www.ncbi.nlm.nih.gov/geo (accession no. GSE26160).
We are grateful to the members of the Zavolan lab for fruitful discussions and comments on the manuscript. This work was supported by the Swiss National Science Foundation (SNF) grant #3100A0-114001, the SNF ProDoc program (grants PDAMP3_127218 and PDFMP3_123123) and a SNF Fellowship for prospective researchers (grant PBBSP3-133782).
- Vagin VV, Sigova A, Li C, Seitz H, Gvozdev V, Zamore PD: A distinct small RNA pathway silences selfish genetic elements in the germline. Science. 2006, 313 (5785): 320-324. 10.1126/science.1129333.View ArticlePubMedGoogle Scholar
- Saito K, Nishida KM, Mori T, Kawamura Y, Miyoshi K, Nagami T, Siomi H, Siomi MC: Specific association of Piwi with rasiRNAs derived from retrotransposon and heterochromatic regions in the Drosophila genome. Genes Dev. 2006, 20 (16): 2214-2222. 10.1101/gad.1454806.View ArticlePubMedPubMed CentralGoogle Scholar
- Aravin A, Gaidatzis D, Pfeffer S, Quintana ML, Landgraf P, Iovino N, Morris P, Brownstein MJ, Miyagawa SK, Nakano T, Chien M, Russo JJ, Ju J, Sheridan R, Sander C, Zavolan M, Tuschl T: A novel class of small RNAs bind to MILI protein in mouse testes. Nature. 2006, 442 (7099): 203-207.PubMedGoogle Scholar
- Girard A, Sachidanandam R, Hannon GJ, Carmell MA: A germline-specific class of small RNAs binds mammalian Piwi proteins. Nature. 2006, 442 (7099): 199-202.PubMedGoogle Scholar
- Saito K, Sakaguchi Y, Suzuki T, Suzuki T, Siomi H, Siomi MC: Pimet, the Drosophila homolog of HEN1, mediates 2'-O-methylation of Piwi- interacting RNAs at their 3' ends. Genes Dev. 2007, 21 (13): 1603-1608. 10.1101/gad.1563607.View ArticlePubMedPubMed CentralGoogle Scholar
- Brennecke J, Aravin AA, Stark A, Dus M, Kellis M, Sachidanandam R, Hannon GJ: Discrete small RNA-generating loci as master regulators of transposon activity in Drosophila. Cell. 2007, 128 (6): 1089-1103. 10.1016/j.cell.2007.01.043.View ArticlePubMedGoogle Scholar
- Brennecke J, Malone CD, Aravin AA, Sachidanandam R, Stark A, Hannon GJ: An epigenetic role for maternally inherited piRNAs in transposon silencing. Science. 2008, 322 (5906): 1387-1392. 10.1126/science.1165171.View ArticlePubMedPubMed CentralGoogle Scholar
- Houwing S, Kamminga LM, Berezikov E, Cronembold D, Girard A, van den Elst H, Filippov DV, Blaser H, Raz E, Moens CB, Plasterk RHA, Hannon GJ, Draper BW, Ketting RF: A role for Piwi and piRNAs in germ cell maintenance and transposon silencing in Zebrafish. Cell. 2007, 129: 69-82. 10.1016/j.cell.2007.03.026.View ArticlePubMedGoogle Scholar
- Houwing S, Berezikov E, Ketting RF: Zili is required for germ cell differentiation and meiosis in zebrafish. EMBO J. 2008, 27 (20): 2702-2711. 10.1038/emboj.2008.204.View ArticlePubMedPubMed CentralGoogle Scholar
- Aravin AA, Sachidanandam R, Girard A, Toth KF, Hannon GJ: Developmentally regulated piRNA clusters implicate MILI in transposon control. Science. 2007, 316 (5825): 744-747. 10.1126/science.1142612.View ArticlePubMedGoogle Scholar
- Carmell MA, Girard A, van de Kant HJG, Bourc'his D, Bestor TH, de Rooij DG, Hannon GJ: MIWI2 is essential for spermatogenesis and repression of transposons in the mouse male germline. Dev Cell. 2007, 12 (4): 503-514. 10.1016/j.devcel.2007.03.001.View ArticlePubMedGoogle Scholar
- Aravin AA, Sachidanandam R, Bourc'his D, Schaefer C, Pezic D, Toth KF, Bestor T, Hannon GJ: A piRNA pathway primed by individual transposons is linked to de novo DNA methylation in mice. Mol Cell. 2008, 31 (6): 785-799. 10.1016/j.molcel.2008.09.003.View ArticlePubMedPubMed CentralGoogle Scholar
- Miyagawa SK, Watanabe T, Gotoh K, Totoki Y, Toyoda A, Ikawa M, Asada N, Kojima K, Yamaguchi Y, Ijiri T, Hata K, Li E, Matsuda Y, Kimura T, Okabe M, Sakaki Y, Sasaki H, Nakano T: DNA methylation of retrotransposon genes is regulated by Piwi family members MILI and MIWI2 in murine fetal testes. Genes Dev. 2008, 22 (7): 908-917. 10.1101/gad.1640708.View ArticleGoogle Scholar
- Graham Ruby J, Jan C, Player C, Axtell MJ, Lee W, Nusbaum C, Ge H, Bartel DP: Large-scale sequencing reveals 21U-RNAs and additional microRNAs and endogenous siRNAs in C. elegans. Cell. 2006, 127 (6): 1193-1207. 10.1016/j.cell.2006.10.040.View ArticlePubMedGoogle Scholar
- Wang G, Reinke V: A C. elegans Piwi, PRG-1, regulates 21U-RNAs during spermatogenesis. Curr Biol. 2008, 18 (12): 861-867. 10.1016/j.cub.2008.05.009.View ArticlePubMedPubMed CentralGoogle Scholar
- Das PP, Bagijn MP, Goldstein LD, Woolford JR, Lehrbach NJ, Sapetschnig A, Buhecha HR, Gilchrist MJ, Howe KL, Stark R, Matthews N, Berezikov E, Ketting RF, Tavare S, Miska EA: Piwi and piRNAs act upstream of an endogenous siRNA pathway to suppress Tc3 transposon mobility in the Caenorhabditis elegans germline. Mol Cell. 2008, 31: 79-90. 10.1016/j.molcel.2008.06.003.View ArticlePubMedPubMed CentralGoogle Scholar
- Batista PJ, Graham Ruby J, Claycomb JM, Chiang R, Fahlgren N, Kasschau KD, Chaves DA, Gu W, Vasale JJ, Duan S, Conte D, Luo S, Schroth GP, Carrington JC, Bartel DP, Mello CC: PRG-1 and 21U-RNAs interact to form the piRNA complex required for fertility in C. elegans. Mol Cell. 2008, 31: 67-78. 10.1016/j.molcel.2008.06.002.View ArticlePubMedPubMed CentralGoogle Scholar
- Gunawardane LS, Saito K, Nishida KM, Miyoshi K, Kawamura Y, Nagami T, Siomi H, Siomi MC: A slicer-mediated mechanism for repeat-associated siRNA 5' end formation in Drosophila. Science. 2007, 315 (5818): 1587-1590. 10.1126/science.1140494.View ArticlePubMedGoogle Scholar
- Horwich MD, Li C, Matranga C, Vagin V, Farley G, Wang P, Zamore PD: The Drosophila RNA methyltransferase, DmHen1, modifies germline piRNAs and single-stranded siRNAs in RISC. Curr Biol. 2007, 17 (14): 1265-1272. 10.1016/j.cub.2007.06.030.View ArticlePubMedGoogle Scholar
- Kirino Y, Mourelatos Z: 2'-O-methyl modification in mouse piRNAs and its methylase. Nucleic Acids Symp Ser (Oxf). 2007, 417-418. 10.1093/nass/nrm209. 51Google Scholar
- Friedlaender MR, Adamidi C, Han T, Lebedeva S, Isenbarger TA, Hirst M, Marra M, Nusbaum C, Lee WL, Jenkin JC, Alvarado AS, Kim JK, Rajewsky N: High-resolution profiling and discovery of planarian small RNAs. Proc Natl Acad Sci USA. 2009, 106 (28): 11546-11551. 10.1073/pnas.0905222106.View ArticleGoogle Scholar
- Grimson A, Srivastava M, Fahey B, Woodcroft BJ, Rosaria Chiang H, King N, Degnan BM, Rokhsar DS, Bartel DP: Early origins and evolution of microRNAs and Piwi-interacting RNAs in animals. Nature. 2008, 455 (7217): 1193-1197. 10.1038/nature07415.View ArticlePubMedGoogle Scholar
- Betel D, Sheridan R, Marks DS, Sander C: Computational analysis of mouse piRNA sequence and biogenesis. PLoS Comput Biol. 2007, 3 (11): e222-10.1371/journal.pcbi.0030222.View ArticlePubMedPubMed CentralGoogle Scholar
- Robine N, Lau NC, Balla S, Jin Z, Okamura K, Miyagawa SK, Blower MD, Lai EC: A broadly conserved pathway generates 3'UTR-directed primary piRNAs. Curr Biol. 2009, 19 (24): 2066-2076. 10.1016/j.cub.2009.11.064.View ArticlePubMedPubMed CentralGoogle Scholar
- Linsen SEV, de Wit E, de Bruijn E, Cuppen E: Small RNA expression and strain specificity in the rat. BMC Genomics. 2010, 11: 249-10.1186/1471-2164-11-249.View ArticlePubMedPubMed CentralGoogle Scholar
- Murchison EP, Kheradpour P, Sachidanandam R, Smith C, Hodges E, Xuan Z, Kellis M, Gruetzner F, Stark A, Hannon GJ: Conservation of small RNA pathways in platypus. Genome Res. 2008, 18 (6): 995-1004. 10.1101/gr.073056.107.View ArticlePubMedPubMed CentralGoogle Scholar
- Malone CD, Brennecke J, Dus M, Stark A, Richard McCombie W, Sachidanandam R, Hannon GJ: Specialized piRNA pathways act in germline and somatic tissues of the Drosophila ovary. Cell. 2009, 137 (3): 522-535. 10.1016/j.cell.2009.03.040.View ArticlePubMedPubMed CentralGoogle Scholar
- Alefelder S, Patel BK, Eckstein F: Incorporation of terminal phosphorothioates into oligonucleotides. Nucleic Acids Res. 1998, 26 (21): 4983-4988. 10.1093/nar/26.21.4983.View ArticlePubMedPubMed CentralGoogle Scholar
- Berninger P, Gaidatzis D, van Nimwegen E, Zavolan M: Computational analysis of small RNA cloning data. Methods. 2008, 44: 13-21. 10.1016/j.ymeth.2007.10.002.View ArticlePubMedGoogle Scholar
- Olsen AJ, Brennecke J, Aravin AA, Hannon GJ, Sachidanandam R: Analysis of large-scale sequencing of small RNAs. Pac Symp Biocomput. 2008, 126-136.Google Scholar
- Hafner M, Landgraf P, Ludwig J, Rice A, Ojo T, Lin C, Holoch D, Lim C, Tuschl T: Identification of microRNAs and other small regulatory RNAs using cDNA library sequencing. Methods. 2008, 44: 3-12. 10.1016/j.ymeth.2007.09.009.View ArticlePubMedPubMed CentralGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.