Annotation of mammalian primary microRNAs
© Saini et al; licensee BioMed Central Ltd. 2008
Received: 04 June 2008
Accepted: 27 November 2008
Published: 27 November 2008
MicroRNAs (miRNAs) are important regulators of gene expression and have been implicated in development, differentiation and pathogenesis. Hundreds of miRNAs have been discovered in mammalian genomes. Approximately 50% of mammalian miRNAs are expressed from introns of protein-coding genes; the primary transcript (pri-miRNA) is therefore assumed to be the host transcript. However, very little is known about the structure of pri-miRNAs expressed from intergenic regions. Here we annotate transcript boundaries of miRNAs in human, mouse and rat genomes using various transcription features. The 5' end of the pri-miRNA is predicted from transcription start sites, CpG islands and 5' CAGE tags mapped in the upstream flanking region surrounding the precursor miRNA (pre-miRNA). The 3' end of the pri-miRNA is predicted based on the mapping of polyA signals, and supported by cDNA/EST and ditags data. The predicted pri-miRNAs are also analyzed for promoter and insulator-associated regulatory regions.
We define sets of conserved and non-conserved human, mouse and rat pre-miRNAs using bidirectional BLAST and synteny analysis. Transcription features in their flanking regions are used to demarcate the 5' and 3' boundaries of the pri-miRNAs. The lengths and boundaries of primary transcripts are highly conserved between orthologous miRNAs. A significant fraction of pri-miRNAs have lengths between 1 and 10 kb, with very few introns. We annotate a total of 59 pri-miRNA structures, which include 82 pre-miRNAs. 36 pri-miRNAs are conserved in all 3 species. In total, 18 of the confidently annotated transcripts express more than one pre-miRNA. The upstream regions of 54% of the predicted pri-miRNAs are found to be associated with promoter and insulator regulatory sequences.
Little is known about the primary transcripts of intergenic miRNAs. Using comparative data, we are able to identify the boundaries of a significant proportion of human, mouse and rat pri-miRNAs. We confidently predict the transcripts including a total of 77, 58 and 47 human, mouse and rat pre-miRNAs respectively. Our computational annotations provide a basis for subsequent experimental validation of predicted pri-miRNAs.
MicroRNAs (miRNAs) are short (21–23 nt), non-coding RNAs present in diverse organisms that regulate gene expression via the RNA silencing machinery. miRNAs can induce translational repression of a target transcript and/or mRNA degradation depending to some extent on the degree of complementarity between the miRNA and binding sites in the 3' untranslated regions (3'UTR) of its target [1–3]. A number of miRNAs have been implicated in the pathogenesis of human diseases, such as neurodegenerative disorders, cancer, and more recently in viral and metabolic diseases [4–11].
Previous studies have suggested that genes encoding miRNAs are surprisingly long, given the size of the processed mature final product. The miRNA biogenesis process is well-elucidated, and involves two intermediate transcript species [12–15]: The primary transcript (pri-miRNA), which can be several thousand bases long, is cleaved by the ribonuclease enzyme Drosha in the nucleus to a shorter, 70 nt stem-loop structure known as the precursor (pre-) miRNA. A subset of intronic miRNAs, known as mirtrons, bypass Drosha processing and are spliced from the intron [16–18]. The pre-miRNA is exported to the cytoplasm by the export factor Exportin 5 , where it is cleaved by the Dicer enzyme to form the mature miRNA [13, 20]. Finally, the mature miRNA is incorporated into a ribonuclear particle (RNP), which becomes the RNA-induced gene silencing complex (RISC), capable of executing RNA-based gene silencing [21, 22]. A large number of studies have been directed at understanding the processing of mature miRNAs and their target recognition. However, few studies exist pertaining to the structure of the primary miRNA transcripts [14, 23–26]. Indeed, while the genomic coordinates and structures of precursor miRNA and mature miRNAs are easily obtained, there are only a handful of mammalian pri-miRNAs whose complete structures are determined experimentally [25–29]. Thus, there is a need to predict the transcript structure of pri-miRNAs and to demarcate their 5' and 3' boundaries. Such studies will help us to locate transcriptional regulatory motifs, facilitate our understanding of the regulation of miRNA expression and provide information required to make target constructs for miRNA knockouts.
Previous studies attempted to predict the transcript boundaries of pri-miRNAs based on features such as expressed sequence tags (ESTs) and transcription factor binding sites (TFBS) ([30–32]). Recently, we described a large-scale analysis of distribution of transcription features in the flanking regions of human pre-miRNAs . This study showed that many transcription start sites (TSSs) and CpG islands lie within 2 kb of the precursor, but a small number appear to be 10s of kb upstream. Using other features in combination proved to be useful for predicting pri-miRNA boundaries. However, our previous study focused only on human sequences and was able to predict the putative boundaries for a limited set of pri-miRNAs. It is known that miRNAs are well conserved across a wide range of species, so it is of interest to determine whether pri-miRNAs have conserved transcript structures. Furthermore, identifying the consensus features of conserved miRNAs facilitates the prediction of transcript boundaries of a larger set of miRNAs.
We have analyzed a combination of predicted transcriptional features (TSSs, CpG islands and polyadenylation (polyA) signals) and direct evidence (ESTs, cDNAs, cap analysis of gene expression (5' CAGE) and gene identification signature (GIS) ditags) in order to predict the 5' and 3' boundaries of pri-miRNAs. We have used three closely related genomes, human, mouse and rat, to obtain sets of conserved and non-conserved pre-miRNAs using bidirectional BLAST and conserved synteny analysis. Each set is then surveyed for transcription features in their flanking regions, and transcriptional boundaries annotated. We describe here the characteristics of the predicted pri-miRNA transcripts.
Results and discussion
Obtaining conserved pre-miRNAs
Pre-miRNAs from the three genomes (human, mouse and rat) are divided into four groups (i) Group I: pre-miRNAs conserved in all three genomes, (ii) Group II: pre-miRNAs conserved in two of the three genomes, (iii) Group III: pre-miRNAs that are unique to one of the three genomes, but have multiple paralogous copies, and (iv) Group IV: singleton pre-miRNAs unique to one of three genomes.
Group I pre-miRNAs
Distribution of conserved and non-conserved pre-miRNAs in the human, mouse and rat genomes, with respect to protein-coding gene annotation.
Group II pre-miRNAs
We obtained 55 pairs of human-mouse pre-miRNAs, which are not conserved in rat. Similarly, we obtained 3 pairs of human-rat conserved pre-miRNAs and 26 pairs of mouse-rat conserved pre-miRNAs (Table 1).
Group III pre-miRNAs
Paralogous families of human pre-miRNAs.
mir-199a-1 mir-199a-2†† mir-199b†m
mir-376a-2 mir-376a-1†† mir-376b†† mir-376c††
mir-450a-2 mir-450a-1†† mir-450b†m
mir-487a mir-487b†† mir-539†† mir-154††
mir-500 mir-501†† mir-502††
mir-509-1 mir-509-2 mir-509-3 mir-514-1 mir-514-2 mir-514-3 mir-510
mir-513a-1 mir-513a-2 mir-513c mir-513b
mir-520e mir-520c mir-524 mir-515-2 mir-515-1 mir-525 mir-519b mir-519a-2 mir-520f mir-516b-2 mir-519d mir-516a-2 mir-526a-1 mir-517c mir-518e mir-521-2 mir-516a-1 mir-516b-1 mir-518c mir-527 mir-520g mir-526b mir-520b mir-517a mir-519e mir-526a-2 mir-518f mir-522 mir-517b mir-518a-2 mir-519c mir-523 mir-520d mir-521-1 mir-520h mir-518d mir-520a mir-518b mir-518a
mir-570 mir-548c mir-548d-2 mir-548a-3 mir-548b mir-548d-1 mir-603 mir-548a-1 mir-548a-2
mir-941-1 mir-941-2 mir-941-3 mir-941-1
Paralogous families of mouse pre-miRNAs.
mir-135a-1 mir-135a-2†† mir-135b††
mir-199a-1 mir-199b †h mir-199a-2††
mir-465a mir-465b-2 mir-465b-1 mir-465c-1 mir-465c-2†h
mir-466a mir-466e mir-466c mir-466b-1 mir-466b-3 mir-466b-2 mir-466d mir-466h mir-466g mir-669a-1 mir-669a-2 mir-669a-3 mir-466f-1 mir-669c mir-669b mir-297b mir-297c mir-466f-3 mir-466f-2 mir-297a-4 mir-297a-3 mir-297a-5 mir-297a-2
mir-467a mir-467b mir-467c mir-467d mir-467e
mir-680-1 mir-680-2 mir-680-3
Paralogous families of rat pre-miRNAs.
Group IV pre-miRNAs
There are 154 singleton human miRNAs with no defined homologs. We also find 66 mouse and 5 rat singleton miRNAs (Table 1). These may represent species-specific miRNAs. It is also likely that with ongoing miRNA discovery and the addition of new sequences to miRBase, some singleton miRNAs may find relationships to new miRNAs.
Annotation of pri-miRNAs
We analyzed different transcriptional features in the flanking regions of miRNAs, in order to predict the putative boundaries of their primary transcripts. It is widely assumed that intronic miRNAs are generally transcribed coincidentally with their host genes. The pri-miRNA in these cases is therefore the host protein-coding transcript. We therefore focus on predicting primary transcripts of intergenic miRNAs (that is between protein-coding gene annotations). The 5' ends of pri-miRNAs are annotated based on the mappings of predicted TSS, CpG islands and 5' CAGE tags to the upstream flanking regions. Similarly, the 3' end is demarcated based on predicted polyA signals and 3' ditags in the downstream flanking region. Further, these predictions are supported by transcriptional evidence, either from cDNA or ESTs. Highly confident annotations are obtained for 59 pri-miRNAs, with 36 pri-miRNAs conserved in all 3 species (Group I), 4 pri-miRNAs conserved in only two species (Group II), 15 human unique pri-miRNAs and 4 mouse unique pri-miRNAs. The predicted transcript structures are also analyzed for functional regulatory regions such as promoter-associated regulatory sequences and CTCF-enriched insulator sites surrounding the putative 5' ends.
Group I Pri-miRNAs
Polycistronic clusters of intergenic miRNAs with complete EST/cDNA coverage.
Human: CR737132, DB266639, DA2895925, BI752321, AA631714
Rat: AW919398, BF2869095, AI008234
Human: DA528985, BX355821
Mouse: BE332980, CA874578
Mouse: AK081202, BC058715
Human: AI801869, CB961518, CB991710, BU729805, CB996698, BM702754
Human: DA545600, DA579531, DA474693, DA558986, DA600978
Mouse: BB657503, BM936455
Rat: BF412891, BF412890, BF412889, BF412895
Human: DA706043, DA721080
Rat: BF559199, BI274699
Mouse: BC027389, AK035525, BC076616, AK085125
Human: BG612167, BU932403, BG613187, BG500819
Human: BC022349, BC022282, BC070292, BC026275, BC055417, AF264787
Mouse: AI789372, BY718835
Mouse: AK134888, AF380423, AF380425, AK080165
Human: AI969882, AI695443, AA863395, BM855863.1, AA863389
Human: DA685273, AL698517, DA246751, DA755860, CF994086, DA932670, DA182706
Further, we analyzed the regulatory features such as promoters and insulator sequences in the upstream region of the predicted human pri-let-7i. Insulators are sequences located between enhancers and promoters of adjacent genes and prevent an enhancer from inappropriately binding to and activating the promoter of a neighbouring gene. In vertebrates, insulator function requires association with CCCTC factor (CTCF) binding sites. The normalized chromatin immunoprecipitation genome-tiling (ChIP-chip) array scores for CTCF binding sites and the sequence conservation of the regulatory features in the upstream regions of mouse and rat pri-let-7i are shown (Figure 3b). We identify promoter sequences and CTCF binding sites spanning a region from 61,282.5 kb to 61,283.5 kb, ~1 kb upstream of the predicted 5' end of pri-let-7i. The corresponding regions in mouse and rat show a strong conservation in relative position, suggesting the putative promoter regions. Analyzing these regions using the UCSC conserved transcription factors track allowed us to identify two conserved transcription factors binding sites: activating transcription factor 6 (ATF6) and upstream transcription factor 1 (USF1) located at ~61,282.8 kb, which may be important for let-7i expression. However, delineating the transcription factors that bind in the promoter region requires further analysis and experimental validations.
We predicted the consensus secondary structure of pri-let-7i based on the sequence alignments of human, mouse and rat sequences, using RNAalifold (Figure 3c) . The conserved pair residues are marked in red. It can be seen that the stem segments immediately flanking the pre-miRNA are conserved (blue box). Previous studies have also shown that the sequences flanking the miRNA hairpin are important for miRNA biogenesis [16, 36]. In particular, the stem extension located immediately adjacent to the pre-miRNA hairpin and the single-stranded basal segments at the ends are required for efficient processing by Drosha [37, 38].
Group II pri-miRNAs
We annotate the boundaries of four pri-miRNAs conserved in 2 of the 3 genomes. Among them are two polycistronic transcripts (miR-15a~16-1 and miR-193b~365-1), and two expressing single miRNAs (miR-148a and miR-155). Figure 2 shows the predicted length of the pri-miRNAs and the features supporting them. The predicted genomic coordinates of pri-miRNAs are provided in Additional file 1. Here, we describe in detail the annotation of the pri-miRNA containing miR-15a and miR-16-1.
PolyA signals 'AATAAA', 'ATTAAA' and 'TATAAA' are predicted at an average distance of 4,695 bp and 4,595 bp from the 3' end of miR-16-1 in human and mouse respectively. The 3' end is also supported by ditags in human (U_144334, U_1281401 and U_141201), 4,208 bp from the 3' end of miR-16-1, and by ESTs and cDNAs in both human and mouse (Figure 6). We conclude that the 3' boundaries of human and mouse pri-miRNAs are similar, but that the length of the 5' upstream transcript is significantly different. The respective genomic coordinates and predicted lengths of pri-miRNAs are shown (Figure 2 and table S1). We identify promoter and CTCF binding sites (average tiling array score = 1.36) ~150 bp upstream of the predicted TSS in human, with the corresponding region conserved in mouse (Figure 6).
These data agree with previous annotation by the VEGA project of non-protein-coding transcripts (accessions: OTTHUMT00000044959 and OTTHUMT00000044961) expressing miR-15a and miR-16-1 in human, called DLEU2 . This region has been shown to be deleted or down-regulated in chronic lymphocytic leukaemia cases .
Species-specific (Group III and IV) pri-miRNAs
Characteristics of Predicted Pri-miRNAs
By analyzing the transcriptional features mapped in the upstream and downstream flanking regions surrounding the precursor miRNAs, we are able to characterize the 5' and 3' ends and lengths of their primary transcripts. Several observations can be made from these analyses.
Mapping of 5'CAGE tags and prediction of polyA signals in the flanking sequences of precursor miRNAs in human, mouse and rat clearly indicate that the pri-miRNAs are both 5' capped and polyadenylated. This provides strong evidence that the major fraction of mammalian miRNAs is transcribed by RNA polymerase II (pol II). The distribution of pol II TSS predictions also supports this assumption. Previous studies have also reported that the pol II is the major polymerase of human miRNA transcription [25, 26]. However, a small number of miRNAs lying within Alu repeats have been reported to be transcribed by pol III .
We have examined the exon-intron organization of predicted pri-miRNAs based on EST/cDNA alignments. ESTs or cDNAs spanning the entire pre-miRNA reveal that pri-miRNAs have conventional exon-intron structures, although they appear to contain fewer introns than protein-coding messages. 44% (26/59) of our annotated pri-miRNAs have good EST/cDNA alignments across the entire transcript. 92% of these have fewer than four introns (mean number of introns per transcript = 0.74). 6 pri-miRNAs are intronless. For example, the cluster mmu-mir-144~451 is overlapped by a full length cDNA 'AK158085.1', whose 5'/3' ends coincides with ditags. The set of unspliced transcripts also include pri-mir-21, which was previously shown experimentally to be intronless . About 50% of the predicted pri-miRNAs have only one intron. For example pri-mir-196a-1 has one full-length cDNA with its 5'/3' ends coinciding exactly with predicted TSS, polyA and ditags.
Previously, very little data has been presented regarding the primary transcript structures of miRNAs. We have systematically annotated the primary transcripts of human, mouse and rat intergenic miRNAs using various transcription-related features. The 5' end of the primary transcript is predicted based on mapped TSS, CpG and 5' CAGE tags in the upstream region. The 3' end is predicted based on the mapping of polyA signals, and supported by multiple ESTs/cDNA mappings. In addition, the complete transcript structure is also supported by mapped ends of ditags. Using conservation and synteny, we are able to identify the boundaries of a significant proportion of mammalian miRNAs. We show that the transcription features in the flanking regions around conserved miRNAs have similar distribution and exhibit similar transcript structure in the three genomes. The results also indicate that a significant fraction of pri-miRNAs have lengths between 1 and 10 kb. Previous experimental studies of pri-miRNAs have also identified transcript lengths of 1–4 kb [25, 26, 29]. However, we also identify a small number of pri-miRNA candidates with exceptional length – up to 100s of kb. While pri-miRNAs are significantly shorter on average than protein-coding messages (including those with intronic miRNAs), the disparity between the length of the transcribed sequence and the final functional product is startling. It remains to be seen whether long non-protein-coding pri-miRNAs have function in addition to that of the miRNA itself.
Obtaining Pre-miRNAs sequences
The sequences and genomic coordinates of human, mouse and rat pre-miRNAs were obtained from miRBase::Sequences (version 10.0) . The human, mouse and rat genome annotations were obtained from Ensembl release 48 . miRNAs located outside of Ensembl transcripts were classified as "intergenic", while those overlapping annotated transcripts were classified as "intronic".
Obtaining conserved miRNAs
We identified a set of conserved pre-miRNAs between human, mouse and rat. Reciprocal-best BLAST hits highlighted miRNA pairs that are best matched to each other. Each miRNA pair was subjected to synteny analysis, using Ensembl Compara . Pairs were retained for subsequent analysis if the neighboring genes of the pre-miRNA in one species had one-to-one matches to the neighboring genes of orthologous pre-miRNA in the other species. Pre-miRNAs with no reciprocal hits are further classified as paralogs if they have homologs in the same genome, but no homologs in the other two genomes using all-against-all BLAST.
Obtaining flanking regions
The upstream and downstream flanking sequences around human, mouse and rat pre-miRNAs were obtained from Ensembl using the Perl API (release 48), representing genome assemblies NCBI 36, NCBI m37 and RGSC 3.4 respectively. For intergenic miRNAs, we truncated the flanking region if it overlapped with any neighboring Ensembl-annotated transcript.
We analyzed seven different transcriptional features: transcription start sites (TSSs), CpG islands, ESTs, cDNAs, polyA signals, 5'CAGE and GIS-PET as described previously . CAGE tags are 20- or 21-nt sequence tags that are derived from the mRNA sequenced in the proximity of the cap site and their mapping to unique genomic sequences identifies TSSs . Ditags are 5' and 3' signatures of a full-length transcript and thus are useful in defining the transcript boundaries . Additionally, regulatory features such as promoters and insulators were obtained from the Ensembl Functional Genomics database (Release 48), which includes experimental data from experiments such as DNaseI hypersensitivity sites and CTCF binding sites [46, 49, 50]. The conserved transcription factor binding sites in promoter regions are obtained from UCSC genome browser . CTCF binding sites in the human genome are obtained from ChIP-chip experiments .
The pri-miRNA annotations are available to the public as DAS sources  for viewing in the Ensembl genome browser or other DAS clients (http://das.sanger.ac.uk/das/hsaprimiRNA, http://das.sanger.ac.uk/das/mmuprimiRNA and http://das.sanger.ac.uk/das/rnoprimiRNA). The feature sets used to annotate pri-miRNAs here are also available through the Genomics section of the miRBase database http://microrna.sanger.ac.uk/sequences/genomics.shtml.
We thank members of Team101 at the Wellcome Trust Sanger Institute for useful discussion and advice. HKS was supported by a GlaxoSmithKline postdoctoral fellowship. AJE was supported by the Wellcome Trust and SG-J was supported by the University of Manchester.
- Bagga S, Bracht J, Hunter S, Massirer K, Holtz J, Eachus R, Pasquinelli AE: Regulation by let-7 and lin-4 miRNAs results in target mRNA degradation. Cell. 2005, 122 (4): 553-563. 10.1016/j.cell.2005.07.031.PubMedView ArticleGoogle Scholar
- Lai EC: Micro RNAs are complementary to 3' UTR sequence motifs that mediate negative post-transcriptional regulation. Nat Genet. 2002, 30 (4): 363-364. 10.1038/ng865.PubMedView ArticleGoogle Scholar
- Giraldez AJ, Mishima Y, Rihel J, Grocock RJ, van Dongen S, Inoue K, Enright AJ, Schier AF: Zebrafish MiR-430 promotes deadenylation and clearance of maternal mRNAs. Science. 2006, 312 (5770): 75-79. 10.1126/science.1122689.PubMedView ArticleGoogle Scholar
- Alvarez-Garcia I, Miska EA: MicroRNA functions in animal development and human disease. Development. 2005, 132 (21): 4653-4662. 10.1242/dev.02073.PubMedView ArticleGoogle Scholar
- Calin GA, Sevignani C, Dumitru CD, Hyslop T, Noch E, Yendamuri S, Shimizu M, Rattan S, Bullrich F, Negrini M, Croce CM: Human microRNA genes are frequently located at fragile sites and genomic regions involved in cancers. Proc Natl Acad Sci USA. 2004, 101 (9): 2999-3004. 10.1073/pnas.0307323101.PubMedPubMed CentralView ArticleGoogle Scholar
- Mishima Y, Stahlhut C, Giraldez AJ: miR-1-2 gets to the heart of the matter. Cell. 2007, 129 (2): 247-249. 10.1016/j.cell.2007.04.008.PubMedView ArticleGoogle Scholar
- Caudy AA, Myers M, Hannon GJ, Hammond SM: Fragile X-related protein and VIG associate with the RNA interference machinery. Genes Dev. 2002, 16 (19): 2491-2496. 10.1101/gad.1025202.PubMedPubMed CentralView ArticleGoogle Scholar
- Calin GA, Croce CM: MicroRNA signatures in human cancers. Nat Rev Cancer. 2006, 6 (11): 857-866. 10.1038/nrc1997.PubMedView ArticleGoogle Scholar
- Mattes J, Collison A, Foster PS: Emerging role of microRNAs in disease pathogenesis and strategies for therapeutic modulation. Curr Opin Mol Ther. 2008, 10 (2): 150-157.PubMedGoogle Scholar
- Miska EA: How microRNAs control cell division, differentiation and death. Curr Opin Genet Dev. 2005, 15 (5): 563-568. 10.1016/j.gde.2005.08.005.PubMedView ArticleGoogle Scholar
- Scaria V, Hariharan M, Pillai B, Maiti S, Brahmachari SK: Host-virus genome interactions: macro roles for microRNAs. Cell Microbiol. 2007, 9 (12): 2784-2794. 10.1111/j.1462-5822.2007.01050.x.PubMedView ArticleGoogle Scholar
- Bartel DP: MicroRNAs: genomics, biogenesis, mechanism, and function. Cell. 2004, 116 (2): 281-297. 10.1016/S0092-8674(04)00045-5.PubMedView ArticleGoogle Scholar
- Kim VN: MicroRNA biogenesis: coordinated cropping and dicing. Nat Rev Mol Cell Biol. 2005, 6 (5): 376-385. 10.1038/nrm1644.PubMedView ArticleGoogle Scholar
- Cullen BR: Transcription and processing of human microRNA precursors. Mol Cell. 2004, 16 (6): 861-865. 10.1016/j.molcel.2004.12.002.PubMedView ArticleGoogle Scholar
- Pasquinelli AE, Hunter S, Bracht J: MicroRNAs: a developing story. Curr Opin Genet Dev. 2005, 15 (2): 200-205. 10.1016/j.gde.2005.01.002.PubMedView ArticleGoogle Scholar
- Lee Y, Ahn C, Han J, Choi H, Kim J, Yim J, Lee J, Provost P, Radmark O, Kim S, Kim VN: The nuclear RNase III Drosha initiates microRNA processing. Nature. 2003, 425 (6956): 415-419. 10.1038/nature01957.PubMedView ArticleGoogle Scholar
- Han J, Lee Y, Yeom KH, Kim YK, Jin H, Kim VN: The Drosha-DGCR8 complex in primary microRNA processing. Genes Dev. 2004, 18 (24): 3016-3027. 10.1101/gad.1262504.PubMedPubMed CentralView ArticleGoogle Scholar
- Ruby JG, Jan CH, Bartel DP: Intronic microRNA precursors that bypass Drosha processing. Nature. 2007, 448 (7149): 83-86. 10.1038/nature05983.PubMedPubMed CentralView ArticleGoogle Scholar
- Lund E, Guttinger S, Calado A, Dahlberg JE, Kutay U: Nuclear export of microRNA precursors. Science. 2004, 303 (5654): 95-98. 10.1126/science.1090599.PubMedView ArticleGoogle Scholar
- Hutvagner G, McLachlan J, Pasquinelli AE, Balint E, Tuschl T, Zamore PD: A cellular function for the RNA-interference enzyme Dicer in the maturation of the let-7 small temporal RNA. Science. 2001, 293 (5531): 834-838. 10.1126/science.1062961.PubMedView ArticleGoogle Scholar
- Hammond SM, Bernstein E, Beach D, Hannon GJ: An RNA-directed nuclease mediates post-transcriptional gene silencing in Drosophila cells. Nature. 2000, 404 (6775): 293-296. 10.1038/35005107.PubMedView ArticleGoogle Scholar
- Martinez J, Patkaniowska A, Urlaub H, Luhrmann R, Tuschl T: Single-stranded antisense siRNAs guide target RNA cleavage in RNAi. Cell. 2002, 110 (5): 563-574. 10.1016/S0092-8674(02)00908-X.PubMedView ArticleGoogle Scholar
- Saini HK, Griffiths-Jones S, Enright AJ: Genomic analysis of human microRNA transcripts. Proc Natl Acad Sci USA. 2007, 104 (45): 17719-17724. 10.1073/pnas.0703890104.PubMedPubMed CentralView ArticleGoogle Scholar
- Rodriguez A, Griffiths-Jones S, Ashurst JL, Bradley A: Identification of mammalian microRNA host genes and transcription units. Genome Res. 2004, 14: 1902-1910. 10.1101/gr.2722704.PubMedPubMed CentralView ArticleGoogle Scholar
- Lee Y, Kim M, Han J, Yeom KH, Lee S, Baek SH, Kim VN: MicroRNA genes are transcribed by RNA polymerase II. Embo J. 2004, 23 (20): 4051-4060. 10.1038/sj.emboj.7600385.PubMedPubMed CentralView ArticleGoogle Scholar
- Cai X, Hagedorn CH, Cullen BR: Human microRNAs are processed from capped, polyadenylated transcripts that can also function as mRNAs. Rna. 2004, 10 (12): 1957-1966. 10.1261/rna.7135204.PubMedPubMed CentralView ArticleGoogle Scholar
- Zeng Y, Cullen BR: Recognition and cleavage of primary microRNA transcripts. Methods Mol Biol. 2006, 342: 49-56.PubMedGoogle Scholar
- Tam W: Identification and characterization of human BIC, a gene on chromosome 21 that encodes a noncoding RNA. Gene. 2001, 274 (1–2): 157-167. 10.1016/S0378-1119(01)00612-6.PubMedView ArticleGoogle Scholar
- Bracht J, Hunter S, Eachus R, Weeks P, Pasquinelli AE: Trans-splicing and polyadenylation of let-7 microRNA primary transcripts. Rna. 2004, 10 (10): 1586-1594. 10.1261/rna.7122604.PubMedPubMed CentralView ArticleGoogle Scholar
- Smalheiser NR: EST analyses predict the existence of a population of chimeric microRNA precursor-mRNA transcripts expressed in normal human and mouse tissues. Genome Biol. 2003, 4 (7): 403-10.1186/gb-2003-4-7-403.PubMedPubMed CentralView ArticleGoogle Scholar
- Gu J, He T, Pei Y, Li F, Wang X, Zhang J, Zhang X, Li Y: Primary transcripts and expressions of mammal intergenic microRNAs detected by mapping ESTs to their flanking sequences. Mamm Genome. 2006, 17 (10): 1033-1041. 10.1007/s00335-006-0007-9.PubMedView ArticleGoogle Scholar
- Zhou X, Ruan J, Wang G, Zhang W: Characterization and identification of microRNA core promoters in four model species. PLoS Comput Biol. 2007, 3 (3): e37-10.1371/journal.pcbi.0030037.PubMedPubMed CentralView ArticleGoogle Scholar
- Dike S, Balija VS, Nascimento LU, Xuan Z, Ou J, Zutavern T, Palmer LE, Hannon G, Zhang MQ, McCombie WR: The mouse genome: experimental examination of gene predictions and transcriptional start sites. Genome Res. 2004, 14 (12): 2424-2429. 10.1101/gr.3158304.PubMedPubMed CentralView ArticleGoogle Scholar
- Carninci P, Sandelin A, Lenhard B, Katayama S, Shimokawa K, Ponjavic J, Semple CA, Taylor MS, Engstrom PG, Frith MC, Forrest AR, Alkema WB, Tan SL, Plessy C, Kodzius R, Ravasi T, Kasukawa T, Fukuda S, Kanamori-Katayama M, Kitazume Y, Kawaji H, Kai C, Nakamura M, Konno H, Nakano K, Mottagui-Tabar S, Arner P, Chesi A, Gustincich S, Persichetti F, Suzuki H, Grimmond SM, Wells CA, Orlando V, Wahlestedt C, Liu ET, Harbers M, Kawai J, Bajic VB, Hume DA, Hayashizaki Y: Genome-wide analysis of mammalian promoter architecture and evolution. Nat Genet. 2006, 38 (6): 626-635. 10.1038/ng1789.PubMedView ArticleGoogle Scholar
- Hofacker IL: RNA consensus structure prediction with RNAalifold. Methods Mol Biol. 2007, 395: 527-544.PubMedView ArticleGoogle Scholar
- Zeng Y, Cullen BR: Sequence requirements for micro RNA processing and function in human cells. Rna. 2003, 9 (1): 112-123. 10.1261/rna.2780503.PubMedPubMed CentralView ArticleGoogle Scholar
- Zeng Y, Cullen BR: Efficient processing of primary microRNA hairpins by Drosha requires flanking nonstructured RNA sequences. J Biol Chem. 2005, 280 (30): 27595-27603. 10.1074/jbc.M504714200.PubMedView ArticleGoogle Scholar
- Han J, Lee Y, Yeom KH, Nam JW, Heo I, Rhee JK, Sohn SY, Cho Y, Zhang BT, Kim VN: Molecular basis for the recognition of primary microRNAs by the Drosha-DGCR8 complex. Cell. 2006, 125 (5): 887-901. 10.1016/j.cell.2006.03.043.PubMedView ArticleGoogle Scholar
- Wilming LG, Gilbert JG, Howe K, Trevanion S, Hubbard T, Harrow JL: The vertebrate genome annotation (Vega) database. Nucleic Acids Res. 2008, D753-760. 36 DatabaseGoogle Scholar
- Calin GA, Cimmino A, Fabbri M, Ferracin M, Wojcik SE, Shimizu M, Taccioli C, Zanesi N, Garzon R, Aqeilan RI, Alder H, Volinia S, Rassenti L, Liu X, Liu CG, Kipps TJ, Negrini M, Croce CM: MiR-15a and miR-16-1 cluster functions in human leukemia. Proc Natl Acad Sci USA. 2008, 105 (13): 5166-5171. 10.1073/pnas.0800121105.PubMedPubMed CentralView ArticleGoogle Scholar
- Borchert GM, Lanier W, Davidson BL: RNA polymerase III transcribes human microRNAs. Nat Struct Mol Biol. 2006, 13 (12): 1097-1101. 10.1038/nsmb1167.PubMedView ArticleGoogle Scholar
- Altuvia Y, Landgraf P, Lithwick G, Elefant N, Pfeffer S, Aravin A, Brownstein MJ, Tuschl T, Margalit H: Clustering and conservation patterns of human microRNAs. Nucleic Acids Res. 2005, 33 (8): 2697-2706. 10.1093/nar/gki567.PubMedPubMed CentralView ArticleGoogle Scholar
- Baskerville S, Bartel DP: Microarray profiling of microRNAs reveals frequent coexpression with neighboring miRNAs and host genes. Rna. 2005, 11 (3): 241-247. 10.1261/rna.7240905.PubMedPubMed CentralView ArticleGoogle Scholar
- Prlic A, Down TA, Kulesha E, Finn RD, Kahari A, Hubbard TJ: Integrating sequence and structural biology with DAS. BMC Bioinformatics. 2007, 8: 333-10.1186/1471-2105-8-333.PubMedPubMed CentralView ArticleGoogle Scholar
- Griffiths-Jones S, Saini HK, van Dongen S, Enright AJ: miRBase: tools for microRNA genomics. Nucleic Acids Res. 2008, D154-158. 36 DatabaseGoogle Scholar
- Flicek P, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, Clarke L, Coates G, Cunningham F, Cutts T, Down T, Dyer SC, Eyre T, Fitzgerald S, Fernandez-Banet J, Gräf S, Haider S, Hammond M, Holland R, Howe KL, Howe K, Johnson N, Jenkinson A, Kähäri A, Keefe D, Kokocinski F, Kulesha E, Lawson D, Longden I, Megy K, Meidl P, Overduin B, Parker A, Pritchard B, Prlic A, Rice S, Rios D, Schuster M, Sealy I, Slater G, Smedley D, Spudich G, Trevanion S, Vilella AJ, Vogel J, White S, Wood M, Birney E, Cox T, Curwen V, Durbin R, Fernandez-Suarez XM, Herrero J, Hubbard TJ, Kasprzyk A, Proctor G, Smith J, Ureta-Vidal A, Searle S: Ensembl 2008. Nucleic Acids Res. 2008, D707-714. 36 DatabaseGoogle Scholar
- Shiraki T, Kondo S, Katayama S, Waki K, Kasukawa T, Kawaji H, Kodzius R, Watahiki A, Nakamura M, Arakawa T, Fukuda S, Sasaki D, Podhajska A, Harbers M, Kawai J, Carninci P, Hayashizaki Y: Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc Natl Acad Sci USA. 2003, 100 (26): 15776-15781. 10.1073/pnas.2136655100.PubMedPubMed CentralView ArticleGoogle Scholar
- Ng P, Wei CL, Sung WK, Chiu KP, Lipovich L, Ang CC, Gupta S, Shahab A, Ridwan A, Wong CH, Liu ET, Ruan Y: Gene identification signature (GIS) analysis for transcriptome characterization and genome annotation. Nat Methods. 2005, 2 (2): 105-111. 10.1038/nmeth733.PubMedView ArticleGoogle Scholar
- Crawford GE, Holt IE, Whittle J, Webb BD, Tai D, Davis S, Margulies EH, Chen Y, Bernat JA, Ginsburg D, Zhou D, Luo S, Vasicek TJ, Daly MJ, Wolfsberg TG, Collins FS: Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS). Genome Res. 2006, 16 (1): 123-131. 10.1101/gr.4074106.PubMedPubMed CentralView ArticleGoogle Scholar
- Kim TH, Abdullaev ZK, Smith AD, Ching KA, Loukinov DI, Green RD, Zhang MQ, Lobanenkov VV, Ren B: Analysis of the vertebrate insulator protein CTCF-binding sites in the human genome. Cell. 2007, 128 (6): 1231-1245. 10.1016/j.cell.2006.12.048.PubMedPubMed CentralView ArticleGoogle Scholar
- Karolchik D, Hinrichs AS, Kent WJ: The UCSC Genome Browser. Curr Protoc Bioinformatics. 2007, Chapter 1:Google Scholar
- Kim TH, Barrera LO, Zheng M, Qu C, Singer MA, Richmond TA, Wu Y, Green RD, Ren B: A high-resolution map of active promoters in the human genome. Nature. 2005, 436 (7052): 876-880. 10.1038/nature03877.PubMedPubMed CentralView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.