Identification and characterization of new miRNAs cloned from normal mouse mammary gland

Background MicroRNAs (miRNAs) are small non-coding RNAs that have been found to play important roles in silencing target genes and that are involved in the regulation of various normal cellular processes. Until now their implication in the mammary gland biology was suggested by few studies mainly focusing on pathological situations allowing the characterization of miRNAs as markers of breast cancer tumour classes. If in the normal mammary gland, the expression of known miRNAs has been studied in human and mice but the full repertoire of miRNAs expressed in this tissue is not yet available. Results To extend the repertoire of mouse mammary gland expressed miRNAs, we have constructed several libraries of small miRNAs allowing the cloning of 455 sequences. After bioinformatics' analysis, 3 known miRNA (present in miRbase) and 33 new miRNAs were identified. Expression of 24 out of the 33 has been confirmed by RT-PCR. Expression of none of them was found to be mammary specific, despite a tissue-restricted distribution of some of them. No correlation could be established between their expression pattern and evolutionary conservation. Six of them appear to be mouse specific. In several cases, multiple potential precursors of miRNA were present in the genome and we have developed a strategy to determine which of them was able to mature the miRNA. Conclusion The cloning approach has allowed improving the repertoire of miRNAs in the mammary gland, an evolutionary recent organ. This tissue is a good candidate to find tissue-specific miRNAs and to detect miRNA specific to mammals. We provide evidence for 24 new miRNA. If none of them is mammary gland specific, a few of them are not ubiquitously expressed. For the first time 6 mouse specific miRNA have been identified.

ulation of target messager RNAs (mRNAs) on cell cycle arrest. The total estimated number of reasonably conserved miRNAs in vertebrates varies from 250 [2] to 600 [3]. In human, Bentwich et al. [4] suggested that the total number of miRNAs is above 800. The sequences of many miRNAs are conserved among distantly related organisms [5], but recent evidences demonstrated the presence of primate-specific miRNAs [6,7]. miRNAs are transcripts which are cleaved from a ~70 nucleotides hairpin precursor by Dicer [8,9]. They regulate gene expression at the posttranscriptional level through binding to their target mRNAs by base-pairing and subsequently inducing either translational repression or mRNA destabilization [10]. miRNAs are involved in the regulation of various cellular processes, including cell differentiation, cell proliferation, development and apoptosis [11].
Several methods are used to characterize the miRNA expression profiles in specific tissues such as Northern blotting, RNase protection assay, RT-PCR and microarray analyses. All these approaches depend on the prior knowledge of the miRNA sequences. If the accurate profiling of known miRNA expression represents an important tool to investigate physiological and pathological states, the discovery of new miRNAs is still important. Bioinformatics' strategies and miRNA gene prediction algorithms have been used to screen genome sequences and to identify potential miRNAs [ [2], for review [12]]. Schematically, the bioinformatic' approaches scan genomic sequences for the phylogenetic conservation of short nucleotides motifs located within genomic stretches that have the structural characteristics, ie secondary structures, of miRNA precursors. However such gene predictions may not reveal all miRNAs, and might especially miss those that are not phylogenetically conserved. Furthermore, all these in silico predictions require independent experimental validations. In contrast, the cloning approaches allowed the identification of miRNAs without prior knowledge of their sequences [for example [13]], but limit the identification only to those miRNAs present at specific moments in the studied organ.
The mammary gland is a dynamic organ whose structure changes throughout the female reproductive cycle. These successive physiological stages, that are regulated by hormones, growth factor ligands, their receptors and some transcriptional factors, are characterized by proliferation, differentiation and apoptosis of the mammary epithelial tissue which is embedded in the stroma [for review: [14,15]].
An implication of miRNAs in mammary gland biology was suggested by few studies mainly focusing on pathological situations, such as the appearance of breast cancer [ [16] for review]. Some miRNAs were found deregulated in human breast cancers [17][18][19][20][21]. Recently the role of specific miRNAs (miR-206, -221 and -222) in the regulation of the human Estrogen Receptor- in breast cancer cell lines has been demonstrated [22,23]. Moreover some miRNAs expression profiles have been used to identify human breast cancer tumor subclasses [24,25]. In the normal mammary gland the expression of known miRNAs has been studied in human [26] and mouse [27,28] at different developmental stages and found to be regulated. Moreover the identification and characterization of miR-NAs from the bovine mammary tissue by Gu et al. [29] allowed the cloning of 33 distinct miRNAs, including 3 novel ones. Recently, Ibarra et al. [30] showed that several miRNAs are involved in the maintenance of mouse mammary epithelial progenitor cells. The full repertoire of miRNAs expressed in one tissue, in a specific condition or in specific cell types is not yet available and could revealed specific miRNAs. The mammary gland which is an evolutionary recent organ is a good candidate to search for such tissue-specific miRNAs.

Cloning of new miRNAs from mouse mammary gland
Using the small RNA cloning method described by Lagos-Quintana et al. [31], cDNA libraries were constructed from RNA samples of mouse mammary glands collected at different stages (virgin 8 weeks, gestation 2-, 6-and 18days, lactation 4-days and involution 1-day). A total of 455 clones were obtained and sequenced, of which 169, corresponding to 120 non redundant sequences, had inserts ranging between 18 and 28 nt in length (Additional file 1). Most of the remaining clones contained inserts smaller than 17 nt and were not further analysed. The 120 non redundant sequences were compared with known miRNAs by searching the miRBase database (release 12.0). Nine sequences matched with mature miR-NAs, 3 of them perfectly with mouse let-7b (MG030), let-7c (MG070) and human miR-923 (MG041) respectively, 3 with a 1-nt difference with mouse miR-126-5p (MG034), miR-429 (MG018) and human miR-1268 (MG068), and 3 have partial homologies (Additional file 1). Human miR-923 is 100% homologous with the mouse genomic sequence. The above-mentioned 3 sequences that perfectly matched with known miRNAs were thus considered as corresponding to true miRNAs and not further analysed. Thirty-three sequences have partial homologies with 1 or several precursors of different species, among them 17 have partial homology with the miRNA strand, 8 with the miRNA* strand and 8 with the loop (Additional file 1).
One important criteria that distinguishes miRNA from other endogenous small RNA is the ability of their genomic flanking sequences to adopt a hairpin structure with the mature miRNA properly positioned within one of its strand in order to be excised during Dicer processing [32]. The 117 sequences cloned were aligned with the mouse genome using BLAST, 31 of them were not found to be homologous. They were discarded; and considered as being cloning artefacts. For some of them we cannot exclude that the mismatches observed are due to RNA editing [33]. Among the 86 homologous sequences, 15 showed homology with a single location, whereas the others could be located in several genomic regions, leading overall to a total of 441 localisations. To assess which of the 441 regions correspond to potential miRNA genes, their secondary structures were studied using the RNA folding program MFOLD [34]. One hundred and fourteen chromosomal regions, which correspond to 40 cloned sequences, could form a stable hairpin structure. Among the 114 chromosomal regions, 57 were annotated as ribosomal RNA or small nuclear RNA and were discarded. The 57 remaining corresponded to 33 cloned sequences.
Their chromosomal locations and sequence alignments revealed that several cloned sequences were partially overlapping and could derive from the same precursor. In our study 5 families, corresponding to 13 sequences, were detected ( Figure 1). Recently, the presence of variants has been reported after the characterization of cDNA libraries of small RNAs from porcine fibroblast cells [35] and from bovine adipose tissue and mammary gland [29]. The 3'-end variants may thus arise from preferential degradation at the 3' end or from imprecise processing of miRNA precursors by Dicer, thereby generating miRNAs with differing 3' ends [36,37]. However, we cannot exclude the possibility that these miRNA variants originate from multiple genomic loci. The functional specialization of miRNA variants is still unknown. The 33 cloned sequences could derive from 41 precursors (Additional file 2). In fact one cloned sequence could derive from several precursors (Additional file 2, for example MG141 could be issue from 3 precursors). In some cases one cloned sequence possess i) 2 identical precursors localised in 2 different chromosomes (Additional file 2, MG016-13 and -16), ii) 3 identical precursors clustered in one chromosome (MG055) or iii) 2 similar precursors (with 1 or 3 different nt) in two different chromosomes (Additional file 2, MG009/MG037/MG056/MG066-06 and -17, MG016-04 and -1 or MG141-12 and -15, for example).
Overall the precursors are distributed on all the chromosomes but chromosomes 5, 10 and Y ( Figure 2). Chromosome 1 appears to contain more mammary gland miRNA genes than any other chromosome. Our result are in agreement with published results [38,39] that showed that miRNA genes are distributed among all the chromosomes but chromosome Y. Twenty-six of the precursors are localised in intergenic regions, 14 in genes (intron: 11, 5'UTR: 1, 3'UTR: 1 and exon: 1) and 1 corresponds to one Sequence variations of cloned mouse miRNAs Figure 1 Sequence variations of cloned mouse miRNAs. Two to 4 variants were identified for 5 distinct families. The number of clones for each variant is indicated in the parentheses beside the sequence. The sequences of the corresponding precursors are presented in Additional file 2. For several families, more than one predicted precursors were determined by bioinformatics' analysis (Additional file 2).
miRNA non-fully characterised (ENSMUSG00000076325) (Additional file 2). Thus the majority of miRNA genes are part of intergenic sequences (62%) as observed by Ro et al. [40]. By analysing the proximity of the 57 chromosomal locations in the mouse genome, 4 miRNA clusters were observed ( Table 1). The miRNA genes are in the same cluster if they are less than 1000 bp apart on the same chromosome [29]. This physical proximity is consistent with recent reports of miRNA clustering within the human genome [41]. Among the 4 clusters, 2 correspond to the association of two new precursors, MG016-01 and MG016-16, with 2 miRNAs already described mmu-miR-689-1 and mmu-miR-689-2, respectively ( Table 1). The cluster MG055 on chromosome 1 has three repeats of the same precursor.
The evolutionary conservation of the 33 new miRNAs (25 distinct miRNAs + 8 variants; Table 2) was studied by comparison of their sequences with the human, rat, dog, monkey, chicken, Danio rerio, monodelphis and Ornithorhynchus anatinus genome sequences. Only 3 were conserved in all the species studied. Six were not detected in any other species (Table 2). If the phylogenetic conservation of the miRNA sequence is one of the criteria established by Ambros et al. [32] to characterize miRNA, it is important to recall that for some species the sequencing of their genome is not complete enough to detect all the sequences cloned here. It remains that the bioinformatics' analysis performed in this study has allowed identifying mouse specific miRNA.

Expression profiling of the 33 new miRNAs
miRNA expression is generally examined to better understand their physiological function. In this study, the tissue expression profile was used to assess the mammary gland specificity of the new miRNAs. In addition these analyses provide additional evidence for the identification of a bona fide miRNA. Expression of the 33 new miRNAs cloned in this study was analysed using an adapted RT-PCR described by Shi and Chiang [42] and Ro et al. [43] and detailed in the Methods' section. Using this approach, we could detect expression for 22 of them in at least one of the 4 analysed tissues (lactating or involuting mouse mammary gland, brain, muscle and liver, Figure 3 and Table 3). Based upon the multi-tissues expression profiles, these new miRNAs have been grouped into 4 categories: undetected (11 miRNAs), ubiquitously expressed (10 miRNAs), mammary specific (4 miRNAs), and expression in several but not all tissues (8 miRNAs). Among the latter category, one is not expressed in the mammary gland stages used for the RT-PCR experiment (lactation and involution).
The expression profile is the same for the 3 miRNAs present in the cluster MG113/MG130; MG141. The characterization of the undetected miRNAs was completed by studying their expression in some supplementary mammary gland stages (virgin 8 weeks, gestation day-18 or lactation day-4). Among the 11 miRNAs, only 2 have been detected (MG004 and MG123, data not shown). Among the 9 miRNAs undetected 3 correspond to variants of different families. We cannot exclude that these 3 variants are Chromosomal distribution of the miRNA genes identified in this study Figure 2 Chromosomal distribution of the miRNA genes identified in this study. The number of hits represents the number of miRNA genes localized on each chromosome. Thus study did not reveal the occurrence of a specific chromosome encompassing most of the mammary-expressed miRNA loci.
cloning artefact or that this lack of detection is due to individual variation for such miRNA families. In a same family all the variants are not expressed in the same tissues. A first analysis has allowed the identification of 4 mammary gland specific new miRNAs, to confirm this preliminary result their expression has been studied by RT-PCR on more tissues (heart, intestine, ovary, lung, spleen and kidney). Finally they were found not to be mammary gland specific: MG013 is expressed in all the tissues except kidney and heart; and MG056 is expressed in all the tissues except in ovary and kidney; MG119 have been detected in all tissues. MG009 was only found to be expressed in spleen and mammary gland.
In 11 cases the miRNA is detected in mammary gland in involution but not in lactation, a result in agreement with the expression profile of known miRNAs at different stages of the mammary gland biology obtained in our team (unpublished results). However, no correlation has been observed between the tissue-distribution of the miRNA expression (Table 3) and their evolutionary conservation (Table 2), or between their mammary expression profile (Table 3) and their conservation in mammals   ( Table 2). And the 6 miRNAs present only in the mouse genome are expressed in the 4 tissues tested. If in this study no mammary gland specific miRNA have been identified, several new miRNAs which are not ubiquitously expressed have been cloned.

Functional validation of precursors detected by bioinformatics' analysis
The bioinformatics' analysis allowed the detection of potential precursors, but these results could not determine if these hairpin structures will be matured by RNase III, Drosha in the nucleus and Dicer in the cytoplasm. The use of expression vectors in cell culture, allowing the synthesis of hairpin structures that are matured into miRNAs, has already been demonstrated [44,45]. To test the func-tional validity of the precursors obtained by bioinformatics' analysis, we expressed them in transfected cells and checked for the presence of the mature miRNAs. The constructs carrying the precursors (MG008-X and MG053-01) were transiently transfected in COS-7 cells and the expression of the mature miRNAs (MG008 and MG053) was studied by Northern blot analysis ( Figure 4). As a control the precursor of a known miRNA (let-7b) was used (data not shown). This approach could be used to validate precursor(s) of miRNA.

Conclusion
In spite of the development of new tools, as the miRNA arrays, the cloning approaches remain the only strategy to identify miRNAs in a tissue at a specific stage. As the mam- Figure 3 Expression of new miRNAs in different tissues. RT-PCR analysis (+, with reverse transcriptase; -, without reverse transcriptase) of newly miRNAs in brain (B), liver (L), muscle (M) and mammary gland (MG) during lactation (day-10, L10) and involution (day-3, I03) were presented for 6 miRNAs. C samples without cDNA. ma: DNA marker. It is the 1-kb ladder from Gibco-BRL and it is shown using two different UV-light intensities to allow its better visualization.

Expression of new miRNAs in different tissues
mary gland is an evolutionary recent organ, it is a good candidate to search for new tissue-specific miRNAs. Our study provides evidence for the occurrence in this tissue of 3 already known miRNAs and of 33 new mouse miRNAs. Among the 33 new miRNAs, the expression of 24 of them was confirmed by RT-PCR analysis.
If none of them is mammary gland specific, some are not ubiquitous and are good candidates to further analyse their roles in the mammary gland biology.
One of the rules proposed by Ambros et al. [32] to characterize a miRNAs is the phylogenic conservation of the miRNA sequences. In our study, no correlation could be established between the expression and the evolutionary conservation of these new miRNAs. Our result is in agreement with the data obtained by Berezikov et al. [7] showing by comparing the miRNA content of human and chimpanzee brain that evolution of miRNAs is an ongoing process and that along with ancient, highly conserved miRNAs, there are a number of emerging miRNAs. Farh et al. [46] have suggested that the binding sites of miRNAs to 3'UTR do not necessarily have to be conserved among the different species. Therefore the miRNAs identified here that are conserved in non mammalian species could have also a specific role in the mammary gland, as the others. Expression in the different stages of the mammary gland biology and target identification of these new miRNAs will be critical for determining their functions.
This study has allowed the identification of 6 mouse specific miRNAs, reinforcing the yet unique ability of the cloning approach to identify such evolutionary not conserved miRNAs.

Collection of samples
The tissues were isolated from FVB/N mice. The mammary glands were collected at different stages: virgin 8 weeks, gestation 2-, 6-and 18-days, lactation 3-and 4-days and involution 1-and 3-day. For all pregnancy samples, day 0 of pregnancy is the day we observed the vaginal plug. Day 1 of involution was designated as 24 h after the removal of the pups. All mouse manipulations were done follow- x 0 0 x x 0: non detected, x: detected Functional validation of precursors MG008-X and MG053-01 Figure 4 Functional validation of precursors MG008-X and MG053-01. Detection of miRNAs MG008 and MG053 after transfection of COS-7 cell with expression vectors vMG008 and vMG053 by Northern blot analysis. RNAs were extracted from cells transfected with expression vectors (T) and with the control empty vector (C), separated onto a 15% denaturing PAGE and transferred onto a nylon membrane Blots were hybridized with miRNA antisense oligonucleotides and a U6 probe used as an internal loading control.
ing the French Commission de Génie Génétique recommendations.

Small RNA isolation and cloning
For the library construction, small RNAs were isolated using mirVana TMP miRNA isolation kit (Ambion) according the manufacturer's instructions. Small RNA cloning was performed as described by Lagos-Quintana et al. [31] without the step of concatemerization of the PCR products. Briefly, 500 g of RNA were size fractionated using 15% denaturing polyacrylamide gel electrophoresis (PAGE). Excised gel bands were homogenized in NaCl 0.3 M overnight at 4°C to solubilize RNA. 3' adapter (5' phosphorylated) was ligated to the RNA fraction in presence of T4 RNA ligase (Amersham-Pharmacia). The mix was size fractionated using 15% PAGE and a 5' adapter was ligated.
The sample was again size fractionated using 15% PAGE before to be reverse transcribed using primer complementary to the 3' adapter sequence and PCR amplified using primers on both adapters. Amplified products were cloned using pGEM-T vector system I (Promega). Clones with inserts were sequenced.

Bioinformatics' analysis
Small RNA sequences ranging between 18 and 25-nt in length obtained from the libraries were multi-aligned by CLUSTAL W software [47] to exclude redundant sequences [48]. The distinct sequences were used to search in miRBase [49] with BLASTN to identify conserved miRNA [50]. The sequences were mapped in the mouse genome from EnsEMBL mouse genome database [51]. A fragment of ~260 nucleotides genomic sequence flanking the small RNA at both 5' and 3' ends was used for predicting the secondary structure of the miRNA precursor (stemloop formations) using the MFOLD program (version 3.2) [34,52]. If a sequence including a small RNA formed a stem-loop, if the small RNA size ranged from 18-25 nt and had not been registered in the miRBase, we classified it as a new miRNA. The sequences of the new miRNA candidates and their precursors were subjected to a BLASTN search against NCBI genomes to estimate the species conservation.

miRNA expression analysis
Total RNAs were isolated from tissue samples and from COS cells using the RNA NOW kit (Biogentex inc.) according to the manufacturer's protocol with a small modification. The RNAs were precipitated at -20°C overnight by the addition of 3 volumes of ethanol. An equalmolar mix of total RNA from 3 different mice was used for reverse transcription.
Detection was achieved by PCR using a set of primers composed of the universal primer corresponding to the 5' end of the polyTadapter (5'CGAATTCTAGAGCTCGAGGCAGG3') and a primer specific of the miRNA sequence (Additional file 3). Reaction products were separated on 2% agarose gel. The detection of miRNA let-7c has been used as positive control and was indeed detected in all analysed samples (data not shown).
For Northern blot analysis, 20 g of total RNA were fractionated using a 15% PAGE, transferred to Hybond-N+ membrane (Amersham) by capillarity. Blots were hybridized overnight at 55°C with radioactively [-32 P]ATP labeled DNA oligonucleotide probe complementary to miRNA sequences in Phosphate buffer [53], washed twice with 2×SSC at 55°C, and exposed to Phosphor Screen and the StormScan software.

Vector design
The precursor vectors contain the sequences design from MG008 and MG053 precursors (sequence in bold in Additional file 2). For MG053 precursor, the oligonucleotides sense ( 5' CGGGATCCCGAGCGCCGAATCCCCGCCGCGCGTC GCGGCGTG 3' ) and antisense ( 5' CGGGATCCCGGGTCTTCCGTACGCCACATTTCCCAC GCCGCGACGCGCGGCGG 3' ) were annealed, filled with the Klenow fragment enzyme and digested by BamHI. The resulting fragment was inserted into the BamHI site of the pUHD10.3 plasmid [54]. The MG008 precursor fragment was obtained by PCR on mouse genomic DNA using the primer sense ( 5' CGGGATCCCGGTTTCAAAGTTTTGATAGGTTCTACGC ATG 3' ) and antisense ( 5' CGGGATCCCGGCTTCAGCTTTGACTTTCAGAGCACT GGG 3' ) and digested by BamHI before to be cloned into the BamHI site of the pUHD10.3 plasmid. The orientation of the insert was characterized by PCR using the primer sense of the precursor and the primer of the plasmid design in each side of the cloning site (primer rtTA/1: 5' GATGCCCTGGAATTGACGAG 3' and primer glob/2: 5' TATAACATGAATTTTTCAATAGCG 3' ). In these vectors, the precursor is placed under the transcriptional regulation of the CMV promoter, already validated in COS-7 cell culture (personal communication).
Cell culture experiments COS-7 cell were cultured in Dulbecco's Modified Eagle's medium (DMEM) with addition of 10% fetal calf serum, 2 mM glutamine, penicillin 10 U/l and streptomycin 100 U/l at 37°C (5% of CO 2 ). The expression vectors were transfected using the jetPEI™ cationic reagent (Poly-Plus transfection) following the manufacturer's instructions. Cells were harvested 48 h after transfection.

Authors' contributions
NS and LS were responsible for miRNA cloning and precursor validation in cell culture. JL analyzed the miRNA expression in different tissues by RT-PCR. GT has participated in miRNA cloning. SL has participated in expression vector constructions. JC has provided the mice used to collect tissue samples. JLV has participated in the project development and the manuscript elaboration. FLP was responsible for project development, sequence analysis and writes the paper. All contributing authors reviewed and approved the final copy of this manuscript.