Target mimics: an embedded layer of microRNA-involved gene regulatory networks in plants
© Meng et al.; licensee BioMed Central Ltd. 2012
Received: 9 March 2012
Accepted: 17 April 2012
Published: 21 May 2012
Skip to main content
© Meng et al.; licensee BioMed Central Ltd. 2012
Received: 9 March 2012
Accepted: 17 April 2012
Published: 21 May 2012
MicroRNAs (miRNAs) play an essential role in gene regulation in plants. At the same time, the expression of miRNA genes is also tightly controlled. Recently, a novel mechanism called “target mimicry” was discovered, providing another layer for modulating miRNA activities. However, except for the artificial target mimics manipulated for functional studies on certain miRNA genes, only one example, IPS1 (Induced by Phosphate Starvation 1)—miR399 was experimentally confirmed in planta. To date, few analyses for comprehensive identification of natural target mimics have been performed in plants. Thus, limited evidences are available to provide detailed information for interrogating the questionable issue whether target mimicry was widespread in planta, and implicated in certain biological processes.
In this study, genome-wide computational prediction of endogenous miRNA mimics was performed in Arabidopsis and rice, and dozens of target mimics were identified. In contrast to a recent report, the densities of target mimic sites were found to be much higher within the untranslated regions (UTRs) when compared to those within the coding sequences (CDSs) in both plants. Some novel sequence characteristics were observed for the miRNAs that were potentially regulated by the target mimics. GO (Gene Ontology) term enrichment analysis revealed some functional insights into the predicted mimics. After degradome sequencing data-based identification of miRNA targets, the regulatory networks constituted by target mimics, miRNAs and their downstream targets were constructed, and some intriguing subnetworks were further exploited.
These results together suggest that target mimicry may be widely implicated in regulating miRNA activities in planta, and we hope this study could expand the current understanding of miRNA-involved regulatory networks.
MicroRNAs, the most sophisticatedly characterized small RNA (sRNA) species, were shown to play essential regulatory roles in gene expression in plants [1, 2]. Based on the high complementarity of the recognition sites on their targets, the plant miRNAs exert repressive roles mostly through target RNA cleavages at post-transcriptional level . Similar to the protein-coding genes, a dominant portion of miRNA genes are transcribed by RNA polymerase II [3–5]. At the same time, the biogenesis and the activities of these critical small molecules themselves were under tight surveillance transcriptionally or post-transcriptionally .
One novel mechanism involved in modulating miRNA activities in plants was unraveled by Franco-Zorrilla et al. (2007) . A 23-nt-long motif was observed to be highly conserved among the phosphate starvation-induced, non-coding RNAs transcribed from the TPSI family genes including IPS1 and At4. In Arabidopsis, this motif could be recognized by miR399, but could not serve as an effective target cleavage site due to a 3-nt bulge on the “target” RNA sequence opposite the position 10th to 11th nt of miR399 which is the canonical slicing site. Intriguingly, the non-cleavable transcript acts as a target mimic to sequester the corresponding miRNA, thus reducing the active level of miR399. Based on this result, the term “target mimicry” was coined to describe the target mimic—miRNA regulatory relationships. By generating artificial mimics, the authors demonstrated that “target mimicry” might be not only implicated in phosphate signaling, but also in other biological processes, and the mechanisms might be widespread in plants . By using the IPS1 transcript as a scaffold, the subsequent research efforts generated a collection of target mimics in Arabidopsis [8, 9], which were valuable for functional studies on certain miRNA genes.
To date, however, only IPS1—miR399 has been experimentally identified as an example of target mimicry that exists in planta naturally. Although dozens of manipulated target mimics have shown great potential for modulating the activities of specific miRNA genes, the widespread existence of the related mechanism in plants remains to be a pressing question. Only one study by Ivashuta et al. (2011) was performed to partially uncover the natural target mimics of the miRNAs in Arabidopsis . However, no in-depth analysis was performed except for some basic statistical results. Besides, the old version of the miRNA registries (miRBase release 15 previously used vs. miRBase 17 currently available)  and the gene model annotations [TAIR (The Arabidopsis Information Resource) 9 vs. TAIR 10]  utilized in that study, and the exclusion of the currently available non-coding gene information may lead to insufficient exploration on this topic .
Here, by using the latest versions of the gene annotations from TAIR (release 10)  and TIGR rice (The Institute for Genome Research, release 6.1; currently named the J. Craig Venter institute) , genome-wide in silico prediction of potential target mimics was performed for all the registered miRNAs of Arabidopsis and rice in miRBase (release 17) . The miRNAs predicted to be sequestered by certain transcripts were further included for degradome sequencing data-based identification of the downstream targets. Combining these two results, numerous target mimic—miRNA—target regulatory relationships were extracted for comprehensive network construction. Certain subnetworks were further analyzed, and some interesting findings were provided.
The latest versions of gene model annotations of Arabidopsis and rice were retrieved from TAIR  and TIGR rice , respectively, serving as the transcript database for the following prediction. All the miRBase-registered miRNAs of both plants (release 17) were included to search for their complementary sites on the gene transcripts by using the tool Ssearch belonging to the FASTA3 package [13, 14]. Then, the search results were filtered to identify the potential target mimics of certain miRNAs according to the rules established based on the previous experimental experiences [7–9] (see Methods for detailed rule-based filtering). As a result, 300 and 260 mimic—miRNA interactions were identified, involving 137 and 155 different mature miRNAs in Arabidopsis and rice, respectively (Additional file 1: Table S1 and Additional file 2: Table S2). In Ivashuta et al.’s study (2011), only a limited set of non-coding transcripts from TAIR were included for target mimic prediction . Thus, the question whether the non-coding RNAs tend to be more or less likely to function as target mimics needs to be addressed. To interrogate this issue, most currently available non-coding RNA sequences were obtained from Genomic tRNA Database  and NONCODE [16, 17]. A same Ssearch- and rule-based identification of target mimics was carried out. Surprisingly, only one mimic of ath-miR418, tRNA238-LysTTT on chromosome 1, was predicted to be a potential candidate in Arabidopsis (Additional file 1: Table S1).
A large portion of miRNA targets have been demonstrated to be transcription factors in plants, suggesting their important role in gene regulatory cascades . To gain a global functional view of these identified targets of the sequestered miRNAs, GO term enrichment analysis was performed again. As expected, “transcription factor activity” is a highly enriched function possessed by the target sets in both plants (Additional file 6: Figure S3A and S3B). Interestingly, the GO term “hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides” belonging to the “Molecular Function” category was found to be enriched in the rice target set (Additional file 6: Figure S3B). Considering the functional enrichment of the target mimic genes in phosphorus metabolism-related processes in both Arabidopsis and rice, the embedded implication is worth investigating. Moreover, according to the GO annotations, a large portion of the miRNA targets in Arabidopsis were suggested to be involved in the biological processes “RNA interference”, “cell differentiation”, “vegetative (leaf) and reproductive (flower, fruit, and seed) organ development”, and “meristem initiation” (Additional file 6: Figure S3C).
Also based on target prediction and degradome data-based validation, certain target mimics were identified to be regulated by specific miRNAs. For instance, the mimic transcripts AT1G69440.1 and AT5G03545.1 were indicated to be regulated by ath-miR5021 and ath-miR414 respectively (Additional file 7: Figure S4), and LOC_Os02g36880.3 was cleaved by osa-miR164a-f in rice (Additional file 8: Figure S5).
Through target mimic prediction and degradome data-based miRNA target identification, the basic data for establishing the “miRNA—mimic—miRNA—target” regulatory relationships were obtained. Thus, we set out to construct comprehensive networks involving target mimic—miRNA regulations in both Arabidopsis and rice by using Cytoscape . At first glance, 465 nodes (including miRNAs, miRNA targets, and target mimics) were found to be connected by 559 edges in Arabidopsis, and 441 nodes connected by 1048 edges in rice (Additional file 9: Figure S6 and Additional file 10: Figure S7). To demonstrate the biological meanings of the constructed networks, certain subnetworks were further investigated.
Several other interesting subnetworks were also extracted for characterization. For instances, only one member of miR156 families in both plants, i.e. ath-miR156h and osa-miR156k, was identified within the established comprehensive networks (Additional file 13: Figure S9). Both mimics and downstream targets were discovered for the two miR156 genes. Thus, whether these two miR156 family members play a dominant role in specific regulatory pathway in both plants requires further investigations. Based on our computational approach, the target genes of miR172 belonging to the AP2 family were identified based on the significant cleavage signals within the target sites in Arabidopsis (Additional file 4: Figure S1 and Additional file 13: Figure S9). Thus, in addition to the previously reported translational repressive effect of miR172 on the AP2 genes [37, 38], target cleavages of the AP2 transcripts may also play an indispensable role in floral organ development. Within the ath-miR169-mediated subnetwork, the miRNA star species, ath-miR169g*, was found to be potentially regulated by three mimic transcripts, AT1G52060.1, AT4G16070.1 and AT4G16070.2 (Additional file 13: Figure S9). Considering the widespread regulatory activities of miRNA*s unraveled in recent years [39–42], it is reasonable that the active levels of certain miRNA*s should be under strict surveillance through target mimicry. In rice, a similar miR169-invovel subnetwork was identified, although no such mimic—miR169* regulatory relationship existed (Additional file 13: Figure S9). However, based on the TIGR rice annotations, two mimic genes of osa-miR169, LOC_Os05g24010 and LOC_Os09g37800, were suggested to be responsive to stress. Considering the reported involvement of miR169 in drought  and nitrogen starvation  response in rice, the characterized subnetwork may play an essential role in multi-stress-induced response. Moreover, the largest subnetwork involving miR446, miR809 and miR819 was identified in rice (Additional file 13: Figure S9). These three miRNA families seem to be rice-specific according to the current miRBase registries (release 17), and their biological functions have not been characterized. Unfortunately, we could not gain any informative hints from the current annotations of the mimic and target genes within this subnetwork. Hence, it will be interesting to gain functional insights from the established subnetwork through experimental approaches.
Taken together, comprehensive networks constituted by numerous target mimic—miRNA—target regulatory cascades have been constructed in Arabidopsis and rice through in silico target mimic site prediction and degradome data-based target identification. By in-depth characterization of certain interesting subnetworks, the established networks were demonstrated to be relatively reliable and biologically meaningful. Several subnetworks were observed to be conserved between Arabidopsis and rice to some extent (Figure 5, Figure 6, Additional file 12: Figure S8 and Additional file 13: Figure S9). And some might be species-specific. We hope this study could expand the current view of miRNA-mediated regulatory networks in plants, and will inspire more research efforts on the novel regulatory mechanisms for modulating miRNA activities, such as the target mimicry characterized in our analysis.
The degradome sequencing data sets were retrieved from GEO (Gene Expression Omnibus; http://www.ncbi.nlm.nih.gov/geo/)  and NGSDBs (Next-Gen Sequence Databases; http://mpss.udel.edu/) . The accession numbers of these data sets are: (1) Arabidopsis degradome data: GSM278333, GSM278334, GSM278335, and GSM278370 from GEO; and AxIDT, AxIRP, AxSRP, Col, ein5l, TWF, and Tx4F from NGSDBs. (2) rice degradome data: GSM434596, GSM455938, GSM455939, and GSM476257 from GEO. The gene annotation and sequence information of Arabidopsis and rice were retrieved from the FTP sites of TAIR (release 10; ftp://ftp.arabidopsis.org/home/tair/Sequences/blast_datasets/)  and TIGR rice (release 6.1; ftp://ftp.plantbiology.msu.edu/pub/data/Eukaryotic_Projects/o_sativa/annotation_dbs/pseudomolecules/) , respectively. The sequences of the miRNA precursors and the mature miRNAs were downloaded from miRBase (release 17; http://www.mirbase.org/) . The other non-protein-coding RNA sequences were retrieved from Genomic tRNA Database (http://gtrnadb.ucsc.edu/download.html)  and NONCODE (http://www.noncode.org/NONCODERv3/download.htm) [16, 17].
First, Ssearch from the FASTA3 program package [downloaded from the FTP site of EBI (European Bioinformatics Institute), ftp://ftp.ebi.ac.uk/pub/software/unix/fasta/[13, 14] was used to search for the sites in cDNA sequences that were reverse complementary to the miRNAs. Each miRNA of Arabidopsis and rice was included to search against the cDNA sequence library of the corresponding plant species. The cDNA sequences were retrieved from TAIR and TIGR rice, and the miRNAs from miRBase as mentioned above. To retain the predicted site with low complementary to the miRNAs (to allow the identification of target mimics with big bulges within the complementary sites), the first 5,000 Ssearch results for each miRNA were obtained for further identification. To discover the miRNA mimics, we applied the following set of rules referring to the previous experimental results [7–9]: (1) The 3- to 5-nt bulges must exist within the complementary sites of the cDNAs, and the bulges should located in the middle of the corresponding miRNAs (definition of the middle positions: 9th to 11th nt of the 19-nt-long miRNAs; 10th to 11th nt of the 20-nt miRNAs; 10th to 12th nt of the 21-nt ones; 11th to 12th nt of the 22-nt ones; 11th to 13th nt of the 23-nt ones; 12th to 13th nt of the 24-nt ones). (2) For the total mismatches within the non-middle region of each miRNA, no more than 4 were allowed, and the consecutive mismatches should not exceed 2 nt. (3) No bulge was permitted within the non-middle regions of the miRNAs. A Perl script was developed to perform this rule-based screening. The cDNAs satisfied the above criteria were considered to be target mimic candidates.
Based on the TAIR and TIGR rice annotations, only the target mimic sites located on the gene transcripts with 5’ UTR—CDS—3’ UTR structure were included in this analysis. In some cases, one mimic site might be recognized by two or more different miRNAs (especially for the members of the same miRNA families). These sites were considered only once.
Target prediction was performed by using miRU algorithm [21, 22] with default parameters. The degradome sequencing data were utilized to validate the predicted miRNA—target pairs. First, the read counts of all the degradome reads from each library were normalized in order to allow cross-library comparison. The normalized read count (in RPM, reads per million) of a short read from a specific library was calculated by dividing the raw count of this read by the total counts of the library, and then multiplied by 106. Second, all the degradome short reads were mapped to the predicted target transcripts by using BLAST algorithm , and only the perfectly matched reads were retained. Then, two-step filtering was performed to extract the most likely miRNA—target pairs. During the first step, the predicted targets were retained for further validation only if there were three or more degradome reads with identical 5’ ends located within the predicted target binding sites. For this filtering step, all the degradome data sets were utilized at the same time to do a comprehensive screening. It was based on the scenario that a miRNA—target pair was considered to be the candidate once the cleavage signal(s) existed in any data set(s). After the first filtering, the degradome signals along each retained transcript were obtained from the BLAST results to provide a global view of the signal noise when compared to the signal intensity within a specific target binding site. Referring to our previous study , both the global and the local t-plots (target plots) [49, 50] were drawn. Finally, exhaustive manual filtering was performed, and only the transcripts with cleavage signals easy to be recognized were extracted as the miRNA—target pairs.
Induced by Phosphate Starvation 1
The Arabidopsis Information Resource
The Institute for Genome Research
Gene Expression Omnibus
Next-Gen Sequence Databases
Reads per million
We would like to thank all the publicly available datasets and the scientists behind them.
This work was funded by the National Natural Science Foundation of China [31100937, 31125011], the Starting Grant funded by Hangzhou Normal University to Yijun Meng [2011QDL60].
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.