Transcriptome map of plant mitochondria reveals islands of unexpected transcribed regions
© Fujii et al; licensee BioMed Central Ltd. 2011
Received: 11 January 2011
Accepted: 1 June 2011
Published: 1 June 2011
Skip to main content
© Fujii et al; licensee BioMed Central Ltd. 2011
Received: 11 January 2011
Accepted: 1 June 2011
Published: 1 June 2011
Plant mitochondria contain a relatively large amount of genetic information, suggesting that their functional regulation may not be as straightforward as that of metazoans. We used a genomic tiling array to draw a transcriptomic atlas of Oryza sativa japonica (rice) mitochondria, which was predicted to be approximately 490-kb long.
Whereas statistical analysis verified the transcription of all previously known functional genes such as the ones related to oxidative phosphorylation, a similar extent of RNA expression was frequently observed in the inter-genic regions where none of the previously annotated genes are located. The newly identified open reading frames (ORFs) predicted in these transcribed inter-genic regions were generally not conserved among flowering plant species, suggesting that these ORFs did not play a role in mitochondrial principal functions. We also identified two partial fragments of retrotransposon sequences as being transcribed in rice mitochondria.
The present study indicated the previously unexpected complexity of plant mitochondrial RNA metabolism. Our transcriptomic data (Oryza sativa Mitochondrial rna Expression Server: OsMES) is publicly accessible at [http://bioinf.mind.meiji.ac.jp/cgi-bin/gbrowse/OsMes/#search].
The obvious expansion of its genomic size indicates that higher plant mitochondria experienced a dramatic evolution. The common size of mitochondria genetic information is limited to approximately 16 kb in metazoans , whereas in higher plants the sequence length can be 200-2400 kb (Additional file 1) . The principal role of mitochondria (i.e. oxidative phosphorylation) is undoubtedly shared between metazoans and higher plants. The presence of three genes encoding subunits of ATP synthase (atp6 and atp8), three genes encoding subunits of cytochrome oxidase (cox1-cox3), cytochrome b (cob), and seven genes for NADH dehydrogenase (nad1-nad4, nad4L, nad5 and nad6) is indeed conserved in mitochondria of both kingdoms. What is extra in plant mitochondria compared to that of metazoans are only a few more respiratory-related genes (including atp1, atp9, nad7 and nad9) and dozens of genes encoding ribosomal subunits (rps or rpl). Thus, usually a higher plant mitochondrion encodes about 40 genes with known functions, whereas in most cases there are 13 tightly conserved genes encoded by metazoan mitochondria. Thus, a greater number of mitochondrial genes would explain only a small proportion of the genome size increase of plant mitochondria.
A partial answer to the mysterious expansion of mitochondrial genome size was given by recent plant mitochondrial genome sequencing studies. According to NCBI (http://www.ncbi.nlm.nih.gov), mitochondrial genome sequencing of 14 Magnolyophyta species is currently complete. These include dicot species Arabidopsis thaliana , Beta vulgaris [4, 5], Brassica napus  and Nicotiana tabacum ; and monocots Oryza sativa [8–10], Zea mays [11, 12] and Triticum aestivum . These studies not only reported the great variability in size and gene content of mitochondria among species, but also even within species mitochondrial genomic structure can vary significantly. For example, different Z. mays lineages can carry mitochondrial genomic information of range 536-740 kb [11, 12]. Much of the size differences of these genomes are due to different numbers of large genomic duplication events, and apparent genomic recombination events between lineages. The comparison of a B. vulgaris cytoplasmic male sterile (CMS) strain with the non-CMS strain showed a complex rearrangement of sequence blocks [4, 5]. We recently sequenced two CMS strains of rice (O. sativa and O. rufipogon), and at least 12 genomic recombination events were necessary to explain the origin of the mitochondrial genome compared to the reference genome .
To understand the nature of large inter-genic region, we used a 60-mer probe-tiling array to visualize the expression pattern of the entire rice mitochondrial genome. In calli, 48.5% of the regions could be regarded as being transcribed. By setting the transcriptional borders by defining transcriptional units, we showed that 36.9% of open reading frames (ORFs) present in inter-genic regions were being transcribed without association with known mitochondrial housekeeping genes. We also identified two different partial fragments of transposable elements (TEs) that were being transcribed, suggesting unexpected complexity of transcriptional regulation in plant mitochondria.
The mitochondrial genome size of Nipponbare, the rice cultivar commonly used in molecular biology, is estimated at 490 520 bp . Tiling probes of 60 mer were designed for 374 866-bp non-redundant sequences after discarding large duplicated regions of > 10 000 bp. The probes overlapped each other by 58 nucleotides, meaning that tiling probes with 2-bp intervals were designed for the 374 866-bp region. Mitochondrial RNA was prepared from calli or etiolated-seedlings and hybridized against the tiling probes after biotin labeling upon reverse transcription by random primers. For further information on the processing of tiling array output, such as expression value normalization, see Methods.
All regions carrying known mitochondrial genes were regarded as being transcribed under the above statistics we used (Figure 1B). Almost all probes within the known genic regions gave p < 0.001, thus, the criterion we used to choose the expressed region was correct at least in this sense.
In this study we conducted a rice mitochondrial genomic-tiling array to reveal the hidden factors that expanded plant mitochondrial genomic information size. Depending on the tissue type, at most 48.5% of the mitochondrial genome was being transcribed. Transcription frequencies of iORFs can be dependent on tissue types (Figure 1). Several iORFs could be considered as transcribed not by coincidence, suggesting the hidden role of mitochondria. There are currently no further hints for functions of these iORF features conserved among species. This is because no similar protein motif was identified from non-redundant protein database searches, and it is not easy to speculate their protein function even if they were translated. Alternatively, there could be a possibility that these transcripts do not encode a protein. They could be transcribed as non-coding RNA (ncRNA) genes. This is supported by the fact that there are a few transcripts that do not carry ORF longer than 70 amino acids (Additional file 4). For example, ncRNA4 does not carry a reading frame longer than 42 codons. It is more convincing to consider that these transcripts are ncRNAs rather than to consider that they encode a protein. To date, no ncRNA-mediated gene regulation mechanism has been described in plant mitochondria. It is possible that these inter-genic transcribed regions serving the ncRNA are required for regulation of other genic regions.
One of the most important findings in the present study was the detection of transcription of two partial TEs. TE fragments have been found in all plant mitochondrial genomes sequenced so far [6, 7, 9, 12, 13]; however, most of them were not considered functional, as they were present only in fragments. One possibility is the detection of contaminated nuclear RNA. However, nucleotide identities of these two TEs (iORF_487 and iORF_502) with their respective nuclear sister copies were 59.3 and 63.2%, low enough to be washed off during the post-hybridization processing. Secondly, transcript detection of TE fragments from nuclear RNA contamination seems odd, when considering that there were 20 TE-like sequences present in mitochondria and that transcripts for 18 were not detected, even though they too possess decent numbers of nuclear sister copies. We have also confirmed the RNA expression of iORF_487 and iORF_502 by RT-PCR analysis (Additional file 4). Lastly, as unrestricted transposition of TEs may cause deleterious genomic mutations, the majority of nuclear TEs would be suppressed by DNA methylation . Indeed, best sister copies of each mitochondrial TE fragment were not transcribed, according to the nuclear tiling array profile in the TIGR rice genome browser . Therefore we consider that false-detection of a nuclear copy was unlikely in this case.
Although these TE fragments (iORF_487 and iORF_502) are unlikely to possess transposing activities given the incomplete fragments of TE protein sequences they carry (Additional file 5), it is likely that they are the remnants of past TE integration into the mitochondrial genome. Recent study of RNA editing site loss in Silene spp. has shown evidence of integration of reverse-transcribed mRNA into mitochondrial genome . It is possible that similar re-integration events can occur for other mitochondrial RNA. In the present study we showed that the mitochondrial RNA pool is a mixture of various transcripts from unexpected places within the genome. Re-integration of such transcripts might have contributed to complexity of flowering plant mitochondria compared to other species. A good example is the chimeric structure of most CMS-related genes, which are often present as the fused chimeric protein of partial respiratory chain subunit and an unknown sequence . The peptide region of an unknown sequence can originate from inter-genic transcripts; these sequences may be quickly lost during evolution and missing in extant species, while they can be retained as a part of CMS-related gene. This might be one explanation for why sequences of unknown origin are often found within CMS-related genes.
Approximately 50% of a typical plant mitochondrial genome does not show any similarities with other plant mitochondrial genomes , suggesting the involvement of frequent genomic recombination or horizontal transfer events in enlargement of the plant mitochondrial genome. It is also possible that as mitochondria alter their genomic arrangement quite frequently [5, 8, 11], nuclear-encoded RNA polymerases are not yet adapted to optimize the transcriptional efficiency of mitochondria. This could be the reason why so many inter-genic regions that are seemingly unrelated to mitochondrial functions are being transcribed by residual functions of un-optimized RNA polymerases.
We conclude that the plant mitochondrion is enriched with unknown transcripts that are mapped on inter-genic regions. This is obviously related to the large genomic information size of plant mitochondria, and revealing the functions of these transcripts should improve our understanding of the expansion of the plant mitochondrial genome. Finally, the results presented here are publicly accessible as Oryza sativa Mitochondrial rna Expression Server (OsMES: http://bioinf.mind.meiji.ac.jp/cgi-bin/gbrowse/OsMes/#search).
Calli and seedling mitochondria were purified on sucrose gradients with an ultracentrifuge, as described by Tanaka et al. . Three-week-old calli inducted from seeds were transferred to fresh medium, and mitochondria were isolated after one week of incubation. Seedlings were grown in complete darkness for two weeks before mitochondrial preparation. Total mitochondrial RNA was extracted by standard phenol/chloroform RNA isolation method.
Sixty-mer probes were designed for 374 866-bp non-redundant rice mitochondrial genomic sequences, with 2-bp intervals. Thus, a total of 374 866 probes were designed from sense and antisense directions. Probe synthesis and array construction was done in GeneFrontier (Tokyo, Japan). Cy3-labeled double-strand cDNA products of total mitochondrial RNA were obtained using a random primer. All of the hybridization and signal detection experiments were done in GeneFrontier. Custom perl scripts were written to re-map the probes and hybridization signals onto the 490 520-bp rice mitochondrial genome (BA000029). The tiling array data is deposited to Center for Information Biology Gene Expression database (CIBEX; http://cibex.nig.ac.jp) under accession no. CBX156.
Signal intensities of sense and antisense probes were treated as experimental replicates. Normalization of each tissue was done by quantile-scaling using limma library in R . Expression values were then converted into Z-scores. Expression levels of orf490 and orf181 were used as the background control, e.g. the reference for an untranscribed region. The 1124 probes spanning these regions were compared against every sliding window with each consecutive 60-mer probe. The expressional confidence of each sliding window (i.e. p-values for significant mean differences) was calculated for each sliding window using Student's t-test (two-tailed). Custom perl scripts were written to perform these statistical analyses, and also to visualize the expression patterns in bar graphs.
The OsMES database was constructed on a Linux server (FedraCore operating system) to provide information of transcribed regions and gene annotations in the rice mitochondrial genome. In OsMES, information on the transcribed regions and the gene annotations are presented using the genome browser GBrowse version 1.7 . For better accessibility and simplicity, 8245 non-overlapping probes were selected for display. The Generic Feature Format (GFF) file containing the data of predicted transcribed regions by tiling array, six-frame translation and known genic regions in the mitochondrial genome was created using custom perl scripts. The contents of GFF and rice mitochondria genome sequence files were imported into tables via the MySQL database.
Six-frame translation was performed using the getorf program implemented in the EMBOSS v6.0.0 package . Orthologs in 13 plant species, Arabidopsis thaliana (NC_001284), Brassica napus (NC_008285), Carica papaya (NC_012116), Cucurbita pepo (NC_014050), Citrullus lanatus (NC_014043), Nicotiana tabacum (NC_006581), Beta vulgaris (NC_002511), Vitis vinifera (NC_012119), Zea mays (NC_007982), Tripsacum dactyloides (NC_008362), Sorghum bicolor (NC_008360), Triticum aestivum (NC_007579), Cycas taitungensis (NC_010303) and Physcomitrella patens (NC_007945) were identified by BLASTp from a six-frame translation BLAST library created for each species. ORFs had to be ≥ 70 amino-acids in size in these libraries, as the shortest mitochondrial protein known in rice is orf79 with 79 amino acids .
Mitochondria and mitochondrial RNA were isolated from four-week-old etiolated seedlings and calli (Nipponbare) as described by Tanaka et al.  and Zeltz et al. , respectively. Of mitochondrial RNA, 3 μg was subjected to northern blot analysis as previously described [25, 26]. Digoxigenin (Roche)-labeled probes were obtained by PCR using primers nad4, 5'-CCAATATGAGTTTACCCGGC-3' and 5'-GCCATGTTGCACTAAGTTAC-3'; orf181, 5'-ACCAGACTACATGCCAAGAC-3' and 5'-GCTAAAATAGATGCCAACCGCA-3'; and orf490; 5'-AGATGATCGCAAGTCCACTG-3' and 5'-TTAACCTCCACAATGGAGGC-3'. Signal detection was carried out using LAS-4000 Mini (Fuji Film, Tokyo).
Primers used for RT-PCR analysis were listed in Additional file 6.
cytoplasmic male sterile
Generic Feature Format
long terminal repeat
open reading frames
Oryza sativa Mitochondrial rna Expression Server
This study was supported by a Grant-in-Aid for Special Research on Priority Areas (No. 18075002) and a Grant-in-Aid for Basic Research (No. 23658002) from the Ministry of Education, Science, Sports and Culture, Japan to KT, and Grant-in-Aid for Special Research on Priority Areas (Nos. 19043015 and 21024010) to KY. TT is the recipient of Research Fellowships of the Japan Society for the Promotion of Science for Young Scientists.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.