Transfer RNA gene arrangement and codon usage in vertebrate mitochondrial genomes: a new insight into gene order conservation
- Takashi P Satoh†1,
- Yukuto Sato†2,
- Naoharu Masuyama3, 4,
- Masaki Miya5 and
- Mutsumi Nishida3Email author
© Satoh et al; licensee BioMed Central Ltd. 2010
Received: 14 October 2009
Accepted: 19 August 2010
Published: 19 August 2010
Mitochondrial (mt) gene arrangement has been highly conserved among vertebrates from jawless fishes to mammals for more than 500 million years. It remains unclear, however, whether such long-term persistence is a consequence of some constraints on the gene order.
Based on the analysis of codon usage and tRNA gene positions, we suggest that tRNA gene order of the typical vertebrate mt-genomes may be important for their translational efficiency. The vertebrate mt-genome encodes 2 rRNA, 22 tRNA, and 13 transmembrane proteins consisting mainly of hydrophobic domains. We found that the tRNA genes specifying the hydrophobic residues were positioned close to the control region (CR), where the transcription efficiency is estimated to be relatively high. Using 47 vertebrate mt-genome sequences representing jawless fishes to mammals, we further found a correlation between codon usage and tRNA gene positions, implying that highly-used tRNA genes are located close to the CR. In addition, an analysis considering the asymmetric nature of mtDNA replication suggested that the tRNA loci that remain in single-strand for a longer time tend to have more guanine and thymine not suffering deamination mutations in their anticodon sites.
Our analyses imply the existence of translational constraint acting on the vertebrate mt-gene arrangement. Such translational constraint, together with the deamination-related constraint, may have contributed to long-term maintenance of gene order.
The animal mitochondrial (mt)-genome generally encodes 13 protein, 2 rRNA, and 22 tRNA genes. Although their arrangement is rather variable among invertebrate mt-genomes, a typical gene arrangement has been highly conserved among vertebrate mt-genomes from jawless fishes to mammals with some exceptions [1, 2]. This implies an extremely long-term persistence of mt-gene order probably for > 500 million years across diverse clades of vertebrates. However, it has been unclear whether such high-conservation of gene order is a consequence of some constraints, or whether it results only by sharing a common ancestry. This has been a long-standing enigma for more than 20 years since the initial reports of the whole mt-genome sequence of vertebrates .
To address this problem, we analyzed codon usage and tRNA gene arrangements of the vertebrate mt-genomes to examine possible constraints on the gene order of vertebrate mt-genomes.
Results and Discussion
Amino acid usage and tRNA gene arrangement
Comparison of the positions of mitochondrial (mt) tRNA genes corresponding to hydrophobic and hydrophilic amino acids based on human mt-genome data
Such tRNA gene localization may indicate that, in the typical vertebrate mt-genomes, highly-used tRNA genes are located in the genomic region close to the CR, where the transcription efficiency is thought to be relatively high. The transcription of the vertebrate mt-genome is initiated from regulatory elements within the CR [6, 7], and thus, the complete transcription of the genes into mRNA and functional RNAs would be more successful in the genomic region closer to the CR. In fact, the two rRNA genes immediately adjacent to the CR (12 S and 16S; see Fig. 2) are highly expressed . Likewise, the tRNA genes localized close to the CR, which specify hydrophobic residues, would also be highly expressed. Such an efficient production of the highly-used tRNAs may be favorable for translation of vertebrate mt-genomes.
Correlation between codon usage and tRNA gene position
List of the species studied and the DDBJ/EMBL/GenBank accession numbers of their mitochondrial genome sequences
Evolutionarily stable gene orders
Rearranged gene orders within lower taxa
In the correlation analysis, we aimed to eliminate the effects of shared common ancestry  based on independent contrast analysis  of the codon usage and tRNA positions. Independent contrasts for these two variables were estimated using the program CAIC  based on a composite tree of the sampled species that was constructed from recent molecular phylogenies for the major clades of vertebrates (supplementary Fig. S2 [see Additional file 1]). This method focuses on differences only between sister lineages or nodes in a phylogeny, which have arisen after a split, therefore yielding sets of independent "contrasts" . Our data points were derived from ancestral states (namely, various nodes across the tree) based on independent contrasts of tRNA genes (listed in supplementary Table S1, Table S2, and Table S3 [see Additional file 1]). Thus, many data points were above the level of major clades such as mammals, birds, reptiles, amphibians, actinopterygians, chondrichthyans, and agnathans, and the total number of data points became much smaller than the number of pair-wise comparisons.
The significance of the above correlations, however, may be due to the larger number of degrees of freedom generated by multiple-species comparisons on multiple tRNA genes, although we sought to eliminate the effect of shared common ancestry as described above. To limit this effect and to corroborate our results, we took two other approaches: first, we averaged codon usage and distance from the CR for each of 22 tRNA across taxa, respectively, yielding 22 independent data points of 22 non-homologous tRNA genes. This set of data might reflect an ancient adaptation between codon usage and tRNA positions of an original vertebrate mt-genome. Second, we analyzed variations of codon usage and distance from the CR of each of the 22 tRNA genes across taxa, specifically focusing on the rearranged gene orders within lower taxa. The second analysis might detect a recent adaptation between codon usage and tRNA positions in taxonomic groups concerned (lower than an order level; for details, see supplementary Fig. S2 [see Additional file 1]).
The second approach involving the meta-analysis of the results of the respective 22 tRNA gene sets showed that the rearranged gene orders within lower taxa had little effect on the preexisting correlation between codon usage and tRNA positions (an overall weighted Fisher's r = 0.0722, Stouffer's combined p = 0.3128; the results of the correlation in the respective 22 tRNAs are shown in the supplementary Fig. S3 and Fig. S4 [see Additional file 1]). This result can be explained as a result of elimination of novel gene arrangement deviating from the relationship. In fact, among 13 of phylogenetically independent cases of gene-order rearrangements (supplementary Fig. S5 [see Additional file 1]), three cases (deep-sea eels, frogs, and Tuatara) were observed to have improved correlation between tRNA position and codon usage compared to the evolutionarily stable gene orders; majority of the cases of gene order rearrangements showed no improvement in the relationship (detailed data are shown in supplementary Table S4 [see Additional file 1]). This implies that not a few gene order rearrangements without improvement have existed for some periods of evolutionary time. It is noted, however, that the correlation between tRNA position and codon usage was significant (or marginally significant) for the above three cases only. Their novel gene order arrangements might be maintained through some forms of natural selection.
On the basis of these sets of analyses, we propose that the tRNA gene arrangement of vertebrate mt-genomes, and possibly that of an ancestral, original vertebrate mt-genome, may be adaptive with regard to translational efficiency. The genes close to the CR, where the transcription initiation sites of both strands exist, appear to be highly expressed in the vertebrate mt-genomes . Consequently, genes of tRNAs specifying highly-used codons would be favorably located close to the CR to ensure the efficient translation of the protein-coding genes in the vertebrate mt-genomes.
Mitochondrial gene arrangement and translational constraint
On the basis of the results obtained from our analyses, we suggest the existence of translational constraint on the positions of mt-tRNA genes, but not on their gene copy numbers, in the vertebrate mt-genomes, although the constraint may be weak. In nuclear genomes, translational selection is known to promote adaptation of tRNA gene number to the usage of the corresponding codon [12, 13]. Clear association of tRNA gene number with codon usage has been observed in the genomes of various organisms ranging from E. coli to humans [14–19]. The vertebrate mt-genome is also likely exposed to translational selection because vertebrates are considered to be metabolically active and have high rates of ATP synthesis. However, translational selection would not act at the level of tRNA gene numbers in the vertebrate mt-genome since it is extremely compact and the number of contained genes is limited.
Recently, some studies suggest the replication and translational constraints affected the positions of translational genes such as RNA polymerase, rRNA, and tRNA genes in bacterial genomes , and abundant and broadly expressed genes in the human genome . Such constraints associated with translation and gene expression may also have limited the gene order rearrangement of vertebrate mt-genomes, specifically the rearrangements which interfere with transcriptional efficiency of mt-tRNAs (see Fig. 3B, Fig. 4B, and supplementary Fig. S3 and Fig. S4 [see Additional file 1]). This constraint may have driven the conservation of the mt-gene arrangement among vertebrates from jawless fishes to mammals for more than 500 million years.
Gene-order rearrangements are often found in vertebrate mt-genomes within lower taxonomic categories such as families, genera, and species (166/769 = 21.6% of species ), however, there are no extensive rearrangements shared across higher taxa, which are likely to have persisted for long evolutionary periods of time . This observation further implies the existence of constraint on vertebrate mt-gene orders, possibly through translational efficiency as discussed above. Exceptionally, mt-genomes of birds and lampreys show some little deviation from the typical gene order; either of the bird or lamprey have mt-genomes showing some changes in tRNA-Glu, tRNA-Thr, and tRNA-Pro gene positions (see supplementary Fig. S1 [see Additional file 1]). These non-typical gene orderings, however, interfere little with the correlation reported above (when birds and lampreys were excluded; r = -0.1499, one-tailed p = 0.0071, n = 267). This also implies that the deviations of the tRNA-Leu (CUN) and tRNA-Thr from the supported relationship (see Fig. 3) do not arise from the rearrangements of birds and lampreys.
The present analysis considering the asymmetric nature of mtDNA replication provided results match with the above prediction (Fig. 5). The tRNA loci that are expected to be exposed as a single strand for a longer time tend to have more guanine (G) and thymine (T) in their anticodon region on the coding strands (Fig. 5A: tRNA loci located between the OL and OH along the direction of L-strand replication: r = 0.1750 one-tailed p = 0.2838, n = 13; Fig. 5B: all tRNA genes: r = 0.4281 one-tailed p = 0.0234, n = 22). This suggests that the tRNA gene arrangement of typical vertebrate mt-genome is also adaptive in avoiding mutations in anticodons through deamination during replication, and possibly, in transcription . Regarding the correlation coefficients, the deamination-related constraint may be stronger than the codon/amino-acid usage-related constraint discussed above.
In this paper, we propose that the high conservation of the gene arrangement of the vertebrate mt-genome is underpinned not only by a shared common ancestry, but also by translational constraint acting on the tRNA gene arrangement. This conclusion can be derived from the simple observation that the mt-tRNA genes corresponding to hydrophobic amino acids, which are frequently used in translation of the mt-genes, are localized close to the CR. In addition, an analysis considering the asymmetric nature of mtDNA replication suggested that deamination-related constraint against mutations in tRNA anticodons is also an important determinant of the tRNA gene arrangement in the typical vertebrate mt-genome. The translational constraint together with the deamination-related constraint may have contributed to shaping and maintaining the typical gene order of the vertebrate mt-genomes.
Taxonomic sampling of mitochondrial genome data
To consider variation in codon usage and gene arrangement across typical vertebrate mt-genomes, we chose five species from each of mammals, birds, reptiles, amphibians, actinopterygians, and chondrichthyans, and three species from agnathans, for which only few mt-genome sequences were available in databases. Those 33 mt-genomes are defined as "evolutionarily stable" gene orders. In addition, to include mt-genomes that have rearranged gene orders within lower taxa, we chose eight species from actinopterygians and three species from reptiles and amphibians, respectively. Those 14 mt-genomes are defined as "rearranged gene orders within lower taxa". Species names and GenBank accession numbers of the mt-genomes are listed in Table 2: these species were selected to represent a broad niche breadth. The invertebrates could not be analyzed in this study, because a transcription system of the mt-genome and a sound phylogenetic framework are unclear for most of them.
Measuring codon usage and the position of each tRNA gene
The usage of each codon was counted in the sequence of the 13 protein-coding genes (ND1, ND2, ND3, ND4, ND4L, ND5, ND6, CO I, CO II, CO III, ATPase6, ATPase8, and Cyt b) of the mt-genomes examined. The overlapping codons between ATPase 8 and ATPase 6, and between ND4L and ND4 were considered once for each gene, because the open reading frame was different among these neighboring genes. To measure the position of each tRNA gene, base-pair distances from the 3' end of CR to the 5' end of each tRNA gene were counted in their respective positions on the H- and L-strands of mt-genome sequences. Although the accurate locations of the transcription start sites of mt-genome are unknown in most of the vertebrate species, it is assumed that the transcription start site for heavy and light strands may differ in distance from the 3' end of CR on the respective strand. Therefore, we examined whether such supposed differences affect the analysis in this study, and we found that the hypothetical differences of ± 150 bp and ± 500 bp in distance, which are based on a reference , do not affected the significance of the results of Mann-Whitney U-test and correlation analyses shown in the Results. Thus, we considered that measuring the positions of tRNA genes based on their base pair distances from the 3' end of the CR is justified.
Regression analysis considering the effects of shared common ancestry
To examine whether the frequency of usage of each codon varies with the position of its corresponding tRNA gene (the base-pair distance from the CR), we calculated Pearson's correlation coefficient (r) and Spearman rank-correlation coefficient (rs), and evaluated the significance of the relationship both parametrically and non-parametrically, respectively. To account for the effect of shared common ancestry , "independent contrasts"  for these two variables were estimated using the program CAIC  based on a composite tree of the sampled species (supplementary Fig. S2 [see Additional file 1]). The typical mt-gene order has predominated and persisted in most of the major vertebrate lineages for more than 500 million years, however, local gene order rearrangements and codon usage variation have been observed and described in vertebrates [1, 2, 28]. By considering such potential changeability of mt-gene order and codon usage, we regarded the data points obtained from independent contrast analysis as virtually independent of each other, although all vertebrate mt-genomes share a common ancestor. The analysis using the program CAIC was performed using logarithmically transformed data to focus on the proportional change in the variables. The validity of this approach is discussed in the CAIC User's Guide .
We thank our colleagues at the Atmosphere and Ocean Research Institute of the University of Tokyo for helpful discussions. This manuscript has greatly benefited from the constructive and helpful comments of two reviewers. The final version of the manuscript was carefully read by Dr. Christopher Loretz, to whom we are grateful. This study was partially supported by Grants-in-Aid from the Japan Society for the Promotion of Science to MN, MM, and YS, the NF-Hadal Environmental Science Education Program from the Nippon Foundation to TPS, the Sasakawa Scientific Research Grant from The Japan Science Society to YS, and NIG (National Institute of Genetics, Japan) postdoctoral fellowship to YS.
- Boore JL: Animal mitochondrial genomes. Nucleic Acids Res. 1999, 27: 1767-1780. 10.1093/nar/27.8.1767.PubMed CentralPubMedView ArticleGoogle Scholar
- Inoue JG, Miya M, Tsukamoto K, Nishida M: Evolution of the deep-sea gulper eel mitochondrial genomes: large-scale gene rearrangements originated within the eels. Mol Biol Evol. 2003, 20: 1917-1924. 10.1093/molbev/msg206.PubMedView ArticleGoogle Scholar
- Anderson S, Bankier AT, Barrell BG, de Bruijn MH, Coulson AR, Drouin J, Eperon IC, Nierlich DP, Roe BA, Sanger F, Schreier PH, Smith AJ, Staden R, Young IG: Sequence and organization of the human mitochondrial genome. Nature. 1981, 290: 457-465. 10.1038/290457a0.PubMedView ArticleGoogle Scholar
- Adachi J, Hasegawa M: Model of amino acid substitution in proteins encoded by mitochondrial DNA. J Mol Evol. 1996, 42: 459-468. 10.1007/BF02498640.PubMedView ArticleGoogle Scholar
- Jones DT, Taylor WR, Thornton JM: The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci. 1992, 8: 275-282.PubMedGoogle Scholar
- Chang DD, Clayton DA: Identification of primary transcriptional start sites of mouse mitochondrial DNA: accurate in vitro initiation of both heavy- and light-strand transcripts. Mol Cell Biol. 1986, 6: 1446-1453.PubMed CentralPubMedView ArticleGoogle Scholar
- Ojala D, Montoya J, Attardi G: tRNA punctuation model of RNA processing in human mitochondria. Nature. 1981, 290: 470-474. 10.1038/290470a0.PubMedView ArticleGoogle Scholar
- Christianson TW, Clayton DA: A tridecamer DNA sequence supports human mitochondrial RNA 3'-end formation in vitro. Mol Cell Biol. 1998, 8: 4502-4509.View ArticleGoogle Scholar
- Harvey PH, Pagel MD: The Comparative Method in Evolutionary Biology. 1991, Oxford: Oxford University PressGoogle Scholar
- Felsenstein J: Phylogenies and the comparative method. Am Nat. 1985, 125: 1-15. 10.1086/284325.View ArticleGoogle Scholar
- Purvis A, Rambaut A: Comparative analysis by independent contrasts (CAIC): an Apple Macintosh application for analysing comparative data. Comput Appl Biosci. 1995, 11: 247-251.PubMedGoogle Scholar
- Akashi H: Gene expression and molecular evolution. Curr Opin Genet Dev. 2001, 11: 660-666. 10.1016/S0959-437X(00)00250-1.PubMedView ArticleGoogle Scholar
- Akashi H: Translational selection and yeast proteome evolution. Genetics. 2003, 164: 1291-1303.PubMed CentralPubMedGoogle Scholar
- Ikemura T: Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system. J Mol Biol. 1981, 146: 1-21. 10.1016/0022-2836(81)90363-6.PubMedView ArticleGoogle Scholar
- Yamao F, Andachi Y, Muto A, Ikemura T, Osawa S: Levels of tRNAs in bacterial cells as affected by amino acid usage in proteins. Nucleic Acids Res. 1991, 19: 6119-6122. 10.1093/nar/19.22.6119.PubMed CentralPubMedView ArticleGoogle Scholar
- Moriyama EN, Powell JR: Codon usage bias and tRNA abundance in Drosophila. J Mol Evol. 1997, 45: 514-523. 10.1007/PL00006256.PubMedView ArticleGoogle Scholar
- Percudani R, Pavesi A, Ottonello S: Transfer RNA gene redundancy and translational selection in Saccharomyces cerevisiae. J Mol Biol. 1997, 268: 322-330. 10.1006/jmbi.1997.0942.PubMedView ArticleGoogle Scholar
- Duret L: tRNA gene number and codon usage in the C. elegans genome are co-adapted for optimal translation of highly expressed genes. Trends Genet. 2000, 16: 287-289. 10.1016/S0168-9525(00)02041-2.PubMedView ArticleGoogle Scholar
- Kotlar D, Lavner Y: The action of selection on codon bias in the human genome is related to frequency, complexity, and chronology of amino acids. BMC Genomics. 2006, 7: 67-10.1186/1471-2164-7-67.PubMed CentralPubMedView ArticleGoogle Scholar
- Couturier E, Rocha EP: Replication-associated gene dosage effects shape the genomes of fast-growing bacteria but only for transcription and translation genes. Mol Microbiol. 2006, 59: 1506-1518. 10.1111/j.1365-2958.2006.05046.x.PubMedView ArticleGoogle Scholar
- Huvet M, Nicolay S, Touchon M, Audit B, d'Aubenton-Carafa Y, Arneodo A, Thermes C: Human gene organization driven by the coordination of replication and transcription. Genome Res. 2007, 17: 1278-1285. 10.1101/gr.6533407.PubMed CentralPubMedView ArticleGoogle Scholar
- NCBI Organelle Genome Resources Website. [http://www.ncbi.nlm.nih.gov/genomes/OrganelleResource.cgi?opt=organelle&taxid=33208]
- Seligmann H, Krishnan NM, Rao BJ: Mitochondrial tRNA sequences as unusual replication origins: pathogenic implications for Homo sapiens. J Theor Biol. 2006, 243: 375-385. 10.1016/j.jtbi.2006.06.028.PubMedView ArticleGoogle Scholar
- Lynch M: The Origins of Genome Architecture. 2007, Sunderland, MA: SinauerGoogle Scholar
- Shadel GS, Clayton DA: Mitochondrial DNA maintenance in vertebrates. Annu Rev Biochem. 1997, 66: 409-435. 10.1146/annurev.biochem.66.1.409.PubMedView ArticleGoogle Scholar
- Brown TA, Cecconi C, Tkachuk AN, Bustamante C, Clayton DA: Replication of mitochondrial DNA occurs by strand displacement with alternative light-strand origins, not via a strandcoupled mechanism. Genes Dev. 2005, 19: 2466-2476. 10.1101/gad.1352105.PubMed CentralPubMedView ArticleGoogle Scholar
- Saccone C, Pesole G, Sbisá E: The main regulatory region of mammalian mitochondrial DNA: structure-function model and evolutionary pattern. J Mol Evol. 1991, 33: 83-91. 10.1007/BF02100199.PubMedView ArticleGoogle Scholar
- Xia X: Mutation and selection on the anticodon of tRNA genes in vertebrate mitochondrial genomes. Gene. 2005, 345: 13-20. 10.1016/j.gene.2004.11.019.PubMedView ArticleGoogle Scholar
- The CAIC User's Guide. [http://www.bio.ic.ac.uk/evolve/software/caic/index.html]
- Sayle R, Milner-White EJ: RASMOL: biomolecular graphics for all. Trends Biochem Sci. 1995, 20: 374-376. 10.1016/S0968-0004(00)89080-5.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.