Analysis of functional variants in mitochondrial DNA of Finnish athletes

Background We have previously reported on paucity of mitochondrial DNA (mtDNA) haplogroups J and K among Finnish endurance athletes. Here we aimed to further explore differences in mtDNA variants between elite endurance and sprint athletes. For this purpose, we determined the rate of functional variants and the mutational load in mtDNA of Finnish athletes (n = 141) and controls (n = 77) and determined the sequence variation in haplogroups. Results The distribution of rare and common functional variants differed between endurance athletes, sprint athletes and the controls (p = 0.04) so that rare variants occurred at a higher frequency among endurance athletes. Furthermore, the ratio between rare and common functional variants in haplogroups J and K was 0.42 of that in the remaining haplogroups (p = 0.0005). The subjects with haplogroup J and K also showed a higher mean level of nonsynonymous mutational load attributed to common variants than subjects with the other haplogroups. Interestingly, two of the rare variants detected in the sprint athletes were the disease-causing mutations m.3243A > G in MT-TL1 and m.1555A > G in MT-RNR1. Conclusions We propose that endurance athletes harbor an excess of rare mtDNA variants that may be beneficial for oxidative phosphorylation, while sprint athletes may tolerate deleterious mtDNA variants that have detrimental effect on oxidative phosphorylation system. Some of the nonsynonymous mutations defining haplogroup J and K may produce an uncoupling effect on oxidative phosphorylation thus favoring sprint rather than endurance performance.


Background
Prolonged muscle activity in aerobic endurance performance requires sustained supply of energy that is provided in the form of adenosine triphosphate (ATP) [1]. Most of ATP is produced by oxidative phosphorylation (OXPHOS), where the transfer of electrons through four enzyme complexes (I-IV) and two electron carriers leads to a formation of proton gradient across the inner mitochondrial membrane. The gradient is then employed by complex V, ATP synthase, to generate ATP [2]. Short and high intensity efforts, such as that in sprint/power sports or in team sports, rely more on anaerobic glycolysis rather than OXPHOS.
The subunits of OXPHOS complexes are encoded in part by mitochondrial DNA (mtDNA) that harbors genes for 13 subunits as well as 22 tRNAs and two rRNAs [3]. Maternal inheritance, high mutation rate and lack of recombination have led mutations to accumulate sequentially in mtDNA lineages during population history. The ensuing groups of related haplotypes are continentspecific, e.g. Europeans harbor haplogroups H, V, U, K, T, J, W, I and X [4]. We have previously found that the frequencies of mtDNA haplogroups J and K are higher in Finnish sprinters than in Finnish endurance athletes and that none of the endurance athletes harbored haplogroup K or subhaplogroup J2 [5]. Such results prompted us to suggest that these mtDNA lineages could be "uncoupling genomes". In mitochondrial uncoupling, electron transportation is uncoupled from energy production so that heat is generated instead of ATP [6]. Hence, "uncoupling genome" would be detrimental for endurance athletic performance. Consistent with our findings, Polish male endurance athletes harbor haplogroup K less frequently than the controls [7], and Iranian athletes representing power events or team sports have a higher frequency of haplogroup J than the controls [8]. Indeed, it has been shown that men with haplogroup J have lower maximal oxygen consumption than men with non-J haplogroups [9]. Together these findings suggest that haplogroup J rather than just subhaplogroup J2 and haplogroup K are candidates for being "uncoupling genomes".
Most of the variants in mtDNA do not affect mitochondrial function. Unlike such neutral variants, nonneutral variants may have functional consequences and their effect on mitochondrial metabolism may be deleterious, mildly deleterious or beneficial [10]. Deleterious mutations cause OXPHOS defect and decline in ATP production and lead to variable disease phenotypes [11]. Combinations of mildly deleterious mtDNA mutations may confer a risk for complex diseases and phenotypes [12,13]. In addition, beneficial nonneutral variants may become enriched in the population by adaptive selection [4]. Beneficial variants could affect elite athletic performance by increasing OXPHOS coupling efficiency and possibly provide an explanation, why certain mitochondrial lineages may be more favorable for endurance athletes than others.
Here we have analyzed entire mtDNA sequences from 141 Finnish elite athletes in order to study, whether the frequency of functional variants or whether the mutational load differ between the athletes and controls. In addition, the complete sequences enabled us to search for possible uncoupling variants within haplogroups J and K.

Results
We determined complete mtDNA sequences of 141 Finnish athletes. These sequences and 77 sequences from control subjects were then used to generate a comprehensive phylogeny of 218 Finnish mtDNAs (Additional file 1: Figure S1). The athletes harbored 604 functional variants (rare variants, 28%) and the controls harbored 323 functional variants (rare variants, 23%). Altogether, there were 103 different rare variants including 65 nonsynonymous, 12 tRNA and 26 rRNA variants (Additional files 2, 3 and 4: Tables S1, S2 and S3). Quite strikingly, among the sprint athletes one of the rare variants was the pathogenic m.3243A > G mutation in MT-TL1 and one was the pathogenic m.1555A > G mutation in MT-RNR1. The m.3243A > G mutation was heteroplasmic at a rate of 43% and the m.1555A > G mutation was homoplasmic.
The distribution of rare functional variants and common functional variants differed between endurance and sprint athletes and the controls (p = 0.04, X 2 test). The difference appeared to be due to a higher number of rare functional variants among endurance athletes (Table 1).
Mutational load of nonsynonymous variants and rare nonsynonymous variants did not differ between the groups (Additional file 5: Table S4).
We have previously shown that haplogroups J and K are infrequent among Finnish endurance athletes compared to sprinters or control population [5]. Here we determined, whether these haplogroups differ in sequence variation from that in the remaining mtDNA haplogroups among 218 Finnish subjects consisting of athletes and controls. Analysis revealed that the ratio between rare functional variants and common functional variants in haplogroups J and K was 0.42 of that in the remaining haplogroups (p = 0.0005, X 2 test). In line with this, common nonsynonymous variants were more frequent in haplogroups J and K than those in the remaining haplogroups ( Table 2). The subjects with haplogroup J and K also showed a higher  Values are means ± standard deviations. 1 Mann-Whitney test. Samples from all three groups were pooled in order to maximize the number of sequences. Use of Finnish sequences was preferred over the use of sequences from the GenBank in order to avoid introduction of variation caused by population differences mean level of nonsynonymous mutational load attributed to common variants than subjects with the other haplogroups, while the mutational load attributed to rare nonsynonymous variation was similar between haplogroups J and K and the remaining haplogroups (Additional file 6: Table S5).

Discussion
We found differences in the distribution of rare functional variants in mtDNA between athletes and controls suggesting that endurance athletes harbor rare mutations that are beneficial for prolonged aerobic performance. We propose that such mutations could be favorable to the function of OXPHOS. Indeed, previously Japanese endurance athletes have been found to harbor a subset of mitochondrial DNA rare variants, clustered in branches of haplogroup A3, possibly influencing elite athletic performance [14]. It should be also noted that rare mtDNA variants have been associated with physiological and clinical phenotypes related with endurance performance including regulation of blood pressure [15], vascular function [16], body mass index and waist-hip ratio [17]. Non-neutral mutations in mtDNA may affect the function of OXPHOS and influence adaptation in varied energy demands. Adaptive mtDNA variants are less frequent in the population than the deleterious ones [18,19], but animal studies have estimated that 26% of nonsynonymous substitutions are fixed by adaptive evolution [20]. Natural selection could favor retention of adaptive mutations enhancing OXPHOS and such mutations could be concentrated among endurance athletes, whose performance relies on efficient ATP production. Indeed, heterogeneous selection on OXPHOS genes have been detected among different fish species with extremes of high and low aerobic swimming performance [21]. Adaptive mutations could affect endurance performance by altering the expression of nuclear DNA. In keeping with this, mtDNA variants have been shown to be important modulators of autosomal disease [22].
Some of the rare nonsynonymous variants that were only harbored by endurance athletes (m.3308 T > C, m.5319A > T, m.9822C > T and m.12940G > A) show quite high probability of pathogenicity (> 0.4). The score suggests that these variants are at least function-altering. We do not consider that any of these rare variants alone, but rather rare variants as a group, could potentially affect OXPHOS. The status of m.3308 T > C as a disease-causing variant has been a matter of debate and haplogroup background could influence its penetrance [23]. Germline variants m.5319A > T, m.9822C > T and m.12940G > A, on the other hand, have not been reported as disease-causing in MITOMAP. Certainly, more studies will be needed to elucidate, if these variants have a beneficial effect on endurance capacity.
Previously, mtDNA mutations with high pathogenic potential have been detected in healthy human individuals in the 1000 Genomes Project and in individuals from the United Kingdom [24,25]. However, to our knowledge, pathogenic mtDNA mutations have rarely been reported in elite athletes. Thus, surprisingly, two of the sprinters in our study harbored a disease-causing mtDNA mutation. One had the m.1555A > G mutation, a cause of hereditary nonsyndromic hearing loss [26], and the other had m.3243A > G, the common cause of the mitochondrial encephalopathy, lactic acidosis, stroke-like episodes syndrome (MELAS) [27]. The heteroplasmy of m.3243A > G was 43%, which is highly interesting, as age-adjusted m.3243A > G heteroplasmy in blood is as strongly associated with clinical disease burden and progression as muscle heteroplasmy levels [28]. Furthermore, heteroplasmy of > 40% in blood may lead to fully expressed MELAS phenotype [29]. The frequency of m.1555A > G in the population is 0.33% and that of m.3243A > G is 0.14% [30,31], whereas the population frequencies estimated from patient cohorts are one tenth or less [32,33]. This discrepancy suggests that there are unaffected or mildly affected subjects in the population. The finding that there were two sprinters with mutation suggests that sprint athletes may tolerate deleterious mtDNA mutations, while endurance athletes may not. Rather strikingly, given the above population frequencies and using the general formula for probability mass function, the probability of one and only one carrier of m.3243A > G among 89 subjects would be 11% and that of m.1555A > G would be 22%. These probabilities imply that the two mutations may have a beneficial, or at least not detrimental, effect for sprint performance. Indeed, sprint performance is based on anaerobic glycolysis rather than OXPHOS and [34], hence, mutations affecting OXPHOS would be less detrimental for sprinters than endurance athletes.
The rate of common nonsynonymous variation was higher in haplogroups J and K than that in the remaining haplogroups, but rare nonsynonymous variation was similar suggesting that the difference is due to nonsynonymous haplogroup-associated variants with minor allele frequency > 1%. The fact that only one endurance athlete belonged to haplogroup J and none to haplogroup K suggests that some of the nonsynonymous variants specific to these lineages may have a detrimental effect on endurance performance. Moreover, the high frequency of haplogroup J among centenarians and nonagerians has suggested that this haplogroup is beneficial for longevity [13,35]. The reactions in OXPHOS produce a proton motive force across the inner mitochondrial membrane that is then harnessed in ATP formation. We have previously suggested the term "uncoupling genome" that would code for OXPHOS complexes that are less efficient in ATP production contributing to poor endurance performance and that produce lower amounts of reactive oxygen species contributing to longevity [5]. In the presence of an "uncoupling genome" the reactions dissipate the membrane potential favoring heat generation instead of ATP production. Indeed, experiments on human cell cybrids have shown that haplogroup J cybrids have lower levels of ATP and reactive oxygen species production than haplogroup H cybrids [36].
Electrons enter the mitochondrial respiratory chain primarily via complex I. Hence, the complex plays an essential role in generating mitochondrial membrane potential, determines the NADH/NAD+ ratio and is a major source of the reactive oxygen species [37]. Interestingly, two variants defining haplogroup J (m.4216 T > C, m.13708G > A) and m.3394 T > C occurring in haplogroup J are located in genes encoding subunits of complex I. These three mtDNA variants occur in the branches of European and Asian phylogeny indicating that they have arisen independently during evolution, i.e. are homoplasic, and suggesting that selective factors have favored their retention in the populations [38]. In addition, the variants are enriched in Tibetan highlanders and the Sherpas [39,40], who are adapted to hypoxic environment.
Adaptation to ambient hypoxia brings about repression of mitochondrial respiration and induction of glycolysis. Recently, rather striking results have been seen in experimental mouse with inactivated Ndufs4 gene that encodes another complex I subunit and leads to OXPHOS reduction. The ambient oxygen of 11% corresponding to 4000 m altitude resulted in amelioration of symptoms and longer survival compared to knock-out mice in atmospheric oxygen [41]. Our results, those showing that the frequency of rare variants in MT-ND1 is higher in Japanese sprinters than that in controls [14], and population genetic and experimental data on adaptation and survival in hypoxic environment suggest that haplogroup J mtDNA or m.4216 T > C may reduce the capacity of OXPHOS and induce glycolytic pathway that would be beneficial for sprint performance. Furthermore, it is worth mentioning that some of the variants defining haplogroup J are located in the mtDNA regulatory region and may have functional importance. For instance, m.295C > T variant have been shown to impact mtDNA transcription and replication via in vitro transcription and cell culture studies [42]. Such variants could potentially enable quick transcriptional response to changing environmental conditions and stress, and thereby partially account for the functional impact of haplogroup J.

Conclusions
Our results suggest that endurance athletes harbor an excess of rare mtDNA variants that may be beneficial for OXPHOS, while sprinters may tolerate mtDNA mutations that have disease-causing properties and have detrimental effect on cellular OXPHOS. Our previous finding on paucity of haplogroups J and K among endurance athletes was further examined by using complete mtDNA sequences. Common nonsynonymous variants were more frequent in haplogroups J and K compared to those in other haplogroups, suggesting that the uncoupling variants in haplogroups J and K are those defining these haplogroups. Indeed, the mutation load of these variants was considerably high, which increases the likelihood that some of these variants could alter function and negatively affect endurance performance. Our results are in line with previous studies indicating that at least some of the haplogroup-specific polymorphisms in mtDNA can have adaptive significance and common mutations in OXPHOS complex I genes are potential candidates to drive the functional impact of haplogroup J [4,[43][44][45].

Subjects and controls
Total DNA has previously been extracted from a national cohort of 141 Finnish track and field athletes including 52 endurance athletes (mean age, 21 ± 7 years; men, 26) and 89 sprinters (mean age, 20 ± 3 years; men, 45) [5]. Control mtDNA sequences (n = 77) were randomly selected from 192 Finnish sequences so that the proportions of mtDNA haplogroups matched with those in the population [46,47]. The mean age of the sample population for the controls was 41 ± 12 years (men, 60%). Controls were not age-matched, as germline mtDNA variation remains unaltered throughout life.

Molecular methods
The entire mtDNA coding sequence was determined by using a strategy consisting of conformation sensitive gel electrophoresis (CSGE) and subsequent sequencing (Big-Dye Terminator v1.1 Cycle Sequencing Kit, Applied Biosystems, Foster City, CA, U.S.A.) [46]. In addition, the mtDNA D-loop was sequenced directly. The sequence reads were aligned to the revised Cambridge Reference Sequence (rCRS; NC_012920) using Sequencher® version 5.0 sequence analysis software (Gene Codes Corporation, Ann Arbor, MI, U.S.A.). The mtDNA sequences were assigned to haplogroups based on PhyloTree v.17 with HaploGrep2 software [48,49]. Sequencing was repeated in cases, where haplogroup-defining mutations were missing or where private mutations were found. Phire® Hot Start II DNA polymerase (Thermo Fisher Scientific, Waltham, MA, U.S.A.) was used for all amplifications.

Variation of interest and mutational load estimates
HaploGrep2 software was used to construct a phylogenetic tree that was based on complete mtDNA sequences and used superhaplogroup L3 as the outgroup [48]. Mutational hotspots m.523_524delAC, m.16182A > C, m.16183A > C, m.16519 T > C and C-insertions at positions m.309, m.315 and m.16193 were not included in the tree. Functional variants were defined as single nucleotide variants in tRNA and rRNA genes, and as variants in protein-coding genes causing amino acid substitutions. The number of such variants was counted in each sequence and the count of rare functional variants included those with minor allele frequency (MAF) less than 1% in MITOMAP (http://www. mitomap.org) and common functional variants included those with MAF ≥ 1%. The variants m.9966G > A and m.2702G > A in subclade N1, m.6261G > A in subclade T2c and m.10398A > G in haplogroup J were removed from all subsequent analysis because of back mutations in these positions. Allele frequencies were based on 30,589 GenBank sequences available at the time of analysis.
APOGEE meta-predictor was used to evaluate the impact of nonsynonymous substitutions [50]. Nonsynonymous variants were deemed nonneutral, if the APOGEE bootstrap mean probability of pathogenicity was greater than 0.5. Mutational load, i.e. the sum of these probabilities in each sequence, was calculated. Probabilities were not estimated for the five nonsynonymous mutations (m.10398G > A, m.8701G > A, m.14766 T > C, m.15326G > A and m.8860G > A) that connect L3 and rCRS in the phylogeny.

Statistical analysis
Chi-square test (X 2 ) was used to evaluate differences in rare and common functional variants between endurance athletes, sprint athletes and the controls, and between haplogroups J and K and the remaining haplogroups. Kruskal-Wallis or Mann-Whitney test was used to assess differences between the groups in continuous variables. IBM® SPSS® Statistics Version 22 software was used.
Additional file 1: Figure S1. Phylogenetic tree of mitochondrial genomes of 141 Finnish athletes and 77 controls (S=sprint athlete; E= endurance athlete; C=control) based on PhyloTree v.17 (48) constructed with HaploGrep2 software (49) and edited using the PDF-XChange Editor V7. Unless an exact base substitution is specified variants are transitions. Insertions are indicated by a decimal point position on the base that the insert follows. Deletions are denoted by "d" after the nucleotide flanking the deletion site. Heteroplasmic positions are indicated by R, S or Y. Variants preceded by @ are assumed back mutations or missing mutations. Colors indicate the following: blue, nonsynonymous mutation; brown, tRNA mutation; green, rRNA mutation.
Additional file 2: Table S1. A list of rare nonsynonymous functional variants in Finnish athletes and controls.
Additional file 3: Table S2. A list of rare tRNA functional variants in Finnish athletes and controls.
Additional file 4: Table S3. A list of rare rRNA functional variants in Finnish athletes and controls.
Additional file 5: Table S4. Mean mutational load of common and rare nonsynonymous functional variants per subject in Finnish athletes and controls.
Additional file 6: Table S5. Mean mutational load of common and rare nonsynonymous variants per subject in subjects with haplogroup J and K as compared to other haplogroups.