Genetic variants in nuclear DNA along with environmental factors modify mitochondrial DNA copy number: a population-based exome-wide association study

Background Mitochondrial DNA (mtDNA) copy number has been found associated with multiple diseases, including cancers, diabetes and so on. Both environmental and genetic factors could affect the copy number of mtDNA. However, limited study was available about the relationship between genetic variants and mtDNA copy number. What’s more, most of previous studies considered only environmental or genetic factors. Therefore, it’s necessary to explore the genetic effects on mtDNA copy number with the consideration of PM2.5 exposure and smoking. Results A multi-center population-based study was performed with 301 subjects from Zhuhai, Wuhan and Tianjin. Personal 24-h PM2.5 exposure levels, smoking and mtDNA copy number were evaluated. The Illumina Human Exome BeadChip, which contained 241,305 single nucleotide variants, was used for genotyping. The association analysis was conducted in each city and meta-analysis was adopted to combine the overall effect among three cities. Seven SNPs showed significant association with mtDNA copy number with P value less than 1.00E-04 after meta-analysis. The following joint analysis of our identified SNPs showed a significant allele-dosage association between the number of variants and mtDNA copy number (P = 5.02 × 10− 17). Further, 11 genes were identified associated with mtDNA copy number using gene-based analysis with a P value less than 0.01. Conclusion This study was the first attempt to evaluate the genetic effects on mtDNA copy number with the consideration of personal PM2.5 exposure level. Our findings could provide more evidences that genetic variants played important roles in modulating the copy number of mtDNA. Electronic supplementary material The online version of this article (10.1186/s12864-018-5142-7) contains supplementary material, which is available to authorized users.


Background
Mitochondria are vital eukaryotic organelles and participate in various physiological processes, including energy supplying, oxidative phosphorylation, cell apoptosis and so on [1,2]. Mitochondria have genetic materials independent of the host nuclear genome, known as mitochondrial DNA (mtDNA). Usually, each mitochondrion carries 2-10 copies of mtDNA in human cells [3]. However, the mtDNA are easily damaged because of the absence of DNA repair machinery. As a result, mitochondrion alter its copy number as a compensation for mtDNA damage [4]. Many studies have proved that variation of mtDNA copy number could modify the susceptibility of cancers, diabetes and infertility [5][6][7]. Nowadays, both genetic and exogenous factors have been found associated with the copy number of mtDNA [8][9][10].
Among the exogenous factors, exposure of PM 2.5 (fine particulate matter) is ubiquitous and inevitable for peoples. The situation is especially serious in china due to the rapid economic development recent years [11]. PM 2.5 is also known as fine particulate matter and is a mixture of organic chemicals and transition metals. Previous studies suggested that high PM 2.5 exposure was associated with decreased mtDNA copy number [12,13]. In addition to exogenous factors, genetic factors could also modulate the copy number of mtDNA. Curran and Xing et al. found that mtDNA content appeared to have a high heritability [14,15]. Furthermore, some candidate genes (such as TFAM, TIMM23 and PARL) and single nucleotide polymorphisms (SNPs) could also influence the mtDNA copy number [14,16]. In 2014, Lopez and colleagues performed the first genome-wide association study (GWAS) of mtDNA copy number and identified 15 significant SNPs using 386 subjects from Spanish [17]. However, the environmental factors (such as PM 2.5 exposure level and smoking) were not considered in previous study and these genetic variants only explained a small fraction of the total variation. What's more, the genetic background might be different between Spanish and Chinese population. Therefore, more efforts are warranted to evaluate the association between genetic variants and mtDNA copy number in Chinese population.
In the present study, we performed a multi-center population-based study with 301 subjects from three cities to evaluate the association between genetic variants and mtDNA copy number. Effects of PM 2.5 exposure level and smoking pack-years on mtDNA copy number were also assessed. The quantitative real-time PCR was used to measure the mtDNA copy number of peripheral blood leucocyte and 241,305 SNPs were genotyped using the Illumina Human Exome Beadchip.

Study subject
Subjects in this study were exactly the same with that in our previous studies [18,19]. In brief, 307 subjects with different PM 2.5 exposure level from Zhuhai, Wuhan and Tianjin were included in this study. After signing the informed consent, each participant provided 5-mL peripheral blood for genotyping and measurement of mtDNA copy number. The demographic information and smoking information were collected using a unified questionnaire. The Ethics and Human Subject Committees of Tongji Medical College and Nanjing Medical University approved this study. The basic information of the subjects was summarized in Table 1.

Monitoring for PM 2.5 exposure level
The monitoring for personal 24-h real-time PM 2.5 exposure has been described previously [18,19]. Briefly, the Sensidyne Company sampler pump and 37-mm Teflon filters from Beijing Lianyi Xingtong Apparatus & Instrument Co., Ltd. were used to measure the PM 2.5 exposure level. The flow rate was set at 2.0 L/min for 24 h. The filter was weighted before and after sampling. We calculated the PM 2.5 concentration based on the equation shown below, where PM 2.5 concentration (ug/m 3 ) was represented by C, m 1 and m 2 represented the weight of filter (mg) before and after sampling, V was the flow rate (2 L/min in this study) and t was the sampling time (24 h × 60 min/h = 1440 min in this study).

Measurement of mitochondria DNA copy number
We used the phenol/chloroform to extract the genomic DNA from peripheral blood leucocyte. Further, we measured the relative mtDNA using qPCR (7900HT Real Time PCR system, Applied BiosystemsTM, Lincoln Centre Drive Foster City, CA). In brief, we designed two primer pairs, one for mitochondrial subunit ND1 gene (MT-ND1, primer sequences: F, 5′-CCCTAAAACCCGCCACATCT-3′ and R, 5′-GAGCGATGGTGAGAGCTAAGGT-3′) and another for nuclear gene human globulin (HGB, primer sequences: F, 5′-GAAGAGCCAAGGACAGGTAC-3′ and R, 5′-CAACTTCATCCACGTTCACC-3′) [20]. A 10 ul reaction biosystem was constructed with a final DNA concentration of 5 ng/ul. The thermal cycling procedure was set as follows: 50°C for 2 min, then 95°C keeping 2 min, followed by 40 cycles of 95°C for 15 s and 60°C for 1 min (MT-ND1) or 56°C for 1 min (HGB). Relative mtDNA was calculated using the ratio of MT-ND1 to HGB based on the standard curves. All the samples were measured in triplicates and the average value was reported. For each sample, the ratio of MT-ND1 to HGB was calculated through subtracting the HGB Ct value from MT-ND1 Ct value (-dCt). Furthermore, the relative ratio of MT-ND1 to HGB (-ddCt) could be calculated by subtracting the -dCt of the calibrator DNA from the ratio of each sample. Finally, we calculated the relative mtDNA copy number using the formula: 2 × 2 −ddCt [5].

Genotyping and quality control (QC)
In this study, the genotyping was performed using Illumina Human Exome BeadChip, which contained 241,305 SNVs (single nucleotide variants) around exonic regions. Systematic quality control was performed before the association analysis. As far as it concerns samples, six samples (two samples from Zhuhai and four samples from Wuhan) with call rates less than 95% were excluded; SNVs that satisfied any of the following criteria would be removed: (1) non-autosomal; (2) genotyping call rate < 95%; (3) Hardy-Weinberg equilibrium (HWE) < 0.001. As a result, 301 qualified subjects with 238,927 SNVs were kept for further analysis.

Statistical analysis
The PM 2.5 exposure level and relative mtDNA copy number were described using the 25%, 50% and 75% percentiles. The HWE test was performed using goodness-of-fit χ 2 test. Considering the abnormal distribution of mtDNA copy number, it was transformed using the rank-based inverse-normal transformation (INT) [21]. The multivariable linear regression model was used to evaluate the association between genetic variants and mtDNA copy number. The additive genetic model was adopted. Age, gender, PM 2.5 exposure level and pack-years of smoking were adjusted to control their potential confounding. The association analysis was performed individually in each city and combined result of these three cities was calculated using meta-analysis. SNPs with consistent association direction in three cities and P value less than 1 × 10 − 4 were recognized as significant variants [22,23]. Further, we used the multivariate stepwise regression model to screen the independent factors of mtDNA copy number using Stata 11. Variables with P < 0.05 would be reserved in this model. Functional annotations were performed based on four public databases, including RegulomeDB (http://regulo me.stanford.edu/), HaploReg v4.1(https://pubs.broadin stitute.org/mammals/haploreg/haploreg.php), GTEx V7 (https://www.gtexportal.org/home/) and CADD (http:// cadd.gs.washington.edu/home). The gene-based analysis was conducted using SKAT-O method (SNV-set (sequence) kernel association test) [24]. The association analysis was performed using plink 1.9 and R 3.3.3.
Association between smoking, PM 2.5 exposure level and mtDNA copy number In this study, we evaluated the effect of smoking and PM 2.5 exposure on mtDNA copy number. As shown in Additional file 1: Figure S1, the median mtDNA copy number in smokers was significantly higher than that in non-smokers (P = 0.025), suggesting that smoking could be associated with increased mtDNA copy number. This result was further proved through the linear regression model with adjustment for age, gender and PM 2.5 exposure level (β = 0.448, P = 0.012, Additional file 2: Table S1). Further, we divided these subjects into high PM 2.5 exposure subgroup and low PM 2.5 exposure subgroup in each city based on the median PM 2.5 exposure level. In Zhuhai, the median mtDNA copy number in high PM 2.5 exposure subgroup was lower than that in low PM 2.5 exposure subgroup (P = 0.001). Result from the regression model showed that PM 2.5 was significantly associated with decreased copy number of mtDNA in Zhuhai (P = 0.024, Additional file 2: Table S1). However, we did not observe the consistent results in Wuhan and Tianjin. Meta-analysis indicated that PM 2.5 was negatively correlated with mtDNA copy number, but the association was not statistically significant (P = 0.241).

Association between genetic variants and mtDNA copy number
Totally, 238,927 SNVs were kept for the association analysis using linear regression model with the adjustment for age, gender, pack-years and PM 2.5 exposure level (Fig. 1). Among them, 13,027 SNVs showed consistent direction of regression coefficients in these three cities. Further, 7 SNPs showed significant association with mtDNA copy number with P value less than 1.00E-04 after taking meta-analysis (Table 2, Fig. 1). SNP rs37576 (located in PDE4D, 5q11.2, G > A, β = − 0.478, P = 2.21E-06) showed the most significant association. Furthermore, we used the multivariable stepwise regression analysis to identify the independent factors that could modulate the mtDNA copy number. Our identified 7 significant SNPs along with age, gender, pack-years and PM 2.5 exposure level were analyzed. Finally, only the 7 SNPs were reserved in the stepwise regression model ( Table 3), suggesting that these 7 SNPs could influence the copy number of mtDNA independently.
In the interest of exploring the cumulative effect of our identified 7 SNPs on mtDNA copy number, we performed a joint analysis. All the subjects were divided into three subgroups: "≤5", "6-7" and "≥8" according to their carried effect allele numbers in each city. We observed a significant allelic dosage-effect of combined 7 SNPs on mtDNA copy number in all the three cities (P for trend was 6.99E-09, 1.78E-06 and 2.40E-03 for Zhuhai, Wuhan and Tianjin, respectively). When we combined the three cities, the similar significant dosage-effect tendency was observed (P for trend: 5.02E-17, Table 4 and Fig. 2).

Gene-based analysis
In this study, we further analyzed the overall effects of SNPs that located in the same genes using SKAT package. In total, 995 genes with at least three variants were evaluated. Among them, 11 genes showed significant association with mtDNA copy number (P value < 0.01, Table 5). The XIRP1 showed the most significant association with mtDNA copy number with a P value of 6.54E-04.

Functional annotation
For our identified 7 significant SNPs, we performed functional annotations using multiple databases. Functional annotations indicated that rs7326068 and rs33962844 were located in transcript factors binding sites or DNase peak according to RegulomeDB website, suggesting that these two SNPs might modify the binding of transcript factors (Additional file 2: Table S2). Most of these 7 SNPs could change the motifs and regulate the expression of surrounding genes. Notably, SNP rs33962844 (MYO3B, A > G) was a missense with a CADD score of 13.98, indicating that this variant could be harmful to human genome. Consistently, Fig. 1 The manhattan plot of association between genetic variants and mtDNA copy number this missense was predicted to be deleterious with a score of − 4.90 based on PROVEAN database.

Discussion
In the current study, we performed an exome-wide association study to assess the association between genetic variants and mtDNA copy number with the consideration of personal PM 2.5 exposure levels. Seven SNPs and 11 genes were identified significantly associated with mtDNA copy number. Our findings provided more evidences that genetic variants in nuclear DNA could modulate the copy number of mtDNA.
Up till now, more and more studies proved that variation of mtDNA copy number was closely related to various diseases [5,25,26]. Therefore, studies on the influence factors of mtDNA copy number are vital for the prevention and treatment of mtDNA related diseases. In previous studies, high PM 2.5 exposure was found to be associated with decreased mtDNA copy number [12,13]. This was consistent with our finding in Zhuhai, but not validated in Wuhan and Tianjin. Many factors could be responsible for this, such as the different components of PM 2.5 , differential concentration of PM 2.5 and the other potential compounds [12]. Besides the PM 2.5 exposure level, we also observed that smoking contributed to the increased mtDNA copy number, which was consistent with the report by Xing et al. [15]. Nowadays, studies about the roles of PM 2.5 and smoking on mtDNA copy number are still limited, our findings could provide more clues for the following studies.
For our identified 7 SNPs, functional annotations indicated that most of them could regulate the expression of surrounding genes. SNP rs9507174 was located in 13q12.12 loci and showed significant association with the expression of MIPEP (mitochondrial intermediate peptidase) according to GTEx database. The MIPEP encoded proteins that participated in the maturation of oxidative phosphorylation (OXPHOS)-related proteins. These OXPHOS-related proteins targeted to the mitochondrial matrix or inner membrane, suggesting that MIPEP could influence the replication and expression of mtDNA [27][28][29]. In addition, MIPEP has been found associated with the risk of lung cancer and myopia [30,31]. The underlying pathogenic mechanisms were not clear until now. Notably, several studies revealed that the increased mtDNA copy number contributed to the lung cancer risk [32,33]. These findings suggested that the changed mtDNA copy number by MIPEP might be the potential disease-causing pathway (Additional file 3: Figure S2). Indeed, we observed a significant association between the rs9507174 and lung cancer risk using our previous data (C > A, OR = 1.10 (1.01-1.20), P = 0.037) [30]. SNP rs33962844 was located in 2q31.1, the 20th exon of MYO3B and was a missense mutation. According to PRO-VEAN database, SNP rs33962844 was predicted to be deleterious with a score of − 4.90. Consistently, the CADD score was 13.98, more than 96% of the human genetic variants, indicating that this genetic variant might be seriously harmful to human beings. The MYO3B encoded one of the ATPases and played important roles in the production and utility of ATP [34]. Similarly, SNP rs6857360   Table 4 The cumulative effects of our identified 7 significant SNPs on mtDNA copy number was associated with the expression of SMARCA5, who encoded a member of the SWI/SNF family and also had ATPase activities [35]. These evidences suggested that both MYO3B and SMARCA5 could affect the oxidative phosphorylation and the normal supply of ATP. This might aggravate the burden of mitochondria and result in a series of change [36]. In addition, previous studies also indicated that MYO3B was associated with obesity and Kawasaki disease [37,38]. As for SMARCA5, it was an important chromatin remodeling gene and has been found associated with breast cancer, Alzheimer's disease and leukemia [39][40][41]. Interestingly, in our previous study, the rs6857360 (SMARCA5) was found associated with the DNA damage levels [42]. This finding suggested us that SMARCA5 could influence both the nuclear and the mitochondrial DNA.
To date, there have been three articles focusing on the genome-wide association study of mtNDA copy number [17,43,44]. We attempted to compare our results with previous findings, however, no consistent finding was found. Many reasons could explain this point, such as different experimental measure of mtDNA, different genotype chips and populations, etc. In addition to 7 significant SNPs, we also identified 11 candidate genes that could modulate the mtDNA copy number based on gene-based analysis. Among these genes, RASGRP3 encoded a guanine   [45,46]. Besides, the CREBBP encoded the cAMP-response element binding protein and was involved in the transcriptional co-activation of many different transcription factors [47]. Numerous studies have proved that CREBBP could influence the susceptibility of cancers, Rubinstein-Taybi syndrome and diabetes [48][49][50].
As for the other genes, the related studies are limited and more studies are needed to reveal their functions. Compared with previous studies, this study has two prominent advantages: (1) we systematically evaluated the association between genetic variants and mtDNA copy number with the consideration of personal PM 2.5 exposure levels and smoking pack-years for the first time; (2) the study samples were recruited from the south, middle and north cities and results were validated in each city. However, this study also has some limitations. First, the relative sample size was limited because of the difficulty of carrying PM 2.5 sampler keeping for 24 h. Second, the underlying mechanisms of our identified SNPs and genes were not well expounded. Studies with larger sample size and functional assays are warranted to validate our findings.

Conclusions
This study was the first attempt to evaluate the genetic effects on mtDNA copy number with the consideration of smoking and personal PM 2.5 exposure level. Seven significant SNPs and 11 genes were identified associated with the copy number of mtDNA. Our study provided more evidences that genetic variants in nuclear DNA along with environmental factors could modulate the mitochondrial DNA copy number.

Additional files
Additional file 1: Figure S1. The effects of smoking and PM 2.5 exposure level on mtDNA copy number. The first column indicated that smokers have higher mtDNA copy number than non-smokers. The 2-5 columns showed that the median mtDNA copy number in subjects with low and high PM 2.5 exposure in each city and combined analysis. (DOCX 331 kb) Additional file 2: Table S1. Association between PM 2.5 , smoking and mtDNA copy number. Table S2. Functional annotations for our identified 7 significant SNPs. (DOCX 21 kb) Additional file 3: Figure S2.