Mendelian randomization studies of brain MRI yield insights into the pathogenesis of neuropsychiatric disorders

Background Observational studies have identified various associations between neuroimaging alterations and neuropsychiatric disorders. However, whether such associations could truly reflect causal relations remains still unknown. Results Here, we leveraged genome-wide association studies (GWAS) summary statistics for (1) 11 psychiatric disorders (sample sizes varied from n = 9,725 to 1,331,010); (2) 110 diffusion tensor imaging (DTI) measurement (sample size n = 17,706); (3) 101 region-of-interest (ROI) volumes, and investigate the causal relationship between brain structures and neuropsychiatric disorders by two-sample Mendelian randomization. Among all DTI-Disorder combinations, we observed a significant causal association between the superior longitudinal fasciculus (SLF) and the risk of Anorexia nervosa (AN) (Odds Ratio [OR] = 0.62, 95 % confidence interval: 0.50 ~ 0.76, P = 6.4 × 10− 6). Similar significant associations were also observed between the body of the corpus callosum (fractional anisotropy) and Alzheimer’s disease (OR = 1.07, 95 % CI: 1.03 ~ 1.11, P = 4.1 × 10− 5). By combining all observations, we found that the overall p-value for DTI − Disorder associations was significantly elevated compared to the null distribution (Kolmogorov-Smirnov P = 0.009, inflation factor λ = 1.37), especially for DTI − Bipolar disorder (BP) (λ = 2.64) and DTI − AN (λ = 1.82). In contrast, for ROI-Disorder combinations, we only found a significant association between the brain region of pars triangularis and Schizophrenia (OR = 0.48, 95 % CI: 0.34 ~ 0.69, P = 5.9 × 10− 5) and no overall p-value elevation for ROI-Disorder analysis compared to the null expectation. Conclusions As a whole, we show that SLF degeneration may be a risk factor for AN, while DTI variations could be causally related to some neuropsychiatric disorders, such as BP and AN. Also, the white matter structure might have a larger impact on neuropsychiatric disorders than subregion volumes. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07661-8.


Background
Neuroimaging study is the most widely used procedure for studying brain disorders [1]. Many neuroimaging studies in the past quarter-century have revealed brain abnormalities in neuropsychiatric disorders [1,2], which have served as the basis for biomarker discovery, clinical guidance, and investigations into the mechanisms of neuropsychiatric disorders [2,3]. However, it is unclear whether such associations reflect disease causality [4]. One concern is that the spurious correlations [5] could emerge from indirect correlations with confounders such as medication, circadian and dietary changes, or falsepositive events. Another issue is the direction of causality: the neurotoxicity hypothesis [6] suggests that psychiatric illnesses have toxic effects on the central nervous system, which leads to structural alterations following disease onset [7]. This theory gained support from several observations in which neuroimaging abnormalities exhibited dynamic progression during the neuropsychiatric disorder course [6,7]. Under such theories, these associations should be utilized as clinical biomarkers rather than mechanism identifiers.
Several case-control studies with large sample sizes [8,9] have found a significant correlation between neuroimaging alteration and neuropsychiatric disorder, yet still unable to distinguish the causality from the correlation. Longitudinal analyses have partially overcome the limitations of cross-sectional observational studies [10] for investigating the disease causality, such as detecting neuroimaging alterations in participants with the so-called high-risk status along with their progression [11,12]. However, these studies are still limited by our current definition of high-risk cohorts. For example, the onset of neuropsychiatric disorders may be far earlier than a clinically recognizable high-risk status, such that the associated abnormalities may include changes that occurred after the primary pathology.
Despite the longitudinal analysis focusing on the highrisk status, an alternative method for addressing the challenge of causality is Mendelian Randomization (MR) [13]. MR has been used to derive the relationships between peripheral inflammatory markers and schizophrenia [14] and between physical activity and depression [15] with great successes. By selecting genetic locus that is strongly associated with exposure, so-called instruments, MR separates subjects into high and low lifetime exposure groups according to their genotypes on the instruments [16], then compares the prevalence of outcomes between the high and low exposure groups. Such grouping is considered unbiased as genotypes are randomly determined during meiosis. Furthermore, a twosample Mendelian Randomization (MR) estimates instrument-exposure and instrument-outcome relationships in different cohorts to infer the exposureoutcome relationship, without the need for information on individuals [16]. The explosive growth of genomewide association studies (GWAS) offers opportunities for applying MR to solve the debate of causality in different clinical medicine fields. Recently, two GWAS from UK-Biobank (UKBB) [17,18]

Causal associations between DTI and neuropsychiatric Disorder
After data harmonization, 1253/1320 (110*12) DTI-Disorder pairs and 973/1212 (101*12) ROI-Disorder pairs had at least one strong instrument (SNP with P < 5 × 10 − 8 in DTI GWAS [17], Additional file 1) and were analyzed by MR (Additional file 2 and 3). We started by evaluating the DTI-Disorder association. In the inverse variance-weighted (IVW) analysis (outer layer of Fig. 2a Fig. 1a) applied to DTI-Disorder pairs with only one instrument. A basic assumption of MR is that MR genetic instruments should only impact the outcome via the exposure and not any other pathway (horizontal pleiotropy) [16]. The existence of heterogeneity, which is introduced by outlier instruments, can also bias the MR estimation [19]. By applying various sensitivity tests, we confirmed that our results were not impacted by horizontal pleiotropy (Egger intercept P > 0.05; the third layer of Fig. 1a and Additional file 2) or heterogeneity (modified Cochran's Q test P > 0.05; Table 1), or bidirectional effects (reverse MR P > 0.05; Additional file 2). They also showed consistent trends across different MR methods, which are robust against pleiotropy and measurement error (see Method for detail) (except for Fornix.FA − bipolar disorder [BP]; Table 1). They also showed little directional pleiotropy, as indicated by funnel plots (Additional file 4). Neither MR-PRESSO nor leave-one-out tests found the impact of outliers on these DTI-Disorder pairs (Additional files 2 and 4). Taken together, these results confirmed the casual relation between SLF.AxD-AN, and suggested potential relations of other five DTI-Disorder pairs.

Overall contribution of DTI on neuropsychiatric disorders
Despite separate DTI-Disorder pairs that reached the significance threshold, we were also interested in whether DTI as a whole made a causal contribution to the diseases. To answer this question, we pooled the IVW P value for all DTI-Disorder pairs and compared them to the null uniform distribution ( Fig. 3a and Additional file 2). The distribution of IVW P was significantly inflated (KS P = 0.009, λ = 1.37); this result persisted after removing heterogeneous DTI-Disorder pairs (those with Cochran's P < 0.05) (KS P = 0.002, λ = 1.48) or outlier instruments (SNPs with MR-PRESSO P < 0.05) (KS P = 8.0 × 10 − 5 , λ = 1.49). When we analyzed each disease separately ("Original" in Fig. 3b), BP (Fig. 3c) and AN showed the most significant inflation ( Table 2). These results were also relatively stable against heterogeneity and outlier removal ( Fig. 3b and Additional file 4). The permutation test confirmed that the results for BP (permutation P value [pp] for KS test: 0.012; pp for λ: 0.002) and AN (pp for λ: 0.012) were not due to the bias inherent in the data or method. Additionally, heterogeneous DTI-Disorder pairs generally had nonsignificant MR results and did not contribute to inflation ( Fig. 3d and Additional file 2). Thus, we concluded that DTI polymorphisms made an overall causal contribution to neuropsychiatric disorders, especially BP and AN.

Causal associations between brain volume and neuropsychiatric Disorders
Similar to the DTI-Disorder analysis, we also assessed the ROI-Disorder association. 973/1212 (101*12) RD pairs had at least one strong instrument (SNP with P < 5 × 10 − 8 ) and were analyzed by MR. In the IVW analysis,  Fig. 4a), which was driven by a single SNP (rs2279829). The only trait associated with rs2279829 in PhenoScanner [20] was daytime dozing or sleeping (P = 2.02 × 10 − 6 ). The SMR-HEIDI test detected a significant MR effect in the same direction (OR = 0.48, P = 7.57 × 10 − 4 ) with no evidence of colocalization pleiotropy (HEIDI P = 0.84). We estimated that per 1-SD increment in normalized PT volume, the risk of SCZ decreased by 52 % (OR = 0.48). In conclusion, these results suggested a potential causal relationship between PT and SCZ. However, the validation of this relationship requires more work on potential pleiotropy. The overall distribution of the IVW p-value did not significantly differ from the null distribution (Fig. 4b), even after removing heterogeneity (KS P = 0.12) and outliers (KS P = 0.07). However, IVW p-value for SCZ showed significant inflation (λ = 1.90, KS P = 0.001), and the result was significantly impacted by heterogeneity (Fig. 4c). Additionally, 10 % (6/58) of the nominally significant RD pairs showed heterogeneity, and 3 contained outlier SNPs (Additional files 3 and 4). In conclusion, there was no evidence that ROI polymorphisms had a universal contribution to neuropsychiatric disorders.

Discussion
The issue of causality has long beleaguered clinical neuroimaging studies [4]. Confirmation of a causal change can provide insights into disease mechanisms at the circuit and region levels and may reveal useful biomarkers for predicting prognosis. Many current neuroimaging studies cannot directly draw this conclusion [1,4], which partly limits the translation of their findings to clinical practice. In this study, we conducted hypothesis-free, data-driven MR analyses to assess the causal relationship between neuroimaging polymorphisms and neuropsychiatric disorders in an unbiased manner.
Our results showed that, in general, WM connectivity was more closely associated with the risk of neuropsychiatric disorders than gray matter volume (GMV) (Fig. 3a and 4b). Among neuropsychiatric disorders, BP showed the most significant association with genetically determined connectivity polymorphisms (Fig. 3b,c). These results support the dysconnectivity theory [21] of psychiatry, positing that major psychiatric disorders such as SCZ and BP have common WM abnormalities in their pathology [22]. According to this hypothesis, anatomic and neurodevelopmental changes that arise from neurotoxicity [6] are a consequence rather than a cause of the illness. Indirect evidence from a functional study of neuromodulation and myelination [21] and a casecontrol study of the high-risk state [12] supports the dysconnectivity theory. Research interest has now shifted from the region-of-origin to a connectome's concept [22], in which connections between brain regions rather than the regions themselves cause BP and other mental illnesses. Our finding that DTI is more closely associated with the onset of neuropsychiatric disorders than ROI provides supportive evidence for this paradigm shift.
Our study's top MR result was a novel risk factor for AN-namely, decreased SLF.AxD. A few studies have reported a decreased SLF integrity in AN patients [23,24], but the results were inconsistent with the marginal effect size. As neuroimaging findings in AN patients are influenced by dietary and metabolic alterations [25], a causal change may not be manifested as a visible signal. Nonetheless, the SLF may contribute to body image distortion in the pathology of AN [24], probably through its connection to areas responsible for body image perception (prefrontal and parietal networks) and self-perception (inferior parietal lobe) [26]. Although abnormalities in both GMV [26] and WM connectivity [23,24] have been observed, our results suggest that the latter is a primary cause. In contrast, the former is a consequence of neuromodulator mechanisms such as activity-dependent pruning [21].
The roles of BCC in AD and PT in SCZ-2 marginally significant results from our MR analysis-have received  more attention in the literature than SLF in AN. However, both MR results should be interpreted with caution. BCC atrophy is widely observed in AD patients even at an early stage and reflects Wallerian degeneration and myelin breakdown [27]. However, our MR analysis revealed a reverse association: the FA of BCC was positively associated with AD risk (β = 0.07). One possible explanation for this discrepancy is that an enlarged BCC in early life is a risk factor for AD development at an older age, causing BCC atrophy after disease onset. Confirmation of such a complex theory requires more robust evidence from large-scale longitudinal studies. As for the PT-SCZ association, although it was validated by several additional analyses such as SMR-HEIDI and reverse MR, a single SNP-driven MR result is by nature suspect due to the unexplored pleiotropy [16,28]. Because the volume reduction of PT has been demonstrated in high-risk psychosis and first-episode SCZ patients [29], we suggest that the inferred causality between PT and SCZ is plausible. There were some limitations to this study. Firstly, classic MR methods largely depend on high heritability and strong exposure instruments [16,28]. However, both heritability and number of instruments [17,18] vary across the tested neuroimaging parameters, such that the power of MR is inconsistent across all DTI-Disorder and RD pairs. In fact, 67 DTI-Disorder and 239 RD pairs were discarded at the beginning of our analysis due to the absence of instruments. Even if there were causal links among them, they would not have been detected in our study. Thus, negative results for DD/RD pairs with limited instruments are not as convincing as positive results for those with adequate instruments. For the positive results, it should also be noticed that only the genetically regulated proportion of polymorphisms are associated with the disorders. Secondly, the original GWAS sample sizes may have impacted the MR results since estimation accuracy (i.e., standard error for effect size) for the instrument-outcome relationship is directly linked to the confidence interval of the MR effect estimate [16,28]. Since GWAS OCD and TS recruited fewer than 10,000 cases, their MR results were underpowered. In fact, several DD/RD pairs for OCD and TS had a large MR effect, but their wide CI range resulted in non-significant P values. Future GWAS with a larger sample size, both for neuroimaging polymorphism and neuropsychiatric disorders, will provide a better chance to improve our understanding in this field.

Conclusions
In conclusion, our analysis results demonstrate that, in general, WM structures make a more significant contribution to the etiology of neuropsychiatric disorders-especially BP and AN-than brain subregion volumes. SLF.AxD was causally related to AN; marginally significant relationships were also found between BCC.FA and AD and between PT and SCZ.

Methods
We obtained publicly available GWAS summary statistics for DTI, ROI, and neuropsychiatric without collecting any individual information. Ethics approval was obtained in each of the original studies; therefore, no further ethics approval was needed for the current study.

Data collection and preprocessing
The genetic instruments for DTI measurements have been previously described [17]. Briefly, the ENIGMA-DTI pipeline [30] was used to analyze UKBB diffusion MRI data for 17,706 European participants and generate 110 DTI parameters-namely, fractional anisotropy (FA), axial diffusivity (AxD), mean diffusivity, mode of anisotropy, and radial diffusivity, of 21 WM tracts as well as their mean values. The genetic instrument for ROI volumes was also obtained from UKBB GWAS [18], which included 19,629 European participants and used the standard OASIS-30 Atropos template for registration and Mindboggle-101 atlas for labeling [31].
For each DTI and ROI measurement, we retained single nucleotide polymorphisms (SNPs) with P < 5 × 10 − 8 as strong instruments for MR; measurements without a strong instrument were discarded. We removed SNPs with linkage disequilibrium (LD) r2 ≥ 0.001 for each measurement using reference LD data from the 1000 Genomes Project [44]. Data harmonization was applied independently for each DTI-disease (DD) and ROIdisease (RD) pair with the TwoSampleMR R package   45]. Since many of the GWAS summary statistics analyzed in this study did not provide allele frequency information, we did not exclude SNPs based on ambiguous strand error. For all binary phenotypes, we logtransformed the odds ratio to generate the β value.

Power calculation
We calculated the variance in phenotype explained by each instrument by Where EAF was the effect allele frequency, β was the effect size, N was the sample size, and se(β) was the standard error of effect size. The F statistic was then denoted as F ¼ R 2 Ã ðN À 2Þ 1 À R 2 R 2 and F were used to evaluate power for each instrument. For each RD and DTI-Disorder pair, we calculated the overall MR power using mRnd tool [46], assuming OR = 1.3 and type I error = 0.05. The assumption of OR was based on the actual MR effects passing the significance threshold. We took this assumption because there is limited observational estimation of OR that is currently available. MR power and the number of valid instruments for each pair are recorded in Additional files 2 and 3. Details for all instruments are shown in Additional file 1.

Calculation of MR effects
For DD/RD pairs with at least two instruments, we performed a meta-analysis of each instrument's MR effect using the inverse variance-weighted (IVW) method. The results were considered preliminary results and were used for downstream analyses. For the top IVW findings, we additionally applied weighted mode [47], weighted median [13], and MR-Egger regression [48] approaches-which are relatively robust against horizontal pleiotropy [15]-to further confirm the validity of the MR effect. For DD/RD pairs with only one instrument, estimates based on the Wald ratio were considered preliminary results. For the top Wald ratio finding, we used summary data-based MR (SMR) [49] to confirm the existence of the MR effect and heterogeneity in dependent instruments (HEIDI) [49] to rule out the probability that the MR effect was driven by colocalization of the instrument with the effective locus.
Since both IVW and Wald ratio results were taken into account, the p-value was adjusted by the Bonferroni method by numbers of all DTI-Disorder pairs (1320) or ROI-Disorder pairs (1212).

Sensitivity analysis
The intercept of the Egger regression was used as an indicator of potential horizontal pleiotropy, while modified Cochran's Q for IVW and Rucker's Q for Egger regression [50] were used as an indicator of heterogeneity. We used MR pleiotropy residual sum and outlier (MR-PRES SO) [19] with the number of permutations = 2500 to detect potential outlier SNPs for each DD/RD pair and generate an overall p-value for heterogeneity; those with outlier(s) were reanalyzed by IVW after removing the outlier(s). We also applied leave-one-out tests for all top findings to further evaluate the effects of unknown outliers.
For all MR results, we tested the reverse MR effect (i.e., neuropsychiatric disorders as exposure and DTI/ ROI as outcome) by selecting SNPs with P < 5 × 10 − 8 for each neuropsychiatric disorder as instruments. The absence of a reverse MR effect (IVW P > 0.05) was considered as evidence for the validity of directionality.

Analysis of the general causal contribution
To assess the general contribution of DTI (ROI) polymorphisms to neuropsychiatric disorder, we pooled the IVW P values for all DTI-Disorder (RD) pairs and compared them to the null uniform distribution using quantile-quantile plots. A positive bias (inflation) from uniform distribution was considered as evidence for a general contribution. The significance of inflation was evaluated with the Kolmogorov-Smirnov (KS) test, while the extent of inflation was assessed with the inflation factor λ, which was calculated by chi-square regression using the GenABEL R package [51]. Since the inflation factor might be overestimated due to the small number of P values, we shuffled SNP labels for BP and AN 1,000 times to carry out a permutation test. Permutation P < 0.05 was considered evidence for significant inflation. These tests were also separately applied to each disease and repeated after removing heterogeneous results (those with Cochran's P < 0.05) or outlier SNPs (those with MR-PRESSO P < 0.05).
Additional file 4. Funnel plots for top MR findings, leave-one-out results for top DTI-Disorder pairs, general contribution of DTI to neuropsychiatric disease after sensitivity test adjustment, Rank-Rank overlaps between MR effect and heterogeneity for ROI-Disease analysis.