Skip to main content

Multivariate genome wide association and network analysis of subcortical imaging phenotypes in Alzheimer’s disease



Genome-wide association studies (GWAS) have identified many individual genes associated with brain imaging quantitative traits (QTs) in Alzheimer’s disease (AD). However single marker level association discovery may not be able to address the underlying biological interactions with disease mechanism.


In this paper, we used the MGAS (Multivariate Gene-based Association test by extended Simes procedure) tool to perform multivariate GWAS on eight AD-relevant subcortical imaging measures. We conducted multiple iPINBPA (integrative Protein-Interaction-Network-Based Pathway Analysis) network analyses on MGAS findings using protein-protein interaction (PPI) data, and identified five Consensus Modules (CMs) from the PPI network. Functional annotation and network analysis were performed on the identified CMs. The MGAS yielded significant hits within APOE, TOMM40 and APOC1 genes, which were known AD risk factors, as well as a few new genes such as LAMA1, XYLB, HSD17B7P2, and NPEPL1. The identified five CMs were enriched by biological processes related to disorders such as Alzheimer’s disease, Legionellosis, Pertussis, and Serotonergic synapse.


The statistical power of coupling MGAS with iPINBPA was higher than traditional GWAS method, and yielded new findings that were missed by GWAS. This study provides novel insights into the molecular mechanism of Alzheimer’s Disease and will be of value to novel gene discovery and functional genomic studies.


Alzheimer’s disease (AD) is a debilitating and highly heritable disease with great complexity in its genetic contributors [1]. Genome-wide association studies (GWAS) of AD or AD biomarkers have been performed at the single-nucleotide polymorphism (SNP) level [2,3,4] as well as at the higher level (e.g., gene, pathway and/or network) [5,6,7,8]. It is widely recognized that AD has a complicated genetic mechanism involving multiple genes. Different combinations of functionally related variants in genes and pathways may interact to produce the phenotypic outcomes in AD, single SNP-level and gene-level GWAS results are unlikely to completely reveal the underlying genetic mechanism in AD. GWAS have greatly facilitated the identification of genetic markers (e.g., single nucleotide polymorphisms or SNPs) associated with brain imaging quantitative traits (QTs) in AD [9, 10]. As a complex disease, it is highly likely that AD is influenced by multiple genetic variants [11, 12]. The identified single-SNP-single-QT associations typically have small effect sizes. To bridge this gap, exploring single-SNP-multi-QT associations may have the potential to increase statistical power and identify meaningful imaging genetic associations. With this observation, we employ the MGAS (Multivariate Gene-based Association test by extended Simes procedure) tool [13] to perform multivariate GWAS on eight AD-relevant subcortical imaging measures.

In addition, biological interactions may be important in contributing to intermediate imaging QTs and overall disease outcomes [14]. Network-based analysis guided by biologically relevant connections from public databases provides a powerful tool for improved mechanistic understanding of complex disorders [15,16,17,18]. Considering that the etiology of AD might depend on functional protein-protein interaction (PPI) network, we conduct multiple iPINBPA (integrative protein-interaction-network-based pathway analysis) [19] network analyses on MGAS findings using the PPI data, and identify Consensus Modules (CMs) based on the iPINBPA discoveries. Functional annotation and network analysis are subsequently performed on the identified CMs.

In order to enhance the ability to recognize the aggregation effect of multiple SNPs, it may be desirable to perform association analysis at the SNP set (or gene) level rather than at a single SNP level. This paper aims to reveal the relationship between genetic markers and multiple phenotypes, improve statistical power, and find GWAS missing results by MGAS. Network analysis could provide meaningful biological relationships to help interpret GWAS data to further study the genetic mechanism of AD. A schematic framework of our analysis is shown in Fig. 1.

Fig. 1
figure 1

An overview of the proposed analysis framework. a Multivariate genome wide association analysis of eight subcortical imaging measures. b Network-based analysis of MGAS findings using the CM-based network strategy. c Functional enrichment analysis of the identified consensus modules (CMs)


Participant characteristics

The subjects (N = 866) consisted of 467 males (53.9%) and 399 females (46.1%) aged 48–91 years. Shown in Table 1 are the demographic and clinical characteristics of these subjects stratified by five diagnostic groups. There is no significant difference on the APOE e4 status in the five diagnostic groups. Significant differences are observed in gender (p = 0.035) and education (p = 0.037). Age is significantly different across the five groups (p < 0.001). Furthermore, eight neuroimaging phenotypes (LAmygVol,RAmygVol,LHippVol,RHippVol, LAccumVol,RAccumVol,LPutamVol,RPutamVol; see Table 2) show the significant difference across the five diagnostic groups (p < 0.001). Shown in Fig. 2 is the correlation matrix of these eight phenotypes. The correlation between LHippVol and RHippVol (r = 0.83) and that between LPutamVol and RPutamVol (r = 0.90) are among the highest.

Table 1 Demographic information and total number of participants involved in each analysis
Table 2 14 FreeSurfer subcortical ROIs
Fig. 2
figure 2

Phenotypic correlations between 8 subcortical volumes traits, volumes

Multivariate genome wide association study

In multivariate genome wide association study (MGAS) [13, 20], the top SNP hit is rs769449 from the APOE and TOMM40 region (p = 1.19E-09) (Table 3). According to the hypothesis that genes are the functional units in biology [15, 21], multivariate gene-level association p-values were also obtained by MGAS which combines p-value information in regressing univariate phenotypes on common SNPs. Figure 3 shows the Manhattan plot of the gene-based MGAS results. Using Bonferroni corrected p-value of 0.05 as the threshold, three genes (APOE, TOMM40, APOC1) were significantly associated with the studied eight subcortical measures. Table 4 shows that the top 10 gene-level findings identified by MGAS, where APOE (p = 2.77E-08), TOMM40 (p = 3.49E-08), and APOC1 (p = 2.09E-06) are the well-known AD risk regions. LAMA1 (p = 3.79E-05) was reported to encode the laminin alpha subunit associated with late onset AD in the Amish [22]. HSD17B7P2 (p = 8.40E-05) was reported to play an important role in brain development [23]. The other five gene-level findings in top 10 are XYLB, NPEPL1, CYP24A1, OR5B2 and MIR7160.

Table 3 The top 10 SNPs identified by MGAS
Fig. 3
figure 3

A Manhattan plot showing the gene-level p values in multivariate GWAS study of 8 subcortical volumes. The blue line corresponds to p = 10− 5; the red line corresponds to p = 10− 7

Table 4 The top 10 FDR corrected genes identified by MGAS in 8 subcortical ROIs

Consensus modules

Consensus modules (CMs) were constructed based on our previous work [24]. To search for subnetworks in the multivariate GWAS finding, we ran iPINBPA ten times by varying the random seed value from 1 to 10. Table 5 shows the top 5 subnetworks identified in each run, including the Dice’s coefficient value with the most similar modules in other runs. Compared with the standard iPINBPA method, our CM-based network strategy was designed to identify more reliable modules across multiple runs.

Table 5 The characteristics of the identified consensus modules in 10 iPINBPA runs

For the overlapping subnetworks, five unique CMs were identified (Fig. 4). CM1 contains eight genes, including MAPK8, ATF2, TNFRSF1A, JUND, NR3C1, RB1, IKBKB, and CDK2. CM2 contains four genes, including IGF1R, PRKCA, INSR, and PTPRC. CM3 contains six genes, including APP, APOE, CASP3, C3, PLTP, and CNTF. CM4 contains five genes, including APP, APOE, CASP3, C3, and PLTP. CM5 contains six genes, including MED8, GATA6, HNF4A, MED1, THOC7, and PHYHIP. The individual genes in the CMs might not demonstrate a direct statistical significance. All the genes in an identified module have a collective effect on the studied QTs, and thus have the potential to provide valuable information about the underlying biology.

Fig. 4
figure 4

Consensus modules identified by CM-based network strategy. Different CMs are showed by different colors. The blue color indicates genes in CM1; green color indicates genes in CM2; cyan color indicates genes in CM3; red color indicates genes in CM4; and yellow color indicates genes in CM5.The genes appearing in multiple CMs have multiple colors. For example, the genes APOE, APP, CASP3, C3, PLTP are in both CM3 and CM4

Pathway analysis of consensus modules

In our work, we hypothesize that the identified trait prioritized CMs with high replication might have strong functional associations with the studied subcortical volume phenotypes. We clustered the relevant pathways for five CMs and plotted a heat map to summarize the relationships between these pathways and CMs (Fig. 5). Figure 5 shows that Alzheimer’s disease, Apoptosis, TNF signaling pathway, Herpes simplex infection, MAPK signaling pathway are the pathways significantly enriched by one or more CMs [25]. We also observe that CM1, CM3, CM4 enriched many interesting pathways. In particular, CM3 demonstrates the strongest functional association with AD (p = 4.94E-05).

Fig. 5
figure 5

Functional annotation of the five identified consensus modules (CM1-CM5) using KEGG pathways. The five consensus modules were treated as five gene sets, and went through pathway enrichment analysis based on the KEGG pathway database. The enrichment results at a nominal statistical threshold of p < 0.05 are shown. -log10(p) values are color-mapped and displayed in the heat map. Heat map blocks labeled with “x” reach the nominal significance level of p < 0.05. Only top enrichment findings are included in the heat map, and so each row (pathway) has at least one “x” block


In this work, we performed multivariate genome wide association study (MGAS) of eight AD-relevant subcortical ROIs, using 866 samples in the ADNI database. To the best of our knowledge, this is the first MGAS on the quantitative traits of eight subcortical ROIs. In our MGAS, we confirmed associations at multiple genes previously associated with AD, such as APOE (p = 2.77E-08, rs769449), TOMM40 (p = 3.49E-08, rs769449), APOC1 (p = 1.18E-06, rs4420638), as well as identified a few novel associations shown in Table 4. Table 4 also shows that the associations to individual subcortical QTs (e.g. APOE, TOMM40, APOC1: associated to LAmygVol, RAmygVol, LHippVol and RHippVol) have a range of different significances.

XYLB (p = 6.72E-05, rs196376) had been reported to be associated with neurological diseases such as ischemic stroke [26]. We observed that this gene is associated to RAmygVol, LHippVol and RHippVol. LAMA1 (p = 3.79E-05, rs656734) encodes one of the alpha 1 subunits of Laminin, which has been demonstrated to be expressed in the hippocampal neuronal cell layers [27]. NPEPL1 was confirmed to be a potential direct target of miR-19a in a breast cancer study [28] and miR-19a was up-regulated in primary motor cortex and hippocampus in the brain of amyotrophic lateral sclerosis mice at late disease stage [29]. In our study, we found that NPEPL1 was associated to LPutamVol (PLPutamVol = 2.13E-05) and RPutamVol (PRPutamVol = 3.39E-03). In Table 4, the hippocampus and amygdala volumes were associated with multiple genes. While CYP24A1 was associated with none of eight studied QTs, it was identified in MGAS to have an overall association with all eight QTs. The statistical efficacy of MGAS of the detected gene associations appears to be more powerful than univariate phenotype models. Given that OR5B2 and MIR7160 have not been reported to be related to AD or AD related biomarkers, it warrants further investigation to examine their roles on AD in independent cohorts.

Because we found that different subnetworks could be identified by using different random seed values, we present the consensus modules discovered by an enhanced iPINBPA strategy. The genes for the CMs might not show a direct individual statistical significance but demonstrated a collected effect on the studied phenotypes. We assessed the significance of each identified consensus module (Table 6). CM1 (Score = 3.32, p = 9.00E-04) contains totally 8 genes, including KEGG AD genes TNFRSF1A. CM2 (Score = 1.41, p = 0.16) contains totally 4 genes without reaching the significance level. CM3 (Score = 4.37, p = 1.24E-05) contains 6 genes, including KEGG AD genes APOE APP, and CASP3. CM4 (Score = 4.32, p = 1.56E-05) contains 5 genes, including KEGG AD genes APOE, APP, and CASP3. CM5 (Score = 1.89, p = 5.88E-02) contains 6 genes with a marginal significance. The genes in the significant CMs warrant further investigation. The consensus module strategy applied to the iPINBPA framework yielded more stable results than the standard iPINBPA.

Table 6 The properties of Consensus Modules identified from the PPI network

The intersection of CM3 and CM4 yielded five genes, including APP, APOE, CASP3, C3, and PLTP. The C3 gene was shown to contribute to the pathogenesis of demyelinating disease by directly or indirectly chemoattracting encephalitogenic cells to the CNS [30]. The PLTP gene was reported to play an important role in Aβ metabolism and it is an interesting topic to further elucidate functions of PLTP in AD susceptibility. Table 7 shows the top ten pathways enriched by the intersection genes. Among these genes, the APP, APOE, and CASP3 genes are known AD risk factors. Several significant pathways were observed, including Alzheimer’s disease (p-value = 2.50E-05, FDR = 1.54E-03), Legionellosis (p-value = 1.90E-04, FDR = 7.03E-03), Pertussis (p-value = 3.63E-04, FDR = 8.40E-03), Serotonergic synapse (p-value = 8.02E-04, FDR = 1.28E-02), Tuberculosis(p-value = 2.00E-03, FDR = 1.92E-02), Herpes simplex infection (p-value = 2.13E-03, FDR = 1.92E-02) and so on. It has been reported that Legionella pneumonphila, one species of Legionella, is an intracellular microorganism that causes Legionellosis. This type of pulmonary infection is usually associated with neurological dysfunction [31]. Serotonergic neurotransmission and synapse activity are highlighted as primary pathological factors in neuropsychiatric symptoms [32, 33]. Pertussis toxin inhibits the apoptosis and DNA synthesis caused by FAD APP mutants which precedes FAD APP-mediated apoptosis in neurons and inhibition of neuronal entry into the cell cycle inhibits the apoptosis [34]. Apoptotic pathways and DNA synthesis are activated in neurons in the brains of individuals with AD.

Table 7 The pathway of genes appearing in all five consensus modules

Due to the limited number of samples available to us, in this work we were only able to perform a discovery study. In the future, when more data become available replication studies in independent cohorts warrant investigation to validate the identified CMs.


In this study, we performed MGAS analysis to explore the multivariate imaging genetic association effects for a set of AD-related subcortical measures. In addition, we conducted the iPINBPA network analysis to discover consensus modules related to these imaging phenotypes from a protein-protein interaction network. The MGAS analysis identified several genes associated with the studied imaging phenotypes, including APOE, TOMM40, APOC1, LAMA1, XYLB, HSD17B7P2 and others. The statistical power of coupling MGAS with iPINBPA was higher than traditional GWAS method, and yielded findings missed by GWAS. In this work, we reported top five consensus modules based on MGAS results. Network-based analysis can take into account information on biological relationships to interpret GWAS data. Our results suggested several susceptible genes and network modules for further investigation and replication to better understand the genetic mechanism of Alzheimer’s Disease.


Subjects and data

Data used in the preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database ( The ADNI was launched in 2003 as a public-private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial magnetic resonance imaging (MRI), positron emission tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment (MCI) and early Alzheimer’s disease (AD). For up-to-date information, see

Baseline 3 T MRI scans, demographic information, and diagnosis for the ADNI-1 and ADNI-GO/2 cohorts were downloaded [35]. MRI scans were analyzed using FreeSurfer version 5.1 for brain segmentation. We examined the volume measures of 14 subcortical ROIs; see Tables 1-2. We performed analysis of variance (ANOVA) to evaluate the diagnostic effect on 14 volume measures. Using the significance level of p < 0.05, we focused on the volume measures of eight subcortical ROIs (i.e., LAmygVol, RAmygVol, LHippVol, RHippVol, LAccumVol, LPutamVol, RPutamVol; see Table 2) in subsequent genetic association studies.

Genotyping data of both ADNI-1 and ADNI-GO/2 cohorts were downloaded, and then quality controlled and combined as described in [36]. A total of 866 non-Hispanic Caucasian participants with both complete subcortical imaging measurements and genotyping data were included in the study. The study sample (N = 866) included 183 cognitively normal (CN), 95 significant memory concern (SMC), 281 early MCI (EMCI), 177 late MCI (LMCI) and 130 AD subjects. The demographic and clinical characteristics of participants, stratified by the diagnosis, are shown in Table 1.

Multivariate genome wide association study

GWAS was performed to examine the main effects of 563,980 SNPs on eight subcortical measures as quantitative traits (QTs). Linear regression model was performed using PLINK to examine the association between each SNP-QT pair ( [37]. An additive genetic model was tested with age, gender and brain volume as covariates. We computed the correlation matrix (8 × 8 matrix) for the QT data containing eight imaging phenotypes (Fig. 2). We applied the MGAS (Multivariate Gene-based Association test by extended Simes procedure) tool to all 563,980 SNPs and examined their multivariate gene-based associations with eight imaging QTs [13]. A Manhattan plot was generated using R ( to visualize the gene-level MGAS results for our work (Fig. 3).We obtained one multivariate gene-based p-value PMGAS as follows.

$$ {P}_{MGAS}=\min \left(\frac{q_e{p}_j}{q_{ej}}\right) $$

Here, qe represents the effective number of p-values within a gene, qej represents the effective number of p-values among the top j p-values where j runs from 1 to 8 × 563,980, and pj represents the j-th p-value in the list of ordered p-values. PMGAS is the smallest weighted p-value within a gene associated with the null hypothesis that none of the eight phenotypes are related to the 563,980 SNPs within the gene, and the alternative hypothesis that at least one of the eight phenotypes is related to at least one of the 563,980 SNPs. We identified 1386 genes with p-value< 0.05 [13, 38].

Identifying consensus models using iPINBPA

This study used the protein-protein interaction (PPI) data from the Human Protein Reference Database (HPRD, [39], containing 9617 proteins and 39,240 interactions. Gene-level p-values obtained from MGAS of subcortical imaging phenotypes were mapped to the PPI network. Then the network was followed by an iPINBPA (integrative protein-interaction-network-based pathway analysis) procedure [19] to identify enriched PPI network modules. Consensus modules (CMs) were identified using the following approach based on our prior study [24].

Briefly, building on our prior study, we focus on analyzing the top 5 subnetworks (TN1, TN2, TN3, TN4, TN5) in each iPINBPA run. Let TNij be the top i-th subnetwork identified in the j-th run, where i {1, 2, …, 5} and j {1, 2, ...10}. We first find SNn(TNij), which is the most similar subnetwork to TNij in the n-th run, where n {1, 2, …, 10}\{j}. Clearly, we have.

SNn(TNij) = argmaxsn DC(TNij, sn),

where sn is any subnetwork enriched in Run n, and DC(x, y) indicates the dice coefficients between two subnetworks x and y. Consequently, for Run j, we define its i-th consensus module CMij as follows.

$$ C{M}_{ij}=T{N}_{ij}\cap \left(\bigcap \limits_{j=1,2,...\mathrm{10}}^{i=1,2,...\mathrm{5}}S{N}_n\left(T{N}_{ij}\right)\right),n\in \left\{1,\mathrm{2...10}\right\}\backslash \left\{j\right\} $$

Namely, CMij is the intersection of TNij and its most similar subnetworks identified in all the other runs. In our empirical study, we will report the consensus modules based on Run 1, i.e., CMi1 as the i-th consensus module.

Functional analysis

Cytoscape 3.4 [40] was used to visualize the identified CMs. We used ToppGene online tool ( for functional enrichment analysis. The ToppGene suite is an advanced bioinformatics tool, it could detect and arrange candidate genes through a comprehensive assessment of a variety of factors, including gene ontology (GO) annotating, phenotype, signaling pathway and protein interactions from a specific list of genes [41]. In this case, the top 10 findings of our multivariate gene-based association analysis were analyzed for functional enrichment. For the identified CMs, we also performed functional enrichment analysis using the ToppGene Suite.

Availability of data and materials

The genotyping and subcortical imaging phenotypes data were downloaded from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database ( Application for access to the ADNI data can be submitted by anyone at The process includes completion of an online application form and acceptance of Data Use Agreement. We have received administrative approval for access to the ADNI database. The Human PPI data were downloaded from the public Human Protein Reference Database (



Alzheimer’s disease


Quantitative traits


Genome-Wide Association Studies


Quantitative trait


Single Nucleotide Polymorphisms


Alzheimer’s Disease


Multivariate Gene-based Association test by extended Simes procedure


Integrative Protein-Interaction-Network-Based Pathway Analysis


Protein-protein interaction


Consensus Module


Consensus Modules


Single nucleotide polymorphism


Single nucleotide polymorphisms


Human Protein Reference Database


Alzheimer’s Disease Neuroimaging Initiative


Magnetic resonance imaging


Positron emission tomography


Mild cognitive impairmen


Cognitively normal


Significant memory concern


Early mild cognitive impairment


Late mild cognitive impairment


Analysis of variance


Dice’s coefficient


Groups of enriched subnetworks


Top subnetwork.


  1. Andrews SJ, Fulton-Howard B, Goate A. Interpretation of risk loci from genome-wide association studies of Alzheimer's disease. Lancet Neurol. 2020;19(4):326–35.

    Article  PubMed  Google Scholar 

  2. Kunkle BW, Grenier-Boley B, Sims R, Bis JC, Damotte V, Naj AC, Boland A, Vronskaya M, van der Lee SJ, Amlie-Wolf A, et al. Genetic meta-analysis of diagnosed Alzheimer's disease identifies new risk loci and implicates Abeta, tau, immunity and lipid processing. Nat Genet. 2019;51(3):414–30.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Jansen IE, Savage JE, Watanabe K, Bryois J, Williams DM, Steinberg S, Sealock J, Karlsson IK, Hagg S, Athanasiu L, et al. Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer's disease risk. Nat Genet. 2019;51(3):404–13.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Lambert JC, Ibrahim-Verbaas CA, Harold D, Naj AC, Sims R, Bellenguez C, DeStafano AL, Bis JC, Beecham GW, Grenier-Boley B, et al. Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer's disease. Nat Genet. 2013;45(12):1452–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Mukherjee S, Kim S, Ramanan VK, Gibbons LE, Nho K, Glymour MM, Ertekin-Taner N, Montine TJ, Saykin AJ, Crane PK, et al. Gene-based GWAS and biological pathway analysis of the resilience of executive functioning. Brain Imaging Behav. 2014;8(1):110–8.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Ramanan VK, Kim S, Holohan K, Shen L, Nho K, Risacher SL, Foroud TM, Mukherjee S, Crane PK, Aisen PS, et al. Genome-wide pathway analysis of memory impairment in the Alzheimer's disease neuroimaging initiative (ADNI) cohort implicates gene candidates, canonical pathways, and networks. Brain Imaging Behav. 2012;6(4):634–48.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Yao X, Yan J, Liu K, Kim S, Nho K, Risacher SL, Greene CS, Moore JH, Saykin AJ, Shen L, et al. Tissue-specific network-based genome wide study of amygdala imaging phenotypes to identify functional interaction modules. Bioinformatics. 2017;33(20):3250–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Shen L, Thompson PM. Brain imaging genomics: integrated analysis and machine learning. Proc IEEE Inst Electr Electron Eng. 2020;108(1):125–62.

    Article  PubMed  Google Scholar 

  9. Shen L, Kim S, Risacher SL, Nho K, Swaminathan S, West JD, Foroud T, Pankratz N, Moore JH, Sloan CD, et al. Whole genome association study of brain-wide imaging phenotypes for identifying quantitative trait loci in MCI and AD: a study of the ADNI cohort. NeuroImage. 2010;53(3):1051–63.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Shen L, Thompson PM, Potkin SG, Bertram L, Farrer LA, Foroud TM, Green RC, Hu X, Huentelman MJ, Kim S, et al. Genetic analysis of quantitative phenotypes in AD and MCI: imaging, cognition and biomarkers. Brain Imaging Behav. 2014;8(2):183–207.

    Article  CAS  PubMed  Google Scholar 

  11. Lee S, Kerns S, Ostrer H, Rosenstein B, Deasy JO, Oh JH. Machine learning on a genome-wide association study to predict late genitourinary toxicity after prostate radiation therapy. Int J Radiat Oncol Biol Phys. 2018;101(1):128–35.

    Article  PubMed  PubMed Central  Google Scholar 

  12. Kim J, Zhang Y, Pan W. Alzheimer's disease neuroimaging I: powerful and adaptive testing for multi-trait and multi-SNP associations with GWAS and sequencing data. Genetics. 2016;203(2):715–31.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Van der Sluis S, Dolan CV, Li J, Song Y, Sham P, Posthuma D, Li MX. MGAS: a powerful tool for multivariate gene-based genome-wide association analysis. Bioinformatics. 2015;31(7):1007–15.

    Article  PubMed  CAS  Google Scholar 

  14. Liu Y, Maxwell S, Feng T, Zhu X, Elston RC, Koyuturk M, Chance MR. Gene, pathway and network frameworks to identify epistatic interactions of single nucleotide polymorphisms derived from GWAS data. BMC Syst Biol. 2012;6(Suppl 3):S15.

    Article  PubMed  PubMed Central  Google Scholar 

  15. Marin M, Esteban FJ, Ramirez-Rodrigo H, Ros E, Saez-Lara MJ. An integrative methodology based on protein-protein interaction networks for identification and functional annotation of disease-relevant genes applied to channelopathies. BMC Bioinformatics. 2019;20(1):565.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  16. Gosak M, Markovic R, Dolensek J, Slak Rupnik M, Marhl M, Stozer A, Perc M. Network science of biological systems at different scales: a review. Phys Life Rev. 2018;24:118–35.

    Article  PubMed  Google Scholar 

  17. Yan J, Risacher SL, Shen L, Saykin AJ. Network approaches to systems biology analysis of complex disease: integrative methods for multi-omics data. Brief Bioinform. 2018;19(6):1370–81.

    CAS  PubMed  Google Scholar 

  18. Li J, Chen F, Zhang Q, Meng X, Yao X, Risacher SL, Yan J, Saykin AJ, Liang H, Shen L, et al. Genome-wide network-assisted association and enrichment study of amyloid imaging phenotype in Alzheimer's disease. Curr Alzheimer Res. 2019;16(13):1163–74.

    Article  CAS  PubMed  Google Scholar 

  19. Wang L, Mousavi P, Baranzini SE. iPINBPA: an integrative network-based functional module discovery tool for genome-wide association studies. Pac Symp Biocomput. 2015:255–66.

  20. Vroom CR, Posthuma D, Li MX, Dolan CV, van der Sluis S. Multivariate gene-based association test on family data in MGAS. Behav Genet. 2016;46(5):718–25.

    Article  PubMed  Google Scholar 

  21. Barabasi AL, Gulbahce N, Loscalzo J. Network medicine: a network-based approach to human disease. Nat Rev Genet. 2011;12(1):56–68.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. D'Aoust LN, Cummings AC, Laux R, Fuzzell D, Caywood L, Reinhart-Mercer L, Scott WK, Pericak-Vance MA, Haines JL. Examination of candidate Exonic variants for association to Alzheimer disease in the Amish. PLoS One. 2015;10(2):e0118043.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  23. Shehu A, Mao J, Gibori GB, Halperin J, Le J, Devi YS, Merrill B, Kiyokawa H, Gibori G. Prolactin receptor-associated protein/17beta-hydroxysteroid dehydrogenase type 7 gene (Hsd17b7) plays a crucial role in embryonic development and fetal survival. Mol Endocrinol. 2008;22(10):2268–77.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Cong W, Meng X, Li J, Zhang Q, Chen F, Liu W, Wang Y, Cheng S, Yao X, Yan J, et al. Genome-wide network-based pathway analysis of CSF t-tau/Abeta1-42 ratio in the ADNI cohort. BMC Genomics. 2017;18(1):421.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  25. Tamura K, Wakimoto H, Agarwal AS, Rabkin SD, Bhere D, Martuza RL, Kuroda T, Kasmieh R, Shah K. Multimechanistic tumor targeted oncolytic virus overcomes resistance in brain tumors. Mol Ther. 2013;21(1):68–77.

    Article  CAS  PubMed  Google Scholar 

  26. Zhang YW, Tong YQ, Zhang Y, Ding H, Zhang H, Geng YJ, Zhang RL, Ke YB, Han JJ, Yan ZX, et al. Two novel susceptibility SNPs for ischemic stroke using exome sequencing in Chinese Han population. Mol Neurobiol. 2014;49(2):852–62.

    Article  CAS  PubMed  Google Scholar 

  27. Chen ZL, Strickland S. Neuronal death in the hippocampus is promoted by plasmin-catalyzed degradation of laminin. Cell. 1997;91(7):917–25.

    Article  CAS  PubMed  Google Scholar 

  28. Ouchida M, Kanzaki H, Ito S, Hanafusa H, Jitsumori Y, Tamaru S, Shimizu K. Novel direct targets of miR-19a identified in breast cancer cells by a quantitative proteomic approach. PLoS One. 2012;7(8):e44095.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Marcuzzo S, Bonanno S, Kapetis D, Barzago C, Cavalcante P, D'Alessandro S, Mantegazza R, Bernasconi P. Up-regulation of neural and cell cycle-related microRNAs in brain of amyotrophic lateral sclerosis mice at late disease stage. Mol Brain. 2015;8:5.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  30. Boos L, Campbell IL, Ames R, Wetsel RA, Barnum SR. Deletion of the complement anaphylatoxin C3a receptor attenuates, whereas ectopic expression of C3a in the brain exacerbates, experimental autoimmune encephalomyelitis. J Immunol. 2004;173(7):4708–14.

    Article  CAS  PubMed  Google Scholar 

  31. Lagana P, Soraci L, Gambuzza ME, Delia S. Innate immune surveillance in the central nervous system in Legionella Pneumophila infection. CNS Neurol Disord Drug Targets. 2017.

  32. Ballard C, Aarsland D, Francis P, Corbett A. Neuropsychiatric symptoms in patients with dementias associated with cortical Lewy bodies: pathophysiology, clinical features, and pharmacological management. Drugs Aging. 2013;30(8):603–11.

    Article  CAS  PubMed  Google Scholar 

  33. Doraiswamy PM. Non-cholinergic strategies for treating and preventing Alzheimer's disease. CNS Drugs. 2002;16(12):811–24.

    Article  CAS  PubMed  Google Scholar 

  34. McPhie DL, Coopersmith R, Hines-Peralta A, Chen Y, Ivins KJ, Manly SP, Kozlowski MR, Neve KA, Neve RL. DNA synthesis and neuronal apoptosis caused by familial Alzheimer disease mutants of the amyloid precursor protein are mediated by the p21 activated kinase PAK3. J Neurosci. 2003;23(17):6914–27.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Fischl B, Dale AM. Measuring the thickness of the human cerebral cortex from magnetic resonance images. Proc Natl Acad Sci U S A. 2000;97(20):11050–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Li J, Zhang Q, Chen F, Yan J, Kim S, Wang L, Feng W, Saykin AJ, Liang H, Shen L. Genetic interactions explain variance in cingulate amyloid burden: an AV-45 PET genome-wide association and interaction study in the ADNI cohort. Biomed Res Int. 2015;2015:647389.

    PubMed  PubMed Central  Google Scholar 

  37. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81(3):559–75.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Li MX, Gui HS, Kwan JS, Sham PC. GATES: a rapid and powerful gene-based association test using extended Simes procedure. Am J Hum Genet. 2011;88(3):283–93.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Goel R, Harsha HC, Pandey A, Prasad TS. Human protein reference database and human Proteinpedia as resources for phosphoproteome analysis. Mol BioSyst. 2012;8(2):453–63.

    Article  CAS  PubMed  Google Scholar 

  40. Demchak B, Hull T, Reich M, Liefeld T, Smoot M, Ideker T, Mesirov JP. Cytoscape: the network visualization tool for GenomeSpace workflows. F1000Res. 2014;3:151.

    Article  PubMed  PubMed Central  Google Scholar 

  41. Chen J, Bardes EE, Aronow BJ, Jegga AG. ToppGene Suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Res. 2009;37(Web Server issue):W305–11.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references


Alzheimer’s Disease Neuroimaging Initiative.

Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database ( As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at:

About this supplement

This article has been published as part of BMC Genomics Volume 21 Supplement 112,020: Bioinformatics methods for biomedical data science. The full contents of the supplement are available at


Data analysis, result interpretation and manuscript writing were supported in part by grants from National Natural Science Foundation of China (61901063, 61803117, 61773134) and Hei long jiang Provincial Natural Science Foundation of China (QC2018080), Humanities and Social Science Fund of Ministry of Education of China (19YJCZH120), Qinglan Project of Jiangsu Province(2020), Science and Technology Plan Project of Changzhou (CE20205042) and Natural Science Foundation of the Jiangsu Higher Education Institutions of China (19KJB520003). Data analysis, method development, and manuscript editing were supported in part by NIH R01 EB022574, U01 AG024904, R01 AG19771, and P30 AG10133 at U Penn and IU. Publication costs were funded by National Natural Science Foundation of China (61901063).

Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd. and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health ( The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Disease Cooperative Study at the University of California, San Diego. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.

Author information

Authors and Affiliations




X.M., J.L., H.L. and L.S. led and supervised the research. X.M., H.L., L.S., and J.L. designed the research and wrote the article. X.M. performed the gene analysis, MGAS analysis, network mapping, consensus modules analysis, and pathway analysis. C.B., Q.Z., F.C., and Z.X. prepared subcortical imaging measurements and genotyping data, and performed quality control. X.Y., J.Y., S.L.R., A.J.S., and L.S. provided guidance and consultation on the genotyping and biomarker details about ADNI data, data preprocessing, quality control, population stratification, GWAS protocol, gene analysis, network analysis and pathway analysis. All the authors reviewed, commented, edited and approved the manuscript.

Corresponding authors

Correspondence to Hong Liang or Li Shen.

Ethics declarations

Ethics approval and consent to participate

The study procedures were approved by the institutional review boards of all participating centers ( to apply/

ADNI_Acknowledgement_List.pdf), and written informed consent was obtained from all participants or their authorized representatives. Ethics approval was obtained from the institutional review boards of each institution involved: Oregon Health and Science University; University of Southern California; University of California—San Diego; University of Michigan; Mayo Clinic, Rochester; Baylor College of Medicine; Columbia University Medical Center; Washington University, St. Louis; University of Alabama at Birmingham; Mount Sinai School of Medicine; Rush University Medical Center; Wien Center; Johns Hopkins University; New York University; Duke University Medical Center; University of Pennsylvania; University of Kentucky; University of Pittsburgh; University of Rochester Medical Center; University of California, Irvine; University of Texas Southwestern Medical School; Emory University; University of Kansas, Medical Center; University of California, Los Angeles; Mayo Clinic, Jacksonville; Indiana University; Yale University School of Medicine; McGill University, Montreal-Jewish General Hospital; Sunnybrook Health Sciences, Ontario; U.B.C.Clinic for AD & Related Disorders; Cognitive Neurology—St. Joseph’s, Ontario; Cleveland Clinic Lou Ruvo Center for Brain Health; Northwestern University; Premiere Research Inst (Palm Beach Neurology); Georgetown University Medical Center; Brigham and Women’s Hospital; Stanford University; Banner Sun Health Research Institute; Boston University; Howard University; Case Western Reserve University; University of California, Davis—Sacramento; Neurological Care of CNY; Parkwood Hospital; University of Wisconsin; University of California, Irvine—BIC; Banner Alzheimer’s Institute; Dent Neurologic Institute; Ohio State University; Albany Medical College; Hartford Hospital, Olin Neuropsychiatry Research Center; Dartmouth-Hitchcock Medical Center; Wake Forest University Health Sciences; Rhode Island Hospital; Butler Hospital; UC San Francisco; Medical University South Carolina; St. Joseph’s Health Care Nathan Kline Institute; University of Iowa College of Medicine; Cornell University; and University of South Florida: USF Health Byrd Alzheimer’s Institute.

Consent for publication

Not applicable.

Competing interests

The authors have no actual or potential conflicts of interest including any financial, personal, or other relationships with other people or organizations that could inappropriately influence (bias) our work.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Meng, X., Li, J., Zhang, Q. et al. Multivariate genome wide association and network analysis of subcortical imaging phenotypes in Alzheimer’s disease. BMC Genomics 21 (Suppl 11), 896 (2020).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Brain imaging
  • Multivariate gene-based genome-wide analysis
  • iPINBPA network analysis
  • Consensus modules