Genome-wide association study using family-based cohorts identifies the WLS and CCDC170/ESR1 loci as associated with bone mineral density

Background Osteoporosis is a common and debilitating bone disease that is characterised by a low bone mineral density (BMD), a highly heritable trait. Genome-wide association studies (GWAS) have proven to be very successful in identifying common genetic variants associated with BMD adjusted for age, gender and weight, however a large portion of the genetic variance for this trait remains unexplained. There is evidence to suggest significant genetic correlation between body size traits and BMD. It has also recently been suggested that unintended bias can be introduced as a result of adjusting a phenotype for a correlated trait. We performed a GWAS meta-analysis in two populations (total n = 6,696) using BMD data adjusted for only age and gender, in an attempt to identify genetic variants associated with BMD including those that may have potential pleiotropic effects on BMD and body size traits. Results We observed a single variant, rs2566752, associated with spine BMD at the genome-wide significance level in the meta-analysis (P = 3.36 × 10−09). Logistic regression analysis also revealed an association between rs2566752 and fracture rate in one of our study cohorts (P = 0.017, n = 5,654). This is an intronic variant located in the wntless Wnt ligand secretion mediator (WLS) gene (1p31.3), a known BMD locus which encodes an integral component of the Wnt ligand secretion pathway. Bioinformatics analyses of variants in moderate LD with rs2566752 produced strong evidence for a regulatory role for the variants rs72670452, rs17130567 and rs1430738. Expression quantitative trait locus (eQTL) analysis suggested that the variants rs12568456 and rs17130567 are associated with expression of the WLS gene in whole blood, cerebellum and temporal cortex brain tissue (P = 0.034–1.19 × 10−23). Gene-wide association testing using the VErsatile Gene-based Association Study 2 (VEGAS2) software revealed associations between the coiled-coil domain containing 170 (CCDC170) gene, located adjacent to the oestrogen receptor 1 (ESR1) gene, and BMD at the spine, femoral neck and total hip sites (P = 1.0 × 10−06, 2.0 × 10−06 and 2.0 × 10−06 respectively). Conclusions Genetic variation at the WLS and CCDC170/ESR1 loci were found to be significantly associated with BMD adjusted for only age and gender at the genome-wide level in this meta-analysis.


Background
Osteoporosis is a common and debilitating bone disease that is characterised by a low bone mineral density (BMD) and micro architectural deterioration of the bone tissue, leading to decreased bone strength and an increased risk of fracture [1]. Excess mortality caused by osteoporotic fracture in women has been estimated at 9 % 1-year post fracture and 24 % 5-years post fracture [2]. The disease is particularly prevalent in postmenopausal women due to a reduction in oestrogen production, with subsequent effects on bone as well as intestinal and renal calcium handling [3,4]. Environmental factors, such as dietary calcium intake and exercise, also play a role in the disease [5,6].
In addition to the effects of oestrogen and environmental factors, osteoporosis has a strong genetic component. BMD, as assessed by dual-energy X-ray absorptiometry (DXA), is currently the best clinical indicator of fracture risk and is a highly heritable trait. Twin and family studies have generated BMD heritability estimates ranging from 0.46-0.92 depending on the anatomical site studied [7,8], while individuals with an affected first-degree relative have a considerably elevated familial relative risk of fragility fracture of 1.31-4.24 [9,10]. Various measures of body size, including height and body mass index (BMI), have also been shown to have substantial heritable components [11,12]. There is evidence to suggest significant genetic correlation between these traits and BMD [13,14], and the existence of genes with pleiotropic effects [15].
Genome-wide association studies (GWAS) have proven to be very successful in identifying common genetic variants associated with BMD, with at least 71 loci reported as associated at a high level of confidence (NHGRI GWAS Catalogue [16]). However, a large portion of the genetic variance for BMD remains unexplained and many of the most significantly associated variants appear to contribute little to fracture risk [17]. The vast majority of genetic studies for BMD so far have used weight, age and gender as covariates to identify genetic variants associated with BMD independently of these factors. However, this potentially removes some of the influence of genes with pleiotropic effects on body size and BMD, and it has recently been suggested that unintended bias can be introduced as a result of adjusting a phenotype for a correlated trait [18]. We decided to perform a GWAS meta-analysis using BMD data adjusted for only age and gender, in an attempt to identify genetic variants associated with BMD including those that may have potential mediated pleiotropic effects on body size traits and bone density (whereby the skeleton adapts to the extra load by increasing bone density, or excess fat mass leads to altered secretion of bone active hormones such as oestrogen, leptin and adiponectin [19]), while also removing the possibility of false positives induced by collider bias.

Genetics of Osteoporosis (GENOS) cohort
The discovery population used in this study is known as the Genetics of Osteoporosis (GENOS) cohort. This cohort is based on~1,050 individuals from an extreme discordant and concordant (EDAC) family-based study of Northern European/UK ancestry [20][21][22]. The EDAC families were selected based on containing a proband aged 25-83 years and having a lumbar spine, femoral neck or total hip BMD Z-score (defined as the number of standard deviations above or below the mean BMD of an age and gender matched control population) of < −1.5. The GENOS cohort is a powerful resource for detection of loci relevant to osteopenia and osteoporosis due to the EDAC study design and enrichment of genetic effects [23]. Clinical data were collected for BMD at the spine, total hip and femoral neck as well as extensive medical and lifestyle data. Exclusion criteria were applied and included hyperparathyroidism, long-term steroid use (>6 months), rheumatoid arthritis, anorexia nervosa or surgical oophorectomy. All subjects from the study provided written informed consent and the experimental protocols were approved by the Sir Charles Gairdner Group Human Research Ethics Committee and the St Thomas' Hospital Research Ethics Committee.
At a clinic visit, data including age, height, weight, medical, gynaecological, and lifestyle factors were recorded and a blood sample was collected for DNA extraction. DXA BMD was assessed (Hologic, Bedford, MA, USA) at the lumbar spine L1-L4, femoral neck and total hip. BMD data were adjusted for age and gender prior to analysis by conversion to BMD Z-scores using the formula: (patient BMDmean BMD of age and gender matched control population)/standard deviation. The NHANES (National Health and Nutrition Examination Survey) 3 reference ranges were used to calculate BMD Z-scores at the femoral neck and total hip sites, while the Hologic reference range was used for the lumbar spine.

TwinsUK cohort
The replication population used in this study is known as the TwinsUK cohort and is comprised of~12,000 monozygotic and dizygotic twins unselected for any particular disease or trait from St Thomas' UK Adult Twin Registry (TwinsUK). The cohort is from Northern European/UK ancestry and has been shown to be representative of singleton populations and the UK population in general [24]. Medical and lifestyle-factor data were obtained from questionnaires, with exclusion criteria applied including rheumatoid arthritis, oral steroid use or surgical oophorectomy. All participants provided written, informed consent and the research was approved by the Guy's and St Thomas' Hospital Research Ethics Committee.
Clinical data for most of the twins were obtained at several time points for multiple phenotypes including fracture data (any fracture since 16 years of age) and DXA BMD (Hologic, Bedford, MA, USA) at the lumbar spine, femoral neck and total hip sites. DNA for genotyping was extracted from whole blood samples obtained for the vast majority of the cohort at the time of the study visit. BMD measures were adjusted for age and gender by conversion to BMD Z-scores as described above.

Genotyping and imputation
Genotyping was performed for 1,046 individuals in the GENOS cohort using the Illumina HumanOmniExpress-12 v1.1 700 K BeadChip, with genotypes called using the GenCall algorithm (GenomeStudio). Quality control criteria were applied and included gender, ancestry (principal components analysis (EIGENSTRAT) [25]) and relatedness (Genome-wide Complex Trait Analysis (GCTA) [26]) checks. Samples with a genotype call rate <99 % were excluded as were any that were not of Northern European ancestry or that failed the above checks. Variants with a call rate <90 % or Hardy-Weinberg P ≤ 10 −6 were also excluded. The genotype data was pre-phased using SHA-PEIT2 [27] before imputation was performed using the IMPUTE2 software package [28] in conjunction with the 1000 Genomes Project Phase 1 and UK10K Project reference panels. Imputed variants with an "info" score <0.4 were excluded.
Genotyping in the TwinsUK cohort was completed for 5,654 individuals using the Illumina HumanHap300, HumanHap610, 1 M-Duo and 1.2 M-Duo arrays, as described previously [29,30]. Imputation was performed using IMPUTE2 [28] in conjunction with the 1000 Genomes Project Phase 1 reference panel. Imputed variants with an info score <0.4 were excluded.

Statistical analysis
Genome-wide association analyses for BMD at the lumbar spine, femoral neck and total hip sites was performed in each cohort using GEMMA (Genome-wide Efficient Mixed Model Association) [31], which controls for familial relatedness within a cohort. Only variants with a minor allele frequency (MAF) ≥1 % were included in the analysis. Meta-analysis of the results from the two cohorts was performed using the GWAMA (Genome-Wide Association Meta-Analysis) software package [32] a fixed effects model was applied, combining estimates of the allelic effect size and standard error from each cohort. Genome-wide significant and suggestive thresholds were set at 1.17 × 10 −08 [33,34] and 5 × 10 −07 respectively. Assuming an additive genetic model, our combined cohorts (total n = 6,696) have an estimated 88.7 % power to detect a variant with a MAF of 1 % that accounts for 0.2 % of the trait variance [35]. Any variants associated with BMD at the genome-wide significance level were tested for association with fracture rate in the Twin-sUK cohort using logistic regression adjusted for age, age 2 , gender, height and weight (SPSS Statistics 22). Gene-wide (+/− 10Kb) tests of association with BMD were performed on the GWAS meta-analysis results using the VErsatile Gene-based Association Study 2 (VEGAS2) software [36], which assigns variants to genes and calculates gene-based empirical association p-values while accounting for the LD structure within the gene. We have found the +/− 10 Kb option to be a good balance between incorporating shortrange regulatory variants while maintaining the specificity of the result for a specific gene, as variants associated with neighbouring genes can influence the test statistic for a gene of interest. A genome-wide Bonferroni-corrected significance threshold of 2.14 × 10 −06 for 23,390 gene tests was used, with suggestive significance set at 1 × 10 −05 .

Bioinformatics analysis
Analysis of the linkage disequilibrium (LD) surrounding variants of interest was performed using LDlink (1000 Genomes Project Phase 3 EUR population) [37]. Prediction of histone marks, DNAse hypersensitivity sites and expression quantitative trait locus (eQTL) associations was performed using HaploReg v4.0 [38] and genomic evolutionary rate profiling (GERP) scores were obtained using GWAVA (Genome Wide Annotation of Variants) [39]. Variants of interest were also queried using the human osteoblast eQTL dataset generated by Grundberg et al. [40].

Results
Descriptive statistics for the two cohorts are presented in Table 1 and quantile-quantile plots generated for the meta-analysis results are shown in Fig. 1a, b and c (spine BMD λ = 0.99, femoral neck BMD λ = 0.99, total hip BMD λ = 1.00). Genome-wide association analysis of the family-based discovery cohort (GENOS), enriched for individuals with low BMD, did not yield any genome-wide significant results. Therefore, we next performed a metaanalysis of both discovery and replication cohorts.

Spine BMD
We observed a single variant, rs2566752, associated with spine BMD at the genome-wide significance level in the meta-analysis (P = 3.36 × 10 −09 ) (Fig. 2a) ( Table 2). This is an intronic variant (C/T) located in the wntless Wnt ligand secretion mediator (WLS) gene (1p31.3) (Fig. 3), the less common C allele being associated with an increased spine BMD. This allele was also found to be associated with a decreased risk of fracture in the TwinsUK cohort (odds ratio (OR) = 0.86 (95 % confidence interval (CI): 0.77-0.97), P = 0.017).

Bioinformatics analysis
Analysis of LD in the WLS gene region showed there were no variants in strong LD (r 2 > 0.8) and 10 variants in moderate LD (r 2 > 0.5) with rs2566752 ( Table 3). All of these variants were associated with spine BMD P < 5 × 10 −05 in the meta-analysis, apart from rs1430738 (P = 0.002) and rs36009202 (not present in genotype dataset). Bioinformatics analysis of these variants suggested the presence of various regulatory features including promoter histone marks, enhancer histone marks and DNAse hypersensitivity sites in multiple cell types for the variants rs72670452, rs17130567 and rs1430738 (Table 3). According to the HaploReg v4.0 eQTL database [38], which contains eQTL data for a variety of tissues including results from the pilot phase of the Genotype-Tissue Expression (GTEx) project [41], the two variants rs12568456 and rs17130567 were also associated with expression of the WLS gene in whole blood, cerebellum and temporal cortex brain tissue (P = 0.034-1.19 × 10 −23 ) [42,43]. In all three tissues, the less common allele at these two variants was associated with increased expression of the WLS gene. The variant rs72670452 was found to have a GERP score of 3.95 ( Table 3), suggesting that the site may be under evolutionary constraint. GERP scores are used to quantify nucleotide substitution deficits, which represent "rejected substitutions" that reflect past purifying selection [44]. Apart from rs36009202 and rs75334237, which were not present in the dataset, none of the variants in Table 3 were found to be associated with expression of the WLS gene in primary human osteoblasts (P = 0.16-0.92) [40]. Analysis of the LD between the maximally associated variants from the CCDC170 gene for each BMD phenotype suggested that the rs6557156 and rs62444275 variants are in moderate LD (r 2 = 0.31), whereas the rs1038304 variant is in low LD with rs6557156 and rs62444275 (r 2 = 0.05 and 0.02 respectively).

Discussion
We undertook this study to identify genes significantly associated with BMD adjusted for only age and gender in order to investigate all genes regulating BMD including those with possible pleiotropic effects extending to aspects of body size. We have demonstrated a genomewide significant association between a variant in the WLS gene (also known as GPR177) and BMD at the spine in this meta-analysis, as well as an association with fracture rate in the TwinsUK cohort. The product of the WLS gene is a chaperone protein that is an integral component of the Wnt ligand secretion pathway. There are 19 Wnt proteins in the mammalian genome, and they represent an evolutionarily conserved family of secreted signalling molecules. Both the canonical (β-   . The red bars represent the gene-wide region tested and association P value (−log 10 ). The recombination rate (blue line) and position of genes, their exons and direction of transcription is also indicated [69] catenin dependent) and non-canonical (β-catenin independent) Wnt signalling pathways have been shown to play important roles in prenatal and postnatal bone development [45]. The product of the WLS gene has been shown to be required for the activity of virtually all of the Wnt proteins [46] and Wls-knockout mice display early embryonic lethality due to impaired body axis formation [47]. Conditional deletion of the Wls gene in mice has been found to severely impair the development of the craniofacial and body skeletons, demonstrating a role in intramembranous and endochondral ossification respectively [48]. Genetic variation within the WLS gene has been identified as associated with BMD in several previous GWAS [17,[49][50][51], some of which have observed more than one independent association signal originating from the locus [17,51]. The most recent of these studies, a large metaanalysis that included a subset of the TwinsUK cohort and used a combination of whole-genome sequence, wholeexome sequence and deeply imputed genotype data in its discovery study (n ≈ 33,000), also identified the rs2566752 single nucleotide polymorphism (SNP) as the maximally associated variant from the WLS gene region for both lumbar spine and femoral neck BMD [51]. Consistent with our findings, this variant was most strongly associated with BMD at the spine and the C allele was associated with an increased BMD [51]. Interestingly, pleiotropic effects on BMD and bone geometric parameters have been previously observed at the WLS locus [52], although it should be noted that to our knowledge variation in the WLS gene has not been previously associated with body size related traits such as height, weight or BMI. Liu et al. [15] published the results from a bivariate GWAS for BMI and hip BMD, identifying significant associations for two SNPs in the SOX6 gene in males. No significant associations were seen for variants in the WLS locus, however only 380,000 variants were tested genomewide and spine BMD was not analysed [15].
It would appear that the associations seen between rs2566752 and BMD are likely mediated through regulatory effects on the WLS gene. The bioinformatics analysis produced strong evidence for a regulatory role for the variants rs72670452, rs17130567 and rs1430738, which are in moderate LD with rs2566752. It should be noted, however, that there are limitations to using histone modification data when identifying regulatory elements and further study is required. eQTL data from the HaploReg v4.0 eQTL database were only available for variants in the HapMap2 variant set, however this data suggested that the variants rs12568456 and rs17130567 are significantly associated with expression of the WLS gene, with the less common allele at each variant associated with an increase in expression. Judging from the findings in mice described above, increased expression of WLS would likely lead to increased BMD, consistent with the observations for the less common alleles at these variants in this study. These eQTL findings should be interpreted with caution, however, as none of the WLS variants of interest were found to be associated with expression of the WLS gene in primary human osteoblasts derived from 95 donors [40].
Numerous genome-wide suggestive associations were seen for individual variants in this study, representing a mix of novel and known bone loci. Some of the more well-replicated BMD loci include 6q25.1 (CCDC170/ ESR1) [17,49,51,53,54] and 10q22.3 (KCNMA1) [17,51]. An interesting novel suggestive association was seen for spine BMD in the vicinity of the FAT atypical cadherin 4 (FAT4) gene. Mutations in this gene have been shown to cause Van Maldergem Syndrome 2 [55], which is characterised by intellectual disability, characteristic craniofacial features and skeletal and limb malformations. Additional evidence for a role for this gene in bone comes from Fat4deficient mice, which are born runted with a curved body axis, curved tail, abnormal vertebrae morphology and abnormal sternum ossification [56,57]. Another interesting suggestive association was seen between femoral neck BMD and a variant in the catenin (cadherin-associated protein), alpha 2 (CTNNA2) gene, which encodes an αcatenin that appears to have a role as a cell-cell adhesion protein by promoting the linking of β-catenin to the cytoskeleton and inhibiting its downstream signalling [58]. Genome-wide suggestive associations have been previously identified between variation in this gene and forearm BMD [59], while mice carrying a mutation affecting the Ctnna2 gene weigh 25-50 % less than littermate controls [60], which suggests a potential role in growth or body size regulation.
Gene-wide association testing demonstrated associations between the coiled-coil domain containing 170 (CCDC170) gene and BMD at the spine, femoral neck and total hip sites. The fact that this gene was identified as genome-wide significant by the VEGAS2 software, which corrects for LD within a gene, is likely indicative of the presence of multiple independent association signals for BMD originating from this locus. This phenomenon has previously been reported at this locus for bone phenotypes [17,61,62] and is supported by the fact that the maximally associated variant from the CCDC170 gene region for each phenotype in this study were not in strong LD with each other. Gene-wide associations with spine and femoral neck BMD have been previously reported for this locus using HapMap Phase II LD data [63]. The CCDC170 gene encodes a protein with unknown function and is located adjacent to the ESR1 gene. The ESR1 gene encodes the oestrogen receptor 1, a DNA-binding transcription factor that regulates the expression of many different genes. Oestrogen has a well-established protective effect on the skeleton by slowing the rate of bone remodelling and resorption while maintaining the rate of bone formation [64]. Esr1-knockout mice display decreased longitudinal bone growth, increased body weight [65] and obesity [66]. Loss of function mutations in the human ESR1 gene result in a variety of skeletal phenotypes including tall stature, reduced BMD and cortical thinning, as well as impaired glucose tolerance and hyperinsulinemia [67,68]. It is possible that the associations seen in this study between the CCDC170 gene and BMD reflect the presence of regulatory elements relevant to the ESR1 gene.

Conclusions
In conclusion, we performed a GWAS for BMD adjusted for only age and gender using two family-based cohorts. The size of each cohort was limited in comparison to current standards for well-powered GWAS, and we were not able to detect any genome-wide significant loci using the discovery GENOS cohort alone. By conducting a meta-analysis of the two family-based cohorts, we confirmed that genetic variation at the WLS locus is significantly associated with BMD at the genome-wide level. Bioinformatics and eQTL analyses suggest that the association seen is likely caused by regulatory effects on the WLS gene. Gene-wide association testing revealed associations between the CCDC170 gene and BMD at each site studied, although these associations may be due to the presence of regulatory elements relevant to the adjacent ESR1 gene.

Competing interests
The authors declare that they have no competing interests.