Genome-wide association studies for canine hip dysplasia in single and multiple populations – implications and potential novel risk loci

Background Association mapping studies of quantitative trait loci (QTL) for canine hip dysplasia (CHD) can contribute to the understanding of the genetic background of this common and debilitating disease and might contribute to its genetic improvement. The power of association studies for CHD is limited by relatively small sample numbers for CHD records within countries, suggesting potential benefits of joining data across countries. However, this is complicated due to the use of different scoring systems across countries. In this study, we incorporated routinely assessed CHD records and genotype data of German Shepherd dogs from two countries (UK and Sweden) to perform genome-wide association studies (GWAS) within populations using different variations of CHD phenotypes. As phenotypes, dogs were either classified into cases and controls based on the Fédération Cynologique Internationale (FCI) five-level grading of the worst hip or the FCI grade was treated as an ordinal trait. In a subsequent meta-analysis, we added publicly available data from a Finnish population and performed the GWAS across all populations. Genetic associations for the CHD phenotypes were evaluated in a linear mixed model using 62,089 SNPs. Results Multiple SNPs with genome-wide significant and suggestive associations were detected in single-population GWAS and the meta-analysis. Few of these SNPs overlapped between populations or between single-population GWAS and the meta-analysis, suggesting that many CHD-related QTL are population-specific. More significant or suggestive SNPs were identified when FCI grades were used as phenotypes in comparison to the case-control approach. MED13 (Chr 9) and PLEKHA7 (Chr 21) emerged as novel positional candidate genes associated with hip dysplasia. Conclusions Our findings confirm the complex genetic nature of hip dysplasia in dogs, with multiple loci associated with the trait, most of which are population-specific. Routinely assessed CHD information collected across countries provide an opportunity to increase sample sizes and statistical power for association studies. While the lack of standardisation of CHD assessment schemes across countries poses a challenge, we showed that conversion of traits can be utilised to overcome this obstacle. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07945-z.


Background
Canine hip dysplasia (CHD), a condition involving abnormal development of the coxofemoral (hip) joint, is one of the most common orthopaedic disorders in dogs and has been reported for more than 177 dog breeds, with a prevalence varying from 0.9 to 75.3%, according to the Orthopedic Foundation for Animals (OFA) [1]. CHD can lead to lameness, arthritis and hip pain and thus has a profound effect on animal welfare. CHD is recognised as a heritable disease, and several studies aimed to identify underlying genetic risk factors [2][3][4]. The inheritance pattern of CHD has been shown to be polygenic in GSDs [5] and regional and chromosomal heritability analyses of CHD in Labrador retrievers also indicated that the genetic architecture of CHD is based on many genes with small or moderate effect [4]. Identification of potential risk loci involved in CHD and a better understanding of the genetic architecture of this disease could help to address the severe welfare consequences, e.g. by promoting early genetic screening and improving breeding schemes [6,7].
In a previous genome-wide association study (GWAS) on CHD and hip osteoarthritis in 721 dogs from eight breeds, SNPs associated with CHD were detected, suggesting several positional candidate genes [3]. Other studies using GWAS for CHD have not detected consistent genomic positions of quantitative trait loci (QTL) [4,[8][9][10][11], which might be explained by differing features across studies, including sample sizes, the CHD-related trait used as phenotype and population structure, in addition to different breeds. Merging CHD records and genotype data across countries might improve the statistical power of GWAS by increasing the currently limited sample size of genotyped animals within countries. Furthermore, the use of multiple populations, and thus increased diversity, might also enable the detection of further QTL for this complex disease, as shown in studies of human complex traits [12]. However, CHD breeding and assessment schemes vary between countries; the OFA screening system is used in the United States and Canada, while most European countries record CHD using the five-grade approach proposed by the FCI (Fédération Cynologique Internationale) and others measure the severity of CHD using hip scores established by the BVA/KC (British Veterinary Association/Kennel Club) (e.g. UK, Australia) [13]. The challenges and caveats of analysing CHD data from different countries have been previously addressed in cross-country genetic evaluation studies of CHD [14] in which conversion of BVA/KC scores into FCI grades slightly improved prediction accuracy of estimated breeding values for CHD, thus indicating the potential benefits of such approaches.
In the current study, we combined CHD phenotype and genotype data from British and Swedish German Shepherd dogs (GSDs) with publicly available data from a Finnish GSD population [11] to study the genetic architecture of CHD. We performed both singlepopulation genome-wide association studies (GWAS) and a meta-analysis (GWAS across populations), with two aims: (I) to compare the consistency of results from different datasets and the performance of different phenotype classifications and (II) to identify potential novel loci influencing CHD by utilising multiple populations.

Population structure
The underlying genetic population structure of GSDs (from UK, Sweden and Finland) within and across populations was examined by PCA using a pruned genotype data set (comprising 5167 variants). In the multi-population PCA, including the UK, Swedish and Finnish dogs analysed for the metaanalysis, the first two PCs explained 2.89 and 0.94% of the variance, respectively. Figure 1 shows that overlaps exist between the populations, but a separation of clusters for the three countries is also visible, particularly associated with PC2.

Association mapping
The associations between 62,089 SNPs and CHD records within populations and in a subsequent meta-analysis across populations was analysed by genome-wide association studies (GWAS). We took two approaches, the first was based on hip scores and the second was a casecontrol analysis, where dogs with the lowest scores (FCI-A or BVA/KC score 0-10) were treated as controls and all other dogs were considered cases. The quantilequantile (QQ) plots (Fig. S1) and genomic inflation factor (λ) (ranging from 0.96 to 1.03), indicate that both within-population and meta-analysis GWAS were adequately controlled for population stratification.
FCI grades for the worse hip were used as phenotypes for the within-population GWAS of Swedish and Finnish dogs. For the UK dogs, the hip status was screened by the BVA/KC scheme, where aggregated scores for bilateral joints (total hip score; HS) are given. To enable a combined analysis with Swedish and Finnish FCI grades in a meta-analysis, we transformed UK HS to FCI grades. UK single-population GWAS were performed for both HS and FCI grades. The results were highly similar (e.g. more than half of the top 1% significant SNPs overlapped) and therefore the results for UK HS are presented in the supplement (Fig. S2, Table S1).
Five genome-wide significant associations were identified for the UK FCI approach on Chr 21 (39. 2-39.4 Mbp) ( Table 1, Fig. 2). In addition, across all withinpopulation and meta-analyses, 32 suggestive associations (allowing one false-positive per genome scan) were identified, with most of them exclusively associated with a single population and phenotype approach (Table 1). Direct overlaps (shared SNPs) were only found between the FCI and case-control approaches for the Finnish population (on Chr 1) and between the FCI approach in the Finnish and meta-analysis GWAS (on Chr 9). Repeating the GWAS for the Finnish population [11] with altered phenotypes, regions with suggestive associations to CHD were detected on Chr 1 (45.1-46.6 Mbp) and on Chr 9 (31.3-36.8 Mbp). For the UK population, in addition to the genome-wide significant region on Chr 21, a region with suggestive association to FCI was detected on Chr 7 (58. . For the Swedish population, only single SNPs on Chr 8 and 25 showed suggestive associations with CHD. In the meta-analysis GWAS, for the FCI grade approach, the region on Chr 9 was detected (in the same region as the Finnish FCI grade GWAS), while for the case-control approach one SNP showed suggestive association on Chr 7, 5 Mbp upstream from the region showing a suggestive association in the UK population.
The locations of SNPs with significant and suggestive association were mapped to the CanFam3.1 assembly and an area of 200 kb around these SNPs was scanned for genes, revealing 39 positional candidate genes (6 genes with an intragenic SNP; Table 1). Analysis of these 39 genes in Enrichr revealed that the trait "Body mass index" in the GWAS Catalog 2019 was the fifth most significantly enriched term (adjusted p-value = 0.009; genes: GGNBP2, DHRS11, ZNHIT3, LHX1, PIGW, AATF, MRM1, MYO19, PLEKHA7, Table S3).

Comparison of populations
To identify whether there was evidence of common genetic architecture of CHD in the individual GSD populations, we examined the overlap between SNPs identified in GWAS for individual populations (UK, SWE and FIN). There were no overlaps in the top 0.1% SNPs  Table 1 SNPs with suggestive association identified within all CHD phenotype approaches in single-population and meta-analysis GWAS. GWAS were performed for 180 (UK), 402 (Swedish), 775 dogs (Finnish) and 1357 (meta-analysis) dogs, respectively. The Finnish-population and meta-analysis GWAS are based on data provided by Mikkola et al. [11] and the Finnish-population GWAS is a repetition of the original GWAS with altered phenotypes (see Materials

Discussion
GSDs represent one of the largest purebred dog populations in Europe and as a large-sized breed prone to CHD, long-term routine CHD screens (by various organisations) provide a unique opportunity to analyse the genetic architecture of CHD in genetic association studies. However, the inconsistent CHD scoring systems across countries requires specific strategies to enable joint GWAS. In this study, we combined CHD phenotype and SNP genotype data from two countries (UK and Sweden) with publicly available data from a Finnish GSD population [11] to perform single-population and meta-analysis GWAS for CHD, employing various approaches to define the phenotype. We used these analyses to investigate the performance of combined analyses and to identify novel risk loci for CHD.

QTLs for CHD
We identified five genome-wide significant and several suggestive associations between different CHD phenotypes and SNPs in GWAS of UK, Swedish and Finnish populations, where the Finnish population GWAS is a repetition of the original GWAS in Mikkola et al. [11] using altered phenotypes. There were no SNPs with significant or suggestive associations that overlapped between the single-population analyses and the top regions were located on different chromosomes for each population (e.g. Chr 1 and 9 for FIN, Chr 7 and 21 for UK). Similarly, comparison with reported associations between SNPs and CHD in two previous studies of German GSD populations (FCI screening scheme using a case-control approach: FCI grades A vs. C-D) also indicated primarily population-specific associations; these were found on Chr 19, 24, 26 and 34 [9] and on Chr 1, 3,4,8,9,16,19,26, and 33 [15]. Most of these regions did not overlap with the regions identified in our study, but a QTL was found by Marschall and Distl [15] on Chr 9 at 37.4 Mb, which may be the same locus as the one identified at 36.6 Mb in our study for the repeated GWAS of the Finnish population (FIN-FCI). The limited overlap between genomic regions identified for populations examined in this study and in the literature suggests a primarily population-specific genetic architecture of CHD. The fact that allele frequencies at significant and suggestive SNPs were similar across populations suggests that the lack of consistency was not due to limited power of the genotype datasets (Table S2). However, the lack of consistency could be an artifact of small sample sizes, especially for the UK population, which decreased the power to detect QTL. The inconsistency of identified SNPs associated with CHD is also seen in comparisons with studies of other dog breeds. For example in Bernese mountain dogs, SNPs significantly associated with FCI grades were identified on Chr 14 and 37 [8], while in Labrador retriever dogs, Sánchez-Molano et al. [4] identified SNPs with chromosome-wide significance associated with HS on Chr 1, 2, 11, 15, 21 and 23 [4]; and a study on the Norberg angle phenotype in 69 breeds found only one significant SNP on Chr 28 [10]. While we found no signals on most of these chromosomes, we did identify genome-wide significant SNPs for UK FCI on Chr 21 in the same region of the QTL for HS and various other CHD-associated traits reported by Sánchez-Molano et al. [4]. There is however some evidence for a common genetic basis of CHD between dog breeds, as Mikkola et al. [16] demonstrated a significant across-breed association with CHD at four out of 46 risk loci identified in previous studies. As described above, CHD is a complex trait and the growing evidence points to a polygenic architecture. There were a few overlaps between highly significant SNPs in the different populations covered by this study but reviewing this and previously published studies, the identified genomic regions generally differ between populations and breeds and furthermore, a major risk loci for CHD has yet to be identified. If the genetic architecture of CHD is largely populationspecific, the benefit of multi-population panels is questionable. In our study, most of the suggestive SNPs identified in the meta-analysis overlapped with suggestive SNPs identified in the Finnish population (Chr 1 and 9). This is not surprising since the number of Finnish GSDs exceeded the sample sizes of the UK and Swedish populations and thus, findings might be biased towards the Finnish population. However, other suggestive associations on Chr 7 and 15 were identified in the metaanalysis case-control and FCI analyses, respectively, but were not picked up in any of the single-population analyses, suggesting that the multi-population approach can in some cases provide increased power for QTL detection.

Technical implications for single and multi-population approaches
The approach for routine CHD scoring (e.g. FCI grades or BVA/KC scores), and thus the assessed CHD phenotype, depends on the geographical location of the dog population. For multi-population approaches with differing scoring schemes, we suggest the conversion of BVA/ KC scores into FCI grades, as previously described in the literature [13]. We tested the conversion of both BVA/ KC HS and worst hip score (WHS) into FCI grades and found that the two phenotypes were highly correlated (r = 0.997; only one dog was grouped into a different FCI grade between the two approaches). Therefore, we conclude that both BVA/KC score phenotypes are equally suitable for multi-population GWAS. The GWAS results for UK FCI and UK HS were similar, indicating that the conversion from BVA/KC scores to FCI grades was successful. In addition to the choice of BVA/KC phenotype, the CHD phenotypes can be treated as ordinal or casecontrol response variables. In this study, we performed single and multi-population (meta-analysis) GWAS for both response variable types, but unlike some previous studies [8,15,17], we did not exclude intermediate phenotypes (FCI-B dogs) in order to maximize the sample size. Despite this difference, we were able to replicate the identified regions previously reported in the original GWAS for the Finnish population [11,17], which indicates that including this category is a valid option.
There was not strong consistency between the analyses of ordinal variables and the case-control approach for significant or suggestive SNPs; aside from suggestive SNPs for the Finnish population on Chr 1, no overlaps were found between the two CHD phenotype approaches. Furthermore, more significant or suggestive SNPs were found using HS or FCI grades than using the case-control classification. In the case of the UK population (the smallest sample), there is a peak on Chr 21 for both FCI grades and the case-control classification, although there was no suggestive association for the latter. The significant region on Chr 7 was only detected for FCI. The selection of extreme phenotypes of a quantitative trait to generate a case-control approach in GWAS has previously been shown to increase the power of association studies [18]. Because we included all individuals and did not focus on the extremes, the case-control approach may have been less informative in our study. We also explored the use of an alternative case-control approach in which dogs with FCI scores A and B were classified as controls (instead of only the 'A' dogs). No suggestive associations were found with the metaanalysis but the previously noted regions on Chr 1 and 7 were identified in the Finnish and UK populations, respectively (results not shown). We conclude that neither of the case-control approaches is as sensitive as the approach based directly on FCI scores.
Finally, we compared significant and suggestive SNPs found for single population GWAS with the results from the meta-analysis (multi-population) GWAS. The advantage of using multiple populations has been shown in humans for a number of phenotypes [12]. Those authors hypothesized that genetic loci underlying a specific phenotype were the same across populations but there was heterogeneity in effect size. Consequently, they concluded that incorporating multiple populations in the genetic discovery of complex traits would be beneficial in terms of identifying causal (rare) variants. As discussed above, most suggestive SNPs identified in our meta-analysis overlapped with suggestive SNPs identified in the Finnish population (Chr 1 and 9), presumably due to its larger sample size, but two additional regions were only detected in the meta-analysis analysis.

Potential candidate genes and mechanisms
The 200 kb regions surrounding suggestive SNPs were investigated for potential candidate genes. In the following discussion, we focus primarily on those in which suggestive or genome-wide significant SNPs are located. The regions identified on Chr 1 and Chr 9 include the NADPH Oxidase 3 (NOX3) and Ankyrin Repeat And Fibronectin Type III Domain Containing 1 (ANKFN1) genes previously reported in the original GWAS by Mikkola et al. [17].
In addition to these previously highlighted candidates, we also identified a nearby candidate gene on Chr 9 associated with CHD: the Mediator Complex Subunit 13 (MED13) gene, which encodes the mediator complex subunit 13 and plays a key role in transcription regulation, harboured six suggestive SNPs for the metaanalysis and repeated Finnish FCI GWAS. In mice, a transcriptional regulatory mechanism for the control of skeletal muscle glucose homeostasis controlled by MED13 was shown through skeletal muscle-specific deletion of the gene [19]. Furthermore, it was previously reported that hip dysplasia was one of the clinical features of a human patient with MED13 mutations [20]. Interestingly, targeted sequencing of a 7-Mb region on Chr 9 revealed a variant in MED13 that segregated between CHD cases and controls in the original analysis in the Finnish population [17]. The functional prediction of the variant showed no significant effect in the original work, however, in our study, using the FCI phenotype for both the Finnish population and in the meta-analysis resulted in the detection of multiple suggestive, intragenic SNPs. The meta-analysis case-control GWAS also revealed a candidate region on Chr 7, with a suggestive SNP within the Vasohibin 2 (VASH2) gene. Among other functions, VASH2 has been reported to be involved in mammary tumour development in dogs [21] and to enhance angiogenesis in mice [22]. In a study on soft tissue gene expression in CHD, the biological function "angiogenesis" was enriched by differentially expressed genes between CHD affected and unaffected dogs [23]. In humans, enhanced angiogenesis is a wellknown consequence of osteoarthritis [24]. It is possible that the association with the variant in the VASH2 gene reflects the osteoarthritis aspect of the FCI score (reflected in the higher grades).
A broader analysis of the genes located within 200 kb of suggestive SNPs revealed that several of them (GGNBP2, DHRS11, ZNHIT3, LHX1, PIGW, AATF, MRM1, MYO19, PLEKHA7) were previously associated with body-mass index (BMI) in GWAS studies. Pleckstrin Homology Domain Containing A7 (PLEKHA7) is particularly interesting because a genome-wide significant SNP (on Chr 21) for UK FCI was located within this gene. The prevalence of hip dysplasia was previously associated with BMI in dogs [25], suggesting obesity as a possible risk factor for CHD. Thus, it is possible that some of the heritable component of HD is related to BMI. Unfortunately, as we did not have information on body size for all of the dogs, we could not fit this as a fixed effect in our models.

Conclusions
Despite the efforts from various breeding programs, CHD remains a common disease in dogs with a large impact on animal welfare, but the biological basis is not well-understood. In this study, we confirmed the complex genetic nature of the trait, with multiple loci associated with CHD in German Shepherd dogs and also observed that most associated SNPs are populationspecific. However, some genomic regions were only identified in the meta-analysis of three populations, thus indicating that routinely assessed CHD information collected across countries provides an opportunity to increase sample sizes and statistical power for association studies. The lack of standardisation of CHD assessment schemes poses a challenge, but conversion of traits can be utilised to overcome this obstacle. Further investigation into the population-specific nature of CHD will help to uncover the biological basis of this disease and will inform selection schemes.

Genotypes
For the UK and Swedish populations, DNA was extracted from saliva samples collected with Performagene PG-100 swabs (UK dogs) and blood (Swedish dogs). The dogs were genotyped using the Illumina CanineHD Whole-Genome Genotyping BeadChip featuring 172,115 SNPs. Quality control procedures were carried out, as previously described in Friedrich et al. [26]. Filtering was imposed in GenomeStudio version 2.0 for sample call rate > 90%, SNP call rate > 98%, reproducibility (GTS) > 0.6 and low or confounded signal characterised by AB R mean (mean normalized intensity of the AB cluster) > 0.3. SNPs were also filtered using PLINK version 1.9 [27,28] to remove those with minor allele frequency (MAF) < 0.05 and significant deviations from Hardy-Weinberg equilibrium (HWE) (Bonferroni-corrected pvalue of 0.05 = 4.5 × 10 -7 ), resulting in 78,088 autosomal SNPs. For the meta-analysis, we used publicly available data for a Finnish GSDs population from a study by Mikkola et al. [11]. In their study, DNA was extracted from preserved blood samples and dogs were genotyped with the same genotype array as described above. For quality control, we filtered their data for a sample call rate > 90%, a SNP call rate > 98%, MAF > 0.05 and deviations from HWE in PLINK version 1.9 [27,28], which resulted in 75,271 autosomal SNPs. As final step, we extracted SNPs that overlapped between the UK, Swedish and Finnish populations (n = 62,089), which were used for all subsequent analyses. Most of the SNPs that did not overlap between the two datasets had been removed due to the very stringent filtering for SNP call rate.

Canine hip dysplasia (CHD) phenotypes
CHD records for genotyped UK dogs were provided by the British Kennel Club (KC) and for Swedish dogs by Svenska Kennelklubben (SKK). GSDs from the UK population were bred by multiple breeders and primarily were pet dogs. All GSDs from the Swedish population were bred within the breeding program of the Swedish Armed Forces (SAF), which was founded in 2004 with the purpose of breeding working dogs. The CHD records for the Finnish dogs we used for the meta-analysis are publicly available from the study of Mikkola et al. [11] and they were chosen based on Finnish Kennel Club data with balanced sampling for the dog's function (working dogs, show dogs, working and show dogs). For the UK dogs, the hip status was screened by the BVA/ KC scheme. In this scheme, determined by the severity of HD-related measurements from normal to severe, aggregated scores for bilateral joints (total hip score; HS) are given from 0 to 106 (0 to 53 for each joint). The Swedish and Finnish dogs were scored according to the five-grade FCI scheme, where the CHD severity of the hip joint is classified into A, B, C, D or E grades. The FCI score was only available for the worst hip joint in the Swedish population, thus for data compatibility, we also used the FCI score for the worst hip joint in the Finnish population in this analysis.
To analyse the performance of different CHD phenotypes for the single and multi-populations GWAS, various approaches were used ( Table 2). As a first approach for the single population analysis, HS and FCI-transformed HS as described below (UK population) and FCI grades (Swedish and Finnish populations) were used as the response variables. We repeated the single-population GWAS for the Finnish population because we used a different CHD phenotype (FCI of the worst hip joint only, inclusion of FCI B dogs) in contrast to the original GWAS performed by Mikkola et al. [11]. In addition to HS, dogs in the UK also had scores for their worst hip (WHS). In an initial analysis, we tested both HS and WHS and found a high correlation between the GWAS results (r for p-values = 0.93 and r for effect sizes = 0.98). Therefore, in the following, we focus on HS. In a second approach, dogs were classified into cases and controls. Based on a previous comparison of different CHD scoring systems [29], GSDs with a total hip score ≤ 10 (UK population) or grade A (Swedish and Finnish population) were considered as controls. The remaining dogs were treated as cases. In contrast to the methodology described in Mikkola et al. [11], we did not exclude intermediate dogs (FCI-B), as this would have decreased the sample size substantially for the UK population. The number of dogs in the different categories can be found in Table 3.
The meta-analyses required alignment of the FCI and BVA/KC scoring schemes. BVA/KC total hip scores for UK dogs were converted into FCI five-level grades following a recommended conversion (0-10 = A, 11-25 = B, 26-35 = C, 36-50 = D, 51-106 = E) [13]. As CHD phenotypes for the meta-analysis GWAS, we again used two approaches: (I) the FCI grade (using the above conversion for UK GSDs) and (II) a grouping into cases and controls as described above (A = controls, B-E = cases). CHD records and genotypes were available for 180 UK, 402 Swedish and 775 Finnish GSDs (1357 in total) ( Table 3).

Analysis of population structure
To identify underlying population structure, principle component analysis (PCA) was performed within and across all three populations using PLINK version 1.9 [27,28]. A pruned SNP dataset was used to account for linkage disequilibrium as recommended in the PLINK documentation. Therefore, the combined SNP dataset was pruned based on the variance inflation factor with default parameters set by PLINK (windows size = 50, shift steps of SNP numbers = 5, the variance inflation factor threshold = 2), resulting in 5167 SNPs. Then, the PCA was performed using this pruned dataset separately within and across populations.

Building the genetic model
To build the genetic model, the genetic structure and non-genetic factors were tested for their effect on CHD phenotypes. In addition to the first two principal components (PCs) obtained from the PCA, the following non-  genetic factors were tested for each CHD phenotype within populations and for the meta-analysis (across populations): sex, birth year, birth month and age at radiographing. Additionally, for the meta-analyses, 'population' was also fitted in the model. All factors were fitted as fixed effects in a linear model in R [30] and backward elimination was implemented using the 'ste-pAIC' function of the R package 'MASS' to remove one factor at a time and select the model with the lowest Akaike information criterion (AIC). The final genetic models for the CHD phenotypes are shown in Table 4.

Genome-wide association study (GWAS)
The GWAS to identify associations between markers and HD was performed using GEMMA [31] on the 62,089 common SNPs for the single and meta-analysis approaches. The univariate linear mixed model was fitted as below: with the following terms: y is a vector of CHD phenotypes (depending on the approach), μ is the overall mean, b is a vector of fixed effects (as described in Table  4) with X as the corresponding incidence matrix, c is a vector of filtered SNPs (alleles coded as 0/1) with β as the corresponding regression coefficients, Z is the incidence matrix for the vector of random polygenic effects, α, and e is a vector of residuals. The vectors of polygenic effects and residuals follow multivariate normal (MVN) distributions given by MVNð0; σ 2 α GÞ and MVNð0; σ 2 e IÞ; respectively, where G is the genomic relationship matrix composed of the filtered, common SNPs, I is the identity matrix and σ 2 α and σ 2 e are the genetic variance associated with G and the environmental variance, respectively.
To correct for multiple testing, the Bonferroni correction was applied to account for the number of tests carried out per marker. Accordingly, genome-wide significant markers had P-values < 8.1E-07 (adjusted Pvalue = 0.05/62,089). Since the Bonferroni correction is very stringent, especially for genetic studies in pedigree dogs, which have high levels of linkage disequilibrium across the genome, we also identified markers with suggestive associations. Markers with suggestive associations were determined by allowing one false-positive per genome scan (adjusted P-value < 1.6E-05 = 1/62,089; suggestive association). To compare the results across datasets, we further examined the overlaps between SNPs identified across analyses.

Availability of data and materials
Genotype and phenotype data for the UK dogs is available from the Dryad Digital Repository: doi:https://doi.org/10.5061/dryad.h44j0zpkf. The data for the Swedish dogs are restricted by the SAF for reasons of national security.

Declarations
Ethics approval and consent to participate We received approval for the study from the Veterinary Ethical Review Committee (VERC) and Human Ethical Review Committee (HERC) at the Royal (Dick) School of Veterinary Studies, The University of Edinburgh. Dog owners gave informed consent for inclusion in this study. All methods were carried out in accordance with relevant guidelines and regulations.

Consent for publication
Not applicable.