Skip to main content
  • Research article
  • Open access
  • Published:

A genome-wide association study demonstrates significant genetic variation for fracture risk in Thoroughbred racehorses



Thoroughbred racehorses are subject to non-traumatic distal limb bone fractures that occur during racing and exercise. Susceptibility to fracture may be due to underlying disturbances in bone metabolism which have a genetic cause. Fracture risk has been shown to be heritable in several species but this study is the first genetic analysis of fracture risk in the horse.


Fracture cases (n = 269) were horses that sustained catastrophic distal limb fractures while racing on UK racecourses, necessitating euthanasia. Control horses (n = 253) were over 4 years of age, were racing during the same time period as the cases, and had no history of fracture at the time the study was carried out. The horses sampled were bred for both flat and National Hunt (NH) jump racing. 43,417 SNPs were employed to perform a genome-wide association analysis and to estimate the proportion of genetic variance attributable to the SNPs on each chromosome using restricted maximum likelihood (REML). Significant genetic variation associated with fracture risk was found on chromosomes 9, 18, 22 and 31. Three SNPs on chromosome 18 (62.05 Mb – 62.15 Mb) and one SNP on chromosome 1 (14.17 Mb) reached genome-wide significance (p < 0.05) in a genome-wide association study (GWAS). Two of the SNPs on ECA 18 were located in a haplotype block containing the gene zinc finger protein 804A (ZNF804A). One haplotype within this block has a protective effect (controls at 1.95 times less risk of fracture than cases, p = 1 × 10-4), while a second haplotype increases fracture risk (cases at 3.39 times higher risk of fracture than controls, p = 0.042).


Fracture risk in the Thoroughbred horse is a complex condition with an underlying genetic basis. Multiple genomic regions contribute to susceptibility to fracture risk. This suggests there is the potential to develop SNP-based estimators for genetic risk of fracture in the Thoroughbred racehorse, using methods pioneered in livestock genetics such as genomic selection. This information would be useful to racehorse breeders and owners, enabling them to reduce the risk of injury in their horses.


Metabolic bone disorders are often a cause of bone fragility and increased risk of fracture. A common example of a bone metabolic disorder in humans is osteoporosis; a late-onset disease characterized by low bone mineral density, structural deterioration of bone tissue and an elevated risk of fracture in affected individuals. Bone fragility has an estimated heritability of 16-54% [13] in humans, depending on fracture site and type, and several genes associated with bone mineral density and fracture risk have been identified in both humans and other species [47], although the genes underlying each of these traits appear to be different [8, 9].

Bone fractures with non-traumatic origin occur in Thoroughbred racehorses, with the majority of fractures occurring in the distal limbs; bones subject to high impact and load during exercise and racing. Fracture is the main reason for euthanasia of horses on the racecourse [10], with an average of 60 horses per year suffering a fatal distal limb fracture during racing in the UK (both flat and National Hunt jump racing) [11]. The prevalence of all fatal and non-fatal fractures occurring during training is between 10-20% [12, 13]. Studies of the pathology of equine fracture indicate evidence of stress-related damage to the bone prior to fracture, which may be related to metabolic disturbances in bone re-modelling [14, 15].

Fracture risk has been demonstrated to be heritable in several species but its heritability in the horse has not been previously investigated. In this study, we have identified candidate genome regions associated with fracture risk in the Thoroughbred horse by carrying out a genome-wide association study (GWAS) with 43,417 SNPs genotyped on 269 fracture cases and 253 controls. We have also demonstrated that there is significant genetic variation for fracture risk in the Thoroughbred horse, distributed among several chromosomes.



Fracture cases were horses that sustained catastrophic distal limb fractures while racing on UK racecourses, which necessitated euthanasia. A total of 276 fracture case samples were obtained from an archive of bone and tissue collected during a previous study between February 1999 and May 2005 [11]. The exact fracture site and type were identified by post-mortem examination. The frequency of fracture locations is shown in Additional file 1: Table S1. All fracture sites were in bones below and including the knee and hock. No cases with fractures in other bones, for example, the pelvis, neck or skull, were included in the study. The phenotype can, therefore, be regarded as a sub-set of fracture types involving only the distal limbs. Control samples (n = 269) came from a mixture of uninjured horses originally selected from the same race as the case (n = 66) and uninjured horses sampled as part of a previous study (n = 203). Control horses were over 4 years of age, were racing during the same time period as the cases, and had no history of fracture at the time the study was carried out. After genotyping quality control 522 horses remained in the analysis. Horses were bred for both flat and National Hunt (NH) jump racing: of the cases 135 were flat-bred, 110 NH-bred and 24 of unknown status, of the controls 117 were flat-bred, 135 NH-bred and 1 of unknown status (Table 1). Horse identities were anonymised, and no pedigree information was available.

Table 1 Distribution of fracture cases and controls by background and sex

DNA extraction and quantification

Samples consisted of either tissue or bone marrow biopsies (cases) or blood samples collected in EDTA (controls). DNA was extracted using Nucleon BACC DNA extraction kits by Gen-Probe Life Sciences Ltd. DNA samples were quantified in duplicate using Quant_iT PicoGreen dsDNA kits (Invitrogen, Carlsbad, CA) and 10% of the samples were run on a 1% agarose gel to check for the presence of high molecular weight DNA. DNA aliquots were adjusted to a concentration of 70 ng/ul for genotyping.

SNP genotyping and quality control

Samples were genotyped with the Equine SNP50 BeadChip (Illumina, San Diego, CA) by Cambridge Genomic Services (University of Cambridge, UK). The Equine SNP50 BeadChip contains 54,602 SNP assays (average density one SNP per 43.2 kb) selected from the database of over one million SNPs generated during the sequencing of the horse genome ( The initial association results indicated a genome-wide significant region on ECA 18. In order to try to refine the position (fine map) the location of the associated gene, an additional 78 SNPs located on ECA 18 between 61.89 Mb and 71.17 Mb (listed in Additional file 2: Table S2), and not present on the BeadChip, were genotyped using a GoldenGate assay (Illumina, San Diego, CA) by Gen-Probe Life Sciences Ltd.

The genotyping data were analysed with GenomeStudio software (Illumina, San Diego, CA). A cluster file was generated directly from the fracture dataset (n = 545) together with an additional 797 Thoroughbred samples genotyped at the same time. All genotyping data were clustered de novo for the 1,342 samples. The average SNP call frequency was 98.82%, with 150 SNPs not called. Nineteen samples (1.4%) had a call rate less than 95% and these were discarded. The remaining samples were then re-clustered. The average SNP call frequency had increased to 99.17%, with only 143 SNPs (0.26%) not called from the 54,602 on the EquineSNP50 BeadChip.

All SNPs were then subjected to a number of editing steps with GenomeStudio software, during which thresholds were applied for a number of metrics following the chip manufacturer’s guidelines. This resulted in the removal of 190 SNPs with low intensity data (AB R Mean), 1,265 SNPs with inadequately defined clusters (cluster separation), 2,279 SNPs with call rates less than 98%, 297 SNPs where the heterozygote cluster was not well separated from the homozygote clusters (AB T Mean), 119 SNPs where genotypes differ significantly from Hardy-Weinberg equilibrium and 51 SNPs where X chromosome SNPs were heterozygous in males. A total of 4,201 SNPs were removed. The mean call frequency in the remaining SNPs was 99.83%. Markers with a minor allele frequency (MAF) less than 2% were excluded from the analysis (n = 11,124), as were markers that failed the Hardy-Weinberg equilibrium test (p < 0.001) (n = 96), and markers with more than 10% of genotypes missing (n = 4,223): 43,417 SNPs remained in the analysis.

Further quality control procedures on the samples, such as estimation of sample gender based on X chromosome genotypes and identification of duplicated samples based on genotype identity, resulted in 39 samples out of 1,342 being discarded. From the fracture study samples 269 cases and 253 controls passed the quality control procedures and 7 cases and 16 control samples failed.

Statistical analyses

Population stratification

Possible population stratification was assessed by calculating identity-by-state (IBS) sharing among all pairs of individuals. A permutation test for between group IBS differences (where the null hypothesis is no similarity between groups) showed a significant (p < 2 × 10-5) degree of similarity (IBS) between cases and controls. There was also significant IBS between flat-bred and National Hunt-bred horses (p < 1 × 10-5), with evidence for more similarity within the flat-bred and National Hunt-bred groups than between the groups (within-group p < 1 × 10-5, between-group p < 7 × 10-5). These results suggest that both the cases and controls are drawn from genetically related populations. Similarly flat-bred and National Hunt-bred horses are from related populations, although there is some clustering (probably of family groups) within the flat and National Hunt populations. Additional file 3: Figure S1 shows multi-dimensional scaling plots based on IBS sharing between cases and controls, and flat-bred and National Hunt-bred horses. Cases and controls are evenly distributed through both the flat and National Hunt-bred populations.

Whole genome Cochran-Mantel-Haenszel (CMH) association

The Cochran-Mantel-Haenszel (CMH) association test (2 × 2 × K, where K = 52 clusters) in PLINK v1.07 [16] was used in order to correct for the potential confounding of population stratification (ppc 0.001). The CMH association test allows for comparison of cases and controls while controlling for clusters within the data, where the clusters are defined by IBS sharing among individuals (52 clusters were identified in the fracture data set). The CMH analysis tests each single SNP independently. Empirical p-values were calculated using 1,000 permutations with the adaptive permutation option in PLINK v1.07 [16].

Haplotype logistic regression analysis of ECA 18

Haplotype blocks in the region of interest on ECA 18 were identified based on the value of r2 and visualized using HAPLOVIEW [17]. Blocks containing significant SNPs were further analysed using haplotype logistic regression with PLINK v1.07, with sex and background (flat/National Hunt-bred) fitted as co-variates. Corrected p-values for the logistic regressions were obtained with 10,000 permutations. Haplotype frequencies within case and control groups were also determined.

Estimation of the genetic variance due to SNPs

Estimates of the genetic variance explained by all SNPs and by SNPs on individual chromosomes were obtained with Restricted Maximum Likelihood (REML) analysis using the GCTA program [18]. GCTA allows the proportion of variance explained by SNPs to be estimated for a complex disease based on case-control genome-wide association study (GWAS) data [19]. The method takes account of the binary (0-1) nature of case-control data and estimates the genetic variance on the, more interpretable, underlying liability scale. It also takes account of the bias in ascertainment, due to the proportion of cases being larger than the disease prevalence in the population.

A genetic relationship matrix was derived with GCTA [18, 19] from the 43,417 genotyped SNPs. Sex was fitted as a fixed effect, as it had previously been determined to be a significant effect with an ANOVA analysis. GCTA accounts for relationships among individuals through the genetic relationship matrix, but also permits principal component analysis (PCA) eigenvectors to be included as covariates to capture variance due to population structure. In this analysis the first 20 eigenvectors were included as co-variates, in order to account for the structure of the flat and National Hunt-bred populations.

Results and discussion

Population stratification

IBS sharing among individuals indicated there was some genetic differentiation between the flat and National Hunt-bred horses, with clustering of families within the groups (Additional file 3: Figure S1 a and b). Pedigree analysis of 120 leading (ranked by offspring earnings) flat and NH sires in the UK in 2012 showed the mean coancestry among the sires was 0.0295, minimum 0.0024 and maximum 0.276. This confirms the existence of gene flow or shared ancestry between the groups. A multidimensional scaling plot derived from the pedigree-based coancestry (Additional file 3: Figure S1 c) shows a similar pattern to the DNA-based IBS sharing; there is family clustering within NH or flat-bred groups, but also evidence for some differentiation between the family lines used to produce flat or NH horses. In addition to genetic differences between the populations there are also differences in the environmental risks experienced. The risk of fracture depends on the type of racing: flat turf racing is the safest (0.4 fatal fractures/1000 starts) whilst National Hunt racing is associated with the highest risk (2.2 fatal fractures/1000 starts) [11]. The increase in environmental risk for National Hunt racehorses could make the ascertainment of genetically susceptible horses from this population more difficult, potentially decreasing the power of a genetic study.

Prior to correction for population stratification the genomic inflation factor (λ) for the genome-wide association study was 1.15, and after correction with the CMH test λ was 1.05. Additional file 4: Figure S2 shows the quantile-quantile (Q-Q) plot obtained after the CMH analysis. The reduction in λ suggests the CMH analysis is effectively accounting for population structure within the data. Inclusion of background (National Hunt or flat-bred) into a genome-wide logistic regression model gave a genomic inflation factor (λ ) of 1.04, also suggesting this classification of horses and its inclusion as a co-variate in the analysis corrects sufficiently for population stratification. Classification of background (with two levels) as a fixed effect in the logistic model is simpler than including PCA eigenvectors or CMH clusters where there are many levels (in this case, either 20 or 52), and thus permits large numbers of permutations to be computed in a reasonable time-frame.

Genome-wide association study

Three SNPs on ECA 18 and one on ECA 1, reached genome-wide significance after correction for multiple testing (pgenome < 0.05) Table 2. ECA 18 showed evidence for more than one SNP associated with distal limb fracture. A number of supporting SNPs are seen, with the peak localizing to around 62 Mb. There is also evidence of suggestive signals seen on ECA 3, 8, 9, 15, 21 and 22 although they do not reach genome-wide significance level. Figure 1 shows Manhattan plots of (a) the raw p-values from the genome-wide association (Cochran-Mantel-Haenszel) scan for distal limb fracture (b) empirical p-values, calculated after 1000 permutations (c) empirical p-values for ECA 18. The additional 78 SNPs on ECA 18, genotyped for fine mapping purposes, showed no significant associations with fracture risk and did not explain any more of the genetic variation in the heritability analysis than the SNPs included on the Equine SNP50 BeadChip.

Table 2 Unadjusted (raw) and corrected p -values for the 25 top-ranking SNPs from the genome-wide Cochran-Mantel-Haenszel (CMH) association analysis
Figure 1
figure 1

Manhattan plots of raw and corrected p -values from the genome-wide association (CMH test) for whole genome and ECA 18. (a) Manhattan plot of raw p-values from the genome-wide association analysis (CMH test) with flat and National Hunt-bred horses combined. (b) Manhattan plot of empirical p-values, calculated after 1000 permutations (c) empirical p-values for ECA 18 plotted against SNP position on the chromosome (Mb). The 5% genome-wide significance threshold is shown as a red line.

Linkage disequilibrium and haplotype-block analysis

Examination of the linkage disequilibrium (LD) among SNPs showed that the most significant SNPs on ECA 18 fall into an LD block containing 10 SNPs in total and spanning 140 kb (haplotype block 1 in Figure 2). All SNPs within the block are in high LD with each other with pair-wise r2 of at least 0.76. The haplotype GGAGGCTAAA is at higher frequency in the controls and has a protective effect, with logistic regression (Table 3) showing that controls are 1.95 times at less risk of fracture than cases (p = 1 × 10-4). TGGAATTAAG, a risk haplotype, is at low frequency in the cases and, at least in this data set, absent from the controls. Cases with this haplotype are at 3.39 times higher risk of fracture than controls (p = 0.042).

Figure 2
figure 2

Linkage disequilibrium (LD) and haplotype frequencies for LD blocks around significant SNPs on ECA 18. (a) Linkage disequilibrium (LD) around significant SNPs on ECA 18. Haplotype block 1 (62.01 – 62.15 Mb) contains the two most significant SNPs from the genome-wide association study (BIEC2-416680 and BIEC2-416681). There is only one known gene within this haplotype block, ZNF804A. Haplotype block 2 (62.15 – 62.76 Mb) contains the candidate gene FSIP2, while haplotype block 3 (62.76 – 65.87 Mb) contains the candidate genes ITGAV, CALCRL, COL3A1 and COL5A2. SNP BIEC2-417495 in haplotype block 4 (67.18 – 67.20 Mb) is in linkage disequilibrium (r2 = 0.8) with the myostatin (MSTN) gene, believed to be associated with racing performance [2426], but there is only moderate LD (r2 < 0.3) between this SNP and the SNPs in haplotype block 1 which are significantly associated with catastrophic fracture risk (b) Observed haplotypes and their frequencies for the four haplotype blocks observed in the ECA 18 fracture associated region.

Table 3 Logistic regression results for ECA 18 haplotype block 1

There is low LD (r2 < 0.1) between adjacent haplotype blocks in the ECA 18 region. Haplotype block 1 contains 39.5 kb of the zinc finger protein 804A gene (ZNF804A) and the two most significant SNPs in the GWAS are located in this haplotype block, 2.2 kb from the end of the gene. ZNF804A has been reported to have a variant associated with schizophrenia in humans [20, 21], and regulates expression of genes such as the catechol O-methyl transferase gene (COMT) [22] which has been associated with increased fracture risk in males [23]. An elevated risk of fracture has been noted in schizophrenics [24], but no genes directly associated with fracture risk in schizophrenics have previously been reported.

Other candidate genes in ECA 18 haplotype block 3 are integrin alpha-V (ITGAV, ECA 18: 63,417,718 – 63,498,794), a receptor binding to a variety of extracellular matrix proteins including osteopontin and bone sialoprotein, the calcitonin receptor-like gene (CALCRL, ECA 18: 64,065062 – 64, 102,603), collagen type III alpha 1 (COL3A1, ECA 18: 65,487,247 – 65,526,274), collagen type III alpha 2 (COL3A2), and collagen type V alpha 2 (COL5A2, ECA 18: 65,549,357 – 65,689,370). LD between ECA 18 haplotype blocks 1 and 3 is generally low (r2 < 0.3), apart from SNPs BIEC2-417210 and BIEC2-417274 which are in moderate LD (r2 = 0.38 – 0.46) with SNPs in block 1. The LD observed may have arisen due to a combination of selected alleles at different genes in this region. For example, there is evidence that racing performance and optimal racing distance in the Thoroughbred horse is influenced by the nearby myostatin (MSTN) locus [2527] and the extent of LD observed in this region may be the result of a selective sweep [28].

Genetic variance explained by SNPs

Genetic variance explained by SNPs for fracture risk was estimated to be 0.479 (s.e. 0.124). A log-likelihood of 110.6 for the full model compared with a log-likelihood of 103.4 for the null model (genetic variance σg2 = 0) and likelihood ratio test (LRT) of 14.32 (p = 0.00015) confirms the variance is significantly different from zero. Genetic variance estimates for each individual chromosome showed significant variance on chromosomes 9, 18, 22 and 31 (Figure 3 and Additional file 5: Table S3). Chromosomes 9 and 18 accounted for the largest genetic variance, around 0.19, followed by chromosomes 22 and 31. Together these chromosomes account for 61.8% of the total estimated genetic variance.

Figure 3
figure 3

Heritability of fracture risk by chromosome. Estimates of the genetic variance explained by SNPs on individual chromosomes were obtained with Restricted Maximum Likelihood (REML) analysis using the GCTA program.

The highest individual chromosome genetic variance estimates correspond with some, but not all, of the chromosomes identified as showing significant association with fracture risk in the genome-wide association study (GWAS). REML analysis to estimate the genetic variances accounts for both genetic relatedness among individuals, through the SNP-based genetic relationship matrix, and population structure [18, 19]. In contrast, association methods rely on individuals being unrelated and where there is cryptic relatedness among individuals this can result in an inflation of type I errors or false positives. We have used both approaches in this study, giving increased confidence in chromosomes where the results are concordant. GWAS methods are conservative in their approach, as stringent significance thresholds must be reached before a result is declared significant. Large sample sizes are generally required (> 1000 cases and controls) before these significance thresholds are reached. SNPs that do not reach the significance threshold for a GWAS may, nevertheless, still have genetic effects on the disease. Whether or not they reach the significance threshold depends on the size of the allele effect, the allele frequencies and the degree of linkage disequilibrium between the SNPs and the causal mutations.


Significant genetic variation for fracture risk in the Thoroughbred horse was detected on chromosomes 9, 18, 22 and 31 using REML analysis. In a related genome-wide association study SNPs on chromosomes 1 and 18 reached genome-wide significance. Several plausible candidate genes involved in bone development are located in these regions. However, the identification of further candidate regions for fracture risk is likely to require larger sample sizes. This study has demonstrated that fracture risk in the horse is heritable and that there is the potential to develop SNP-based estimators for genetic risk of fracture in the Thoroughbred racehorse.


  1. MacGregor A, Snieder H, Spector TD: Genetic factors and osteoporotic fractures in elderly people. Twin data support genetic contribution to risk of fracture. BMJ. 2000, 320: 1669-1670. 10.1136/bmj.320.7250.1669.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  2. Andrew T, Antioniades L, Scurrah KJ, Macgregor AJ, Spector TD: Risk of wrist fracture in women is heritable and is influenced by genes that are largely independent of those influencing BMD. J Bone Miner Res. 2005, 20: 67-74. 10.1359/JBMR.041015.

    Article  PubMed  Google Scholar 

  3. Deng HW, Chen WM, Recker S, Stegman MR, Li JL, Davies KM, Zhou Y, Deng H, Heaney R, Recker RR: Genetic determination of Colles’ fracture and differential bone mass in women with and without Colles’ fracture. J Bone Miner Res. 2000, 15: 1243-1252. 10.1359/jbmr.2000.15.7.1243.

    Article  CAS  PubMed  Google Scholar 

  4. Duncan EL, Danoy P, Kemp JP, Leo PJ, McCloskey E, Nicholson GC, Eastell R, Prince RL, Eisman JA, Jones G, Sambrook PN, Reid IR, Dennison EM, Wark J, Richards JB, Uitterlinden AG, Spector TD, Esapa C, Cox RD, Brown SD, Thakker RV, Addison KA, Bradbury LA, Center JR, Cooper C, Cremin C, Estrada K, Felsenberg D, Glüer CC, Hadler J: Genome-wide association study using extreme truncate selection identifies novel genes affecting bone mineral density and fracture risk. PLoS Genet. 2011, 7: e1001372-10.1371/journal.pgen.1001372.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  5. Li X, Masinde G, Gu W, Wergedal J, Mohan S, Baylink DJ: Genetic dissection of femur breaking strength in a large population (MRL/MpJ x SJL/J) of F2 mice: single QTL effects, epistasis and pleiotropy. Genomics. 2002, 79: 734-740. 10.1006/geno.2002.6760.

    Article  CAS  PubMed  Google Scholar 

  6. Lagerholm S, Park HB, Luthman H, Grynpas M, McGuigan F, Swanberg M, Åkesson K: Identification of candidate gene regions in the rat by co-localization of QTLs for bone density, size, structure and strength. PLoS One. 2011, 6: e22462-10.1371/journal.pone.0022462.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  7. Campbell AW, Bain WE, McRae AF, Broad TE, Johnstone PD, Dodds KG, Veenvliet BA, Greer GJ, Glass BC, Beattie AE, Jopson NB, McEwan JC: Bone density in sheep: genetic variation and quantitative trait loci localisation. Bone. 2003, 33: 540-548. 10.1016/S8756-3282(03)00228-X.

    Article  CAS  PubMed  Google Scholar 

  8. Duncan EL, Brown MA: Genetic determinants of bone density and fracture risk – state of the art and future directions. J Clin Endocrinol Metab. 2010, 95: 2576-2587. 10.1210/jc.2009-2406.

    Article  CAS  PubMed  Google Scholar 

  9. Ralston SH: Genetics of osteoporosis. Proc Nutr Soc. 2007, 66: 158-165. 10.1017/S002966510700540X.

    Article  CAS  PubMed  Google Scholar 

  10. McKee SL: An update on racing fatalities in the UK. Equine Vet Educ. 1995, 7: 202-204. 10.1111/j.2042-3292.1995.tb01225.x.

    Article  Google Scholar 

  11. Parkin TD, Clegg PD, French NP, Proudman CJ, Riggs CM, Singer ER, Webbon PM, Morgan KL: Horse-level risk factors for fatal distal limb fracture in racing Thoroughbreds in the UK. Equine Vet J. 2004, 36: 513-519.

    Article  CAS  PubMed  Google Scholar 

  12. Verheyen KL, Wood JL: Descriptive epidemiology of fractures in British Thoroughbred racehorses in training. Equine Vet J. 2004, 36: 167-173.

    Article  CAS  PubMed  Google Scholar 

  13. Ramzan PHL, Palmer L: Musculoskeletal injuries in thoroughbred racehorses: a study of three large training yards in Newmarket (2005-2007). Vet J. 2011, 187: 325-329. 10.1016/j.tvjl.2009.12.019.

    Article  PubMed  Google Scholar 

  14. Clegg PD: Musculoskeletal disease and injury, now and in the future. Part 1: fractures and fatalities. Equine Vet J. 2011, 43: 643-649. 10.1111/j.2042-3306.2011.00457.x.

    Article  CAS  PubMed  Google Scholar 

  15. Stover SM: The epidemiology of thoroughbred racehorse injuries. Clinical Techniques in Equine Practice. 2003, 2: 312-322. 10.1053/j.ctep.2004.04.003.

    Article  Google Scholar 

  16. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ, Sham PC: PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007, 81: 559-575. 10.1086/519795.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  17. Barrett JC, Fry B, Maller J, Daly MJ: Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005, 21: 263-265. 10.1093/bioinformatics/bth457.

    Article  CAS  PubMed  Google Scholar 

  18. Yang J, Lee SH, Goddard ME, Visscher PM: GCTA: a tool for Genome-wide Complex Trait Analysis. Am J Hum Genet. 2011, 88: 76-82. 10.1016/j.ajhg.2010.11.011.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  19. Lee SH, Wray NR, Goddard ME, Visscher PM: Estimating missing heritability for disease from genome-wide association studies. Am J Hum Genet. 2011, 88: 294-305. 10.1016/j.ajhg.2011.02.002.

    Article  PubMed Central  PubMed  Google Scholar 

  20. O’Donovan MC, Carroll L, Georgieva L, Dwyer S, Holmans P, Marchini JL, Spencer CC, Howie B, Leung HT, Hartmann AM, Möller HJ, Morris DW, Shi Y, Feng G, Hoffmann P, Propping P, Vasilescu C, Maier W, Rietschel M, Zammit S, Schumacher J, Quinn EM, Schulze TG, Williams NM, Giegling I, Iwata N, Ikeda M, Darvasi A, Shifman S, He L: Identification of loci associated with schizophrenia by genome-wide association and follow-up. Nat Genet. 2008, 40: 1053-1055. 10.1038/ng.201.

    Article  PubMed  Google Scholar 

  21. Esslinger C, Walter H, Kirsch P, Erk S, Schnell K, Arnold C, Haddad L, Mier D, Opitz von Boberfeld C, Raab K, Witt SH, Rietschel M, Cichon S, Meyer-Lindenberg A: Neural mechanisms of a genome-wide supported psychosis variant. Science. 2009, 324: 605-10.1126/science.1167768.

    Article  CAS  PubMed  Google Scholar 

  22. Girgenti MJ, LoTurco JJ, Maher BJ: ZNF804A regulates expression of the schizophrenia-associated genes PRSS16, COMT, PDE4B and DRD2. PLoS One. 2012, 7: e32404-10.1371/journal.pone.0032404.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  23. Erikkson AL, Melström D, Lorentzon M, Orwoll ES, Redlund-Johnell I, Grundberg E, Holmberg A, Ljunggren O, Karlsson MK, Ohlsson C: The COMT val158met polymorphism is associated with prevalent fractures in Swedish men. Bone. 2008, 42: 107-112. 10.1016/j.bone.2007.08.045.

    Article  Google Scholar 

  24. Bolton JM, Metge C, Lix L, Prior H, Sareen J, Leslie WD: Fracture risk from psychotropic medications: a population-based analysis. J Clin Psychopharmacol. 2008, 28: 384-391. 10.1097/JCP.0b013e31817d5943.

    Article  PubMed  Google Scholar 

  25. Hill EW, McGivney BA, Gu J, Whiston R, MacHugh DE: A genome-wide SNP-association study confirms a sequence variant (g. 66493737C > T) in the equine mysotatin (MSTN) gene as the most powerful predictor of optimum racing distance for Thoroughbred racehorses. BMC Genomics. 2010, 11: 552-10.1186/1471-2164-11-552.

    Article  PubMed Central  PubMed  Google Scholar 

  26. Hill EW, Gu J, Eivers SS, Fonseca RG, McGivney BA, Govindarajan P, Orr N, Katz LM, MacHugh DE: A sequence polymorphism in MSTN predicts sprinting ability and racing stamina in thoroughbred horses. PLoS ONE. 2010, 5: e8645-10.1371/journal.pone.0008645.

    Article  PubMed Central  PubMed  Google Scholar 

  27. Tozaki T, Miyake T, Kakoi H, Gawahara H, Sugita S, Hasegawa T, Ishida N, Hirota K, Nakano Y: A genome-wide association study for racing performances in Thoroughbreds clarifies a candidate region near the MSTN gene. Anim Genet. 2010, 41 (Suppl 2): 28-35.

    Article  CAS  PubMed  Google Scholar 

  28. Petersen JL, Mickelson JR, Rendahl AK, Valberg SJ, Andersson LS: Genome-wide analysis reveals selection for important traits in domestic horse breeds. PLoS Genet. 2013, 9 (1): e1003211-10.1371/journal.pgen.1003211.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

Download references


The study was funded by the Horserace Betting Levy Board and the Thoroughbred Breeders’ Association.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Sarah C Blott.

Additional information

Competing interests

A patent application has been filed: UK patent application No. 1314131.2.

Authors’ contributions

TDHP, MV and SCB designed the study. MH, JRN, TDHP and LH collected samples. CS, LYF-C and JES prepared the data and performed quality control analyses. SCB performed statistical analyses. SCB, JES, TDHP, LH, MV contributed to writing the paper. All authors read and approved the final manuscript.

Electronic supplementary material

Additional file 1: Table S1: Distribution of fracture locations among the cases. (DOC 30 KB)

Additional file 2: Table S2: List of additional 78 SNPs genotyped on ECA 18 between 61.89 Mb and 71.17 Mb. (DOC 132 KB)


Additional file 3: Figure S1: Multidimensional scaling (MDS) plots showing National Hunt and flat-bred horses and fracture cases and controls. (DOC 170 KB)


Additional file 4: Figure S2: Quantile-quantile (Q-Q) plot obtained after the Cochran-Mantel-Haenszel (CMH) association test. (DOC 44 KB)

Additional file 5: Table S3: Genetic variance and heritability estimates for fracture risk by chromosome. (DOC 67 KB)

Authors’ original submitted files for images

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.

Reprints and permissions

About this article

Cite this article

Blott, S.C., Swinburne, J.E., Sibbons, C. et al. A genome-wide association study demonstrates significant genetic variation for fracture risk in Thoroughbred racehorses. BMC Genomics 15, 147 (2014).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: