Whole genome association study identifies regions of the bovine genome and biological pathways involved in carcass trait performance in Holstein-Friesian cattle
BMC Genomics volume 15, Article number: 837 (2014)
Four traits related to carcass performance have been identified as economically important in beef production: carcass weight, carcass fat, carcass conformation of progeny and cull cow carcass weight. Although Holstein-Friesian cattle are primarily utilized for milk production, they are also an important source of meat for beef production and export. Because of this, there is great interest in understanding the underlying genomic structure influencing these traits. Several genome-wide association studies have identified regions of the bovine genome associated with growth or carcass traits, however, little is known about the mechanisms or underlying biological pathways involved. This study aims to detect regions of the bovine genome associated with carcass performance traits (employing a panel of 54,001 SNPs) using measures of genetic merit (as predicted transmitting abilities) for 5,705 Irish Holstein-Friesian animals. Candidate genes and biological pathways were then identified for each trait under investigation.
Following adjustment for false discovery (q-value < 0.05), 479 quantitative trait loci (QTL) were associated with at least one of the four carcass traits using a single SNP regression approach. Using a Bayesian approach, 46 QTL were associated (posterior probability > 0.5) with at least one of the four traits. In total, 557 unique bovine genes, which mapped to 426 human orthologs, were within 500kbs of QTL found associated with a trait using the Bayesian approach. Using this information, 24 significantly over-represented pathways were identified across all traits. The most significantly over-represented biological pathway was the peroxisome proliferator-activated receptor (PPAR) signaling pathway.
A large number of genomic regions putatively associated with bovine carcass traits were detected using two different statistical approaches. Notably, several significant associations were detected in close proximity to genes with a known role in animal growth such as glucagon and leptin. Several biological pathways, including PPAR signaling, were shown to be involved in various aspects of bovine carcass performance. These core genes and biological processes may form the foundation for further investigation to identify causative mutations involved in each trait. Results reported here support previous findings suggesting conservation of key biological processes involved in growth and metabolism.
Animal growth is an economically important trait for livestock raised for meat production. Carcass traits, related to animal growth, are critical to the biological and economical efficiency of cattle production and, as such, there is great interest in understanding the underlying genomic architecture influencing these traits. Quantitative trait loci (QTL) associated with a particular trait can be used to predict disease risk or genetic merit of an animal [1, 2]. This information may also be used to investigate the molecular mechanisms and biological pathways involved in phenotypic variation between animals. Investigating complex traits in domestic animals may also provide insights into mechanisms underlying similar traits, such as growth and fat deposition, in humans.
Holstein-Friesian cattle are a popular breed of cow primarily used for their ability to produce large quantities of milk. However, Holstein-Friesian cattle are also an important source of meat for beef production. Several studies in cattle have identified associations between carcass traits and regions of the bovine genome. Carcass trait QTL have been reported most often on chromosomes 2, 3, 6, 14, 20 and 29 [3–7]. However, most studies reporting carcass QTL have been performed using beef breeds such as Aberdeen Angus . There have been no studies to date that have investigated the association of SNP genotypes with carcass performance utilizing measures of genetic merit estimated in dairy breeds. Although many studies have reported carcass QTL in regions containing genes with a known role in animal growth (such as the myostatin gene on bovine chromosome 2 [8, 9]), little is known about the mechanisms or underlying biological pathways involved in growth or carcass traits. Moreover, many of the reported QTL have been identified using raw phenotypic data and how the phenotypic data reflects the underlying genetic merit of the animal is a function of the heritability. By using estimates of individual animal genetic merit, generated from the accumulation of phenotypic information on relatives, the accuracy of the phenotype can be considerably greater and thus the statistical power of the association study is greater for the same number of genotyped individuals.
The objective of this study was to identify regions of the bovine genome associated with carcass performance traits using two statistical approaches: a single marker regression and multi-locus Bayesian approach. Regions detected as associated with a trait were then further investigated to identify the potential causal pathways and biological processes underlying each trait.
Semen samples for genotyping were collected by the Irish Cattle Breeding Federation  and partner artificial insemination organizations. All animal procedures were carried out according to the provisions of the Irish Cruelty to Animals Act (licenses issued by the Department of Health and Children).
Genotypes of 54,001 biallelic single nucleotide polymorphism (SNP) markers from 5,706 Holstein-Friesian sires were available for use in this study. All genotyping was carried out using the Illumina Bovine SNP50 version 1 Beadchip (Illumina Inc., San Diego, CA; ). SNP positions were based on the Btau 4.0 assembly of the bovine genome. All SNPs on the X-chromosome or with an unknown position in the genome were removed from the dataset. Quality filtering was then undertaken to remove SNPs with inconsistent Mendelian inheritance patterns from sire to progeny. SNPs that had a minor allele frequency of less than 5% were also discarded. If a SNP had greater than 5% of calls missing, it was excluded from further analysis. Also, SNPs that failed to distinctly cluster into homozygous and heterozygous calls were removed. A total of 42,477 SNPs remained for analysis after quality filtering.
Phenotypes for four economically important carcass traits were used in this study; carcass weight, carcass fat, carcass conformation of progeny and cull cow carcass weight. Carcass weight refers to the cold weight (measured in kgs) of the carcass taken within 2 hours of slaughter after being bled and eviscerated, and after removal of skin, external genitalia, the limbs at the carpus and tarsus, head, tail, kidneys and kidney fats and the udder. Progeny carcass weight is the carcass weight of a sire’s offspring/progeny measured on males from 300–1200 days and females from 300–875 days of age (females which have not produced a calf). Carcass fat and conformation phenotypes have been assessed since the year 2005 by video image analysis of the outside of the carcass  on a 15-point scale. Progeny carcass fat is the quantity of subcutaneous fat on the carcass of the slaughtered animal varying from 1 (leanest) to 15 (fattest). Progeny carcass conformation is the thickness of muscle on the carcass of the slaughtered animal scored on a scale of 1 (poor conformation) to 15 (excellent conformation). Cull cow carcass weight refers to the carcass weight of a dairy or beef cow slaughtered for meat at the end of her productive life. Cows are aged between 875 and 4000 days of age. Phenotypes for each of these traits are published as predicted transmitting abilities (PTAs), which are sire genetic merit based not on the sires themselves but on the performance of their descendants across multiple generations. Each PTA is accompanied by a respective reliability, which is the confidence in the estimated PTA (scale between 0 – 99%). As more information is included in an animal’s genetic evaluation, the reliability of the evaluation will increase. As the reliability increases, the likelihood that the animal’s PTA will change in the future is reduced as more information is included. The Irish Cattle Breeding Federation calculated PTAs and their respective reliabilities were available for all animals used in this study. Genotypic and phenotypic data for all animals utilized in this study can be requested from the Irish Cattle Breeding Federation . The Irish Cattle Breeding Federation database identifiers for all animals used in this study are contained in Additional file 1. These animals were representative of the Holstein-Friesian population in Ireland. Phenotypic edits were then applied to the animal data. An adjusted reliability was estimated for each animal by removing the parental contribution to reliability as described by Harris and Johnson . To ensure accurate phenotypes, for each trait separately, animals with an adjusted reliability of <70% were discarded. Following removal of animals with a low adjusted reliability, 1,061 animals remained for further analysis. Summary statistics for each of the phenotypes (as de-regressed PTA ), following removal of animals with an adjusted reliability of <70%, are in Table 1.
Two statistical approaches, a frequentist (single SNP regression) and Bayesian approach (BayesB), were used to estimate associations between SNPs and each trait separately.
Single SNP regression
The single SNP regression (SSR) model included each SNP separately as a continuous variable (i.e., count of a given allele) in a linear animal mixed model using ASReml . The individual animal was included as a random effect. Relationships between animals were accounted for using the additive genetic relationship matrix. Pedigree information consisted of 6,854 animals. The dependent variable was de-regressed PTA . Marker effects and associated P-values for each SNP were obtained from the analysis. P-values were adjusted to correct for errors arising from multiple testing using a false discovery rate (FDR) approach (FDR < 0.05) described by Storey and Tibshirani . This procedure was carried out using the q-value package in R. Resultant q-values <0.05 were defined as significant. Adjacent SNPs, based on genomic location, that had q-values <0.05 were considered to be part of the same QTL. Genomic co-ordinates, identifier information, and q-values for all SNPs in the analysis are contained in Additional file 2.
The second statistical approach utilized the Bayesian mixture model “BayesB” as described by Meuwissen et al.. Source code for the BayesB software was provided by the author (Donagh P. Berry). A local version of BayesB was compiled on an in-house Linux server allowing us to efficiently carry out many parallel analyses. The Bayesian approach allows the incorporation of prior knowledge about the distribution of SNPs effects. An inverse chi-squared distribution (v = 4.234, S = 0.0429) was included in the model as the prior distribution of the mean and genetic variation of each SNP included in the model.
A prior value was assigned to π which describes a prior probability of association (1 - π) for each SNP. As this prior probability is assigned to all SNPs in the analysis, it reflects, a priori, the proportion of SNPs assumed to be associated with a particular trait. Analyses were run with alternative prior probabilities assumed to be associated with a particular trait (1- π) ranging from 0.05 to 6.25 × 10−5 (specifications of (1- π) are included in Additional file 3).
Additional analyses were also performed using the proportion of non-significant (q ≥ 0.05) SNPs that were estimated from the SSR analysis (pSSR), and half and double this value, to determine π. This was then used to quantify a prior proportion of SNPs assumed to be associated with each trait (1 – π). A total of eleven analyses were run for each trait. Markov Chain Monte Carlo (MCMC) chains were used to sample every 500th iteration from the posterior distribution of SNP effects. Total iterations for each analysis are contained in Additional file 3.
Convergence of the model for each analysis was confirmed by three approaches: Firstly a visual inspection of summed absolute log-likelihood values. All sampled iterations before convergence were discarded as burn-in. The number of iterations discarded as burn-in for each analysis are contained in Additional file 4. From the remaining sampled iterations, posterior probabilities (PPs) of association were calculated. A PP is the number of sampled iterations after burn-in that a SNP had a non-zero effect divided by the total number of sampled iterations after burn-in. The PP is indicative of the probability that a SNP is associated with a phenotype. A PP of zero indicates a low probability of association whereas a PP of 1 indicates a high probability of association.
The second approach used to ensure that convergence was successfully achieved, was performed by quantifying and plotting the total number of SNPs that had a PP > 0.5 at each iteration. The resultant trace plot was visually inspected to determine if the MCMC chains had run sufficiently long enough to have confidence that all high PP QTL had been detected.
Thirdly, the estimated marker effects for each SNP were checked for convergence. The combined difference between the estimated SNP effect of those SNPs with a PP > 0.5 from the Bayesian approach and the SNP effect for the same set of SNPs as estimated using the SSR approach was calculated using a Euclidean distance. Visual inspection of the trace plot produced by plotting a Euclidean distance at each iteration confirmed convergence of this model parameter.
Identifying significant associations
For each analysis, once convergence had been confirmed and the burn-in discarded, posterior probabilities (PPs) were calculated. However, due to the effect of strong linkage disequilibrium (LD), the posterior probability of a QTL may be distributed across several adjacent SNPs. To account for this, and to accurately identify the presence of a QTL, posterior probabilities were also calculated using a sliding window of 5 adjacent SNPs based on genomic location. Subsequent QTL with a PP > 0.5 were defined as high PP QTL. For each trait separately, high PP QTL for each of the eleven analyses (1 - π = 1- pSSR/2, 1 - pSSR, 1 - pSSR × 2 and 0.05-6.25 × 10−5) were identified. The number of analyses that a QTL had a PP > 0.5 across the 11 analyses was calculated and assigned to the QTL as its occurrence rate.
For each of the eleven analyses within a trait, an average occurrence rate was calculated by summing the individual QTL occurrence rates of QTL with a PP > 0.5 and dividing this value by the total number of QTL with a PP > 0.5. The analysis with the highest average occurrence rate was then identified (Additional file 5). All QTL with a PP > 0.5 within the analysis with the highest average occurrence rate were then considered significantly associated with the respective trait. This was done for each trait separately, resulting in 4 datasets of significantly associated QTL corresponding to each trait under investigation (Additional file 2). Each dataset represented the analysis with the largest number of frequently occurring high PP QTL for each trait.
Four datasets, corresponding to each trait, were created by identifying all bovine genes within a 500 kb region up and downstream of SNPs located within a QTL significantly associated with a trait using the Bayesian method. To investigate the combined role that some pathways may have on each of these traits, a combined trait dataset containing all genes from each of the individual trait datasets was also created. The genes in each of these five datasets were then mapped to their human orthologs using the mapping available from version hg19 of the human genome. A background set of all possible orthologs that could be represented was created containing all human genes that had a bovine ortholog that was within 500 kb of a SNP included in the analysis (17,186 human genes). For each trait dataset the R  package GOseq , without the correction for gene length bias, was used to identify the KEGG pathways which were significantly over-represented (p < 0.05) by the set of genes compared against the background set of human genes.
Using the SSR method, two QTL were associated (q < 0.05) with carcass weight. These SNPs were on chromosomes 3 and 19 (Figure 1).
In the Bayesian analysis, 11 QTL were associated with carcass weight (Table 2), three of which were located on chromosome 3. Interestingly, two of these QTL, within 2.5 Mb of each other, were located either side of the leptin receptor. Only one of the 11 QTL was associated with both carcass weight and at least one of the other three carcass traits. This QTL, on chromosome 6 (~85 Mb), was associated with both carcass weight and carcass conformation using the Bayesian method. None of the QTL identified as associated with carcass weight were common to both statistical approaches.
Using the SSR approach, 24 QTL were associated (q < 0.05) with carcass fat (Table 2). The most significantly associated SNP from this analysis (q = 8.45 × 10−5), rs109514593, was located within a QTL on chromosome 8 at ~22 Mb (Figure 1), while another SNP (rs41607785), located within a separate QTL, approximately 1 Mb away from rs109514593, was also associated with carcass fat. Five QTL were associated with both carcass fat and cull cow carcass weight. One SNP, rs109776183, was associated with both carcass fat and carcass conformation.
Using the Bayesian method, six QTL were associated with carcass fat. Each of these QTL were located on different chromosomes of the genome. One QTL on chromosome 3 (~105 MB) was associated with carcass fat using both the Bayesian and SSR methods. This SNP was located approximately 600 kb away from rs43359171, which was also associated with carcass fat using the SSR approach.
A total of 414 QTL were associated (q < 0.05) with carcass conformation in the SSR analysis (Table 2). Significant QTL for carcass conformation were located on all chromosomes (Figure 1). Twenty-one QTL showed a strong association with this trait (q < 0.005), the most significant (q = 3.787 × 10−4) of which was on chromosome 20. This SNP, rs41580285, resided within a cluster of 5 strongly associated SNPs (q < 0.005), all of which were less than 1 Mb away from the growth hormone receptor (GHR) gene.
Thirteen QTL were associated with carcass conformation in the Bayesian analysis. Seven of these QTL contained at least one SNP that was also associated with carcass conformation using the SSR approach. One of these SNPs was strongly associated with carcass conformation (q < 0.005) using the SSR method. One SNP, located on chromosome 6, was also associated with carcass weight using the Bayesian method.
Cull cow carcass weight
A total of 60 QTL were associated with cull cow carcass weight using either the Bayesian or SSR method (Table 2). Of these, 46 QTL were associated (q < 0.05) with cull cow carcass weight using the SSR method (Figure 1). One SNP, rs41935177, was detected as being strongly associated (q < 0.005) with cull cow carcass weight in both the SSR and Bayesian method (PP = 0.95). Seven SNPs from this analysis were associated with both cull cow carcass weight and another carcass trait (5 SNPs were associated with carcass fat and 2 with carcass conformation) using the SSR approach.
Seventeen QTL were associated with cull cow carcass weight in the Bayesian analysis. Three of these QTL, centered on rs109184437, rs41935177 and rs110340777 respectively, were also significantly associated with cull cow carcass weight using the SSR approach.
Over-represented KEGG pathways
In total, 557 unique bovine genes were within 500 kb of a QTL associated with a trait using the Bayesian approach (Additional file 6). Of these, 423 mapped to 426 human orthologs. The most significantly over-represented KEGG pathway detected using these genes was the peroxisome proliferator-activated receptor (PPAR) signaling pathway (p = 1.14 × 10−3) (Additional file 7). This pathway was significantly over-represented in both carcass fat and the combined trait analyses. In fact, all nine pathways significantly over-represented in the combined trait analysis were also significantly over-represented for a trait when only orthologs from that trait were used in the analysis. Twenty-four different pathways were significantly over-represented across all analyses and are contained in Table 3.
The aim of the study was to identify regions of the bovine genome associated with carcass characteristics using phenotypes of four economically important carcass traits in Holstein-Friesian cattle: carcass weight, carcass fat, carcass conformation of progeny as well as cull cow carcass weight. Two statistical approaches, a Bayesian and frequentist, were used to detect associations between SNPs and each of the traits separately. Detected SNP associations using either approach were distributed across all autosomes.
The Bayesian approach
Both the Bayesian and SSR methods differ fundamentally in their underlying approaches. The single SNP regression method tests each SNP individually, whereas the Bayesian approach tests one SNP at a time while taking cognizance of all other SNPs simultaneously. This was particularly evident by the Bayesian approach identifying a single marker whereas the SSR approach sometimes identified a cluster of adjacent significant associations for the same location (e.g. chromosome 20 at ~10 MB for cull cow carcass weight); this a consequence of linkage disequilibrium in the genome. Also, the Bayesian approach is advantageous as there is no need to correct for Type I errors arising from many thousands of tests. This allowed us to detect associations that might have been removed as false positives by the multiple testing correction method applied to the SSR approach. In fact, 40 of 47 QTL identified from the Bayesian approach were also significantly associated (p < 0.05) with the same trait using the SSR before correction for multiple tests. After correcting for multiple testing, this number dropped to 11. Furthermore, as complex traits are likely to be influenced by a large number of mutations, models that analyze all markers simultaneously should provide more accurate results than models that analyze one or a few markers independently . Thus Bayesian approaches may then have greater power to detect SNPs with moderate effects on a trait of interest. Additionally, the ability to incorporate information a priori into the model would appear to be advantageous in complex traits that are influenced by many variants. Although inclusion of a prior may bias results to fit that prior , it is likely that SNPs with the strongest association will be identified irrespective of the prior proportion of SNPs assumed to have an effect. However, this cannot be guaranteed and as such, should be investigated as is the case in this study. Our choice of prior would appear to be robust, as it represents the most frequently occurring high PP QTL across different prior specifications.
In MCMC Bayesian approaches it is necessary to ensure that the chains have converged before calculating posterior probabilities [22–24]. This can be done in several ways. For instance previous GWAS analyses using Bayesian approaches have used the convergence of the SNP effect for selected SNPs as evidence of model convergence . This however is only a single parameter and its convergence may not represent the convergence of the entire model. As discussed by Cowles and Carlin (1996) , there is no one conclusive diagnostic that can provide assurance of convergence. Convergence of all parameters, not just those of interest, should be checked before making any posterior inferences . With some models, certain parameters can appear to exhibit good convergence behavior. This however, can be misleading due to the slow convergence of other parameters . To tackle this problem, we used the convergence of the sum of log-likelihoods for all SNPs at each iteration and identified when this converged. In addition to this, the total number of high PP (PP > 0.5) SNPs and the Euclidean distance between SNP effects estimated from the Bayesian and SSR approaches for these SNPs were plotted at every sampled iteration. This was to ensure that the MCMC chains had run long enough and that the model had successfully converged.
A large number of associations (514 QTL) were detected across all traits using both statistical approaches. However, most of these were detected for carcass conformation (414) using the SSR approach (q < 0.05). This may be due to biological noise caused by an increased complexity of this trait compared to the others analyzed, or because the trait may be more greatly influenced by several other unmeasured physical characteristics such as bone size and carcass frame. At a significance of q < 0.005, a total of 21 QTL were associated with carcass conformation using the SSR approach. This figure was much more similar to the results from the other three traits. Using this significance threshold for carcass conformation and a significance threshold of q < 0.05 for the other three traits, 90 QTL were associated with at least one trait using the SSR approach. This meant that 129 QTL were associated with at least one of the traits using both the SSR and Bayesian approaches.
Using both statistical approaches, a number of associations detected for each trait were in close proximity to genes with a known role in animal growth (e.g. growth hormone receptor (GHR), Insulin and Insulin-like growth factor 2 (IGF2)). As well as this, a number of novel candidate genes were identified. For example, significant QTL on chromosome 20 were detected within 1 MB of fibroblast growth factor 11 (FGF11) and on chromosome 6 approximately 500 kb away from Gonadotropin-releasing hormone receptor.
Three novel associations with carcass fat were detected on chromosome 2, all of which were within a 3.5 Mb region upstream of the glucagon gene. In the same region, 5 SNPs that were associated with cull cow carcass weight were all within a 2.9 Mb region of the glucagon gene. The glucagon gene plays an important role in a number of biological processes related to metabolism and energy homeostasis . Glucagon is known to regulate fat metabolism via cAMP-dependent mechanisms in animals .
A number of associations detected from the Bayesian approach, that were not detected in the SSR approach, occurred in regions containing genes previously reported to be associated to growth in Holstein cows (e.g. leptin gene ). Interestingly, associations from the Bayesian method that were not detected using the SSR approach, also occurred in close proximity to the leptin receptor (approx. 300 kb upstream). A mutation in the leptin receptor has previously been reported to cause obesity in humans . Leptin is involved in the hypothalamic control of energy homeostasis, an indicator of body fat reserves and regulator of energy expenditure . In ruminants, such as cattle, a positive correlation has been demonstrated between circulating concentrations of leptin and fat accumulation .
Over-represented KEGG pathways
Carcass traits are governed by many complex biological systems, reflecting the combined influence of many genetic factors. However, there may be central biological processes that link together the genetic regulation of all of these traits. The combined trait analysis detected biological pathways that were not found using the individual trait datasets (e.g. peroxisome proliferator-activated receptor signaling pathway). These biological pathways contained genes associated with several different carcass traits, and are thus likely involved in different aspects of each of these traits.
Peroxisome proliferator-activated receptor
Peroxisome proliferator-activated receptor (PPAR) signaling pathway was the most significantly over-represented pathway (p = 0.00114) in both the analysis involving carcass fat and the combined trait dataset (p = 0.00672). PPARs are a group of transcription factors that play an essential physiological role in the regulation of adipocyte tissue development, lipogenesis and skeletal muscle lipid metabolism [32–34]. There are three members of the PPAR family, PPARα, PPARγ and PPARδ, each of which is encoded by a separate gene . PPARs regulate transcription by binding with retinoid X receptors . This heterodimer binds to peroxisome proliferator response elements in the promoter region of target genes, which then stimulates expression . Both PPARα and retinoid X receptor α were identified as candidate genes involved in regulating carcass weight and carcass conformation, respectively. Furthermore, PPARα is also involved in controlling the expression of fatty acid binding proteins, which are a family of carrier proteins involved in mediating intracellular uptake and transport of long-chain fatty acids within the cell [38, 39]. Fatty acid binding proteins also play an important role in systemic energy homeostasis . Interestingly, genes from the carcass weight, carcass fat and carcass conformation gene datasets were also in this pathway suggesting that PPAR may also play a role in each of these traits (Figure 2). This was not unexpected given the known genetic associations among these traits .
Phosphatidylinositol signaling system
Phosphoinositides are a family of minor membrane lipids involved in signal transduction, which play important roles in several signaling pathways within the cell . Phosphoinositides initiate signaling by specifically interacting with a large number of proteins that can result in relocalization of the protein from one area of the cell to another, or induce conformational changes in the protein [42, 43]. The immediate precursor to all phosphoinositides is phosphatidylinositol [44, 45]. Signaling through various phosphoinositides has also been implicated in a wide range of cellular processes including cell growth and proliferation, apoptosis and intracellular vesicle trafficking [42, 43, 46, 47]. The phosphatidylinositol signaling system is initiated in response to environmental stimuli such stress and diet. This pathway was significantly over-represented for both the carcass conformation (p = 0.0303) and the combined trait datasets (p = 0.01284) (Additional file 8). Interestingly, candidate genes from the carcass weight, carcass fat and carcass conformation gene datasets were also in this pathway. This is not surprising given the wide range of functions that phosphatidylinositol signaling has been implicated in. Furthermore, genes involved in the phosphatidylinositol signaling system have been found to differentially expressed in studies examining growth and fatness traits in pigs . This pathway, along with pathways significantly over-represented from the combined trait dataset, may contain core biological processes linked to phenotypic variation observed in each of the traits under investigation.
Conserved biological functions
There are numerous examples of single genes (or mutations in a gene) influencing similar phenotypes in different species. Some well known examples include mutations in the myostatin gene that lead to the “double muscling” phenotype in humans , mice  and cattle [8, 9]. Another example is the control of hair color by the melanocortin receptor gene (Mc1r) in humans , with similar effects on coat color in species such as cattle , pigs  and horses . For complex traits, there is little known on the conservation of genes with low to moderate effects on a phenotype across species. However, there are a number of examples that suggest a degree of conservation of gene classes between mammalian species (e.g. stature  and milk proteins ) exists . From our study, we have identified some well-known biological processes that influence similar traits in humans such as PPAR signaling and its influence in fat deposition and metabolism . In fact, several of the pathways identified in our study have reported roles in similar traits in other organisms. For example, arachidonic acid metabolism has been linked to increased adipose tissue development in infant mice . In addition, levels of arachidonic acid content in adipose tissue have been shown to be higher in overweight and obese children . It is not surprising then, that this pathway was significantly over-represented for carcass fat. A number of pathways with a novel association in cattle, but with known effects in other organisms have also been identified (e.g. Jak-STAT signaling pathway). The Jak-STAT signaling pathway plays an important role in several processes related to cell proliferation, differentiation, migration and apoptosis . This pathway is also highly conserved across species , and has been linked to skeletal muscle development in mice  and humans . This would suggest that a number of the biological processes influencing growth characteristics that are conserved in organisms such as humans are also conserved in cattle.
In the present study, a large number of significant associations, candidate regions, and biological pathways were identified using two different statistical approaches. The use of a Bayesian approach facilitated the identification of associations that might have been removed from the SSR analysis as a false positive after correcting for multiple testing.
Bayesian approaches would seem to have merit in future association studies as they provide numerous advantages over linear regression approaches such as avoiding many thousands of tests by fitting all of the data at once and allowing the inclusion of information a priori. However, including information a priori may create bias that influences posterior inferences. As such, exploring a dispersion of prior specifications and combining this information may reduce bias towards to a single arbitrarily chosen prior . Furthermore, correctly identifying convergence of a Bayesian approach will remain a contentious subject. Monitoring the behavior of numerous model parameters, not just those of interest, as is the case in this study, will provide the best opportunity to confidently confirm convergence when using a Bayesian approach.
A large number of significant associations were detected in this analysis. These associations can help to further refine known large QTL regions and support the identification of any underlying causative mutations. Also, the gene datasets created within this study may form the basis of further investigation, utilizing next-generation sequencing technologies, for targeted re-sequencing which may yield a panel of potential causative mutations. Furthermore, a number of biological pathways with a known role in organisms such as humans and mice were identified as having a function in similar analogous traits in Bos taurus. This supports previous findings which suggest that several core biological processes involved in growth and metabolism are highly conserved across species. In particular, the PPAR signaling pathway would appear to have a key role in controlling several aspects of bovine growth. However, further investigation to understand the cumulative influence that gene interactions have and the multi-faceted role that PPAR and other core biological pathways have on phenotypic expression of growth and carcass traits is warranted.
Wray NR, Goddard ME, Visscher PM: Prediction of individual genetic risk to disease from genome-wide association studies. Genome Res. 2007, 17 (10): 1520-1528. 10.1101/gr.6665407.
de Roos AP, Schrooten C, Veerkamp RF, van Arendonk JA: Effects of genomic selection on genetic improvement, inbreeding, and merit of young versus proven bulls. J Dairy Sci. 2011, 94 (3): 1559-1567. 10.3168/jds.2010-3354.
McClure MC, Morsci NS, Schnabel RD, Kim JW, Yao P, Rolf MM, McKay SD, Gregg SJ, Chapple RH, Northcutt SL, Taylor JF: A genome scan for quantitative trait loci influencing carcass, post-natal growth and reproductive traits in commercial Angus cattle. Anim Genet. 2010, 41 (6): 597-607. 10.1111/j.1365-2052.2010.02063.x.
Nishimura S, Watanabe T, Mizoshita K, Tatsuda K, Fujita T, Watanabe N, Sugimoto Y, Takasuga A: Genome-wide association study identified three major QTL for carcass weight including the PLAG1-CHCHD7 QTN for stature in Japanese Black cattle. BMC Genet. 2012, 13: 40-
Setoguchi K, Furuta M, Hirano T, Nagao T, Watanabe T, Sugimoto Y, Takasuga A: Cross-breed comparisons identified a critical 591-kb region for bovine carcass weight QTL (CW-2) on chromosome 6 and the Ile-442-Met substitution in NCAPG as a positional candidate. BMC Genet. 2009, 10: 43-
Takasuga A, Watanabe T, Mizoguchi Y, Hirano T, Ihara N, Takano A, Yokouchi K, Fujikawa A, Chiba K, Kobayashi N, Tatsuda K, Oe T, Furukawa-Kuroiwa M, Nishimura-Abe A, Fujita T, Inoue K, Mizoshita K, Ogino A, Sugimoto Y: Identification of bovine QTL for growth and carcass traits in Japanese Black cattle by replication and identical-by-descent mapping. Mamm Genome. 2007, 18 (2): 125-136. 10.1007/s00335-006-0096-5.
Lindholm-Perry AK, Sexten AK, Kuehn LA, Smith TP, King DA, Shackelford SD, Wheeler TL, Ferrell CL, Jenkins TG, Snelling WM, Freetly HC: Association, effects and validation of polymorphisms within the NCAPG - LCORL locus located on BTA6 with feed intake, gain, meat and carcass traits in beef cattle. BMC Genet. 2011, 12: 103-
Grobet L, Martin LJ, Poncelet D, Pirottin D, Brouwers B, Riquet J, Schoeberlein A, Dunner S, Menissier F, Massabanda J, Fries R, Hanset R, Georges M: A deletion in the bovine myostatin gene causes the double-muscled phenotype in cattle. Nat Genet. 1997, 17 (1): 71-74. 10.1038/ng0997-71.
McPherron AC, Lee SJ: Double muscling in cattle due to mutations in the myostatin gene. Proc Natl Acad Sci U S A. 1997, 94 (23): 12457-12461. 10.1073/pnas.94.23.12457.
Irish Cattle Breeding Federation: Irish Cattle Breeding Federation. http://www.icbf.com,
Matukumalli LK, Lawley CT, Schnabel RD, Taylor JF, Allan MF, Heaton MP, O'Connell J, Moore SS, Smith TP, Sonstegard TS, Van Tassell CP: Development and characterization of a high density SNP genotyping assay for cattle. PLoS One. 2009, 4 (4): e5350-10.1371/journal.pone.0005350.
Pabiou T, Fikse WF, Cromie AR, Keane MG, Nasholm A, Berry DP: Use of digital images to predict carcass cut yields in cattle. Livest Sci. 2011, 137 (1–3): 130-140.
Harris B, Johnson D: Approximate reliability of genetic evaluations under an animal model. J Dairy Sci. 1998, 81 (10): 2723-2728. 10.3168/jds.S0022-0302(98)75829-1.
Berry D, Kearney F, Harris B: Genomic selection in Ireland. Proceedings of the Interbull International Workshop: January 26–29, Uppsala, Sweden. 2009, 29-34.
Gilmour AR, Cullis BR, Gogel BJ, Welham SJ, Thompson R: ASReml User Guide Release 3.0. 2009, UK: VSN International Ltd, Hemel Hempstead, HP1 1ES
Storey JD, Tibshirani R: Statistical significance for genomewide studies. Proc Natl Acad Sci U S A. 2003, 100 (16): 9440-9445. 10.1073/pnas.1530509100.
Meuwissen TH, Hayes BJ, Goddard ME: Prediction of total genetic value using genome-wide dense marker maps. Genetics. 2001, 157 (4): 1819-1829.
R Development Core Team: R: A Language and Environment for Statistical Computing. 2011, Vienna, Austria: R Foundation for Statistical Computing
Young MD, Wakefield MJ, Smyth GK, Oshlack A: Gene ontology analysis for RNA-seq: accounting for selection bias. Genome Biol. 2010, 11 (2): R14-10.1186/gb-2010-11-2-r14.
van den Berg I, Fritz S, Boichard D: QTL fine mapping with Bayes C(pi): a simulation study. Genet Sel Evol. 2013, 45: 19-10.1186/1297-9686-45-19.
Gianola D, de los Campos G, Hill WG, Manfredi E, Fernando R: Additive genetic variability and the Bayesian alphabet. Genetics. 2009, 183 (1): 347-363. 10.1534/genetics.109.103952.
Nylander JA, Wilgenbusch JC, Warren DL, Swofford DL: AWTY (are we there yet?): a system for graphical exploration of MCMC convergence in Bayesian phylogenetics. Bioinformatics. 2008, 24 (4): 581-583. 10.1093/bioinformatics/btm388.
Cowles MK, Carlin BP: Markov chain Monte Carlo convergence diagnostics: a comparative review. J Am Stat Assoc. 1996, 91 (434): 883-904. 10.1080/01621459.1996.10476956.
Oszkiewicz D, Muinonen K, Virtanen J, Granvik M, Bowell E: Modeling collision probability for Earth-impactor 2008 TC3. Planet Space Sci. 2012, 73 (1): 30-38. 10.1016/j.pss.2012.05.005.
Daetwyler HD, Pong-Wong R, Villanueva B, Woolliams JA: The impact of genetic architecture on genome-wide evaluation methods. Genetics. 2010, 185 (3): 1021-1031. 10.1534/genetics.110.116855.
Nylander JA, Ronquist F, Huelsenbeck JP, Nieves-Aldrey JL: Bayesian phylogenetic analysis of combined data. Syst Biol. 2004, 53 (1): 47-67. 10.1080/10635150490264699.
Tan B, Yin Y, Liu Z, Li X, Xu H, Kong X, Huang R, Tang W, Shinzato I, Smith SB, Wu G: Dietary L-arginine supplementation increases muscle gain and reduces body fat mass in growing-finishing pigs. Amino Acids. 2009, 37 (1): 169-175. 10.1007/s00726-008-0148-0.
Clempson AM, Pollott GE, Brickell JS, Bourne NE, Munce N, Wathes DC: Evidence that leptin genotype is associated with fertility, growth, and milk production in Holstein cows. J Dairy Sci. 2011, 94 (7): 3618-3628. 10.3168/jds.2010-3626.
Clement K, Vaisse C, Lahlou N, Cabrol S, Pelloux V, Cassuto D, Gourmelen M, Dina C, Chambaz J, Lacorte JM, Basdevant A, Bougneres P, Lebouc Y, Froguel P, Guy-Grand B: A mutation in the human leptin receptor gene causes obesity and pituitary dysfunction. Nature. 1998, 392 (6674): 398-401. 10.1038/32911.
Delavaud C, Ferlay A, Faulconnier Y, Bocquier F, Kann G, Chilliard Y: Plasma leptin concentration in adult cattle: effects of breed, adiposity, feeding level, and meal intake. J Anim Sci. 2002, 80 (5): 1317-1328.
Geary TW, McFadin EL, MacNeil MD, Grings EE, Short RE, Funston RN, Keisler DH: Leptin as a predictor of carcass composition in beef cattle. J Anim Sci. 2003, 81 (1): 1-8.
Berger J, Moller DE: The mechanisms of action of PPARs. Annu Rev Med. 2002, 53: 409-435. 10.1146/annurev.med.53.082901.104018.
Ehrenborg E, Krook A: Regulation of skeletal muscle physiology and metabolism by peroxisome proliferator-activated receptor delta. Pharmacol Rev. 2009, 61 (3): 373-393. 10.1124/pr.109.001560.
Canovas A, Quintanilla R, Amills M, Pena RN: Muscle transcriptomic profiles in pigs with divergent phenotypes for fatness traits. BMC Genomics. 2010, 11: 372-10.1186/1471-2164-11-372.
Abbott BD: Review of the expression of peroxisome proliferator-activated receptors alpha (PPAR alpha), beta (PPAR beta), and gamma (PPAR gamma) in rodent and human development. Reprod Toxicol. 2009, 27 (3–4): 246-257.
Tien ES, Hannon DB, Thompson JT, Vanden Heuvel JP: Examination of Ligand-Dependent Coactivator Recruitment by Peroxisome Proliferator-Activated Receptor-alpha (PPARalpha). PPAR Res. 2006, 2006: 69612-
Tan NS, Michalik L, Desvergne B, Wahli W: Multiple expression control mechanisms of peroxisome proliferator-activated receptors and their target genes. J Steroid Biochem. 2005, 93 (2–5): 99-105.
Furuhashi M, Hotamisligil GS: Fatty acid-binding proteins: role in metabolic diseases and potential as drug targets. Nat Rev Drug Discov. 2008, 7 (6): 489-503. 10.1038/nrd2589.
Hertzel AV, Bernlohr DA: The mammalian fatty acid-binding protein multigene family: molecular and genetic insights into function. Trends Endocrinol Metab. 2000, 11 (5): 175-180. 10.1016/S1043-2760(00)00257-5.
Storch J, McDermott L: Structural and functional analysis of fatty acid-binding proteins. J Lipid Res. 2009, 50 (Suppl): S126-S131.
Pabiou T, Fikse WF, Amer PR, Cromie AR, Nasholm A, Berry DP: Genetic relationships between carcass cut weights predicted from video image analysis and other performance traits in cattle. Animal. 2012, 6 (9): 1389-1397. 10.1017/S1751731112000705.
Toker A: Phosphoinositides and signal transduction. Cell Mol Life Sci. 2002, 59 (5): 761-779. 10.1007/s00018-002-8465-z.
Falkenburger BH, Jensen JB, Dickson EJ, Suh BC, Hille B: Phosphoinositides: lipid regulators of membrane proteins. J Physiol Lond. 2010, 588 (17): 3179-3185. 10.1113/jphysiol.2010.192153.
Payrastre B, Missy K, Giuriato S, Bodin S, Plantavid M, Gratacap M: Phosphoinositides: key players in cell signalling, in time and space. Cell Signal. 2001, 13 (6): 377-387. 10.1016/S0898-6568(01)00158-9.
Sasaki T, Sasaki J, Sakai T, Takasuga S, Suzuki A: The physiology of phosphoinositides. Biol Pharm Bull. 2007, 30 (9): 1599-1604. 10.1248/bpb.30.1599.
Bridges D, Saltiel AR: Phosphoinositides in insulin action and diabetes. Curr Top Microbiol Immunol. 2012, 362: 61-85.
Cantrell DA: Phosphoinositide 3-kinase signalling pathways. J Cell Sci. 2001, 114 (8): 1439-1445.
Schuelke M, Wagner KR, Stolz LE, Hubner C, Riebel T, Komen W, Braun T, Tobin JF, Lee SJ: Myostatin mutation associated with gross muscle hypertrophy in a child. N Engl J Med. 2004, 350 (26): 2682-2688. 10.1056/NEJMoa040933.
McPherron AC, Lawler AM, Lee SJ: Regulation of skeletal muscle mass in mice by a new TGF-beta superfamily member. Nature. 1997, 387 (6628): 83-90. 10.1038/387083a0.
Yamaguchi Y, Hearing VJ: Physiological factors that regulate skin pigmentation. Biofactors. 2009, 35 (2): 193-199. 10.1002/biof.29.
Klungland H, Vage DI, Gomez-Raya L, Adalsteinsson S, Lien S: The role of melanocyte-stimulating hormone (MSH) receptor in bovine coat color determination. Mamm Genome. 1995, 6 (9): 636-639. 10.1007/BF00352371.
Kijas JM, Wales R, Tornsten A, Chardon P, Moller M, Andersson L: Melanocortin receptor 1 (MC1R) mutations and coat color in pigs. Genetics. 1998, 150 (3): 1177-1185.
Marklund L, Moller MJ, Sandberg K, Andersson L: A missense mutation in the gene for melanocyte-stimulating hormone receptor (MC1R) is associated with the chestnut coat color in horses. Mamm Genome. 1996, 7 (12): 895-899. 10.1007/s003359900264.
Pryce JE, Hayes BJ, Bolormaa S, Goddard ME: Polymorphic regions affecting human height also control stature in cattle. Genetics. 2011, 187 (3): 981-984. 10.1534/genetics.110.123943.
Lemay DG, Lynn DJ, Martin WF, Neville MC, Casey TM, Rincon G, Kriventseva EV, Barris WC, Hinrichs AS, Molenaar AJ, Pollard KS, Maqbool NJ, Singh K, Murney R, Zdobnov EM, Tellam RL, Medrano JF, German JB, Rijnkels M: The bovine lactation genome: insights into the evolution of mammalian milk. Genome Biol. 2009, 10 (4): R43-10.1186/gb-2009-10-4-r43.
Muoio DM, Way JM, Tanner CJ, Winegar DA, Kliewer SA, Houmard JA, Kraus WE, Dohm GL: Peroxisome proliferator-activated receptor-alpha regulates fatty acid utilization in primary human skeletal muscle cells. Diabetes. 2002, 51 (4): 901-909. 10.2337/diabetes.51.4.901.
Massiera F, Saint-Marc P, Seydoux J, Murata T, Kobayashi T, Narumiya S, Guesnet P, Amri EZ, Negrel R, Ailhaud G: Arachidonic acid and prostacyclin signaling promote adipose tissue development: a human health concern?. J Lipid Res. 2003, 44 (2): 271-279. 10.1194/jlr.M200346-JLR200.
Savva SC, Chadjigeorgiou C, Hatzis C, Kyriakakis M, Tsimbinos G, Tornaritis M, Kafatos A: Association of adipose tissue arachidonic acid content with BMI and overweight status in children from Cyprus and Crete. Br J Nutr. 2004, 91 (4): 643-649. 10.1079/BJN20031084.
Rawlings JS, Rosler KM, Harrison DA: The JAK/STAT signaling pathway. J Cell Sci. 2004, 117 (Pt 8): 1281-1283.
Aaronson DS, Horvath CM: A road map for those who don't know JAK-STAT. Science. 2002, 296 (5573): 1653-1655. 10.1126/science.1071545.
Sun L, Ma K, Wang H, Xiao F, Gao Y, Zhang W, Wang K, Gao X, Ip N, Wu Z: JAK1-STAT1-STAT3, a key pathway promoting proliferation and preventing premature differentiation of myoblasts. J Cell Biol. 2007, 179 (1): 129-138. 10.1083/jcb.200703184.
Trenerry MK, Della Gatta PA, Cameron-Smith D: JAK/STAT signaling and human in vitro myogenesis. BMC Physiol. 2011, 11: 6-10.1186/1472-6793-11-6.
Knurr T, Laara E, Sillanpaa MJ: Impact of prior specifications in ashrinkage-inducing Bayesian model for quantitative trait mapping and genomic prediction. Genet Sel Evol. 2013, 45 (1): 24-10.1186/1297-9686-45-24.
Financial support from the Irish Cattle Breeding Federation, Teagasc and the Research Stimulus Fund (RSF-06-0353; 11/S/112) is gratefully acknowledged. AGD is funded under the Teagasc Walsh Fellowship Scheme (number: 2009183); CJC is funded under the Science Foundation Ireland (SFI) Stokes lecturer scheme (number: 07/SK/B1236A).
The authors declare that they have no competing interests.
DPB and CJC conceived and designed the project; AGD performed the research, analyzed data and wrote the paper with DPB and CJC. All authors read and approved the final manuscript.
Electronic supplementary material
Additional file 1: Animal Identifiers for all animals used in this study. These identifiers can be used to request information from the Irish Cattle Breeding Federation database (http://www.icbf.com). (XLS 70 KB)
Additional file 2: SNP information for all SNPs included in the analysis. This file contains the chromosome number, position and dbSNP identifier for all SNPs included in either the single SNP regression or Bayesian analyses. The q-values for all SNPs included in the single SNP regression analysis for each trait are also included. Posterior probabilities, estimated using a 5 SNP sliding window, for the Bayesian analysis are also included in adjacent columns. (CSV 4 MB)
Additional file 6: Ensembl gene IDs for all Bovine genes within 500 kbs of a QTL associated with a trait using the Bayesian approach. This file also contains the human ortholog(s) for each bovine gene and which of the trait datasets it was assigned to. (CSV 21 KB)
Additional file 7: The peroxisome proliferator-activated receptor (PPAR) signaling pathway. PPAR was the most significantly over-represented KEGG pathway in the combined trait analysis. Genes in this pathway were in regions surrounding QTL associated to three different traits using the Bayesian approach (colored in red). (PDF 558 KB)
Additional file 8: The phosphatidylinositol signaling system. This pathway was significantly over-represented in the carcass conformation and combined trait analyses. Genes from this pathway that were within 500 kbs of significantly associated QTL using the Bayesian approach are highlighted in red. (PDF 534 KB)
About this article
Cite this article
Doran, A.G., Berry, D.P. & Creevey, C.J. Whole genome association study identifies regions of the bovine genome and biological pathways involved in carcass trait performance in Holstein-Friesian cattle. BMC Genomics 15, 837 (2014). https://doi.org/10.1186/1471-2164-15-837