- Research article
- Open Access
Genome-wide association study of milk and reproductive traits in dual-purpose Xinjiang Brown cattle
BMC Genomics volume 20, Article number: 827 (2019)
Dual-purpose cattle are more adaptive to environmental challenges than single-purpose dairy or beef cattle. Balance among milk, reproductive, and mastitis resistance traits in breeding programs is therefore more critical for dual-purpose cattle to increase net income and maintain well-being. With dual-purpose Xinjiang Brown cattle adapted to the Xinjiang Region in northwestern China, we conducted genome-wide association studies (GWAS) to dissect the genetic architecture related to milk, reproductive, and mastitis resistance traits. Phenotypic data were collected for 2410 individuals measured during 1995–2017. By adding another 445 ancestors, a total of 2855 related individuals were used to derive estimated breeding values for all individuals, including the 2410 individuals with phenotypes. Among phenotyped individuals, we genotyped 403 cows with the Illumina 150 K Bovine BeadChip.
GWAS were conducted with the FarmCPU (Fixed and random model circulating probability unification) method. We identified 12 markers significantly associated with six of the 10 traits under the threshold of 5% after a Bonferroni multiple test correction. Seven of these SNPs were in QTL regions previously identified to be associated with related traits. One identified SNP, BovineHD1600006691, was significantly associated with both age at first service and age at first calving. This SNP directly overlapped a QTL previously reported to be associated with calving ease. Within 160 Kb upstream and downstream of each significant SNP identified, we speculated candidate genes based on functionality. Four of the SNPs were located within four candidate genes, including CDH2, which is linked to milk fat percentage, and GABRG2, which is associated with milk protein yield.
These findings are beneficial not only for breeding through marker-assisted selection, but also for genome editing underlying the related traits to enhance the overall performance of dual-purpose cattle.
The Xinjiang Brown was recognized as a new dual-purpose cattle breed in China in 1983 . Xinjiang Brown cattle have strong adaptability and resistance under extreme weather conditions. For example, these cattle can graze in temperatures below -40 °C and in snow up to 20 cm deep . Because of these superior characteristics, the breed has spread widely across the northern area of Xinjiang. By the end of 2017, the population had reached nearly 1.5 million, including hybrid progeny . Similar to breeders of other dual-purpose cattle breeds, Xinjiang Brown breeders took both dairy and beef traits into consideration to achieve comprehensive breeding objectives. Characteristics unique to dual-purpose cattle must be preserved, including the capacity to produce multiple products that can adapt to market demands. This product flexibility is particularly beneficial to small-scale herdsman who are more financially vulnerable to the whims of market changes and consumer preferences.
With the development of genotyping technologies and new genetic analysis methods, the genetic architecture of economically important traits have been explored across different cattle breeds and populations. Substantial genomic regions have been identified [3,4,5,6]. According to Release 36 in the Animal Quantitative Trait Loci (QTL) Database , 41,234 QTL are associated with 154 milk traits, 42,648 QTL with 71 reproductive traits, and 4081 QTL with 92 health traits. Potential candidate genes were also identified for these traits. For example, the DGAT1 gene associates with milk composition and yield traits [8, 9] and has been validated as a major gene in Holstein populations across multiple countries . FASN has a significant effect on milk fat component traits [11, 12]. BRCA1 has an effect on somatic cell score (SCC), which influences mastitis disease in dairy cows [13, 14]. For reproductive traits, the GH-L127 V mutation was reported to be associated with calving interval in a Jersey cattle population .
Although many genome-wide association studies (GWAS) and genomic functional validation studies on dairy and beef cattle traits have been performed, few studies have focused on dual-purpose breeds and populations. For Xinjiang Brown, only a few genetic polymorphisms have been reported for milk composition, somatic cell score, and early growth traits [16,17,18,19]. Studies on the dual-purpose cattle breed, German Fleckvieh, reported a QTL on the Bos taurus (bovine) autosome (BTA) 5 associated with milk production  and two loci on BTA 14 and 21 associated with calving ease and growth-related traits . Another study reported several SNPs associated with milk and functional traits in a population of the dual-purpose breed, Italian Simmental . A few selection signature studies revealed several genetic variations in both dairy and beef cattle (Gir) populations [23, 24], and a few genetic polymorphism studies discussed the genetic architecture of milk production traits in the Italian Simmental breed [25, 26]. Despite the valuable information provided by these previous genomic studies, GWAS using high-density SNPs are still limited in dual-purpose breeds. Because the genetic linkage phase could be different across breeds and populations, using the previously identified markers to conduct marker-assisted selection is problematic, especially when marker density was low during the discoveries. Therefore, GWAS with high-density SNPs are needed to understand the genetic architecture of important, complex traits in dual-purpose cattle breeds.
In this study, we evaluated five milk production traits: milk yield (MY), fat yield (FY), protein yield (PY), fat percentage (FP), and protein percentage (PP); four reproductive traits: age at first service (AFS), age at first calving (AFC), gestation length (GL), and calving interval (CI); and one health trait: somatic cell score (SCS) in the Chinese dual-purpose cattle breed, Xinjiang Brown. We used milk production, reproductive, and health data records, collected during 1995–2017 on 2410 individuals, from four different breeding herds raised in the Xinjiang region of northwestern China. We used another 445 ancestors to obtain a total of 2855 individuals connected by pedigree to estimate variance components and breeding values. Ultimately, a total of 403 cattle were selected for genotyping with the 150 K Bovine BeadChip, which resulted in a total of 139,376 markers. Our objective was to identify SNPs associated with milk, reproductive, and health traits in the Xinjiang Brown for the benefit of marker-assisted selection and dissection of genetic architecture of these complex traits.
A total of 2410 individuals with 6811 reproductive records and 5441 milk records were used in this study. The descriptive statistics results of milk, health, and reproductive traits in Xinjiang Brown Cattle are shown in the Table 1. Based on the milk records, the mean 305-day milk yield (MY) was 4216.49 kg. This mean MY value is within the normal range compared with Chinese dual-purpose Sanhe cattle, Simental cattle, and Chinese Range Red cattle , but less than European dual-purpose Fleckvieh and Braunvieh breeds . In our Xinjiang Brown population, mean milk fat percentage (FP) was 3.93%, similar to Fleckvieh and Braunvieh; mean milk protein percentage (PP) was 3.37%, higher than these two breeds . The population’s mean milk fat yield (FY) and protein yield (PY) were 168.53 kg and 143.79 kg, respectively, which are both less than Fleckvieh and Braunvieh .
Somatic cell score (SCS) was used as an indicator trait for udder health; the smaller the SCS, the lower the risk for mastitis . SCS is not only important in dairy cattle, but is also crucial in dual-purpose breeds. In the study population, mean SCS was moderate, 4.98, with a heritability of 0.08.
Most reproductive traits are difficult to measure and vary across environmental conditions . We selected age at first service (AFS), age at first calving (AFC), gestation length (GL), and calving interval (CI) because they are relatively easy to record and analyze. The averages were 571.89 days, 877.65 days, 437.51 days, and 284.56 days for AFS, AFC, GL, and CI, respectively. Heritabilities were low for all four traits, ranging from 0.01 to 0.08, which is consistent with findings from other studies on dairy and beef cattle [31, 32]. Together, these traits can reflect a cow’s production efficiency and body condition and are also important breeding objectives for the Xinjiang Brown.
Phenotypic, genetic and residual correlation
The correlations and distributions of phenotypes, estimated breeding values (EBV), and residuals for each of the 10 study traits are shown in Additional file 1: Figure S1. The EBVs of all traits followed a normal distribution. We found strong correlations among MY, FY, and PY phenotypes, with correlation coefficients ranging from 0.78 to 0.92. The genetic correlation coefficients among EBVs were medium to high, ranging from 0.54 to 0.70. The correlation between MY and both FP and PP were negative and weak (genetic and phenotypic), which have also been reported in other studies . Among the reproductive traits, the strongest phenotypic and genetic correlations were found between AFS and AFC, with correlation coefficients of 0.94 and 0.92, respectively. The smaller the AFS, the smaller the AFC. We were particularly interested in traits with high genetic correlations and focused on whether they shared common markers.
The PCA scatterplots illustrate a clear population structure for the 396 individuals in the four cattle herds that comprised our study population (Fig. 1). In the scatterplot of PC1 and PC2, the majority of cattle in herd 3 are completely separated from the majority of individuals in herd 4 (Fig. 1a). Similarly, most individuals from herd 1 and herd 2 split into another two distinct groups. Furthermore, several clusters of individuals, either from the same or from different herds, were observed in the scatterplot of PC1 and PC3 (Fig. 1b). Clusters of the same color represent closely related individuals from same herd. In contrast, we identified three distinct clusters of herd 2 (green) and herd 4 (red) individuals and two clusters of herd 2 (green) and herd 1 (black) individuals. These mixed clusters indicate that, although individuals may come from different herds, they still retain close genetic relationships. We further explored the relationships between the first three principal components (PCs) and the phenotypes of the 10 study traits with additional scatterplots (Additional file 4: Figure S4), but found no strong correlations.
Genome-wide association studies
The FarmCPU method was used to perform the genome-wide association analysis. Because population structure can cause false positive results in GWAS, the first three PCs were added into our GWAS model. Ultimately, 12 SNPs passed the 5% threshold after a Bonferroni correction and were associated with six of the 10 study traits (Fig. 2). For milk traits, two significant SNPs were detected on Bos taurus autosome (BTA) 24 (BovineHD2400007916) and BTA 7 (BTB-01731924) and were associated with FP and PY, respectively. For the health trait, mastitis resistance, three significant SNPs were found on BTA 8 (BovineHD0800007286), BTA 22 (BovineHD2200012261), and BTA 5 (BovineHD0500013296) and were associated with SCS. For reproductive traits, three SNPs located on BTA 14 (BovineHD1400016327), BTA 3 (BovineHD0300035237) and BTA 16 (BovineHD1600006691) were significantly associated with AFS; two SNPs located on BTA 14 (BovineHD1400021729) and BTA 17 (ARS-USMARC_528) were significantly associated with GL; and two SNPs located on BTA 19 (Bovine HD1900002007) and BTA 25 (BovineHD2500003462) were significantly associated with CI. We found no significant markers associated with MY, FY, PP, or AFC (Additional file 5: Figure S5).
To check for overlaps among the SNPs significantly associated with milk, reproductive, or health traits, we created a heat map using different bin sizes and several significant p thresholds (Additional file 3: Figure S3). The visual effect of Additional file 3: Figure S3 is a combination of both the strength of signals and the bandwidth. For a small bin, the band is visible only when the signal is strong. For the same level of signals, a band becomes visible when it is wide enough. We found one overlapping SNP (BovineHD1600006691) at 24.2 Mb on BTA 16 that associated with both AFS and AFC. This SNP has also been reported in a QTL region associated with calving ease . Additionally, most of the SNPs we identified have been previously located in QTL regions that are associated with traits related to our study traits. We mapped 12 candidate genes on 11 autosomes, based on the physical position of the significant SNPs (Fig. 2, Table 2). Four SNPs are within genes, including CDH2, which is linked to FP, and GABRG2, which is associated with PY. The other SNPs are within 156 kb or less of a gene.
Population stratification is an important issue in population-based association studies [35, 36]. Because allele frequency may differ in sample individuals due to systematic ancestry differences , hidden population structure may cause spurious results and reduce the statistical power in GWAS. Consequently, stratification in the experimental population must be corrected [38,39,40]. In this study, our Xinjiang Brown experimental cattle were selected from four different commercial herds. Each year, foreign blood was introduced into each herd to improve population productivity, and sometimes cattle were transferred among herds. Thus, we hypothesized that some hidden structure should be inherent in our experimental population. Population structure is one of the major cause spurious association and must be accounted through stratified analyses such as genomic control, structured associations, and PCA . We used PCA to detect the stratification and found a clear subpopulation structure (Fig. 1). For example, herd 3 and herd 4 exhibited an obvious clustering pattern and were completely separated by the first PC. Herd 2 and herd 4 exhibited an overlapping pattern, indicating that individuals from these two herds have a closer genetic relationship than individuals from other herds.
Cryptic relationships among individuals is another major source of spurious associations. Several methods have been developed to correct both population stratification and cryptic relationships to screen markers across genomes. Ideally, a one-step approach would perform the best by optimization over population structure, cryptic relationships, and genetic markers simultaneously; however, the associated computational burden prevents full optimization for practical uses. Furthermore, robust approximation was achieved with a dramatic reduction in computing time. For example, the EMMAx and P3D algorithms deliver almost identical results for full optimization of genetic and residual variance estimates for every testing marker, using the fixed and random effects mixed linear model (MLM).
The computing time of the MLM was further improved by splitting the model into a fixed effect model and a random effect model. The fixed effect model is used for testing markers, one at a time. The random effect model is used to select markers that are used as covariates in the fixed effect model. The fixed effect model and the random effect model are used iteratively until no change occurs in the covariates. Compared to the kinship based on all the available markers, the kinship based on the selected markers has the best likelihood for the specific trait of interest. This method was named the Fixed and random model Circulating Probability Unification (FarmCPU). Both simulation and analyses on real traits demonstrated that FarmCPU has higher statistical power than the regular mixed method using all available markers to build kinship.
Given this population stratification, we used two models to perform GWAS using FarmCPU, with and without the first three PCs as covariates. Without including the PCs, we found 20 significant markers associated with eight of the 10 traits (Additional file 6: Figure S6). After including the PCs, 18 of these 20 significant markers disappeared and 10 new SNPs surfaced. We calculated the inflation factor to check whether significant population structure remained (Additional file 7: Table S1). The result showed minimal inflation using FarmCPU. Both quantile-quantile plots (Q-Q plot) and the inflation factor exhibited the same trend. In fact, FarmCPU is conservative, which even led to minor deflation. Because the previous study  suggested including PCs to ensure population structure is incorporated when performing FarmCPU, we used the model with PCs fitted as covariates. In total, the combined SNP-PCA model identified 12 significant markers associated with six of the 10 traits (Fig. 2).
Comparison of GWAS results
We found 12 significant markers associated with six important, complex traits in Xinjiang Brown cattle, based on a high-density SNP chip. Among them, two SNPs overlapped in both the SNP model and the combined SNP-PCA model. One SNP is seated on BTA 8 and significantly associated with SCS; the other SNP is on BTA 16 and significantly associated with AFS. Four SNPs were significantly associated with MY, FY, PP, and AFC when we used the SNP model, but these SNPs failed to pass the 5% threshold after a Bonferroni correction in the combined SNP-PCA model. Still, SNPs associated with FY (Bovine HD1600007977), PP (Bovine HD2300015096), and AFC (Bovine HD1600006691) are the most significant SNPs in both models. Our study is the first GWAS on milk, reproductive, and mastitis resistance traits in the Xinjiang Brown dual-purpose cattle breed. Only a limited number of studies have reported on similar traits in other dual-purpose breeds [20,21,22,23,24,25,26]; therefore, we compared our results with studies of single-purpose dairy and beef cattle breeds.
Milk composition traits are important breeding traits in both dairy and dual-purpose cattle breeds, especially in modern animal husbandry environments. We found two highly significant SNPs associated with milk composition traits. One SNP is associated with FP and is positioned within the cadherin-2 (CDH2) gene at 29.1 Mbp on BTA 24. CDH2 is a protein encoding gene and participates in adipogenesis . Knocking down CDH2 to block the epithelial-mesenchymal transition-like response could weaken adipocyte lineage commitment . Several previous studies have reported QTL near this SNP. For example, one study found a QTL region spanning 18.1–21.8 Mbp on BTA 24 that was associated with FP in a Danish Holstein population . Another study mapped a QTL at 33.4 Mbp on BTA 24 that was associated with FP in another Holstein cattle population . Furthermore, the cattle QTL database  reports an additional 14 QTL on either side of the FP-associated SNP we identified. These 14 QTL are associated with health, production, reproductive, and meat and carcass traits. One of the QTL that spans 21.8–31.0 Mbp on BTA 24 is significantly associated with SCS in Danish Holstein .
The other milk-related SNP we identified was significantly associated with PY and mapped at 75.8 Mbp on BTA 7, which is within a gene named Gamma-aminobutyric Acid Type A Receptor Gamma2 Subunit (GABRG2). GABRG2 primarily contributes to gamma-aminobutyric acid (GABA)-gated chloride ion channel activity and participates in GABA-A receptor activity  and has been studied mostly in association with human idiopathic epilepsy [49, 50]. Among cattle genomic studies, a potential supporting study reported a nearby QTL region spanning 71.9–73.8 Mbp on BTA7 that was associated with PY in a US Holstein population . Additionally, we found six other QTL in the cattle QTL database  that contained the PY-associated SNP we identified. Three of these QTLs are associated with milk FY in Holstein and Jersey cow populations . One QTL is significantly associated with meat fat content in Nellore beef cattle . Another QTL is linked to cold tolerance in a crossed beef cattle population . And, the sixth one is linked to meat tenderness traits in five taurine cattle breeds .
SCS is highly correlated with mastitis in cattle populations [56, 57] and is usually selected as an indicator trait to reflect udder health status and mastitis resistance . In this study, we mapped three highly significant, SCS-associated SNPs on BTA 5 (46.3 Mbp), BTA 22 (42.3 Mbp), and BTA 8 (24.2 Mbp). Three candidate genes were found nearby these three SNPs. One of the genes, named Dual Specificity Tyrosine Phosphorylation Regulated Kinase 2 (DYRK2), was reported to be related to udder support score trait in crossbred Bos indicus-Bos taurus cows . Many QTL been reported for SCS. For example, a peak QTL region was found at 28.2–44.5 Mbp on BTA 5 in one Holstein population . And, in another Holstein population, several QTL were found on BTA 22 within 1 Mbp of our identified SNP . Two separate studies, performed in different years, reported the same QTL at 24.8 Mbp on BTA 8 that was related to SCS in Norwegian Red  and Red Pied dairy cattle . The position of this QTL is close to the SNP we found on the same chromosome. We also found other studies that identified QTL regions associated with traits related to SCS and also contained the SCS-associated SNPs we identified in this study.
Before reproductive traits became important breeding objectives, most breeders focused on production traits . However, to maintain balanced breeding, fertility traits have gained more and more attention in breeding schemes. Understanding the genetic architecture of low heritability traits, such as fertility traits, helps improve selection; thus, many GWAS on fertility traits have been performed [63,64,65,66,67]. In our GWAS, we found three highly significant SNPs associated with AFS. The first SNP is mapped at 120.4 Mbp on BTA 3; the nearby gene is Kinesin Family Member 1A (KFM1A). The second SNP is seated at 58.7 Mbp on BTA 14; the closest gene is a pseudo gene LOC511981. The third SNP is located at 24.2 Mbp on BTA 16 and within the Glutamyl-prolyl-tRNA Synthetase (EPRS) gene. Several QTL on BTA 16 contain the AFS-associated SNP we found. One of these QTL was previously reported to be related to calving ease in US Holstein cattle ; the other QTLs were related to weaning weight in Blonde d’Aquitaine beef cattle , birth weight in Angus beef cattle , and hip height in Qinchuan and Jiaxian Red beef cattle . Both calving ease and body size traits are highly correlated with AFS.
For GL, we found two significant SNPs, one mapped at 77.5 Mbp on BTA 14 and the other mapped at 34.8 Mbp within the Sprouty RTK Signaling Antagonist 1(SPRY1) gene on BTA 17. The two SNPs we found significantly associated with CI were located at 7.6 Mbp on BTA 19 and at 12.4 Mbp on BTA 25. The nearest genes to these SNPs are Ankyrin-repeat and Fibronectin Type III Domain Containing 1 (ANKFN1) on BTA 19 and Shisa Family Member 9 (SHISA9) on BTA 25. A previously reported QTL region at 6.3–13.8 Mbp on BTA 25 was found to affect dystocia in a dairy population . Another study reported a QTL at 6.3–17.7 Mbp on BTA 25 linked to no-return rate in Danish and Sweden Holstein cattle . Both dystocia and no-return rate are fertility traits and, thus, related to the reproductive traits we studied.
This study used a high-density SNP chip to perform GWAS with milk, reproductive, and mastitis traits in the Chinese dual-purpose cattle breed, Xinjiang Brown. We found 12 significant SNPs associated with six of the 10 traits studied. Seven of these SNPs overlap with QTL regions previously reported in studies of other cattle populations. The candidate gene, CDH2, participates in adipogenesis and may affect milk fat production. These results enhance our understanding of important, complex traits in the dual-purpose Xinjiang Brown cattle breed and contribute to further studies on validation of gene function and genomic selection.
Animals and phenotyping
Phenotypic data used in this study were collected during 1995–2017 from 2410 Xinjiang Brown cow individuals from four different breeding herds, they are Tacheng Area Xinjiang Brown Cattle Breeding Farm, Yili Xinhe Xinjiang Brown Cattle Breeding Farm, Urumqi Xinjiang Brown Cattle Breeding Farm, and the Xinjiang Tianshan Animal Husbandry and Bio-engineering Co., Ltd., located in Tacheng city, Yining city, Urumqi city and Changji city, respectively. Blood sample were collected from the coccygeal vine of the tail-head of cows by the Vacuum Blood Collector, cleaned the area before sampling and pressed the sample wound for a while to let it recover after extraction. The tail-head blood collection method we took is very quick, lower stress and almost painless for the cattle. We used an additional 445 ancestors, for a total of 2855 individuals connected by pedigree, to estimate the breeding values of five milk traits, four reproductive traits, and one health trait (Additional file 1: Figure S1, Additional file 2: Figure S2). Milk traits included milk yield (MY), fat yield (FY), protein yield (PY), fat percentage (FP), and protein percentage (PP). Reproductive traits were age at first service (AFS), age at first calving (AFC), gestation length (GL) and calving interval (CI). And, the health trait was somatic cell score (SCS).
Genotyping and quality control
In total, 403 female cattle were selected for genotyping by using the Illumina 150 K Bovine BeadChip. Quality control was conducted by using Plink software  with criteria as follows: (1) individual call rate > 95%; (2) genotype call rate > 90%; (3) Hardy-Weinberg equilibrium p-value >1e-6; and (4) minor allele frequency (MAF) > 0.05. After quality control, 396 cows and 139,376 markers remained. The genotypes in A/T/G/C format were converted to numeric genotypes by iPat software . The distribution of SNPs on each chromosome was relatively uniform, although a few chromosomes contained relatively large blank areas, especially chromosome X (Fig. 3). Chromosome 1 contained the greatest number of SNPs, whereas chromosome 25 contained the fewest. The distribution of MAF revealed that the MAF frequency increased with MAF, suggesting that SNPs on the Illumina 150 K Bovine BeadChip were selected for common SNPs. Xinjiang Brown had the same property as other breeds used for developing the chip. LD decay fell off quickly within 10 kb physical distance and then decreased slowly afterwards.
Principal component analysis
The experimental Xinjiang Brown population came from four breeding herds. We used the Prcomp function in R to perform a principal component analysis (PCA). The PCA showed a clear population structure (Fig. 1). PC 1 showed the separation between the individuals of herd 3 (blue) and 4 (red). Some individuals from herd 4 and herd 2 (green) exhibited close relationships. Most individuals from herd 1 (black) clustered far away from the other herds.
Estimated breeding values
Genetic analysis was carried out using DMU  software with the animal model as follows:
where yijklm is the phenotype in the jth year, kth season, and lth parity of the mth individual from ith herd; u is overall mean of population, Herdi is the herd effect according to a cow’s origin from one of the four herds; Yearj is the jth year effect, Seasonk is the kth season effect, and Parity is the effect of lth parity; a is the additive effect of mth individual and e is the residual in the jth year, kth season, and lth parity of the mth individual from ith herd. All effects were treated as random except the overall mean.
Genome-wide association studies
The fixed and random model circulating probability unification (FarmCPU) method was used to carry out the genome-wide association analysis in this study . The method uses a fixed effect model and a random effect model iteratively. The fixed effect model tests SNPs one at a time. The significant SNPs are evaluated in the random effect model and the validated SNPs are fitted as covariates in the fixed effect model to control population structure. These SNPs are selected based on the likelihood of using them to build the cryptic relationships among individuals. The iteration stops when no validated SNPs can be added as covariates. Both real data and simulated data has demonstrated that FarmCPU has higher statistical power than other methods, including the random effect model with kinship derived from all the markers, to conduct association tests .
Availability of data and materials
The data and material used in this research are available from the corresponding author on request.
Bos Taurus autosome
Estimated breeding value
Fixed and random model circulating probability unification
Genome-wide association study
Principal component analysis
Quantitative trait loci
Single nucleotide polymorphism
Chen TT. Study on carcass classification and beef cut quality of Xinjiang Brown cattle (Master's thesis). Qingdao Agriculture University. Qingdao. 2014.
Zhou JH, Li P, Liu LY, Zhao GL, Huang XX, Nai BH, et al. The current status of Xinjiang Brown cattle Germplasm resources and suggestions for population genetic improvement. Chinese J Anim Sci. 2017;8:38–43.
Weller JI, Ron M. Invited review: quantitative trait nucleotide determination in the era of genomic selection. J Dairy Sci. 2011;94:1082–90.
Gutiérrez-Gil B, Arranz J, Wiener P. An interpretive review of selective sweep studies in Bos taurus cattle populations: identification of unique and shared selection signals across breeds. Front Genet. 2015;6:167.
Sharma A, Lee JS, Dang CG, Sudrajad P, Kim HC, Yeon SH, et al. Stories and challenges of genome wide association studies in livestock—a review. Asian-Australas J Anim Sci. 2015;28:1371–9.
Ma L, Cole JB, VanRaden PM. Symposium review: genetics, genome-wide association study, and genetic improvement of dairy fertility traits. J Dairy Sci. 2018;102:3735–43.
The Animal Quantitative Trait Loci (QTL) Database. https://www.animalgenome.org/cgi-bin/QTLdb/BT/index. Accessed 10 Dec 2018.
Bernard G, Coppieters W, Farnir F, Karim L, Ford C, Berzi P, et al. Positional candidate cloning of a QTL in dairy cattle: identification of a missense mutation in the bovine DGAT1 gene with major effect on milk yield and composition. Genome Res. 2002;12:222–31.
Thaller G, Kuhn C, Winter A, Ewald G, Bellmann O, Wegner J, et al. DGAT1, a new positional and functional candidate gene for intramuscular fat deposition in cattle. Anim Genet. 2003;34:354–7.
Bennewitz J, Reinsch N, Pual S, Looft C, Kaupe B, Weimann C, et al. The DGAT1 K232A mutation is not solely responsible for the milk production quantitative trait locus on the bovine chromosome 14. J Dairy Sci. 2004;87:431–42.
Roy R, Ordovas L, Zaragoza P, Romero A, Moreno C, Altarriba J, et al. Association of polymorphisms in the bovine FASN gene with milk-fat content. Anim Genet. 2006;37:215–8.
Morris CA, Cullen NG, Glass BC, Hyndman DL, Manley TR, Hickey SM, et al. Fatty acid synthase effects on bovine adipose fat and milk fat. Mamm Genome. 2007;18:64–74.
Yuan Z, Chu G, Dan Y, Li J, Zhang L, Gao X, et al. BRCA1: a new candidate gene for bovine mastitis and its association analysis between single nucleotide polymorphisms and milk somatic cell score. Mol Biol Rep. 2012;39:6625–31.
Yuan Z, Li J, Li J, Zhang L, Gao X, Gao HJ, et al. Investigation on BRCA1 SNPs and its effects on mastitis in Chinese commercial cattle. Gene. 2012;505:190–4.
Komisarek J, Michalak A, Walendowska A. The effects of polymorphisms in DGAT 1, GH and GHR genes on reproduction and production traits in Jersey cows. Anim Sci Pap Rep. 2011;29:29–36.
Jie B, Lin J, Li W, Liu M. Association of toll-like receptor 2 polymorphisms with somatic cell score in Xinjiang Brown cattle. Anim Sci J. 2012;83:23–30.
Niu ZG, Cai HS, Jun ML, Yong ZZ, Yang Z. Correlation between GH gene polymorphism of the 5~(th) exon AluI site and early growth traits in Xinjiang Brown cattle. J South Agric. 2012;43:688–91.
Yuan LL, Hang JZ, Hua MZ, Xia JL, Qing JF, Xin ST, et al. Genetic effect analysis of SNPs from 6 genes on SCS and milk production traits in Xinjiang Brown cattle. China Agr Sci. 2017;50:2592–603.
Sheng CJ, Yang Z, Wei YS, Fang CZ, Hong CL, Sheng TY, et al. Study on the polymorphism of Leptin exon 2’s E2JW and E2FB locus in Xinjiang Brown cattle. J Xinjiang Agric Univ. 2009;5:6–9.
Awad A, Russ I, Emmerling R, Forster M, Medugorac I. Confirmation and refinement of a QTL on BTA5 affecting milk production traits in the Fleckvieh dual purpose cattle breed. Anim Genet. 2010;41:1–11.
Pausch H, Flisikowski K, Jung S, Emmerling R, Edel C, Götz KU, et al. Genome-wide association study identifies two major loci affecting calving ease and growth-related traits in cattle. Genetics. 2011;187:289–97.
Chessa S, Nicolazzi LE, Nicoloso L, Negrini R, Marino R, Vicario D, et al. Analysis of candidate SNPs affecting milk and functional traits in the dual-purpose Italian Simmental cattle. Livest Sci. 2015;173:1–8.
Maiorano AM, Lourenco DL, Tsuruta S, Ospina AMT, Stafuzza NB, Masuda Y, et al. Assessing genetic architecture and signatures of selection of dual purpose Gir cattle populations using genomic information. PLoS One. 2018;13:e0200694.
Liao X, Peng F, Forni S, Mclaren D, Plastow G, Stothard P. Whole genome sequencing of Gir cattle for identifying polymorphisms and loci under selection. Genome. 2013;56:592–8.
Scotti E, Fontanesi L, Schiavini F, Mattina LV, Bagnato A, Russo V. DGAT1 p. K232A polymorphism in dairy and dual purpose Italian cattle breeds. Italian J Anim Sci. 2010;9:e16.
Fontanesi L, Scotti E, Schiavini F, Mattina LV, Davoli R, Bagnato A, et al. Investigation of the DGAT1 K232A and VNTR mutations in dairy and dual purpose cattle breeds. Italian J Anim Sci. 2007;6:137.
Meng QM, Hua CQ, Chun YW, Gang YS, Li SZ, Yuan Z. Chinese and international situation, progresses and perspectives of breeding strategies in dual purpose cattle. China Dairy Cattle. 2013;13:18–21.
Zółkiewski P, Stanek P, Janus E. Productivity of Simmental and Monbeliarde cows culled ub 2005-2006 taking into account the reasons for their culling. Acta Sci Pol Zootech. 2018;17:15–22.
Jørgen Ø, Klemetsdal G, Heringstad B. Genetic improvement of mastitis resistance: validation of somatic cell score and clinical mastitis as selection criteria. J Dairy Sci. 2003;86:4129–36.
Ma L, Cole JB, Da Y, VanRaden PM. Symposium review: genetics, genome-wide association study, and genetic improvement of dairy fertility traits. J Dairy Sci. 2019;104:3735–43.
Ben ZH, Gara AB, Hammami H, Jemmali B, Ferchichi MA, Rekik B. Genetic parameters of reproductive traits in Tunisian Holsteins. Arch Anim Breed. 2016;59:209–13.
Berry DP, Wall E, Pryce JE. Genetics and genomics of reproductive performance in dairy and beef cattle. Animal. 2014;8:105–21.
Cassandro M, Comin A, Ojala M, Dal ZR, De MM, Gallo L, et al. Genetic parameters of milk coagulation properties and their relationships with milk yield and quality traits in Italian Holstein cows. J Dairy Sci. 2008;91:371–6.
Schnabel RD, Songstegard TS, Taylor JF, Ashwell MS. Whole-genome scan to detect QTL for milk production, conformation, fertility and functional traits in two US Holstein families. Anim Genet. 2005;36:408–16.
Wu C, DeWan A, Hoh J, Wang Z. A comparison of association methods correcting for population stratification in case–control studies. Ann Hum Genet. 2011;75:418–27.
Li Q, Yu K. Improved correction for population stratification in genome-wide association studies by identifying hidden population structures. Genet Epidemiol. 2008;32:215–26.
Uemoto Y, Abe T, Tameoka N, Hasebe H, Inoue K, Nakajima H, et al. Whole-genome association study for fatty acid composition of oleic acid in Japanese black cattle. Anim Genet. 2011;42:141–8.
Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38:904–9.
Price AL, Zaitlen NA, Reich D, Patterson N. New approaches to population stratification in genome-wide association studies. Nat Rev Genet. 2010;11:459–63.
Freedman ML, Reich D, Penney KL, McDonald GJ, Mignault AA, Patterson N, et al. Assessing the impact of population stratification on genetic association studies. Nat Genet. 2004;36:388–93.
Sillanpää MJ. Overview of techniques to account for confounding due to population stratification and cryptic relatedness in genomic data association analyses. Heredity. 2011;106:511–9.
Liu X, Huang M, Fan B, Buckler ES, Zhang Z. Iterative usage of fixed and random effect models for powerful and efficient genome-wide association studies. PLoS Genet. 2016;12:e1005767.
Alimperti S, Andreadis ST. CDH2 and CDH11 act as regulators of stem cell fate decisions. Stem Cell Res. 2015;14:270–82.
Huang HY, Chen SZ, Zhang WT, Wang SS, Liu Y, Li X, et al. Induction of EMT-like response by BMP4 via up-regulation of lysyl oxidase is required for adipocyte lineage commitment. Stem Cell Res. 2013;10:278–87.
Höglund JK, Buitenhuis AJ, Guldbrabdtsen B, Su G, Thomsen B, Lund MS. Overlapping chromosomal regions for fertility traits and production traits in the Danish Holstein population. J Dairy Sci. 2009;92:5712–9.
Pimentel EC, Bauersachs S, Tietze M, Simimaner H, Tetens J, Thaller G, et al. Exploration of relationships between production and fertility traits in dairy cattle via association studies of SNPs within candidate genes derived by expression profiling. Anim Genet. 2011;4:251–62.
Lund MS, Guldbandtsen B, Buitenhuis AJ, Thomsen B, Bendixen C. Detection of quantitative trait loci in Danish Holstein cattle affecting clinical mastitis, somatic cell score, udder conformation traits, and assessment of associated effects on milk yield. J Dairy Sci. 2008;91:4028–36.
National Center for Biotechnology Information. https://www.ncbi.nlm.nih.gov/gene/282240. Accessed 20 Jan 2019.
Kananura C, Haug K, Sander T, Runge U, Gu W, Hallmann K, et al. A splice-site mutation in GABRG2 associated with childhood absence epilepsy and febrile convulsions. Arch Neurol. 2002;59:1137–41.
Nakayama J, Hamano K, Noguchi E, Horiuchi Y, Iwasaki N, Ohta M, et al. Failure to find causal mutations in the GABAA-receptor γ2 subunit (GABRG2) gene in Japanese febrile seizure patients. Neurosci Lett. 2003;343:117–20.
Cole JB, Wiggans GR, Li M, Tad S, Thomos JL, Brian AC, et al. Genome-wide association analysis of thirty one production, health, reproduction and body conformation traits in contemporary US Holstein cows. BMC Genomics. 2011;12:408.
Pryce JE, Bolormaa S, Chamberlain AJ, Bowman PJ, Savin K, Goddard ME, et al. A validated genome-wide association study in 2 dairy cattle breeds for milk production and fertility traits using variable length haplotypes. J Dairy Sci. 2010;93:3331–45.
Cesar AS, Regitano LC, , Mourão GB, Tullio RR, Lanna DP, Nassu RT, et al. Genome-wide association study for intramuscular fat deposition and composition in Nellore cattle. BMC Genet 2014;15:39.
Howard JT, Kachman SD, Snelling WM, Pollak EJ, Ciobanu DC, Kuehn, et al. Beef cattle body temperature during climatic stress: a genome-wide association study. Int J Biometeorol. 2014;58:1665–72.
McClure MC, Ramey HR, Rolf MM, McKay SD, Decker JE, Chapple RH, et al. Genome-wide association analysis for quantitative trait loci influencing Warner–B ratzler shear force in five taurine cattle breeds. Anim Genet. 2012;43:662–73.
Carlén E, Strandberg E, Roth A. Genetic parameters for clinical mastitis, somatic cell score, and production in the first three lactations of Swedish Holstein cows. J Dairy Sci. 2004;87:3062–70.
Dal ZR, De Marchi M, Dalvit C, Cassandro M, Gallo L, Carnier P, et al. Heritabilities and genetic correlations of body condition score and calving interval with yield, somatic cell score, and linear type traits in Brown Swiss cattle. J Dairy Sci. 2007;90:5737–43.
Rodriguez SL, Daniel G, George ES. Evaluation of models for somatic cell score lactation patterns in Holsteins. Livest Prod Sci. 2000;67:19–30.
Tolleson MW, Gill CA, Herring AD, Riggs PK, Sawyer JE, Sanders JO, et al. Association of udder traits with single nucleotide polymorphisms in crossbred Bos indicus–Bos taurus cows. J Anim Sci. 2017;95:2399–407.
Ashwell MS, Heyen DW, Sonstegard TS, Van Tassell CP, Da Y, VanRaden PM, et al. Detection of quantitative trait loci affecting milk production, health, and reproductive traits in Holstein cattle. J Dairy Sci. 2004;87:468–75.
Sodeland M, Kent MP, Olsen HG, Opsal MA, Svendsen M, Sehested E, et al. Quantitative trait loci for clinical mastitis on chromosomes 2, 6, 14 and 20 in Norwegian red cattle. Anim Genet. 2011;42:457–65.
Strillacci MG, Frigo E, Schiavini F, Samoré AB, Canavesi F, Vevey M, et al. Genome-wide association study for somatic cell score in Valdostana red pied cattle breed using pooled DNA. BMC Genet. 2014;15:106.
McClure MC, Morsci NS, Schnabel RD, Kim JW, Yao P, Rolf MM, et al. A genome scan for quantitative trait loci influencing carcass, post-natal growth and reproductive traits in commercial Angus cattle. Anim Genet. 2010;41:597–607.
Thomasen JR, Guldbrandtsen B, Sørensen P, Thomsen B, Lund MS. Quantitative trait loci affecting calving traits in Danish Holstein cattle. J Dairy Sci. 2008;91:2098–105.
Seidenspinner T, Bennewitz J, Reinhardt F, Thaller G. Need for sharp phenotypes in QTL detection for calving traits in dairy cattle. J Anim Breed Genet. 2009;126:455–62.
Höglund JK, Guldbrandtsen B, Su G, Thomsen B, Lund MS. Genome scan detects quantitative trait loci affecting female fertility traits in Danish and Swedish Holstein cattle. J Dairy Sci. 2009;92:2136–43.
Fang LZ, Jiang JC, Li BJ, Zhou Y, Freebern E, Vanraden PM, Cole JB, Liu GE, Li M. Genetic and epigenetic architecture of paternal origin contribute to gestation length in cattle. Commun Biol. 2019;2:100.
Michenet A, Barbat M, Saintilan R, Venot E, Phocas F. Detection of quantitative trait loci for maternal traits using high-density genotypes of Blonde d’Aquitaine beef cattle. BMC Genet. 2016;17:88.
McClure MC, Morsci NS, Schnabel RD, Kim JW, Yao P, Rolf MM, et al. A genome scan for quantitative trait loci influencing carcass, postnatal growth and reproductive traits in commercial Angus cattle. Anim Genet. 2010;41:597–607.
Zhang CL, Yuan J, Wang Q, Wang YH, Fang XT, Lei CZ, et al. The association of bovine T1R family of receptors polymorphisms with cattle growth traits. Res Vet Sci. 2012;93:783–787.
Purcell S, Neale B, Todd BK, Thomas L, Ferreira MA, Bender D, et al. PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses. The American Journal of Human Genetics. 2007;81:559–575.
Chen CC, Zhang ZZ. iPat: intelligent prediction and association tool for genomic research. Bioinformatics. 2018;34:1925–1927.
Madsen P, Milkevych V, Ding HD, Christensen FO, Jensen J. DMU-a package for analyzing multivariate mixed models in quantitative genetics and genomics. In: Proceedings of the 10th world congress of genetics applied to livestock production; 2014.
We thank the Tacheng Area Xinjiang Brown Cattle Breeding Farm, Yili Xinhe Xinjiang Brown Cattle Breeding Farm, Urumqi Xinjiang Brown Cattle Breeding Farm, and the Xinjiang Tianshan Animal Husbandry and Bio-engineering Co., Ltd. for their cooperation and support. We also thank Dr. Linda R. Klein for valuable writing advice and editing the manuscript.
This project was partially supported by the China Agricultural Research System (CARS-36), Science & Technology Department of Xinjiang Uygur Autonomous Region (2018E02052), Department of Education of Xinjiang Uygur Autonomous Region (XJEDU2017005), the USDA National Institute of Food and Agriculture Hatch project (1014919), and the Washington Grain Commission (Endowment and Award # 126593). The funding bodies played no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.
Ethics approval and consent to participate
This study was approved and certificated by the School of Agriculture, Ningxia University (China, Ningxia). All the farms involved in this study had signed an Experimental Animal Sampling Certificate to agree to participate in this research. The phenotype and genotype data in this study were obtained with the joint efforts of the farms and our research members.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1: Figure S1. Correlations and distributions of phenotypes, EBVs (estimated breeding values), and residuals. The histograms on the diagonal are the distributions for each trait: MY = milk yield, FY = fat yield, PY = protein yield, FP = fat percentage, PP = protein percentage, SCS = somatic cell score, AFS = age at first service, AFC = age of first calving, GL = gestation length, and CI = calving interval. The upper triangle is comprised of the correlation coefficients among traits. The lower triangle is comprised of the pairwise scatter plots. Graphs a, b, c illustrate milk trait phenotypes, EBVs, and residuals, respectively. Graphs d, e, f illustrate reproduction trait phenotypes, EBVs, and residuals, respectively.
Additional file 2: Figure S2. Heatmap of milk and reproductive traits. Individuals are sorted row wise and traits column wise based on their similarity. The trait values were standardized and illustrated as heat map with red indicating highest and yellow the lowest. MY = milk yield, FY = fat yield, PY = protein yield, FP = the fat percentage, PP = protein percentage, SCS = somatic cell score, AFS = age at first service, AFC = age of first calving, GL = gestation length, and CI = calving interval.
Additional file 3: Figure S3. Display of significant markers as visible bands at different width. The number significant markers were determined by the P-value cut off with three levels illustrated on the left. The significant markers are displayed as bands with width indicated on the top starting from 100 kb to 10,000 kb. More bands are visible with wider bands than narrow bands. Wide band and less stringent P value threshold (e.g. to right) demonstrate pleiotropy of significant markers across traits. These traits include milk yield (MY), fat yield (FY), protein yield (PY), fat percentage (FP), protein percentage (PP), and somatic cell score (SCS). Reproductive traits include age at first service (AFS), age at first calving (AFC), gestation length (GL), and calving interval (CI).
Additional file 4: Figure S4. Scatter plot between principal components and trait phenotypes. We used these plot to determine which traits were correlated with population structure, represented by principal components (PC). Columns represent the first three principal components, rows represent each trait. Milk traits include milk yield (MY), fat yield (FY), protein yield (PY), fat percentage (FP), protein percentage (PP). and somatic cell score (SCS). Reproductive traits include age at first service (AFS), age at first calving (AFC), gestation length (GL), and calving interval (CI).
Additional file 5: Figure S5. Manhattan and Q-Q plots of non-significant GWAS results. GWAS was performed with FarmCPU software and a significant p-value threshold set at P = 10–7. Four of the 10 traits studied, milk yield (MY), fat yield (FY), protein percentage (PP), and age at first calving (AFC), resulted in no SNPs passing the Bonferroni threshold, as illustrated by the Manhattan plots on the left. On the right, Q-Q plots are displayed as scatter plots of true and expected log p-values.
Additional file 6: Figure S6. Manhattan and Q-Q plots of GWAS results, without considering population structure. These Manhattan plots (left) illustrate results from an association analysis model that did not consider population structure. GWAS was performed with FarmCPU software and a significant p-value threshold set at P = 10–7. Q-Q plots (right) are displayed as scatter plots of true and expected log p-values. Milk traits include milk yield (MY), fat yield (FY), protein yield (PY), fat percentage (FP), protein percentage (PP), and somatic cell score (SCS). Reproductive traits include age at first service (AFS), age at first calving (AFC), gestation length (GL), and calving interval (CI).
About this article
Cite this article
Zhou, J., Liu, L., Chen, C.J. et al. Genome-wide association study of milk and reproductive traits in dual-purpose Xinjiang Brown cattle. BMC Genomics 20, 827 (2019) doi:10.1186/s12864-019-6224-x