Animals and phenotype
Between 2013 and 2017, 3770 American Duroc and 2090 Canadian Duroc pigs were raised in two breeding farms in Guangdong Wen’s Foodstuffs Co., Ltd. (Guangdong, China). When these 5860 Duroc pigs reached the body weight of 30 ± 5 kg, the pigs were transferred to the test station. The pigs were freely fed, provided with drinking water, and measured with a final weight of 100 ± 5 kg in the test station. Phenotypic records included ADG and LMP. When their body weight approximately reached 100 kg, the ADG and age to 100 kg (AGE) traits were measured, and then adjusted to 100 kg. The corrected 100 kg AGE was calculated by the following formula [32]:
$$ AGE\ (day)= Measured\ age-\left[\left( Measured\ body\ weight-100\ kg\right)/ CF\right] $$
where correction factors (CF) are different for male and female, and the CF was used in the following formula:
$$ Male: CF= Measured\ body\ weight/ Measured\ age\ast 1.826 $$
$$ Female: CF= Measured\ body\ weight/ Measured\ age\ast 1.715 $$
the corrected 100 kg ADG was calculated by the following formula [32]:
$$ ADG\ \left( kg/ day\right)=100 kg/ AGE $$
In addition, when their body weight reached 100 ± 5 kg, the backfat thickness (BF) and loin muscle depth (LMD) of the pigs were measured by an Aloka 500 V SSD B ultrasound (Corometrics Medical Systems, USA) between the 10th and 11th rib of the pig. The phenotypic values of LMP were calculated based on the BF and LMD as following described [33]:
$$ LMP\ \left(\%\right)=61.21920-0.77665\ast BF+0.15239\ast LMD $$
where LMP denotes the corrected 100 kg LMP; BF represents the corrected 100 kg BF; LMD signifies the corrected 100 kg LMD. The corrected 100 kg BF and LMD were obtained based on the Canadian Centre for Swine Improvement (http://www.ccsi.ca/Reports/Reports_2007/Update_of_weight_adjustment_factors_for_fat_and_lean_depth.pdf) as the following formula:
$$ BF\ (mm)= Measured\ BF\ast \frac{A}{A+B\ast \left( Measured\ body\ weight-100\ kg\right)} $$
where A and B are different for male and female, and the values represented by A and B are as follows:
$$ male:A=13.47;B=0.1115 $$
$$ female:A=15.65;B=0.156 $$
(6)
the corrected 100 kg LMD was calculated by the following formula:
$$ LMD\ (mm)= Measured\ LMD\ast \frac{a}{a+b\ast \left( Measured\ body\ weight-100\ kg\right)} $$
where a and b vary by the gender of the individual, and the values specified by a and b are as follows:
$$ male:a=50.52;b=0.228 $$
$$ female:a=52.01;b=0.228 $$
Genotype data acquisition and quality control
Collected ear tissues were used to extract genomic DNA by applying standard protocols. In this study, all animals were still raised until elimination after phenotypic measurement and ear tissue collection. These animals were not anesthetized during the ear tissue collection. Genotyping was conducted as described by Ding et al. [34]. DNA quality was measured by electrophoresis and a light absorption ratio (A260/280). All DNA samples were diluted to a concentration of 50 ng/μL. After DNA QC, 5860 Duroc and pigs were genotyped with the Geneseek Porcine 50 K SNP chip (Neogen, Lincoln, NE, United States). The genotype QC was conducted by PLINK v 1.9 [35]. The following criteria were used to filter SNPs prior to conducting association analysis: animal call rates < 95%, SNPs with call rates of < 90%, minorallele frequency < 1%, and P-value < 10− 6 for the Hardy-Weinberg equilibrium test were excluded. This study excluded the SNPs located on the sex chromosomes and unmapped regions. In addition, the two Duroc pig populations follow the same QC criteria. In the meta-analysis of GWAS, a common set of SNPs that passed QC across the two populations were later used.
Assessing the power of GWAS
In this study, the sample size calculation was based on the following two formulas, in which the correlation (r) between the marker and the trait was calculated by the following formula [36]:
$$ r\left(t,m\right)=r\left(m,q\right)\ast r\left(q,g\right)\ast r\left(g,t\right) $$
where m denotes the marker genotype; q represents the QTL genotype; g is the genetic value; t is the phenotypic value. r2(m,q) specifies the conventional r2 measure of linkage disequilibrium; r2(q,g) stands for the proportion of genetic variance explained by the QTL; r2(g,t) refers to the heritability of the trait. In this study, we assumed that r2(m,q) = 0.35, r2(q,g) = 0.04. r2(g,t) is the heritability value estimated by GCTA software.
The number of animals (N) required for detecting a QTL was calculated by the following formula [36, 37]:
$$ N={\left(\frac{1-{r}^2\left(t,m\right)}{r\left(t,m\right)\left(1/{Z}_{\left(1-\alpha \right)}\right)}\right)}^2 $$
where Z denotes the normal score; α represents the Bonferroni-corrected type I error rate for K independent tests, in which K is the number of effective SNPs.
According to the formulas, the number of animals required for ADG and LMP are 4490 and 3889, respectively. Herein, we conducted GWASs for ADG and LMP in 3770 American and 2090 Canadian Duroc pigs. To avoid the insufficient detection power of single-population-based GWAS and improve the efficiency of identifying QTL, we conducted meta-analyses in 5860 pigs from the two Duroc pig populations.
Population structure analysis
Population stratification vastly affects GWAS reliability, so software R and GCTA were used to evaluate the population structure of two Duroc populations [38, 39]. The Q–Q plot is a commonly used tool for scanning population stratification in GWAS studies [40]. In this study, the Q–Q plot was constructed by R v3.6.1 software. Given that the experimental animals originated from two groups in this study, PCA was used to evaluate the similarity of the genetic background between American and Canadian Duroc pigs. PCA was generated by software GCTA v1.92.4beta.
Single-population GWAS
GEMMA software was applied to a univariate linear mixed model to execute GWAS, and the single-population analysis of the two Duroc pig populations used the same univariate linear mixed model [13]. Before GWAS, the genomic relatedness matrix (GRM) between individuals was estimated by GEMMA. The matrix form was used in the following statistical model:
$$ y= W\alpha + X\beta +u+\varepsilon $$
where y refers to a vector of phenotypic values for all animals; W denotes the incidence matrices of covariates (fixed effects), including sex, live weight, and the top three eigenvectors of PCA; α represents the vector of corresponding coefficients with the intercept; X corresponds to the vector of marker genotypes; β specifies the corresponding effect size of the marker; u stands for the vector of random effects with u ~ MVNn (0, λ τ− 1K); ε is the vector of random residuals with ε ~ MVNn (0, τ− 1In); λ signifies the ratio between two variance components; τ−1 is the variance of the residual errors; K serves as GRM; I is an n × n identity matrix; MVNn denotes the n-dimensional multivariate normal distribution.
Moreover, to prove the superiority of the combination strategy (included single-population GWAS and meta-analysis), we conducted GWAS for ADG and LMP traits in a mixed-population (included these two Duroc pig populations) and added population as a covariate into the univariate linear mixed model.
Meta-analysis of GWAS
Meta-analysis was conducted by combining single-population GWAS analysis using METAL software [41]. In this study, METAL combined the results of the two single-population GWASs by calculating the pooled inverse-variance-weighted β-coefficients, standard errors, and Z-scores, and the formulas were as follows:
$$ se=\sqrt{1/\sum \limits_i{w}_i} $$
$$ \beta =\sum \limits_i{\beta}_i{w}_i/\sum \limits_i{w}_i $$
where βi is the β-coefficients for study i; SE corresponds to the standard errors for study i.
In the single-population GWAS and meta-analysis, the genome-wide significant (0.05/N) and suggestive (1/N) thresholds by Bonferroni correction, in which N is the number of SNPs, were used in the analysis. In this study, the meta-analysis combined the β-coefficients and standard errors of the SNPs in the results of single-population GWAS that were common to American and Canadian Duroc pigs.
Haplotype block analysis
Haplotype block analysis was carried out for chromosomal regions with multiple significant SNPs clustered around the top SNP to evaluate the LD pattern of the regions. The haplotype blocks were generated by the software PLINK v1.9 and Haploview v4.2 [42].
Estimation of genetic parameters and the explained phenotypic variance
The restricted maximum likelihood method was used to estimate the phenotypic variance explained by the significant SNPs for ADG and LMP traits using GCTA software. To adjust the structure of the stratified population and cryptic relatedness, the top three eigenvectors of PCA were included as covariates in this analysis. The phenotypic variance explained by the significant SNPs were calculated in the following model [39, 43]:
$$ y= X\beta +g+\varepsilon \kern0.4em with\kern0.4em \operatorname{var}(y)={A}_g{\sigma}_g^2+I{\sigma}_{\varepsilon}^2 $$
where y refers to the vector of phenotype value; β is a vector of fixed effects; X is an incidence matrix for β; g represents the vector of the aggregate effects of all the qualified SNPs for the pigs within one population; I is the identity matrix; Ag is the GRM; \( {\sigma}_g^2 \) corresponds to the additive genetic variance captured by either the genome-wide SNPs or the selected SNPs, and \( {\sigma}_{\varepsilon}^2 \) refers to the residual variance. GCTA was used to estimate genetic correlation in the bivariate mode, and was also used to compute the genomic heritability.
Functional annotation of candidate genes
All SNP positions from the most recent Sus scrofa genome (version 11.1) were downloaded from Ensembl [44]. The Ensembl annotation of the Sus scrofa 11.1 genome version was used to find the genes which were nearest the significant SNPs (http://ensembl.org/Sus_scrofa/Info/Index). KEGG and GO analyses were carried out using the KOBAS 3.0 (http://kobas.cbi.pku.edu.cn/kobas3) [45]. Fisher’s exact test was used to assess the significance of the enriched terms with the criteria of P < 0.05 to explore the genes involved in pathways and biological processes [46].