### Genotype and phenotype data

The large dairy cattle database maintained by CDCB and USDA-AGIL includes more than one million genotyped animals with complete pedigree. The data were collected on a continuous basis, and this study included all the Holstein data available until September, 2015. From the database, we extracted 262,757 genotyped females whose sire and dam were also genotyped. The genotypes were generated from 16 different SNP arrays with SNP number ranging from 7 K to 50 K. The SNP genotypes of all 262,757 females were phased to determine the parent-of-origin of each allele. We first used parent genotypes to phase a SNP genotype of a cow [26]. If this step failed, we then applied a population-based phasing approach using FindHap version 3.0 [28]. After phasing, all individuals were imputed to 50 K SNP data. When building genomic relationship matrices (GRMs), we further filled a small portion of genotypes that were still missing after imputation from FindHap by randomly sampling genotypes from a multinomial distribution with probabilities of the three genotypes derived under an assumption of Hardy–Weinberg equilibrium.

Among the 262,757 Holstein cows, 42,701 of them had yield deviation (YD) phenotypic data. YD phenotypes were adjusted for appropriate covariates, including farm, year, and season effects. Eight traits were analyzed, including milk yield (MY), fat yield (FY), protein yield (PY), somatic cell score (SCS; a measure of mammary gland health), standardized productive life (STPL; a measure of longevity), daughter pregnancy rate (DPR; a measure of fertility), cow conception rate (CCR; a measure of fertility), and heifer conception rate (HCR; a measure of fertility). Since many cows were not measured for all the phenotypes, the final sample size for the eight traits ranged from 12,911 (STPL) to 29,811 (MY, FY and PY), as shown in Table 1.

### Variance decomposition with additive, dominance, and imprinting components

Genetic effects of SNPs can be decomposed into three components (i.e., genotypic additive, dominance, and imprinting values), following their evident biological meanings:

$$ \left[\begin{array}{l}{G}_{11}\\ {}{G}_{12}\\ {}{G}_{21}\\ {}{G}_{22}\end{array}\right]= R+ a\left[\begin{array}{l}0\\ {}1\\ {}1\\ {}2\end{array}\right]+ d\left[\begin{array}{l}0\\ {}1\\ {}1\\ {}0\end{array}\right]+ i\left[\begin{array}{c}0\\ {}-1\\ {}1\\ {}0\end{array}\right]= R+ A+ D+ I, $$

(1)

where *G*
_{12} is the genetic value for the genotype 12 with a paternal allele 1 and a maternal allele 2 (similar for *G*
_{11}, *G*
_{21} and *G*
_{22}), *R* is the overall mean, *a* is additive effect, *d* is dominance effect, *i* is imprinting effect, *A* is genotypic additive value arising from *a*, *D* is genotypic dominance value arising from *d*, and *I* is genotypic imprinting value arising from *i*. Under Hardy-Weinberg equilibrium, eq. (2) can be further centralized regarding *a* and *d* into

$$ \left[\begin{array}{l}{G}_{11}\\ {}{G}_{12}\\ {}{G}_{21}\\ {}{G}_{22}\end{array}\right]={R}^{\ast }+ a\left[\begin{array}{c}-2 p\\ {} q- p\\ {} q- p\\ {}2 q\end{array}\right]+ d\left[\begin{array}{c}-2 p q\\ {}1-2 p q\\ {}1-2 p q\\ {}-2 p q\end{array}\right]+ i\left[\begin{array}{c}0\\ {}-1\\ {}1\\ {}0\end{array}\right]={R}^{\ast }+{A}^{\ast }+{D}^{\ast }+ I, $$

(2)

where *R*
^{*} is the overall mean after centering, *p* is the frequency of allele 2 and *q* is the frequency of allele 1, and *A*
^{*} (*D*
^{*}) is genotypic additive (dominance) value after centralization. Note that in eq. (2), genotypic additive value (*A*
^{*}) is not independent of genotypic dominance value (*D*
^{*}), or *Cov*(*A*
^{∗}, *D*
^{∗}) ≠ 0. To address the issue, we can use the extended natural and orthogonal interactions (NOIA) model [21] under Hardy-Weinberg equilibrium,

$$ \left[\begin{array}{l}{G}_{11}\\ {}{G}_{12}\\ {}{G}_{21}\\ {}{G}_{22}\end{array}\right]={R}^{\ast \ast }+\beta \left[\begin{array}{c}-2 p\\ {} q- p\\ {} q- p\\ {}2 q\end{array}\right]+ d\left[\begin{array}{c}-2{p}^2\\ {}2 p q\\ {}2 p q\\ {}-2{q}^2\end{array}\right]+ i\left[\begin{array}{c}0\\ {}-1\\ {}1\\ {}0\end{array}\right]={R}^{\ast \ast }+{A}^{\ast \ast }+{D}^{\ast \ast }+ I, $$

(3)

where *R*
^{**} is the overall mean and *β* is allele substitution effect. Despite its similarity to eq. (2), eq. (3) results in different variance decomposition. The three components for *β*, *d*, and *i* correspond to breeding value (*A*
^{**}), dominance deviation (*D*
^{**}), and genotypic imprinting value (*I*), respectively.

The differences and relationships between eqs. (2) and (3) have been thoroughly discussed in a previous study [6], although that study did not include imprinting effects. The equation still holds when imprinting effects are included because the genotypic imprinting value is independent of the other two components in both eqs. (2) and (3). In theory, the sum of individual breeding value and dominance deviation in eq. (3) is equal to the sum of individual genotypic additive and dominance values in eq. (2); and when ignoring the covariance between additive and dominance effects, the sum of additive and dominance genetic variances resulting from the decomposition by eq. (3) is equal to the sum of the variances of genotypic additive and dominance values resulting from the decomposition by eq. (2). Additionally, individual genotypic imprinting value in eq. (2) is the same as in eq. (3), thus asserting the equivalence of imprinting variance components in the two equations. The theory holds for multiple loci when assuming linkage equilibrium and independent marker effects [6].

Although it is possible to directly fit SNP effects in a model [29], fitting individual-level genetic components is more efficient, especially for a large dataset with many SNP markers. In this study, we used the following model

$$ \mathbf{y}=\mathbf{Xb}+\mathbf{a}+\mathbf{d}+\mathbf{i}+\mathbf{e}\kern0.5em \mathrm{with}\ \mathbf{a}\sim N\left(0,{\sigma}_a^2\mathbf{A}\right),\mathbf{d}\sim N\left(0,{\sigma}_d^2\mathbf{D}\right),\mathbf{i}\sim N\left(0,{\sigma}_p^2\mathbf{P}\right),\mathbf{e}\sim N\left(0,{\sigma}_e^2\mathbf{I}\right), $$

(4)

where the phenotypic value of individuals (*y*) was decomposed into fixed effects (*b*), genotypic additive value (*a*), genotypic dominance value (*d*), genotypic imprinting value (*i*), and residual (*e*). Eq. (4) can be readily solved by a multi-component restricted maximum likelihood (REML) approach as implemented in GCTA [30], as long as we know the covariance structures of the three components, **A**, **D**, and **P**. Different forms of additive genomic relationship matrix (GRM) have been proposed. We used a version with pooled variance across all markers [31],

$$ {A}_{ij}={\sum}_k\left({Z}_{ik}-2{p}_k\right)\left({Z}_{jk}-2{p}_k\right)/{\sum}_k2{p}_k\left(1-{p}_k\right), $$

(5)

where *Z*
_{
ik
} (*Z*
_{
jk
}) is the additive genotype code for marker *k* of individual *i* (*j*) as shown in the vector corresponding to *a* in eq. (1) and *p*
_{
k
} is the population frequency of allele 2. Similarly, based on the equivalence of SNP-BLUP and GBLUP [7, 32], we can obtain corresponding GRMs for dominance (**D**) and imprinting (**P**), which are shown as following:

$$ {D}_{ij}={\sum}_k\left[{H}_{ik}-2{p}_k\left(1-{p}_k\right)\right]\left[{H}_{jk}-2{p}_k\left(1-{p}_k\right)\right]/{\sum}_k2{p}_k\left(1-{p}_k\right)\left[1-2{p}_k\left(1-{p}_k\right)\right] $$

(6)

$$ {P}_{ij}={\sum}_k{S}_{ik}{S}_{jk}/{\sum}_k2{p}_k\left(1-{p}_k\right) $$

(7)

where *H* and *S* are the genotype codes for dominance and imprinting effects as shown in the corresponding vectors in eq. (1), respectively. Equation (6) has been used in previous studies [5, 10]. When building GRMs, we used whole-genome markers with minor allele frequency (MAF) ≥0.01. Finally, the software MMAP [33], which efficiently implements REML, was used to fit model (4).

For comparison purposes, we also performed variance decomposition based on eq. (3). In this case, we need to use a different dominance GRM (**D**
^{*}),

$$ {D}_{ij}^{\ast }={\sum}_k{H}_{ik}^{\ast }{H}_{jk}^{\ast }/{\sum}_k{\left(2{p}_k\left(1-{p}_k\right)\right)}^2, $$

(8)

where *H** is the dominance genotype code as shown in the vector corresponding to *d* in eq. (3). Accordingly, the total genetic variance is decomposed to classical additive and dominance genetic variances and variance of genotypic imprinting effect. We further compared the two different kinds of variance decompositions regarding estimates of individual effects and variance components to verify the theory on their equivalence of explaining phenotypes.

### Simulation study for validating variance decomposition

Note that when building the GRMs, we assumed that the traits are highly polygenic for the additive, dominance, and imprinting effects. Although the polygenic architecture of additive effects is commonly used for complex traits [22], we have less knowledge on whether dominance and imprinting effects are also polygenic. To address this issue, we simulated a number of datasets to investigate whether model (4) can capture dominance and imprinting effects when there are a small number of corresponding QTLs. Specifically, we first obtained a random subsample of 10,000 from the 42,000 cows being analyzed, and then randomly selected markers from the 50 k SNPs as additive, dominance, or imprinting QTLs. We simulated QTL effects using a normal distribution and added them up to obtain *a*, *d*, and *i* for each of the 10,000 cows. Thereafter we calculated \( {\sigma}_a^2=\operatorname{var}(a) \), \( {\sigma}_d^2=\operatorname{var}(d) \), and \( {\sigma}_p^2=\operatorname{var}(i) \) using corresponding simulated genetic values. Based on the heritability we set to simulate, we calculated \( {\sigma}_e^2 \) and simulated *e* by sampling it from \( N\left(0,{\sigma}_e^2\right) \). The phenotype for each individual animal was simulated by adding up *a*, *d*, *i*, and *e*.

To ensure realistic simulations, we picked variance of the normal distribution for simulating effect sizes so the variance decomposition was the same between simulated and real data. Our simulation scenarios included two representative traits, milk yield and DPR, separately. Three scenarios were simulated for either trait by varying QTL numbers, including 1000 + 10 + 10 (1000, 10 and 10 QTLs for additive, dominance, and imprinting effects, respectively), 1000 + 100 + 100, and 1000 + 1000 + 1000. Simulation for each scenario was repeated 100 times. We fitted model (4) for each simulated data set and compared variance component estimation between the three scenarios.

### Genome-wide association study of non-additive effects

To increase computational efficiency, we used a two-step strategy for genome-wide association study, similar to the GRAMMAR approach [34]. First, we fitted model (4), and obtained the residuals to adjust for polygenic effects. Second, we used the residuals as response variable to fit a multiple linear regression model for each SNP,

$$ \mathbf{e}=\mu +{a}_k{\mathbf{Z}}_k+{d}_k{\mathbf{H}}_k+{i}_k{\mathbf{S}}_k+\boldsymbol{\upvarepsilon}, $$

(9)

where **Z**
_{
k
}, **H**
_{
k
} and **S**
_{
k
} are the genotype codes of marker *k* for additive, dominance and imprinting effects, respectively, as described in eqs. (5,6,7), and *a*
_{
k
}, *d*
_{
k
}, and *i*
_{
k
} are corresponding SNP effects. SNPs were filtered by MAF ≥0.01 and *P*-value of Chi-square test for Hardy–Weinberg equilibrium ≥1 × 10^{−6}. Association *P*-values were calculated from t-tests for the three types of SNP effects.

For association signals with sufficient statistical evidence from the two-step analysis, we further used the full mixed model,

$$ \begin{array}{c}\mathbf{y}=\mu +{a}_k{\mathbf{Z}}_k+{d}_k{\mathbf{H}}_k+{i}_k{\mathbf{S}}_k+\mathbf{a}+\mathbf{d}+\mathbf{i}+\mathbf{e}\\ {}\mathrm{with}\ \mathbf{a}\sim N\left(0,{\sigma}_a^2\mathbf{A}\right),\mathbf{d}\sim N\left(0,{\sigma}_d^2\mathbf{D}\right),\mathbf{i}\sim N\left(0,{\sigma}_p^2\mathbf{P}\right),\mathbf{e}\sim N\left(0,{\sigma}_e^2\mathbf{I}\right),\end{array} $$

(10)

or its reduced version,

$$ \mathbf{y}=\mu +{a}_k{\mathbf{Z}}_k+{d}_k{\mathbf{H}}_k+{i}_k{\mathbf{S}}_k+\mathbf{a}+\mathbf{e}\kern0.5em \mathrm{with}\ \mathbf{a}\sim N\left(0,{\sigma}_a^2\mathbf{A}\right)\ \mathrm{and}\ \mathbf{e}\sim N\left(0,{\sigma}_e^2\mathbf{I}\right), $$

(11)

to rerun the association analysis, depending on whether the additive effects can explain a majority of total genetic variance on the trait being analyzed. Here, the response variables in eq. (10) and (11) are yield deviations. Again, we applied the software MMAP [33] to fit the mixed models.

### Validation of non-additive association signals using independent data

Our discovery GWAS used the data available until September, 2015. From then to April, 2016, we assembled a new dataset to validate the signal found in the initial GWAS. The validation data consisted of 5514 cows with both genotypes and milk phenotypes. The genotypes in the validation data were phased with the same procedures as used for the discovery data set. With the validation data, model (11) was used to analyze associations between milk and 50 SNP markers around the *RUNX2* signal. The GRM was built using all chip SNPs except those on chromosome 23, which resulted in a leave-one-chromosome-out analysis (LOCO) [35]. We also built the GRM using all genome-wide SNPs and compared it with the LOCO analysis. The validation data were also used to analyze the significant dominance associations around Chr5:107,000,000 with both fat and protein. The three SNPs with the smallest discovery *P*-value were analyzed with model (11) for fat and protein, respectively.

### Fine mapping for the *RUNX2* dominance signal

First, we used the sequence data of 443 Holstein bulls from the 1000 Bull Genomes project [24] (Run 5.0) to check LD levels between the targeted SNP (Chr23:18,600,456) and SNPs/biallelic indels around it. Based on the LD decay pattern, we chose the region of ±500 kb from the targeted SNP for fine mapping. Then, we used the sequence genotypes of the 443 bulls as reference to impute the 50 k genotypes of 29,811 cows to sequence genotypes. Beagle version 4 [36] was used for the imputation with default parameters. To increase accuracy, our imputation covered a larger region of ±1 Mb from the targeted SNP. After imputation, we removed non-informative SNPs, i.e. SNPs with a MAF <0.01, SNPs with a *P*-value of Chi-square test for Hardy–Weinberg equilibrium <1 × 10^{−6} and SNPs with an allelic *R*
^{2} (AR^{2}) <0.05. AR^{2}, reported by Beagle software, is the estimated squared correlation between the most probable alternative allele dose and the true alternative allele dose and serves as a good metric for estimating imputation accuracy [37]. The analysis of associations between milk and the imputed sequence variants within the targeted region (Chr23:18,100,456–19,100,456) was performed with a two-step method as described in our GWAS section.

### Genomic prediction

We estimated the values of the three effects for individuals in the training population from fitting model (4) in MMAP. The genomic predictions for new individuals can be calculated by

$$ {\widehat{\mathbf{g}}}_n={\widehat{\boldsymbol{\upalpha}}}_n+{\widehat{\mathbf{d}}}_n+{\widehat{\mathbf{i}}}_n={\mathbf{A}}_{n\times t}{\mathbf{A}}_{t\times t}^{-1}{\widehat{\mathbf{a}}}_t+{\mathbf{D}}_{n\times t}{\mathbf{D}}_{t\times t}^{-1}{\widehat{\mathbf{d}}}_t+{\mathbf{P}}_{n\times t}{\mathbf{P}}_{n\times t}^{-1}\widehat{{\mathbf{i}}_t} $$

(12)

where the subscripts *n* and *t* indicate the sets of new individuals and training population, respectively. Besides model (4) (ADD + DOM + IMP), we also considered two reduced models, the additive model (ADD) and the additive-plus-dominance model (ADD + DOM), and compared the prediction performance between the three models. Ten-fold cross validation was used to assess 1) prediction accuracy, defined as the Person correlation between genomic estimated breeding value (GEBV) and phenotype, and 2) unbiasedness, defined as the regression coefficient of phenotype on GEBV in the validation population.