Existence of missing heritability for porcine serum lipid traits
As shown in Table
1, the amount of phenotypic variance explained by the genotyped SNPs was in general lower than heritability values described by Casellas et al.
 in the same population. This phenomenon of “missing heritability” has been frequently described in GWAS studies. In particular, GWAS are short on their ability to identify rare variants with small effects over the phenotype, which might be the case of many traits of polygenic architecture. One additional limiting factor of GWAS studies performed in livestock is that sample sizes are usually much smaller than those employed in humans. Although the size of our Duroc population is comparable to those described in previous porcine GWAS studies
[26–28], the detection of loci with small effects or rare variants with strong effects might be feasible only with larger sample sizes. Despite this limitation, much larger studies performed in humans (in the order of 60,000-100,000 individuals) are consistent with the data outlined in our work. For instance, Asselbergs et al.
 carried out a meta-analysis of 32 GWAS encompassing 50,000 SNP markers and 66,240 European individuals and found that the proportion of phenotypic variance attributable to the genotyped SNP was 10.3% for CHOL, 9.9% for HDL, 9.5% for LDL and 8.0% for TRIG. Similarly, Teslovich et al.
 demonstrated that around 25-30% of the genetic variance of plasma lipids could be explained by the variation of SNPs located at 95 loci. Failure to detect additional sources of genetic variance can have multiple causes. For instance, commercial genotyping arrays might contain neither all common nor all rare variants with moderate to large effects on the trait under analysis, so these alleles will be systematically missed in GWAS studies (unless they are in linkage disequilibrium with one or several markers of the array). This can be especially problematic if there is ascertainment bias i.e. populations used to build the array are distantly related to the one being studied. Imprecise phenotyping, improper statistical analyses and ignoring other sources of genetic variability (e.g. structural variation) can also mask part of VG.
The amount of VP explained by the SNPs for HDL45 (0%) and HDL190 (2%) was very low. This observation is coherent with the small Bayes factors (BF) obtained by Casellas et al.
, in the same Duroc population, when comparing two models with and without additive polygenic effects i.e. BF = 2.2. and 2.1 for HDL45 and HDL190, respectively. Such results, according to the scale of Jeffreys
, are barely worth mentioning. In strong contrast, Bayes factors for CHOL190, LDL190, TRIG45 and TRIG190 ranged between 8.7-47.9 (substantial to very strong evidence favoring the model with polygenic effects). These results imply that the genetic determinism of HDL45 and HDL190 in the Lipgen population is much weaker than that of other serum lipid traits, or that the genetic architecture of these two traits relies on a large amount of loci with very small effects that cannot be captured efficiently with the experimental design and methods used in the current work.
Genetic determinants of porcine serum lipids are modulated by age-specific factors
Identifying TARs for blood lipid concentrations is particularly difficult because their genetic architecture consists of hundreds of genetic determinants with small effect sizes
[24, 25]. The discovery of these TARs, in humans, requires population sizes of tens or even hundreds of thousands of individuals that are unavailable in non-model organisms as pigs. Pigs are particularly interesting because of their physiological similarity with humans and the relative easiness with which tissue samples can be retrieved to analyse gene expression in different experimental conditions. The main trend that emerges from the inspection of data presented at Table
2 is the complete lack of concordance between genotype-phenotype associations detected in 45- and 190-days-old Duroc pigs (Tables
3 and Additional file
2: Table S1). Moreover, we have observed important differences in h2
SNP estimates obtained in 45 days and 190 days-old pigs (in general, older pigs have higher values), as shown at Table
1 and Additional file
1: Figure S1. This result may indicate that the genetic architecture of porcine serum lipids traits is modulated by age-specific factors.
Classical studies performed in humans support this latter conclusion. In a longitudinal study
, it was shown that heritability estimates were relatively constant across generations, but the expression patterns of genes affecting CHOL, LDL, HDL and TRIG were different in adolescent and middle-aged people e.g. only 46% (TRIG) to 80% (CHOL) of the genetic variance was shared by both age groups. Indeed, heritability estimates of age-related variations in LDL (h2 = 0.25-0.36) and HDL (h2 = 0.23-0.58) concentrations are moderate
, meaning that the relative contributions of their genetic determinants change over time. Even more, a comparison of GWAS data obtained in young and adult people revealed that no single association was significant in both groups
, implying that age is an important modifier in the genetic determinism of circulating lipids.
Three genomic regions in SSC3, SSC6 and SSC16 display consistent associations with porcine serum lipid concentrations
There are three regions at SSC3 (~124 Mb, associated with LDL45, CHOL190 and LDL190), SSC6 (~135 Mb, CHOL190, TRIG190) and SSC16 (~17 Mb, CHOL45) that were consistently detected with GEMMA, EMMAX and GenABEL, while several others were method-specific (Table
2). This substantial concordance was, to a certain extent, unexpected because Zhou and Stephens
 showed that, in the presence of a marked sample structure, approximate methods tend to underestimate P-values (i.e. they are less significant) and involve a substantial loss of power. Although in general nominal and corrected P-values obtained with GEMMA were more significant than those retrieved with EMMAX and GenABEL (Table
2), we did not see neither important P-value departures among methods nor a poorer performance of GenABEL (in generating deflated P-values) when compared with EMMAX. It is also true, however, that GEMMA was the method that yielded more method-specific associations (CHOL 45 at SSC3 and SSC10, LDL45 at SSC10 and SSC13, TRIG 190 at two SSC6 regions), something that might be explained by an increase in statistical power associated with the performance of exact instead of approximate significance tests.
Genome-wide association analyses carried out with PLINK
 identified four of the most significant TARs also found with mixed-model methods, plus a large list of additional TARs. We believe these differences are explained by the fact that PLINK assumes a completely different approach to handle population structure
. Instead of capturing infinitesimal polygenic effects, PLINK relies on standard linear models where family-related effects (i.e. sire-mean-adjusted) must be accounted for by appropriate regression coefficients. Alternatively, some specific tests are available for case-control studies when population stratification has been previously identified, although they can not be generalized to quantitative traits
. Given that our analyses focused on non-discrete traits, potential population structure was partially accounted for by including sire-specific effects into the linear model (without considering dam-related contributions). This was mainly due to the limitations of the PLINK program to take into account infinitesimal additive genetic effects under non-homogeneous covariance structures, and the fact that sow-related contributions could not be addressed when a single offspring was retained from each litter. Although the inclusion of sire-specific effects in the model must be viewed as a reasonable way to account for hidden population structure in the Lipgen population, results must be taken with caution given the risk of false positives linked to partially undetected sample structure
Analysis of the gene content of genomic 1 Mb-windows around each one of the most significant SNPs within each one of the TARs detected with mixed-model methods revealed the presence of several loci involved in lipid metabolism. As previously said, one of the most promising candidate genes is apolipoprotein B (APOB, located at SSC3 125.2 Mb), which has been identified in our study as well as in the GWAS performed by Chen et al.
. Apolipoprotein B is essential for the correct assembly of chylomicrons and the synthesis of very low density lipoproteins (VLDL), that transport TRIG from the intestine to other body tissues
. Meanwhile, VLDL become progressively lipolyzed into LDL. Since APOB mediates the binding and endocytosis of LDL by their receptors, the knockout of this gene translates into hypercholesterolemia
. Close to APOB, there is also the syndecan 1 gene (SDC1, located at SSC3 125.9 Mb) that encodes a membrane proteoglycan that mediates the clearance of TRIG-rich lipoproteins
The SSC6 region (peak SNP at ~135 Mb) associated with CHOL190 contains the leptin receptor (LEPR) and the leptin receptor overlapping transcript (LEPROT) genes, both mapping to 135.3 Mb. Leptin plays key roles in (i) the regulation of food intake and energy expenditure, (ii) the modulation of APOB levels and triglyceridemia and (iii) the intestinal absorption of cholesterol
[37, 38]. Finally, it is worth to mention the ATP-binding cassette sub-family G (WHITE), member 1 (ABCG1), that maps to SSC13 (215.8 Mb) and controls tissue lipid levels and the efflux of cellular cholesterol to HDL.
The list of genes within TARs detected with PLINK was very large (Additional file
6: Table S5), so we mapped them to the Reactome database
 to achieve a global view of their biological functions. Loci mapping to TARs identified with PLINK and comprised within the “Metabolism of lipids and lipoproteins” Reactome category encompassed genes related with a variety of processes such as lipid transport (APOA1, APOA4, APOB, APOC3, ABCB11, SCP2) and clearance (SDC1), cholesterol synthesis (DHCR24, CH25H), fatty acid β-oxidation (ACOX1, ACADM) and phospholipid synthesis (AGPAT5).
Positional concordance for GWAS and QTL data generated in the Lipgen population
We have compared our GWAS data with QTL previously reported by Gallardo et al.
 in the same population. Regarding mixed-model methods, the most prominent coincidence was a SSC3 region containing chromosome-wide QTL for CHOL190, LDL190 and TRIG190
. The QTL peak at marker SW2408 (approximately 122 Mb) matched TARs for CHOL190, LDL45 and LDL190 (SSC3, ~124 Mb, confirmed with the three programs). Remarkably, Chen et al.
 identified the same TAR as significantly associated with CHOL and LDL concentrations in F2 Erhualian x Duroc pigs. This specific region contains the APOB gene that in GWAS studies performed in humans has been consistently associated with CHOL and LDL plasma levels. Apolipoprotein B is the main structural component of chylomicrons and very-low density lipoproteins (VLDL, the precursor of LDL) and plays an essential role in TRIG homeostasis
. Interestingly, Pena et al.
 genotyped a polymorphic 230 bp-intronic insertion at the pig APOB gene in the Lipgen population and reported associations with CHOL190, HDL190 and LDL190 concentrations. Taken together, these results suggest that APOB genotype might be a major determinant of CHOL and lipoprotein levels both in humans and pigs.
We also observed some concordance between a QTL for LDL45 at SSC13 (104 cM) and a TAR detected with GEMMA at 215 Mb, as well as between CHOL190 and LDL190 QTL found at SSC13 (72-74 cM) and TARs detected with PLINK at the 180-181 Mb and 207-210 Mb regions (Table
4). The existence of a genetic determinant for serum lipids on SSC13 is supported by results from previous genome scans, where QTL for CHOL (SSC13, 212 Mb approx.) and LDL (SSC13, 194 Mb) were detected by Yoo et al.
 and Uddin et al.
The limited concordance of QTL scan
 and GWAS data obtained from the Lipgen population may be explained by differences in marker density, type of polymorphisms and statistical methods to carry out genome-wide analyses. For instance, the analysis of a Chinese Erhualian × White Duroc three generation population yielded QTL
 and TAR
 maps that were remarkably different i.e in the GWAS the main associations mapped to SSC1 (63 Mb, LDL) and SSC3 (124 Mb, CHOL and LDL); whilst in the QTL scan SSC2 (67-73 cM, CHOL, LDL and TRIG), SSC5 (70 cM, TRIG), SSC7 (134 cM, HDL) and SSC8 (87 cM, LDL) encompassed the most significant associations. Similarly, Ramayo-Caldas et al.
 reported that only 53% of the TARs detected in their GWAS study coincided with previously reported porcine QTL.
Evidences of positional concordance between trait-associated regions in humans and pigs
Gallardo et al.
 reported that there is a remarkable level of correspondence between lipid QTL found in human and pigs. However, the resolution of this study was severely limited by the fact that QTL intervals were defined on the basis of 109 microsatellites spaced approximately every 20 cM. Comparison of orthologous relationships between TARs generated in our study and those published in the NHGRI GWAS Catalog
 revealed few concordances. The most obvious one affected the APOB gene, that maps to SSC3 (125 Mb) and Hsa2 (21 Mb) in pigs and humans, respectively. In the study of Teslovich et al.
, this locus showed pleiotropic effects on the lipid profile, being highly associated (4 × 10-114) with cholesterol and LDL levels. Another potential correspondence was detected for ANGPTL3
 and DOCK7
. Loss-of-function mutations in the ANGPTL3 gene are known to be associated with decreased levels of LDL, HDL and TRIG
. The associations observed for the DOCK7 locus, which is involved in neurogenesis, myelination and axon formation
 but not in lipid metabolism, probably reflect the co-localization of this gene with ANGPTL3. The ABCA1 gene also lies close to the SSC1 (264-271 Mb) TAR for CHOL190 (only detected with PLINK), a result that makes sense from a biological point of view because this gene has a major role in cholesterol homeostasis
There are several considerations that need to be taken into account to explain the limited concordance between human and porcine TARs. First, our Duroc commercial line is by no means representative of the whole porcine diversity, so it is quite possible that the analysis of further swine populations might uncover additional orthologous associations with human. Besides, complex traits are known to have a considerable degree of genetic heterogeneity. A recent review highlighted that the level of correspondence between TARs observed in East Asians and Europeans, two populations that diverged 23 kya ago, ranged between 32-100% with a mean of 65%
. Moreover, a significant part of these shared European-East Asian associations was explained by different SNP. Since human and pigs diverged around 94 MYR ago
 it is reasonable to infer that the level of concordance of GWAS signals between species must be necessarily much lower.
Variation within several TARs is associated with the hepatic expression of lipid metabolism genes
We have discussed the genomic distribution and gene content of blood lipid TARs detected in a Duroc commercial line. Moreover, we have analysed the positional concordance of these TARs with previous data reported in pigs and in humans. In order to gain additional insights into the mechanisms that may explain the associations found, we have examined if SNPs mapping to TARs are also associated with hepatic gene expression levels. Indeed, in a recent study Nicolae et al.
 concluded that TARs are mostly explained by the segregation of expression QTL (eQTL), thus suggesting that causal mutations exert their effects mainly through the regulation of gene expression. This approach allowed us identifying several genes related to lipid metabolism, that deserve to be further explored (Table
5). For instance, SNPs within the SSC13 TAR for LDL45 were also associated with PARP2 mRNA expression (nominal P-value = 1.50 × 10e-07). Interestingly, the deletion of this gene leads to an increase in the accumulation of cholesterol in the liver by enhancing SREBP1 expression
. Other genes of interest were SLC19A1, that in humans is associated with HDL levels
; SYCP3, whose knockdown affects the expression of genes related to lipid metabolism
; CISD2, that inhibits muscle fat infiltration
; and DPP4, a gene that is overexpressed in the visceral fat of severely obese individuals
. All of these associations involved trans-effects, where SNPs within TARs affect the expression of loci mapping to distant locations. According to Cheung et al.
, trans-eQTL are more abundant than those with cis-effects and they often involve interactions mediated by molecules other than transcription factors.