High-resolution association mapping of number of teats in pigs reveals regions controlling vertebral development

Background Selection pressure on the number of teats has been applied to be able to provide enough teats for the increase in litter size in pigs. Although many QTL were reported, they cover large chromosomal regions and the functional mutations and their underlying biological mechanisms have not yet been identified. To gain a better insight in the genetic architecture of the trait number of teats, we performed a genome-wide association study by genotyping 936 Large White pigs using the Illumina PorcineSNP60 Beadchip. The analysis is based on deregressed breeding values to account for the dense family structure and a Bayesian approach for estimation of the SNP effects. Results The genome-wide association study resulted in 212 significant SNPs. In total, 39 QTL regions were defined including 170 SNPs on 13 Sus scrofa chromosomes (SSC) of which 5 regions on SSC7, 9, 10, 12 and 14 were highly significant. All significantly associated regions together explain 9.5% of the genetic variance where a QTL on SSC7 explains the most genetic variance (2.5%). For the five highly significant QTL regions, a search for candidate genes was performed. The most convincing candidate genes were VRTN and Prox2 on SSC7, MPP7, ARMC4, and MKX on SSC10, and vertebrae δ-EF1 on SSC12. All three QTL contain candidate genes which are known to be associated with vertebral development. In the new QTL regions on SSC9 and SSC14, no obvious candidate genes were identified. Conclusions Five major QTL were found at high resolution on SSC7, 9, 10, 12, and 14 of which the QTL on SSC9 and SSC14 are the first ones to be reported on these chromosomes. The significant SNPs found in this study could be used in selection to increase number of teats in pigs, so that the increasing number of live-born piglets can be nursed by the sow. This study points to common genetic mechanisms regulating number of vertebrae and number of teats. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-542) contains supplementary material, which is available to authorized users.


Background
A favorable genetic trend for total number of piglets born has been observed in the last decade [1,2]. Therefore, selection pressure on the number of teats has been applied in order to provide enough teats for the larger litters. Given the heritable nature of variation in number of teats in sheep as described by Bell in 1898 [3], and cases of familial supernumerary breasts (polymastia) or nipples (polythelia) in humans [4], genomic loci affecting teat number must exist. Indeed, the use of Best Linear Unbiased Prediction (BLUP) for estimating breeding values (EBVs) using phenotypes of both females and males, has resulted in an increase of the number of teats (data not shown) and heritability estimates are moderate with estimates between 0.2 and 0.47 [5,6]. Besides the use of quantitative genetics to select the best sows, many studies have used genetic markers -mainly microsatellites -to identify QTL (Quantitative Trait Loci). The QTL studies on number of teats (NTE) listed in the Pig QTL database [7,8] report QTL on all porcine chromosomes except SSC9, SSC13, SSC14, SSC18 and SSCY. Although many QTL were reported [9][10][11][12][13][14][15][16][17][18], they cover large regions and the functional mutations and underlying biological mechanisms have not yet been identified. Interestingly, some QTL for NTE seem to overlap with those for number of vertebrae [11].
Teats (or nipples) develop each as an appendage to previously formed mammary gland rudiments (MRs) during pre-natal life [19,20]. Therefore, the number of teats correlates with the number of MRs induced and maintained at least until teat formation. The number of mammary glands varies among mammalian species, but even in humans, who normally form one pair of breasts, there are at least 6 other positions that additional breasts can randomly occupy on either side of the body [21]. Their positions range from armpit (axilla) to groin (inguen), thus span the same region where pigs form their mammary glands and teats. On both sides lateral to the ventral midline, one can draw imaginary fluent lines from both axillae to both inguenae, called mammary lines or milk lines. During embryonic life, these lines exist as histologically and molecularly distinct bands in the surface ectoderm, connecting all positions where mammary glands may form on either side of the body in any given mammalian species [22][23][24].
Studies on genetically engineered mice (GEMs) have revealed some insights in genetic and cellular mechanisms of mammary gland and nipple formation. Whereas wild type mice normally form five pairs of mammary glands along the mammary lines [22], modification of certain genes can alter this number [25]. For example, loss of either Nrg3, Pax3, Gli3, Fgf10, or Hoxc6 [26][27][28][29] abolishes the formation of different, gene-specific, subsets of MRs [25]. Deletion of e.g. p63 [30,31] or Tbx3 (T-box gene 3) [32] may abolish the formation of all five MRs, while reduction of Wnt signaling by means of deletion of e.g. Lef1 (Lymphoid enhancer factor 1) [33] or Pygo2 (Pygopus 2) allows MR induction but leads to MR regression prior to nipple formation [34]. Even if the MRs are maintained, nipple formation may not occur due to e.g. a lack of PTHrP (parathyroid hormone related peptide) signaling [35]. Conversely, ectodermal overexpression of the genes encoding the receptor ligands Eda-A1 (Ectodysplasin-A1) [36] or Nrg3 [37], or suppression of the Wnt-antagonists Lrp4 or Wise/ Sostdc1, may lead to formation of one or several supernumerary mammary glands in a gene-specific pattern or region along the mammary line [38,39]. These data provide evidence for genetic determinants for the number of mammary glands, and moreover, for the positional variation in activity of, or requirement for, these genetic determinants along the mammary line [25].
Studies carried out in the 1960's on rabbit and mouse embryos had revealed that mammary gland formation in the surface ectoderm is initiated by factors in the dermal mesenchyme underlying the surface ectoderm [27,40,41]. The dermal mesenchyme is derived from the somites, which are also the precursor structures for the vertebrae and ribs. Interestingly, some of the genes mentioned above (e.g. Gli3, Fgf10, Nrg3, Pax3, Hoxc6), are in wild type mice expressed in the somites and/or in the dermal mesenchyme. It is now known that induction of the third pair of MRs in mice, MR3, depends on Gli3-mediated Fgf10 expression in the somites [27]. Somitic expression of Raldh2, an enzyme involved in retinoic acid synthesis, has also been associated with induction of MR3 [42]. With this relationship between somitic gene expression and the number of MRs induced in mouse embryos; and knowing that somites also give rise to the vertebrae, we may expect that some QTL and candidate genes for NTE in pigs are associated with the number of vertebrae in pigs, or vertebrae development in mammalian species in general.
The availability of the Illumina PorcineSNP60 Beadchip [43] allowed us to perform a Genome-Wide Association Study (GWAS) for NTE by genotyping 936 Large White pigs. The better coverage of the whole pig genome by high-density Single Nucleotide Polymorphisms (SNPs) on these Beadchips, combined with advanced statistical methods can improve fine mapping of specific QTL as demonstrated previously for other traits [44][45][46]. In the current study, we based our analysis on deregressed breeding values [47] to account for the dense family structure and a Bayesian approach for estimation of the SNP effects. A total of five highly significant QTL were identified at high resolution including new regions on Sus scrofa chromosome (SSC) 9 and SSC14. Interestingly, in three of these regions we identified genes regulating vertebrae development as candidate genes determining the number of teats.

Identification of QTL for teat number
In total 949 pigs were originally genotyped, but 13 animals were removed during quality control of the data. The remaining 936 animals had an average call rate of 0.995. The mean teat number was 15.3 (SD = 0.94) with a minimum of 14 and a maximum of 19 teats. A minimum of 14 teats is used as a breeding decision, which results in a slightly skewed distribution of the phenotype in this dataset. The (deregressed) EBVs are normally distributed (results not shown). The distribution of the weighting factors calculated for NTE to account for heterogeneous variances are shown in Figure 1. The two distributions observed reflect the deviation between animals with or without offspring information. Animals with a weighting factor below 1 have no offspring where the animals above one have on average 143 offspring with records on NTE. The estimated heritability was 0.42. The GWAS resulted in 212 SNPs with a BF >10 of which 6 SNPs had a BF > 100 ( Figure 2). In total 39 QTL regions were defined containing 170 SNPs on 13 chromosomes. One candidate QTL region showed an r 2 higher than 0.7 with another candidate QTL where the distance was over 2 Mb (4 Mb respectively). These regions were combined into one region (QTL number 30 on SSC14). The 39 significantly associated regions explain 9.5% of the genetic variance with QTL number 13 on SSC7 explaining most of the genetic variance (2.5%, see Table 1). The most significant SNP within this region is ALGA0122954. The allele substitution effect of the A allele (major allele) to the G allele constitutes an additional 0.21 teat on a phenotypic level. This is an indication for overdominance because the heterozygous animals have on average 15.45 teats compared to 15.16 for AA animals and 15.37 for GG animals, respectively. The second largest QTL located on SSC12 explains more than 1% of the genetic variance, with ALGA0066876 being the SNP which explains most of the variance within this region. The allele substitution effect of the A allele (major allele) to the G allele is almost 0.09 teat indicating that the effect is completely additive (see Table 1). In total there are 14 QTL which each explain more than 0.2% of the genetic variance. In Figure 3, the explained variance per chromosome is shown. SSC7, which contains the largest QTL, is also the chromosome which explains most of the genetic variance. The second largest contributing chromosome is SSC1, but this is only due to the large number of SNPs all contributing a small variance ( σ 2 g 0 ) because SSC1 is by far the longest autosome in pigs. The total attributed SNP variance when placed in the null distribution was 69%.

Gene identification within QTL regions
Within the QTL regions, 489 genes (unique Ensembl Gene IDs) were mapped (see Additional file 1 for the genes). For the five highly significant QTL regions on SSC7, 9, 10, 12 and 14 with a BF near or larger than 100, a search for candidate genes was performed. On SSC7 between 102 and 105 Mb, a gene named Vertnin (VRTN) and encoding a potential DNA binding factor, is located at 103.45 Mb. The number-increase allele (Q) of this gene, was shown to add an additional thoracic segment to the pig compared to the wild-type (wt) [48]. Among the 14 genes annotated to the QTL area on SSC12 explaining more than 1% of the genetic variance, is δ-EF1 mapping to 51.96 Mb. We consider this the most likely candidate gene, because it encodes transcriptional repressor involved in skeletal development [49]. On SSC10, three candidate genes were identified Figure 1 Distribution of the weights of the breeding values calculated according to the methodology by Garrick et al. [47]. The x-axis shows the weighting factors used to account for differences in phenotypic information density for each animal.  The position in Mb of the significant (BF >10) left and right flanking markers. 2 The allele substitution effect is the regression coefficient of the most significant SNP of the QTL on number of teats corrected for fixed effects (sex of the animal and farm). The minor allele is counted. 3 The genetic variance explained by the QTL region expressed in %. between 52.21 and 53.94 Mb. Of these, MPP7, is located at 53.01 Mb. It encodes a MAGUK (peripheral membraneassociated guanylate kinases) p55 subfamily member 7-like transcript, which is required for the establishment of cell polarity in the developing Drosophila embryo [50]. Also in cultured cells, MPP7 promotes epithelial cell polarity and tight junction formation [51].

Strength of GWAS methodology
In the present study, a Bayesian Variable Selection approach was used to estimate SNP effects. By using a relatively stringent prior (π 1 =0.001) on average 42 SNPs per run are selected to have an effect on the trait. To secure that many SNPs have been in the π 1 distribution, the number of cycles run was large (500.000 respectively). The advantage is that less SNPs are given a large variance and therefore we expect a clearer distinction between SNPs with a small effect and SNPs with a larger effect [52]. Instead of a sliding window of N number of SNPs, which is often used to account for linkage disequilibrium (LD) [47], regions were defined based on distances between significant SNPs (<2 Mb) and LD for post-analyses. The number of SNPs within a region was variable between a single SNP up to 16 SNPs. The defined region was used to simultaneously estimate the explained variance of the SNPs within the QTL region. The use of deregressed EBV's should also give more reliable results from the GWAS resulting in a better estimation of the size of the SNP effects [53]. Use of only phenotypes in a GWAS from the animals genotyped without considering the information of offspring, parents and other family members is reducing the power of the study. Deregressed EBV's are often used in dairy cattle breeding where daughter yield deviation (DYD) are used in GWAS, which have similar properties to deregressed EBV's [54]. Deregressing breeding values is used to circumvent selecting SNPs which explain family relatedness rather than associated genes. This is done by removing the contribution of information from relatives.

Reliability of identified QTL areas
In total 39 QTL regions were identified with relatively small effects, which suggest that NTE is controlled by many genes. This is in agreement with earlier performed linkage studies where across different pig populations  Table 1 are shown in blue (σ 2 QTL ). The variances explained by SNPs when placed in the second distribution (π 1 ) are shown in orange (σ 2 g 1 ). The variances explained by SNPs when placed in the null distribution (π 0 ) are shown in grey (σ 2 g 0 ). The variances of the SNPs were summed per chromosome.
many QTL were found on almost all chromosomes.
Including the results from this study and published QTL, all chromosomes with the exception of SSC13 and the Y chromosome, carry QTL for teat number. Figure 3 also clearly shows that some QTL found in this study only have a limited contribution compared to the genetic variance explained by SNPs which were in the null distribution (π 0 ). This helps to distinguish whether the variance explained per chromosome is expected because SNPs have a small variance, or alternatively, whether there is a large QTL contributing to the chromosomal genetic variance. The expected variance explained per chromosome is proportional to the number of SNPs on the chromosome with the assumption that no QTL is located on the chromosome, for example on SSC13. The difference between the total chromosomal genetic variance and the expected variance when SNPs are in the null distribution plus the QTL variance, can be caused by SNPs that were selected (π 1 ) but did not reach the significance level to be assigned to a QTL region. The genetic variance explained by the defined QTL region on SSC7 might even be an underestimation due to SNPs which are in LD with the QTL region, but did not reach the significance threshold of a BF >10. In general, the explained variance by the identified QTL regions is small. Several factors could be underlying the relatively small explained variance. The size of this study is moderate in livestock and small compared to human studies [55]. A larger sample size could pick up more rare variants, additionally a larger SNP set (>500.000) will have the SNP closer to the causative mutation and will give more statistical power to the association study [55]. The trait NTE has also been under selection for at least 10 years (E.F. Knol, personal observation), which could have resulted in rapid fixation of large segregating QTL. Additionally, NTE is a categorical trait and not directly measuring the variation of the underlying physiological factors of the developmental signalling chain. Most of the QTL regions identified in this study reside within published QTL for NTE on several chromosomes (SSC2, 3, 7, 8, 10, 11, 12, 15 and 16 respectively). In Figure 4, a detailed overview of all the published QTL compared to the results from this study are shown (see Additional file 2 for the details of the studies published on NTE). Overall, the regions identified in this study are much smaller, as a result of using high-density SNP information in a GWAS instead of microsatellites as in a linkage study.
Especially on SSC7, 8, 10 and 12 some QTL are located relatively close to each other. We identify them as small individual QTL regions rather than one large QTL, because the LD (measured by r 2 ) between the SNPs is less than 0.7. Besides region 30 on SSC14, the r 2 between the closely identified QTL was always below 0.2, which suggests the QTL should be considered as independent genomic regions.

Candidate QTL and genes
The QTL which explains the most genetic variation is located on SSC7 and has been identified in other studies [9,11,12,17,18]. While Mikawa et al. [48] reported an additive effect of this QTL on number of vertebrae, we observed a dominant or even overdominant effect on the number of teats in our population. This is in accordance with a previous study reporting that breeding for increased body length also resulted in increased teat number, leading to the speculation that the number of vertebrae is genetically linked to the number of teats [11]. Within this QTL we identified a gene named Vertnin (VRTN) at 103.45 Mb, which likely provides such a genetic link. The likeliness of Vertnin's candidature is strongly supported by the previous demonstration that structural integrity of the somites, as well as somitic expression of genes such as Gli3 and Fgf10, are required for the proper formation of the mammary line and glands at the axial level of thoracic/abdominal transition in mice [27]. Mechanistically, the genetic linkage between numbers of vertebrae and teats can be explained by the somites being precursors for vertebrae as well as dermal mesenchyme, the latter inducing mammary gland formation in the overlying surface ectoderm [27,39,40].
Another candidate gene in the region is Prox2, a vertebrate homolog of the Drosophila homeobox-containing gene prospero Prox2 belongs to a family of transcription factors whose function has not yet been characterized in detail in mammals. In zebrafish, Prox2 is mainly expressed in the brain and involved in eye development [56]. Notably, both Vertnin and Prox2 have recently been proposed as candidates for the number of vertebrae in a White Duroc x Chinese Erhualian intercross, that also carries two different haplotypes for increased teat number [57].
The second largest QTL effect explaining more than 1% of the genetic variance is located on SSC12 between 51.95 and 52.62 Mb. Although many studies have found a QTL for NTE on SSC12 [9,11,12,17,18], only Guo et al. [12] found a large QTL interval similar to ours in an F2 cross between Large White and Chinese Meishan. Within this region, we find the δ-EF1 gene is of particular interest because its expression in the somites [58] may suggest a role in vertebrae and mammary development in a mechanism similar to Vertnin, as described above.
On SSC10, we found a significant QTL in the same region as in other studies [12,13,15] or neighbouring a region found in other studies [16,59]. Between 52.21 and 53.94 Mb within this region, we identified three candidate genes, namely MPP7, ARMC4, and MKX. In Drosophila and in cultured cells, MPP7 promotes cell polarity and tight junction formation [50,51]. While a role for MPP7 and tight junctions in early mammary gland development has not yet been studied, cell polarization and tissue stratification are integral part of mammary induction and early growth, as observed in mice [23], supporting the candidature involvement of MPP7 establishment of gland/teat number. As a member of the β-catenin gene family, ARMC4 could be involved in mammary gland formation via its role in transcriptional transduction of Wnt/β-catenin signaling. A role for this signaling pathway in early mammary gland formation is demonstrated by compromised mammary gland formation in the presence of pathway-inhibitory mutations, and supernumerary mammary gland formation in the context of excessive Wnt signaling [33,34,[60][61][62][63]. Alternatively, Wnt signaling is implicated downstream of PTHrP signaling in nipple formation, which is a process that occurs secondary to the formation of mammary rudiments [64].
Most convincing, the third gene MKX is in mice expressed in the condensing mesenchyme that will ultimately become the proximal ribs and vertebral bodies [65].
The Elongation of the back and an increased number of vertebrae in pigs as a consequence of domestication was already observed by Charles Darwin [67]. Signatures of selection have also been found for numbers of vertebrae and body length in the domestic pig [68] and for number of teats [69]. In accordance, some QTL for number of vertebrae and number of teats in pigs overlap [11]. A mutation increasing number of teats in the gene NR6A1 (an orphan nuclear receptor) on SSC1 has been fixed in commercial breeds [70] and therefore, this QTL is not segregating in our population either. Nevertheless, selection for increased carcass length still provides variation for genetic improvement of reproductive traits such as number of teats in the sow which is in turn very relevant for the survival of piglets.
Mechanistically, the link between carcass length (number of vertebrae) and number of teats can be explained by results from studies in mice, revealing a role for the somites, i.e. precursor structures for vertebrae, in the induction of mammary gland development [22,25,27]. Although to our knowledge, a variation in number of somites (vertebrae) in mice has not been subject to study in mice, and certainly not in relationship to the number of mammary glands or nipples/teats, it is clear that altered somitic development or gene expression can alter the number of mammary glands (thus nipples/teats) in mice [22,25,27]. In agreement with this biological mechanism of mammary gland development, we identified in our current study several candidate genes with a known association to vertebrae development. To date, none of these genes have a reported role in mammary gland or nipple/teat development. Such studies could certainly be helpful to get closer to the causative genetic variant which can be used in breeding programs for increased NTE in pigs.

Conclusion
Although NTE has been under selection for many generations, this study found many QTL controlling NTE. We could narrow down considerably some of the earlier published QTL regions making it easier to select candidate genes. Five major QTL were found at high resolution on SSC7, 9, 10, 12, and 14 of which the QTL on SSC9 and SSC14 are the first ones to be reported on these chromosomes. The confirmed major QTL found on SSC7 contains the candidate gene Vertnin which has been reported to control the number of thoracic vertebrae. Interestingly, also the two other regions on SSC12 and SSC10 contain genes that have a suspected (δ-EF1) or demonstrated (MKX) involvement in vertebrae development. The genetic relation between teat number and number of vertebrae can be explained by the somites being precursors for both vertebrae and dermal mesenchyme, while both the somites and dermal mesenchyme have been shown to contain inductive signals for mammary gland formation. All significant QTL together explain almost 10% of the genetic variance. Nevertheless, results clearly show the polygenic nature of the trait indicating the genetic complexity of the trait. The significant SNPs found in this study could be used in selection to increase NTE in pigs, so that the increasing number of live-born piglets can be nursed by the sow.

Animals
This study was conducted strictly in line with the regulations of the Dutch law on the protection of animals (Gezondheids-en welzijnswet voor dieren). Animal Care and Use Committee approval was not obtained for this study, because the data were obtained from an existing database. Phenotypic measurements of the number of teats (NTE) were obtained from 936 pigs, of which 230 were boars and 706 sows. All pigs were purebred Large White and were born between 2006 and 2009 and originated from 17 farms. The number of teats were counted at birth and recorded on both sexes. In this study, NTE was the only recorded trait and number of left and right teats and teat malformations were not considered.

Genotyping and SNP quality
Genotyping was performed using the Illumina Porci-neSNP60 Beadchip. Samples collected for DNA extraction were only used for routine diagnostic purpose of the breeding program and was strictly in line with the Dutch law on the protection of animals (Gezondheids-en welzijnswet voor dieren). DNA was extracted from blood, hair and ear punches and commercially genotyped at Service XS (Leiden, The Netherlands) or at Geneseek, Inc. (Lincoln, NE, USA). SNPs with a GenCall score <0.7, call rate <95%, minor allele frequency <0.01 and SNPs with no physical position on the pig genome (pig genome build10.2) were removed. After these quality control measures, 42,654 SNPs out of 64,232 SNPs remained for the genome-wide association (GWA).

Statistical method for GWA analyses
The estimated breeding value (EBV) for every animal was obtained via routine genetic evaluation using MiXBLUP [71] in a multitrait model. The model for obtaining the EBV for NTE included fixed effects for herd-year-season, sex and line and an additive genetic effect (animal) as a random effect. Reliabilities per animal were extracted from the genetic evaluation and was based on the methodology of Tier and Meyer [72]. The EBV was deregressed using the methodology proposed by Garrick et al. [47].
The deregressed EBV only contains information of the animals' own performance and of their descendants' , which was achieved by removing the parent average. The reliabilities (information sources) vary considerably between animals and therefore the deregressed EBVs have heterogeneous variances. This is resolved by weighing the residuals as in Garrick et al. [47]. Deregression of the EBV was applied to account for the dense family structure in the data and the large difference in the number of information sources to avoid double counting.
A Bayesian Variable Selection model [73] was fitted for NTE by estimating the marker effects with all SNPs simultaneously in the model: where y is a n-vector of phenotypes on n animals, μ is a n-vector equal to the mean, X is a n by p matrix where p SNPs are coded as 0, 1, or 2 copies of a particular allele vector and β is a p-vector with the marker effects. A Bernoulli distribution is applied on the marker effect: where the first distribution is referred to the null distribution and SNPs are assumed to have a small effects ( σ 2 g 0 ) and the second distribution contains SNPs assumed to have a large effect explaining a large variance (σ 2 g 1 ) of the phenotype. The probability to be in the null distribution (π 0 ) was set to 0.999, meaning only 1 in every 1000 SNPs will be in the second distribution which is on average 42 SNP per cycle. The term e is a n-vector with random residual effects assumed to be normally distributed but weighted, N 0; σ 2 e W À Á , where W is a diagonal matrix with elements w 1 , …, w n . The model was implemented in the program Bayz [74].
A total of 500,000 MCMC chains with a burn-in of 5,000 cycles were run and a Metropolis-Hastings sampler was applied to get good convergence which was assessed by visual inspection of the trace and using Gelman and Rubin's convergence diagnostic based on deviance [75] using the R package CODA [76].

Identification of associated regions
To determine which SNPs are significantly associated, a Bayes Factor was calculated for every SNP using the prior probability (π 0 and π 1 ) and the posterior probability (p i ) by calculating an odds ratio as: A BF >10 is referred to as 'strong' and a value above 100 as 'decisive' [77].
When at least two SNPs in a region (<2 Mb) showed a BF >10, this region was defined as a candidate QTL region and for regions with a BF around or over 100, a gene search was conducted. To define a QTL for NTE in this analysis, linkage disequilibrium (LD) was taken into account. When r 2 was >0.7 between two QTL regions, but the distance was larger than 2 Mb, the region was still combined into one common region.
The variance explained by the QTL region was calculated by simultaneously estimating the variance explained by all the regions and all the other non-significant SNPs. To get insight in the chromosomal partitioning of the genetic variance, SNPs within a QTL region (BF > 10) and SNPs on different chromosomes were placed in different groups. The sum of the variance explained per chromosome was the sum of variances of the QTL (if detected) on a chromosome plus the explained variance by the other SNPs on the chromosome. The variance expected per chromosome was calculated as the average variance explained by SNPs when placed in the null distribution (π 0 ).

QTL comparisons and candidate genes
All earlier reported QTL found for NTE were available at the PigQTLdb (http://www.animalgenome.org/QTLdb/pig. html). Flanking markers of the QTL were searched at the reference genome (build10.2) to find the physical position of the markers. If one of the markers was not found, a BLAST search was performed. If the physical position of markers could not be identified the closest neighboring marker according to the linkage map from MARC USDA (http://www.marc.usda.gov/genome) was mapped. When any of these markers could not be placed on the physical map, the QTL was not included in the comparison.
For gene searches, the left and right flanking markers of the defined candidate QTL regions were used. Pig genome build10.2 was used for the position of the SNPs. Gene annotation for QTL regions was performed with BIOMART software in the Ensembl Sscrofa 10.2 (http:// www.ensembl.org). Ensemble Gene IDs were used to count the number of genes within the QTL regions.

Additional files
Additional file 1: All 489 genes found in QTL regions as reported in Table 1. The genes are mapped on build10.2. Ensemble gene id was given, start and end of the gene, the status of the transcript and gene and when known the gene name and function.

Competing interest
Although TOPIGS Research Center B.V., employer of ND, EFK and BH, is a research institute closely related to one of the funders (TOPIGS), the funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. The authors declare that they have no competing interests.
Authors' contributions ND was involved in conducting the statistical analyses, prepared figures and tables, and drafted the paper. JMV was a discussion partner and involved in writing the paper. BH was involved in the selection of the candidate genes and writing the paper. All authors have read and approved the final manuscript.