This GWAS identified 46 chromosomal regions affecting the four analysed boar taint traits. The highly associated SNPs around the QTL peak were used to identify haplotypes describing the genetic variation affecting the QTL. The SNP-sets were analysed by use of a probability based phasing algorithm implemented to infer haplotypes. Of the 46 suggestive QTL, 28 were confirmed using a haplotype analysis by contrasting the most abundant haplotype to the rest of the haplotypes with an additive effects model. The QTL and haplotype analysis was performed in the three Danish breeds Duroc, Landrace, and Yorkshire, separately. A total of 10 haplotypes were found to significantly affect androstenone, two to affect pure skatole, six to affect pure indole and seven to affect a skatole equivalent (S/I index) containing both skatole and indole. In addition, slaughter weight and meat content were analysed and sometimes found to be significantly affected by the QTL haplotype. By including slaughter weight and meat content as covariates in the mixed model the fixed effects for the haplotypes were adjusted. The sample size of each breed was limited (132 to 331). Therefore QTL discovered here should be validated using larger samples of the same populations. However, the number of half sib families analysed resembles the number found in a previous study . Also as at most two offspring from each nucleus family were included, we are convinced that a large part of the genetic variation within each breed was represented in the data. In addition, by introducing haplotypes capturing the total genetic variance within the QTL region, and not only within haplotype blocks split by the four gamete rule as is a more conventional approach , it was possible to select and combine information from markers likely to be in strong linkage disequilibrium with whatever regulates the trait. These two things in combination might explain the relatively large effects found in this study.
Previous studies of QTL affecting androstenone in fat identified genomic regions on SSC 1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 13, 14, 15 and 18 [7, 29, 32, 33]. In the current study, a substantial number of genome regions affecting fat androstenone were identified in the three purebred populations analysed. The affected chromosomal regions that were confirmed by the haplotype analysis were located on the following chromosomes: 2, 3, 5, 6, 7, 11, 12 and 14.
Few studies have been conducted to identify genome regions affecting skatole in fat. However, the two most recent studies by Grindflek et al. (2011) [7, 33] detected QTL regions affecting both skatole and indole on nine different chromosomes. In total, ten chromosomes have been identified to harbour QTL affecting skatole and the SSC locations are 1, 2, 3, 5, 6, 7, 10, 11, 13 and 14 [7, 29, 30, 33]. In the current study, regions affecting skatole were identified on SSC 3 and 9. In relation to indole, the current study identified seven chromosomal regions to be affected. These were located on SSC 6, 10, 11, 14 and 15. In addition, segmental regions identified to be affected by the S/I index were found on SSC 5, 6, 7, 8 and 14.
Analysis within and between breeds
The QTL affecting the indole and S/I index trait, identified on SSC 14 in L, was the only QTL found to overlap within a breed. As indole is part of the S/I index an overlap in QTL region between these two traits is not surprising. Overlaps were also observed between breeds, however, mostly within the same trait. The QTL on SSC 12, found to affect androstenone, was segregating in all breeds, but it was only explained by a single SNP in L. An overlap could also be observed between the QTL for androstenone in D and the QTL for the S/I-index in Y on SSC 6. In addition, the QTL on SSC 14 previously described in L also overlap a QTL within the same region in Y. The relatively small overlap in QTL could be expected based on the knowledge of the different distribution of the boar taint traits observed within the three breeds and in relation to possible fixed alleles.
Haplotype analysis in relation to androstenone
A suggestive QTL reducing the level of androstenone was observed in D on SSC 3. It was confirmed by the haplotype analysis. The most frequent haplotype showed a fixed effect of 0.844 μg/g, which is relatively small compared to other effects identified in the present study. However, as the frequency of the haplotype is around 34%, it is possible to increase the population frequency by selection. One of the genes previously identified as being involved in androstenone biosynthesis was located within the QTL region, i.e., the gene ST5AR2 (steroid 5-alpha-reductase 2) also called SRD5A2. A number of steroids are catabolised by this enzyme along with SRD5A1 into their 5α-reduced metabolites . In addition, SRD5A2 is also involved in androstenone formation by catabolising the final step from 4,16-androstadien-3-ene . Variation of SNPs within the SRD5A2-3'UTR has previously been shown to be associated to plasma levels of androstenone in a Norwegian Landrace population by a haplotype analysis . As low androstenone level was associated with low level of estrone sulphate, they concluded that this haplotype is less desirable for selection purposes because of potentially reduced reproduction.
On SSC 6 the QTL identified to affect androstenone in Duroc overlap the one identified in a Norwegian Duroc population . In order to keep the level of fertility in the population they also investigated how the QTL would affect the level of testosterone and estrogens. They found that most of the QTL shown to affect androsterone also affected one or more of the other hormones except the one identified on SSC 6. In the current study, the fixed effect was found to be 0.775 μg/g and highly significant, and the frequency of the haplotype was rather abundant (56%). An analysis to find the proportion of explained phenotypic variance for the haplotype was conducted. Here it was found that the haplotype explained around 11.7% of the total phenotypic variance. High levels of variance explained for SNPs in a number of different genes have previously been described in a Norwegian Duroc population . As 99.4% of the animals were included in the test and only four haplotypes were identified within this QTL the haplotype appears as a possible candidate for use in breeding efforts.
The QTL identified on SSC 14 between D and androstenone was also confirmed by the haplotype analysis. The haplotype showed a relatively large fixed effect (1.183 μg/g) that explained around 8.1% of the total variance. A QTL analysis conducted in a Meishan × Large White population revealed a genome-wide significant peak within the same area . In the current study the peak SNP was found to be located within an intron of the gene LOC100518755 (polypeptide N-acetylgalactosaminyltransferase-like 6-like). Previously, NAT12 (N-acetyltransferase 12) involved in phase II metabolism in the liver has been shown to be differentially expressed when analysing extreme high vs. extreme low androstenone Duroc and Norwegian Landrace animals .
The QTL identified in D on SSC 12 that regulates the level of androstenone was also identified in Y and by a single SNP in L. All suggestive QTL were confirmed by the haplotype analysis. In D the most abundant haplotype (51.5%) showed a fixed effect (1.29 μg/g), which was the highest effect on androstenone in the entire study. Further investigation revealed that this particular haplotype explained about 13.2% of the total variance. The fixed effect in Y and L was not as high, 0.827 and 0.945, respectively, and no significant result on the variance could be achieved for these haplotypes.
In L, four additional QTL regions affecting the level of androstenone were confirmed by the haplotype analysis. On SSC 7 a very broad region spanning 38,292 kb was identified. In the study by Grindflek et al. , three QTL regions were identified on this chromosome. One of these QTL, referred to as 7a in their paper, overlaps the region found here. However, the region identified in this study also overlaps another QTL previously detected. The CYP21 gene was selected as a candidate gene and further analysed, however, they found that no SNP in the coding part of the CYP21 gene could explain the association . The CYP21 gene is involved in formation of 4,16-androstadien-3-ene from progesterone . In this study we found that the SNP UMB10000108, which was located within an intron of the CYP21A2 (cytochrome P450, family 21, subfamily A, polypeptide 2) gene, was associated with the QTL (p = 0.0017). The gene CYP21 has previously been shown to be down regulated in high androstenone Norwegian Landrace boars .
On SSC 1 in L we identified another suggestive QTL. In this case the analysis of the most common haplotype within the QTL did not show a significant association between the haplotype and trait. However, the second most frequent haplotype seemed to deviate more from the total mean and hence was analysed instead. In this case the haplotype showed a large effect (1.279 μg/g) and it was found that this haplotype explained about 16.8% of the phenotypic variance. A candidate gene located within the region is the gene CYB5A (cytochrome b5 type A). Pregnenolone is converted to 5,16-androstadien-3β-ol by the andien-β synthase. The andien-β synthase activity is catalyzed by cytochrome P450c17 and depends on adequate levels of CYB5A . The level of CYB5A in testis hence has an indirect impact on the amount of produced androstenone. A SNP (G > T) at base -8 upstream of ATG in the 5'untranslated region of CYB5A was studied in relation to plasma and back fat traits related to boar taint in Swedish Landrace × Yorkshire crossbred pigs . The analysis showed weight dependent positive effect of having the T allele in both plasma androstenone and back fat skatole. However, they concluded that it should not be implemented due to the lack of change in back fat androstenone that might be due to the low allele frequency of the T allele in the population.
One of the QTL identified on SSC 2 to affect androstenone was found to overlap a region affecting skatole reported by Grindflek et al. (2011) . Further investigations of the area will be needed before including this QTL in a breeding scheme.
Haplotype analysis in relation to the indolic compounds
A total of five QTL verified by the haplotype analysis in relation to indole and the S/I index seemed to be influenced by the slaughter weight as well as the trait. Hence, the haplotype were re-analysed to account for the effects of the slaughter weight. For the three S/I index QTL on SSC 6, 8 and 14 identified in Y, Y and L the fixed effect dropped by 0.011, 0.010 and 0.027 μg/g, respectively, when including the slaughter weight as a covariate. The amount of variance explained by the haplotype was analysed to see how it was affected by including the slaughter weight in the model. Surprisingly, the values were still high; 8.9% in Y on SSC 6a, 7.63% in Y on SSC 8 and 11.21% in L on SSC14, were only the one on SSC 6 had changed markedly (-2.7%). Even though the fixed effect decreased in these cases, the explained variance only changed marginally. The fixed effect of the haplotype identified from the chromosome 10b QTL in D also showed a decrease (0.005 μg/g). This QTL overlap a QTL for both skatole and indole previously identified in the Landrace population .
Only two QTL identified to affect skatole were confirmed by the haplotype analysis, namely, the one on SSC 9 in Y and the one in D on SSC 3. The QTL identified in D on SSC 3 overlap a QTL identified to affect both testosterone and esterone sulphate in a Norwegian Landrace population . Besides, they identified a QTL affecting both skatole and androstenone in a Norwegian Duroc population to be located closer to the centromere region than the one identified in this study. In the current study, a suggestive QTL for androstenone was also detected (P = 0.0004). A candidate gene for the QTL on SSC 3 is the SULT1A1 (sulfotransferase family, cytosolic, 1A, phenol-preferring, member 1) gene. This gene has not yet been aligned to the porcine assembly, but comparative mapping of human genes within the area of SULT1A1 suggests that the gene is likely to be located around 16.6-17.0 Mb on SSC 3 Sscrofa9. In addition, the peak SNP (DIAS0001357) was found to be located within the XPO6 gene situated only 0.4 Mb from SULT1A1 in humans. Furthermore, the gene has been mapped by linkage analysis to SSC 3 . This gene is known to be involved in phase II metabolism of skatole . In addition, a suggestive QTL on SSC 7 identified to affect skatole in the Y breed that could almost be distinguished by the haplotype analysis (p = 0.088) overlap the region reported in a Norwegian landrace population . This region was also reported by Quintanilla et al. (2003) to affect back fat androstenone .
The QTL identified and confirmed by the haplotype analysis on SSC 14 in both L and Y affecting indole, and in L also the S/I index, could be regulated by the candidate gene CYP2E1 (cytochrome P450, family 2, subfamily E, polypeptide 1). This gene is a phase I liver metaboliser of skatole (reviewed by Zamaratskaia et al. 2008 ). In addition, the CYP2E1 gene has previously been shown to be a major metaboliser of indole in rats . The UMB10000045 SNP situated within an intron of the gene was found to be significantly associated with the QTL (p < 1e-5) in L for both traits. However, this SNP was not segregating in the Y population; instead, another SNP was identified in relation to the gene. The SIRI0000194 flanked the CYP2E1 at the 5' side of the gene (587 bp) and was found to be significantly association to the QTL (p = 8e-6). For both of the haplotypes identified in relation to indole the fixed effects seemed small (-0.004 and -0.007). The negative values indicate that for the most frequent haplotype in these QTL the average level of indole was higher. By analysing the explained variance for the haplotypes it was found that the level was similar to previously identified values . The variance explained by the haplotype regulating indole in L was 4.2%, but the level identified in Y was much higher: 13.9%. The variance explained by the S/I index in L was 11.5%.
None of the breeds showed any association to the pure skatole measure within the chromosome region on SSC 14. One of the issues that have been discussed during the past decade is the inability to detect a skatole QTL in relation to CYP2E1 . The difference in bacterial load in the animals and hence the synthesis of the compounds might explain the lack of QTL. Indole is produced in a single step by a large variety of bacteria whereas the two-step formation of skatole is specific and dependent on certain bacteria strains. The first step from tryptophan to indole-3-acetic acid is done by Escherichia Coli and Clostridium [45, 46]. The second step converts the indole-3-acetic acid into skatole. This is done by the genera Clostridium and a Lactobacillus strain [47–50]. The genetic composition of an animal does not depend on whether the skatole is produced or not. If a proportion of the animals have less skatole due to a low bacterial load their genetic ability to metabolise the compound can still be rather poor but give a "false" low trait value in relation to their genetics. In addition, indole and skatole are both broken down by CYP2E1 in the liver as described above but as the level of indole is not affected by the composition of bacteria we see a perfect association to the genetics in the area around the gene. The same applies for the skatole equivalent trait. As indole is part of the trait we get a more balanced trait in relation to the genetics. In a study by Lanthier et al. (2007)  intestinal skatole were measured to distinguish high and low producers. They found that only the high producers showed a strong correlation between fat skatole and the level of SULT1A1. This might be due to the same principle as described above. However, the more recent studies on Norwegian pure bred animals showed association of skatole and indole to affect the genomic region in both Landrace and Duroc [7, 33].