A selective sweep of >8 Mb on chromosome 26 in the Boxer genome
© Quilez et al; licensee BioMed Central Ltd. 2011
Received: 10 December 2010
Accepted: 1 July 2011
Published: 1 July 2011
Skip to main content
© Quilez et al; licensee BioMed Central Ltd. 2011
Received: 10 December 2010
Accepted: 1 July 2011
Published: 1 July 2011
Modern dog breeds display traits that are either breed-specific or shared by a few breeds as a result of genetic bottlenecks during the breed creation process and artificial selection for breed standards. Selective sweeps in the genome result from strong selection and can be detected as a reduction or elimination of polymorphism in a given region of the genome.
Extended regions of homozygosity, indicative of selective sweeps, were identified in a genome-wide scan dataset of 25 Boxers from the United Kingdom genotyped at ~20,000 single-nucleotide polymorphisms (SNPs). These regions were further examined in a second dataset of Boxers collected from a different geographical location and genotyped using higher density SNP arrays (~170,000 SNPs). A selective sweep previously associated with canine brachycephaly was detected on chromosome 1. A novel selective sweep of over 8 Mb was observed on chromosome 26 in Boxer and for a shorter region in English and French bulldogs. It was absent in 171 samples from eight other dog breeds and 7 Iberian wolf samples. A region of extended increased heterozygosity on chromosome 9 overlapped with a previously reported copy number variant (CNV) which was polymorphic in multiple dog breeds.
A selective sweep of more than 8 Mb on chromosome 26 was identified in the Boxer genome. This sweep is likely caused by strong artificial selection for a trait of interest and could have inadvertently led to undesired health implications for this breed. Furthermore, we provide supporting evidence for two previously described regions: a selective sweep on chromosome 1 associated with canine brachycephaly and a CNV on chromosome 9 polymorphic in multiple dog breeds.
It has been proposed that the majority of modern dog breeds recognised today have resulted from two population bottlenecks in dog evolution [1, 2]. During the first genetic bottleneck, pre-domestic breeds diverged from wolves some 15,000 years ago, probably through multiple domestication events. The second bottleneck for most breeds occurred within the last few hundred years, when the breed creation process resulted in the loss of genetic variation due to strong bottleneck events which occurred in parallel with strong artificial selection for behavioural and physical characteristics favoured by humans.
The same bottlenecks and artificial selection forces that generated these breed-specific features have, in some instances, provoked undesired health effects. Random fixation of detrimental variants can occur during bottlenecks. Similarly, risk alleles may be in linkage disequilibrium with selected phenotypic variants or these may have pleiotropic effects [3, 4].
Several studies have previously aimed to identify genomic regions involved in defined traits and their relationship with disease using association mapping (reviewed in Karlsson and Linblad-Toh ). However, phenotypic traits that have been driven to fixation by genetic drift or artificial selection within a dog breed cannot be mapped within that breed with this approach. An alternative in these cases is selection mapping, in which selective sweeps (a reduction or elimination of genetic polymorphism in a region owing to strong selection) are searched [2, 5–8]. The aim of this work was to identify selective sweeps in the Boxer genome resulting from the breed creation process using high density genome-wide SNP data. These regions are likely to govern phenotypic traits of interest and may be linked to overrepresentation of certain genetic disorders in this breed.
Samples genotyped with call rate > 90%.
Boxer (denoted as set A)
Illumina's CanineSNP20 (~20,000 SNPs)
Canarian warren hound
Illumina's CanineHD (~170,000 SNPs)
Boxer (denoted as set B)
German shepherd dog (GSD)
Regions of homozygosity (ROHs) detected in sets A and B.
The selective sweep on CFA 1 was present and showed allelic match with the Boxer (data not shown) in other brachycephalic breeds such as English bulldog, Pug and French bulldog although in the latter the reduction in heterozygosity was not as extended as in the other two breeds and seemed to be located slightly upstream in the chromosome (Figure 3). The selective sweep on CFA 26 was detected with allele-matching in English bulldog (CFA 26:6,785,609-11,282,297 bp; only 3 out of 368 SNPs with non-zero MAF) and French bulldog (CFA 26:8,548,679-9,227,043; 48 SNPs involved with MAF equal to zero) (Figure 3, Additional file 4). Although the ROH was not apparent in the Pug in Figure 3, in the segment of the ROH shared between Boxer and English and French bulldogs a good number of SNPs in Pug samples showed the same genotypes as the other three brachycephalic breeds. In addition, the SNPs comprised between 8,565,784 and 8,620,015 bp (~55 Kb) were nearly fixed in the Pug (Additional file 4). In contrast to the Boxer, a distribution of genotypes in HWE was seen for BICF2G630807104 (CFA 26:4,222,068 bp) in the bulldog breeds studied whereas it was fixed in the Pug (data not shown).
Within the SNPs making up the Illumina's CanineSNP20 only two covered the CNV on CFA 9 (Additional file 5), both of them highly monomorphic with the exception of the SNP at position 20,274,406 bp for the Shar pei samples for which an excess of heterozygous genotypes (HWE test p-value < 0.001) were observed. A pattern of excessive heterozygous genotypes was observed for the region corresponding to the CNV on CFA 9 in the breeds genotyped with the Illumina's CanineHD Beadchip (Additional file 6).
The region of decreased heterozygosity which was observed on CFA 1 in our study overlapped with a region previously associated with canine brachycephaly . This was detected using dogs from brachycephalic breeds and non-brachycephalic breeds to perform across-breed association and selection mapping. Both strategies identified a region on CFA 1 at 59 Mb. The decrease in the averaged observed heterozygosity of brachycephalic dogs relative to non-brachycephalic dogs is indicative of a selective sweep at this position. Genes which have been associated with brachycephaly on CFA 1 include: THBS2 [Ensembl:ENSCAFG00000000874], which is expressed in bone and cartilage during development and in the adult skeleton , and SMOC2 [Ensembl:ENSCAFG00000000868], similar in sequence to BM-40 [Ensembl:ENSCAFG00000017855] which is expressed primarily during embryogenesis and in adult bone tissue .
Dog genes and human orthologs within the selective sweep on CFA 26.
Ensembl Gene ID
Development of vertebral column
Contraction (and relaxation) of cardiac muscle
Bone mineral density of skeleton
Formation of perichondral bone
Resorption of trabecular bone
Contraction (and relaxation) of cardiac muscle
Contraction and relaxation of papillary muscle
Elongation of cardiomyocytes
Contraction of cardiac muscle
Development of sarcomere
Cardiomyopathy of heart ventricle
Boxer, English and French bulldogs
Ensembl Gene ID
Amino Acid Metabolism
Small Molecule Biochemistry
Wrinkly skin syndrome
Leukoencephalopathy with vanishing white matter
Infection Mechanism (by HIV)
Infection Mechanism (by HIV)
Tissue development and cell cycle
CFA 26:8,548,679-9,227,043 bp defined the region where extended homozygosity was common in Boxer, English bulldog and French bulldog (Figure 4a), which comprised six genes significant in the functional annotation analysis (Table 3). Of these genes, ATP6V0A2 [Ensembl:ENSCAFG00000007234] and EIF2B1 [Ensembl:ENSCAFG00000007434] are of special interest because they are involved in genetic disorders in humans. Various loss-of-function mutations of ATP6V0A2 [Ensembl:ENSCAFG00000007234], which encodes the alpha-2 subunit of the V-type H+ ATPase, resulting in impaired glycosilation of proteins during synthesis cause autosomal recessive cutis laxa (ARCL) type II [OMIM:219200] and some cases of wrinkly skin syndrome [OMIM:278250] . Mutations in each of the five subunits of the translation initiation factor eIF2B, including eIF2a encoded by EIF2B1 [Ensembl:ENSCAFG00000007434], can cause leukoencephalopathy with vanishing white matter (VWM, [OMIM:603896]) . VWM is a neurological disorder manifesting progressive cerebellar ataxia, spasticity, inconstant optic atrophy and relatively preserved mental abilities. SETD8 [Ensembl:ENSG00000183955] encodes a lysine methyltransferase that regulates tumor suppressor p53 protein . To note, the region of ~55 Kb of nearly complete homozygosity in the Pug is both upstream of these six genes and within the region of extended homozygosity shared amongst the three breeds. Interestingly, (i) the genes involved in skeletal and muscular system development and tissue morphology that were significant in the functional annotation analysis in the Boxer only (with the exception of POLE [Ensembl:ENSCAFG00000006215]) and (ii) the region shared in Boxer and English and French bulldogs, are both located within the region on CFA 26 that showed the greatest decay in the averaged observed heterozygosity (Figure 4b).
The region of alternate heterozygous/homozygous genotypes patterns observed for CFA 9 in the Boxers overlapped perfectly to a CNV previously described as being polymorphic in a number of dog breeds [19, 20]. This 1.5-Mb CNV region contained three protein coding genes, one of which had two reported transcript variants, and four non coding RNA genes (Additional file 8). Analysis of Gene Ontology Biological Process (GO BP) terms revealed that the two transcript variants of the protein coding gene VPS13D [Ensembl:ENSCAFG00000016397] (CFA 9:21,079,541-21,164,823 bp) were associated with processes of protein localization and viral envelope fusion with host membrane (GO:0008104 and GO:0019064, respectively). This gene was also mapped to the Hs 1:12,290,124-12,572,099 bp but only the protein localization term was associated (GO:0008104). The remaining annotated elements had neither GO BP terms associated nor homology in H. sapiens.
The substitution of a strongly selected mutation produces a selective sweep on the frequency of neutral alleles at linked loci characterised by a reduction of the local genetic variation [21–23]. Two selective sweeps detected on CFAs 1 and 26 in the Boxer genome were replicated in a larger sample size of the same breed obtained from a different geographical location and genotyped for a panel of SNPs of higher density. Assessing both the presence of these regions in other breeds and their genetic content can provide information on how they affect the phenotype and relate to the ancestral origin of breeds.
In our study, the selective sweep previously associated with brachycephaly on CFA 1  was replicated in a larger sample of Boxers and in samples from other brachycephalic breeds. Moreover, samples in this study were from a different geographic area compared to the previous work  (Europe and US, respectively), suggesting the selective sweep is shared in the two populations within each breed.
The selective sweep on CFA 26 indicates strong artificial selection of a trait of interest in the Boxer, although the phenotypic trait resulting from this particular selective sweep is unknown. The sweep was not present in the Iberian wolf, one ancient breed (Shar pei), Labrador retrievers, German shepherd dogs or four hound breeds. On the other hand, it was present, although in shorter length, in English and French bulldogs, breeds that share with the Boxer the brachycephalic trait and a related breed creation process. Altogether, these results suggest that the selection of the sweep predated the formation of Boxer and both bulldog breeds. It is known that the English bulldog contributed to the breed creation of both Boxer and French bulldog breeds . The Boxer is believed to have originated from a long-existing and now extinct German breed, the Bullenbeisser, which was crossed with a small number of English bulldog exemplars exported from the UK. Likewise, the French bulldog originated from toy varieties of English bulldog that were more popular in France. Moreover, it is interesting that the region of the selective sweep common in the three breeds coincides with the lowest reduction in the heterozygosity along the sequence (Figure 4). In selective sweeps, the reduction of genetic variation is lowest at the site of directional selection and not as great at distant sites due to recombination, although asymmetry in the valleys of reduced heterozygosity may provide imprecise information about the location of the sweep . Based on our data, it is hard to assess whether the selective sweep on CFA 26 shared in Boxer and English and French bulldogs was also present in the Pug, also a brachycephalic breed, because only a short segment of reduced polymorphism within the sweep was observed in this breed (~55 Kb). Nonetheless, the history of the Pug differs from that of these three breeds mentioned before. The Pug dates to the ancient China and it is suggested that interbreeding with Pekingese, Japanese chin and possibly Shih tzu contributed to the breed creation process. Pugs were imported to Europe through Holland around 1,600s .
A possible scenario is that the standing neutral variation on CFA 26 present in the original English bulldog was passed to both Boxer and French bulldog during the breed creation process. Some variants would have been beneficial thereafter when selection of brachycephaly started, which is reasonable to think that happened during the breeds creation process since brachycephaly is a breed standard in these three types of dogs. Thus, strong selection of variants close to the position 8-10 Mb on CFA 26 contributing to brachycephaly might have swept nearby genetic variation. Variable selective sweep length in the three breeds would response to different breed histories as it depends on the strength of selection, the amount of recombination and the population size [21, 22, 25]. Therefore, if one assumes the recombination rate to be similar across breeds for a given chromosome region, different across-breeds strength of selection and population sizes might have probably caused the variable length in the sweeps on CFA 26, which in the Boxer is more than ten times larger than in the French bulldog.
Altogether we suggest that CFA 26 may contain a footprint of selection for brachycephaly, especially in the Boxer. A brachycephalic head with a distinctive broad and blunt muzzle is a unique phenotype of the Boxer and particular attention is given to this trait by the Boxer breeding community. Although brachycephaly has been mapped to CFA 1 and the greatest association was greater than 100 times more significant than the second highest, Bannasch et al  suggested that the complex nature of the brachycephalic head phenotype may be the result of associations across multiple chromosomes. Verification as to whether the genome-wide significant markers on CFA 26, the second highest association, previously reported  are within the ROH on CFA 26 in our data would provide some support for a link between this region and selection for brachycephaly. Genes on CFA 1 which have been associated with brachycephaly are involved in skeletal development [7, 10, 11]. Similarly, genes significant in functional annotation analysis of our data were associated with skeletal and muscular system development and function as well as tissue morphology biological process (Table 3).
It is possible that the selection for certain breed-specific loci or locus might be in linkage disequilibrium with detrimental variants at other genes. Interestingly, some of the genes in the selective sweep region on CFA 26 could be related to diseases that are reported to be more common in the Boxer breed, particularly cancer (lymphoblastic lymphoma) and cardiovascular disorders (CMD) .
In addition, we could observe in our data a previously reported CNV on CFA 9 polymorphic in multiple dog breeds [19, 20], providing evidence for within-breed variation in the number of segment copies. Our data suggest that the CNV on CFA 9 is present and variable in the 273 Boxers used in this study (Figure 1b, 2), as well as in other breeds such as German shepherd dog, Pug and English and French bulldogs (Additional file 6). We suggest this CNV may be also possibly present in the Shar pei (Additional file 5) although in this breed it should be confirmed with a panel of higher SNP density. It might be that variable numbers of copies of a gene contained within the CNV such as VPS13D [Ensembl:ENSCAFG00000016397], which is involved in entrance of virus into the host cell, might be functional in the susceptibility to viral infection. Likewise, the non coding RNAs (ncRNAs) and small nuclear RNAs (snRNAs) which precede a region relatively rich in genes (data not shown) could be functional in regulatory processes.
We have identified a selective sweep in excess of 8 Mb on CFA 26 in the Boxer which is not present in Iberian wolves or non-brachycephalic dog breeds. This region is a candidate for strong artificial selection in the Boxer for a trait of interest, possibly brachycephaly, and the inadvertent selection of genes during the enrichment for a certain phenotype may have given rise to an increased incidence of certain related afflictions in the breed. The fact that the selective sweep is also present in English and French bulldogs provides genetic evidence of a shared history of the three breeds.
Furthermore, we provide supporting evidence for two previously described regions: a selective sweep on CFA 1 associated with canine brachycephaly and a CNV on CFA 9 which is polymorphic in multiple dog breeds and contains genetic elements with potential biological implications.
A set of 27 Boxer samples from the UK (denoted as set A) were collected as residual samples from dogs taken for clinical investigation. They were selected from a large archive of DNA samples (UK Companion Animal DNA Archive, University of Manchester) and all samples had informed owner consent. A second set of 274 Boxer samples were collected from Spain, Greece, Italy and Portugal (denoted as set B); samples from Spain represented > 90% of set B. Dogs in this set and the remaining breeds in Table 1 came from the Hospital Clínic Veterinari of the Universitat Autònoma de Barcelona, veterinary clinics or dog owners.
DNA was extracted from peripheral blood or bone marrow samples using either QIAamp® DNA Blood Mini Kit (QIAGEN) or PureLink™ Genomic DNA (Invitrogen). Set A was genotyped at 22,362 SNPs with Illumina's CanineSNP20 BeadChip at The Genome Centre, Queen Mary University of London, UK. Set B and German shepherd dog samples were genotyped at 174,376 markers using Illumina's CanineHD BeadChip at The Centre National de Génotypage, France. The remaining samples were genotyped as indicated in Table 1 at the Universitat Autònoma de Barcelona, Spain.
Data cleaning was conducted using PLINK and R packages [26, 27]. Set A was filtered to have individual and marker call rates > 90%, resulting in 25 Boxers and 22,300 SNPs left for analysis. The same filters were applied to set B and, moreover, in this set we also excluded intensity probes, markers on the boundary autosomal region on chromosome CFA X as well as those SNPs on the non-pseudoautosomal region on CFA X for which heterozygous genotypes in male samples were observed. All samples in set B had an individual call rate > 90% but one sample was excluded as it appeared as an outlier when the first two dimensions of the multidimensional scaling analysis were plotted (Additional file 9). This resulted in 273 individuals with 171,772 SNPs each left for analysis.
Averaged observed heterozygosity was calculated as the moving average of the observed heterozygosity using 50-SNPs windows both for set A (20,451 windows) and set B (169,812 windows). In each set the 1% of windows with the lowest averaged observed heterozygosity was selected (Additional file 2); windows spaced less than fifty times the mean SNP density (bp/SNP) of the beadchip used were considered as single regions of homozygosity. ROHs common in both sets were defined as those overlapping in at least one SNP. The analysis was also performed on the dataset with the SNPs with Hardy-Weinberg Equilibrium test p-value > 0.005 and the identified ROHs presented in Table 2 correspond to this second analysis.
The position of the CNV detected on CFA 9 was defined as the union resulting from our data and the positions annotated in the Ensembl database  in two previous works describing this CNV [19, 20]. This resulted in a region at CFA 9:19,778,695-21,332,928 bp that was searched for Gene Ontology biological process (GO BP) terms using Biomart  and regions of synteny with H. sapiens. For the ROH on CFA 26 Ensembl IDs of the annotated elements in the syntenic region at Hs 12:108,311,620-133,784,108 bp were retrieved using Biomart  and used as input for functional annotation analysis. The annotated genes in Hs 12:108,311,620-133,784,108 bp were tested for enrichment of certain biological functions or diseases by comparison with the annotations from the Ingenuity database for mouse, rat and human genomes . Right-tailed Fisher's exact test was used to calculate a p-value determining the probability that each biological function and/or disease assigned to that data set was due to chance alone. The categories of diseases associated with the region of interest were compared with the reported inherited diseases in the Boxer breed .
This work was funded by the European Commission (LUPA, GA-201370) and the Grant Number RR016466 from the National Center for Research Resources (NCRR), a component of the NIH and the American Kennel Club grant 0876-A. We would like to thank the referring clinicians, dog owners who gave permission for their dogs to participate in this study and the UK Companion Animal DNA Archive for providing the DNA from the UK samples.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.