A high density recombination map of the pig reveals a correlation between sex-specific recombination and GC content
BMC Genomics volume 13, Article number: 586 (2012)
The availability of a high-density SNP genotyping chip and a reference genome sequence of the pig (Sus scrofa) enabled the construction of a high-density linkage map. A high-density linkage map is an essential tool for further fine-mapping of quantitative trait loci (QTL) for a variety of traits in the pig and for a better understanding of mechanisms underlying genome evolution.
Four different pig pedigrees were genotyped using the Illumina PorcineSNP60 BeadChip. Recombination maps for the autosomes were computed for each individual pedigree using a common set of markers. The resulting genetic maps comprised 38,599 SNPs, including 928 SNPs not positioned on a chromosome in the current assembly of the pig genome (build 10.2). The total genetic length varied according to the pedigree, from 1797 to 2149 cM. Female maps were longer than male maps, with a notable exception for SSC1 where male maps are characterized by a higher recombination rate than females in the region between 91–250 Mb. The recombination rates varied among chromosomes and along individual chromosomes, regions with high recombination rates tending to cluster close to the chromosome ends, irrespective of the position of the centromere. Correlations between main sequence features and recombination rates were investigated and significant correlations were obtained for all the studied motifs. Regions characterized by high recombination rates were enriched for specific GC-rich sequence motifs as compared to low recombinant regions. These correlations were higher in females than in males, and females were found to be more recombinant than males at regions where the GC content was greater than 0.4.
The analysis of the recombination rate along the pig genome highlighted that the regions exhibiting higher levels of recombination tend to cluster around the ends of the chromosomes irrespective of the location of the centromere. Major sex-differences in recombination were observed: females had a higher recombination rate within GC-rich regions and exhibited a stronger correlation between recombination rates and specific sequence features.
Linkage maps have been widely used to identify genomic regions that influence phenotypic traits. In addition to the expected advances in fine-mapping of Quantitative Trait Loci (QTL)[1, 2], high-density linkage maps provide a framework for checking the assembly of genome sequences and for studies of the evolution of these genomes through the analysis of recombination. Indeed, recombination lies at the heart of every genetic analysis, and whereas linkage maps in the past were constructed primarily to aid in the generation of a physical map, linkage maps are currently being recognized as indispensable tools to study virtually every aspect of genome biology. Genomic features that have been shown to correlate with recombination rate include GC content, gene density, gene expression, epigenetic modifications, nucleosome formation, repetitive element composition, isochore structure, but also patterns of genetic variation and differentiation within and between populations. For this reason, increasingly dense recombination maps have been constructed in the so called ‘post-genomic era’ for species such as human and mouse, focussing on identifying hotspots of recombination, and, recently, variation in the use of these hotspots between populations and between sexes.
Despite the evident importance of accurate and comprehensive linkage maps in the post-genomic era, comprehensive maps are currently only available for a handful of vertebrate species (human, mouse, rat, cattle, dog, zebra finch and chicken). This limited coverage of the recombination landscape severely limits the possibility of drawing general conclusions about the recombination rates in genomes, particularly now that it is becoming increasingly clear that various mechanisms can work together in creating a very dynamic use of recombination hotspots over time[3–6].
In swine, the first linkage map covering all the autosomes plus the X chromosome of the pig was established in 1995 and a denser map comprising about 1,200 markers was published in 1996. Two other linkage maps comprising around 240 loci were published in the late 1990s[9, 10]. These four maps were mainly based on microsatellites, Restriction Fragment Length Polymorphisms (RFLPs) and protein polymorphisms. More recently, SNPs were added to these maps, but the resolution remained low with an average inter-SNP distance of 3.94 cM. With the advent of genome-wide high-density SNP chips, genetic maps can comprise an increasing number of markers. Until now, such high-density genetic maps, based on microsatellites and SNPs, have been computed for human, mouse, chicken[14, 15], cattle and dog. With the release of Illumina's Porcine SNP60 BeadChip, it became possible to construct a high-density recombination map of the porcine genome. In this work, we present four recombination maps for four different pedigrees. A single set of SNPs was used, each SNP being informative in at least one of the four pedigrees. The recombination maps were estimated using a priori knowledge of the SNPs' order. This physical order of the SNPs was based on the position of the SNPs on the porcine Radiation Hybrid (RH) map and on the positions of the SNPs in the pig genome sequence (build 10.2).
The Illumina PorcineSNP60 BeadChip, which provides assays for 64,232 SNPs, was used to genotype the four studied pedigrees (ILL, UIUC, USDA, ROS; Table1). The a priori order used to compute the recombination map comprised 44,760 SNPs: 35,098 from the RH order, and 9,662 derived from the sequence assembly. Of the 44,760 SNPs, 5,980 SNPs were discarded because of their low call-rate (<97%), and an set of 181 SNPs was removed because they exhibited a large number of Mendelian inconsistencies in several families. When Mendelian inconsistencies were only limited to one particular family per pedigree, genotypes were considered as missing in this family. A total of 168 individuals were removed from the four pedigrees because of their high proportion of incorrect genotypes due to either pedigree or genotyping errors. Finally, the average number of informative meiosis per marker was 432 for ILL, 200 for UIUC, 670 for USDA and 120 for ROS.
The a priori order, on which the recombination analyses were based, comprised 44,760 SNPs, including 556 SNPs mapped to unplaced scaffolds and 480 SNPs with no sequence match on the genome assembly. Finally, we were able to construct a genetic map with a total of 38,599 SNPs including 508 from unplaced scaffolds and 420 that had no match on the assembly. On average, there were 2,144 SNPs per chromosome, ranging from 1,011 (SSC18) to 5,293 (SSC1) (Table2). This set of SNPs was chosen as being valid for all four pedigrees; recombination maps were calculated separately for each of them. The rates of phase reconstruction differed for the four pedigrees. For the complete genome, the highest rate was obtained for the UIUC pedigree (99.0%) and the lowest rate was obtained for the ROS pedigree (87.0%). The ILL and USDA pedigrees were intermediate with phase reconstruction rates of 96.5% and 92.0%, respectively.
The details of the genetic maps calculated for each of the four pedigrees are presented in Table2. The estimates of the total genetic length of the 18 autosomes were 2,012 cM for ILL, 2,149 cM for UIUC, 1,797 cM for USDA and 1,858 cM for ROS. The largest chromosome was SSC6 for ILL, UIUC and ROS pedigrees with 148, 151 and 148 cM, respectively; whereas it was SSC1 for the USDA pedigree with 130 cM. SSC18 was the smallest chromosome for all the pedigrees, its length varying from 44 cM for the ROS pedigree to 71 cM for the UIUC pedigree. Estimates of the size of linkage maps are influenced by many factors. Recombination events are stochastic and different sub-sets of the markers (SNPs) are informative in the different pedigrees. Although potential genotyping errors were removed from the analysis, specific SNPs segregating only in particular pedigrees might still result in increased map length if they have a higher error rate. However, our observed difference in size between the ILL and UIUC maps versus the USDA and ROS maps, is consistently seen for most of the chromosomes, indicating a true biological difference in the recombination rate for these different crosses. Because within the USDA and ROS pedigrees female recombination was not well taken into account (due to the low number of offspring per dam or because of missing genotypes), male and female recombination maps were described separately only for the ILL and UIUC pedigrees (Table3). Consistent with findings in other mammals, the total lengths were longer for the female maps (2,244 and 2,545 cM for ILL and UIUC respectively) than for the male maps (1,782 and 1,747 cM for ILL and UIUC respectively). SSC1 stands out as an exception, with the male maps being longer than the female maps. This difference is due to a low recombination rate in the females in the region between 90 and 250 Mb (Figure1). In this 90–250 Mb region, the average recombination rate in females was 0.056 and 0.031 cM/Mb for ILL and UIUC respectively whereas it was 0.286 and 0.290 for males in ILL and UIUC pedigrees respectively.
Recombination rates were calculated for non-overlapping bins of 1 Mb with marker positions delimiting the intervals (Additional file1). At the level of the genome, the highest average recombination rate was obtained for the UIUC pedigree with 0.85 cM/Mb, the lowest being obtained for the USDA pedigree with 0.70 cM/Mb (Table2). This ratio was highly variable depending on the physical length of the chromosomes, the shortest ones having higher ratios than the longest ones (Figure2).
For the four pedigrees, the highest recombination rate was observed for SSC12 with values of 1.33, 1.30, 1.11 and 1.24 cM/Mb for ILL, UIUC, USDA and ROS, respectively. The lowest recombination rate was obtained on SSC1 with 0.37, 0.38, 0.33 and 0.37 cM/Mb for ILL, UIUC, USDA and ROS respectively (Table2). At the genome level, recombination rates were higher in females than in males. At the chromosome levels, only SSC1 displayed higher recombination rates in males than in females, for ILL and UIUC pedigrees (Table3). The distribution of recombination rates was not constant along the chromosomes with high recombination rates mostly concentrated around the end of the chromosomes (Figure1 and Figure3). This is seen both in male and female recombination but the effect is somewhat stronger in female recombination. Overall, the recombination maps for the 4 pedigrees are in good agreement, although small local differences can be detected.
On SSC9, the large gap observed is due to the absence of SNPs that could be reliably included for the four pedigrees in the genetic maps. The distribution of the recombination rates plotted against the physical distance to the closest chromosome end confirm that high recombination rates tend to cluster around the chromosome ends, irrespective of the position of the centromere (Figure4). For the sex-averaged map, the correlation between the recombination rate and the physical distance to the closest chromosome end was estimated to be-0.48 (p-value < 0.0001), and correlations for separate male and females maps were identical.
Correlation of recombination with sequence parameters
Correlations between recombination rates and various sequence parameters (GC content, repetitive elements content and short sequences) have previously been observed in human, chicken, dog and mouse. The occurrence of these sequence parameters was calculated within bins of 1 Mb and the correlations with the recombination rates were estimated. With the sex-average map, all sequence features were highly significantly correlated with the recombination rate (p-value <0.05). However, the level of the correlations was lower for LINEs and LTRs, with Pearson correlation coefficients of-0.05 and 0.06, respectively. The comparison of the sequence composition of recombination ‘jungles’ and ‘deserts’ (1 Mb intervals with the 10% highest and 10% lowest recombination rates respectively) also highlights this link between the occurrence of specific sequence features and recombination rate (Table4). Recombination jungles were enriched in specific GC rich motifs as compared to the deserts. The largest difference was observed for the CCCCACCCC sequence, this sequence being almost three times more frequent in recombination jungles than in deserts.
Male and female recombination rates were also analysed separately and large differences were observed. The correlation of the recombination rate with GC content was higher in females (0.44) than in males (0.15) (Table4). In agreement with this is the observation that in females recombination is higher only when the GC content of the region is higher than 0.40 whereas it is lower for regions where the GC ratio is smaller than 0.39 (Figure5).
Jungle/desert ratios were also highly different between sexes for SINEs and short sequence motifs. In females, this ratio reached 3.41 for the CTCF consensus sequence (CCNCCNGGNGG), whereas it only reached 1.52 in males.
The reliability of a recombination map is of major importance for linkage and genome-wide association analyses. The presented recombination maps were computed for four different pedigrees, with a subset of SNPs being optimal for all of them, finally comprising 38,599 SNPs. Because only SNPs for which sequence and RH positions were in agreement were included in the analyses and because the recombination maps confirmed the a priori order, the map presented in this study is expected to be as accurate as possible with currently available data. The map presented in this paper is the densest recombination map ever computed for the porcine genome. Until now, the shortest average marker interval on a genetic map was reached by the USDA MARC map with an average interval of 2.23 cM. The large number of SNPs as well as the high number of informative meiosis included in the present analysis enabled the computation of a high-density recombination map of the porcine genome with a consequent substantial increase in resolution (around 0.1 cM) compared to previous maps. The total length of the genetic map varied between the four pedigrees, from 1,797 cM to 2,149 cM, which is smaller than the previously published genetic maps. This decrease in the total length of the map can in part be explained by the lower rate of genotyping errors with SNP chip genotyping as compared to microsatellites or RFLP genotyping. Another factor that contributes to the decreased map size is the fact that male meioses contributed most to the current map, while the USDA maps[8, 20] were based primarily on female meioses. Concerning the map computed with gene-associated SNPs, the sex-averaged genetic maps presented in our study are 15 to 45% shorter, if we take into account only the regions covered in both studies. The same is observed for the sex-specific maps. Female maps are 21 to 33% shorter in our study, and the two male genetic maps are around 18-19% shorter than the one presented by Vingborg et al.. Recently, two genetic maps based on the 60 k SNP chip have been published for Landrace and Duroc, with similar chromosome lengths as in our study except for SSC1 where a length of 199.8 cM was obtained in Landrace, very different from all the others.
The recombination map of the porcine genome described in this paper, revealed major chromosomal as well as regional differences in recombination rates. The four pedigrees clustered into two different groups, ILL and UIUC having recombination rates close to 0.8 cM/Mb whereas the two other pedigrees had lower recombination rates close to 0.7 cM/Mb. All these values are in the range of previous findings in mammals (from 0.6 cM/Mb in mouse to 1.25 cM/Mb in cattle). In birds, the observed recombination rate is higher with a value of 1.5 cM/Mb in the zebra finch and up to 2.7 to 3.4 cM/Mb in chicken. Differences in recombination rate within a species have already been described in mice and chicken[14, 15]. Differences in recombination rate observed in this study among the four pedigrees are partly explained by the percentage of phases that could be reconstructed. A lower number of phases could be reconstructed in the two pedigrees in which family sizes were small (USDA) or where several mother genotypes were missing (ROS). Another potential cause for the observed differences are sequence variations within the individuals used, and in particular structural variants like copy number variants and local inversions. In particular the UIUC and ROS crosses involving Chinese (Meishan) and European (Large White/Yorkshire) breeds which diverged around 1 million years ago, are likely to have local inversions that would affect recombination at these positions.
In addition to these differences among the four pedigrees studied, the recombination rate also varied among chromosomes (Table2 and Figure2) as well as within chromosomes (Figure1). The distribution of the recombination rate according to the physical size of the chromosomes obtained with the pig was in agreement with the distributions observed in other mammalian species and birds: shortest chromosomes exhibiting higher recombination rates. This result is in line with the observation of at least one cross-over occurring per meiosis per chromosome. It is noteworthy that for the longest chromosomes in pig, the overall recombination fraction (cM/Mbp) is much lower than for any other mammalian species for which recombination maps have been developed to date (Figure2).
The distribution of the recombination rate according to the distance to the closest chromosome end showed that higher recombination rates were mostly observed towards the ends of the pig chromosomes. Moreover, the position of the centromere did not seem to influence this distribution: e.g., SSC13 is an acrocentric chromosome and the distribution of the recombination rate along this chromosome is very similar to the distribution along metacentric or submetacentric chromosomes (pig chromosomes 1 to 12 being meta- or submetacentric chromosomes, the others being acrocentric chromosomes). Other species with acrocentric chromosomes, such as the dog, show a marked increase in recombination fraction at the medial and centromeric parts of most chromosomes. The general absence of this pattern in the acrocentric chromosomes in pigs raises questions on how and particularly when the pig chromosomes became acrocentric. The evolution of centromere positions can be highly dynamic, and the current apparent disparity between centromere position and recombination rate may hint at a recent shift of the position of the centromere in several pig chromosomes.
In human and rat, recombination rates were also found higher in the telomeric regions and reduced close to the centre of the chromosomes, but this pattern is not as pronounced as in the pig. This preferential distribution of crossing overs at the chromosomal ends is even more striking in zebra finch with long central regions where the recombination rate remains extremely low. However, in the zebra finch, and also in chicken, these telomeric regions of exceptionally high recombination compared to the other parts of the chromosomes seem to be much more confined to the extreme edges of the chromosomes, whereas in the pig these distal regions of high recombination are less pronounced but much greater in size. In some species, however, this particular distribution of recombination rate along a chromosome is not observed. In the mouse, the correlation estimated between recombination rate and the distance to the centre of the chromosome does not differ from the one estimated with respect to the distance to the telomere, which is in agreement with the distribution of the recombination rate estimated from the sex-averaged genetic map. Similarly, the plot of the genetic map against the physical map of the bovine genome does not show this sigmoid-like pattern that indicates higher recombination rates at the chromosome ends. What is particularly striking in the pig, is that this elevated recombination towards the ends of the chromosomes is also seen for the acrocentric chromosomes. Previous observations in other mammals, were interpreted as that recombination at centromeric regions was low, because recombination would interfere with kinetochore assembly at the centromers. Unless the pig has evolved specific features to overcome such interference, which does not seem to be very likely, other yet unknown structures of mammalian chromosomes underlie these observed differences.
Recombination and sequence features
In this study, we show that recombination rates vary with the distance to the closest chromosome end. In human, the GC content was negatively correlated with the distance to the chromosome end, and the porcine genome exhibits the same negative correlation. The GC content has also been shown to be strongly positively correlated with recombination rates in human[12, 30, 31], mice, chicken and zebra finch, and this was also confirmed in this study. This seemingly universal positive correlation between GC content and recombination is thought to signify a shared underlying mechanism determining recombination rates[32, 33], although it has been proposed that higher GC content can conversely be the result of high recombination rate[34, 35].
Mechanisms explaining the direct relationship between GC content and recombination rate identify the presence of certain recognition motifs for DNA binding proteins that have a known function in meiosis or the recombination process directly, such as cohesin and PR domain-containing protein 9. In other mammalian and avian species, high-density linkage maps have shown strong correlations between recombination rates and various sequences such as the consensus cohesion binding site; the 7-nucleotide oligomer CCTCCCT[4, 13] and a 13-nucleotide oligomer described in human CCNCCNTNNCCNC. Recently, it was shown that this 13-nucleotide sequence is recognized in vitro by the human PR domain-containing protein 9, encoded by the PRDM9 gene. The PR domain-containing protein 9 is known to regulate recombination hotspot activity in human. GC-rich motifs have been investigated in this study and all of them are overrepresented in recombination jungles and underrepresented in deserts. The sequences CCTCCCT and CCCCACCCC, overrepresented in about 10% of human hotspots are also correlated with higher recombination rates in mouse and chicken, jungle/desert ratios being close to 2 or higher. The same is observed in this study with a ratio close to 2 or higher (Table4).
In our study, male and female maps were analysed separately for the ILL and UIUC pedigrees. In both designs, female meioses were better sampled than in the two other pedigrees for which dams were not always genotyped or had too few offspring. The ROS and USDA maps are thus closer to male maps that can be explained by their shorter lengths as compared to the sex-average maps of ILL and UIUC. It should also be noted that the length of the female maps that are reported here are close to the original MARC map that was based primarily on female meioses.
In most species, the heterogametic sex is expected to have a lower recombination rate than the homogametic sex. This was confirmed in this study at the level of the genome with female maps being longer than male maps by 26% or 46% for ILL and UIUC pedigrees, respectively. However, SSC1 stood out with more recombination events described within males than within females. As shown in Figure1, females displayed a 160 Mb region with a very low recombination frequency. Vingborg et al. found that SSC1 was longer in females than in males, but the 70–100 cM region of SSC1 also displayed higher recombination in males than in females. The greater genetic length of SSC1 in males as compared to females was already observed in previous pig genetic maps[7, 37–39]. All these previous maps were based on crosses between genetically diverse founder/grandparental animals including Wild Boars and European commercial breeds and Chinese and European breeds[8, 39] or combinations thereof. The current study also included highly diverse pedigree origins, which makes breed effects therefore unlikely to be the major explanation for this locally low recombination rate. For the ILL pedigree, we observed a small difference between the male and female maps of SSC13 and this was also reported by Guo et al. who observed a female to male ratio of 0.98 for this chromosome. In the linkage map computed with gene-associated SNPs, SSC13 was also found to be rather similar in males and females. For this chromosome, we did not observe such large sex-differences in the distribution of the recombination rates along the chromosome as for SSC1. To better understand this apparent discrepancy in recombination rates between male and female on different chromosomes, we plotted the recombination rates as a function of GC content for male and female separately (Figure5). Although in both sexes higher average recombination frequencies were observed for regions exhibiting a higher GC content, this correlation was much greater in females than in males. This also explains why, contrary to what is observed in most other mammals, there is a tendency of females to show even more elevated recombination towards the ends of the chromosomes than the males. In fact, males showed a clear lower recombination rate at AT rich regions, but females showed an even lower recombination at AT rich regions relative to males. This resulted in an overall lower recombination rate in females in AT rich regions than observed in males. This may explain the observation on SSC1, where the recombination was higher in males due to the 90–250 Mb region being relatively AT rich (GC content of 0.39 compared to the genome average of 0.42). This effect was only clearly observed on SSC1 since the other chromosomes lack such long regions of low GC content. A positive correlation between recombination rates in female and GC content had already been reported in human, and this was confirmed in the present analysis (Table4). Recombination in males appeared to be less sensitive to the frequency of the GC rich motifs and the observed jungle / desert ratios are much higher in females.
The positive relationship between GC content and female recombination does not appear to be universal. Sex-specific GC related recombination rates for instance have been observed in dogs, but appears to be opposite in this species: higher GC content appears to be negatively correlated with female recombination rate. Since the study on dog recombination did not dissect the precise relationship of male and female recombination rates as a function of GC content as done in the present study it is difficult to compare the results. However, this opposite relationship in dogs may hint at specific recombination mechanisms that apply to acrocentric vs. metacentric karyotypes, and demonstrates the importance of having detailed recombination maps for many different species for comparative genome biology purposes.
Even if the mechanisms underlying sex differences in recombination are largely unknown, a number of mechanisms for sex-specific differences have been proposed: difference in time allotted for so called bouquet formation in meiosis, difference in the compactness of the chromosomes at pachytene phase of meiosis, genomic imprinting, or differences in the use of specific recombination-hotspot specific motifs[12, 41]. For instance, it has been shown that different alleles of the RNF212 gene can have opposite effects on male and female recombination rate. In mice, a QTL analysis was carried out to detect regions of the genome underlying recombination rate and the most significant QTLs were observed on chromosome X. This raises the possibility that chromosomes X and/or Y may be involved in the observed striking difference of recombination rates between males and females. However, the analysis included only males, so no sex-specific QTL could be analysed. This study in mice indicated that genomic variations on the X chromosome influenced the recombination rate, but it did not provide further explanation of why females recombine more than males. Finally, in mice, the analysis of meiocytes from XX females, XY males, XY sex-reversed and XO females indicated that recombination patterns depend more on being a male or a female than on the true chromosomal genotype. All of these mechanisms may be compatible with the patterns observed in the present paper. In fact, the evolution of recombination and recombination hotspots seems highly dynamic, and may involve universal (e.g. chromosome compactness at the pachytene phase at meiosis) and species specific mechanisms (e.g. use of sex specific hotspots). The importance of each of these mechanisms will need to be tested for various species using higher density linkage maps in the future.
In this study we present the first high-density recombination map of the porcine genome, with a resolution substantially higher than previously published maps. This high resolution enabled us to focus on the differences between low and high- recombining regions of the genome, and on the large differences that we observed between males and females. As expected, at the genome level, female maps were longer than male maps. The unexpected higher recombination rates in males observed on SSC1, could be explained by a large region of low GC content where females showed very low recombination rates. The higher correlation between recombination rate and GC content (as well as GC rich motifs) in females as compared to males was confirmed at the genome level. Until now, this high correlation between recombination rates in females and GC content has only been reported in human. Further analyses of the mechanisms underlying recombination are needed to identify the molecular mechanism underlying this observed difference. The increased insight into the porcine recombination landscape will help future studies aimed at understanding the evolution of the pig genome and at fine-mapping identified QTLs for economically important traits.
Mapping populations and SNP genotyping
The animals used to compute the recombination maps belong to four independent pedigrees. Three were based on an F2 design (including one reciprocal cross) and one was based on multi-stage crosses. Details about the four pedigrees are presented in Table1.
To compute recombination maps, only families with more than four full-sibs were retained in the analysis. Therefore, recombination maps were calculated based on the information from 573 animals of the ILL pedigree, 247 from the UIUC pedigree, 204 from the ROS pedigree and 1298 from the USDA pedigree. The four pig pedigrees were genotyped using the Illumina PorcineSNP60 BeadChip (San Diego, CA, USA). Each pedigree was genotyped independently, and a total of 664 samples from ILL, 337 from UIUC, 208 from ROS and 1337 from USDA were genotyped. To carry out the computation of recombination maps, only SNPs with a call rate higher than 97% were retained. In addition, all the genotypes were checked for Mendelian inheritance and erroneous genotypes were set as missing. Double recombinants at specific markers were considered as genotyping errors and the corresponding genotypes were therefore set as missing.
Recombination map calculation
Recombination maps were computed for each pedigree independently using a single set of SNPs, each SNP being informative in at least one of the four pedigrees. The first step of the recombination map calculation was to determine the best physical order of the markers based on the RH mapping and in silico mapping of the SNPs to the pig genome sequence. The genotyping of the two RH panels of the porcine genome on the PorcineSNP60 BeadChip enabled the computation of a physical map. SNPs were positioned on the current pig genome sequence build 10.2 (ftp://ftp.ncbi.nih.gov/genbank/genomes/Eukaryotes/vertebrates_mammals/Sus_scrofa/Sscrofa10.2/) by aligning the 200 bp sequence adjacent to the SNP against build 10.2 using BLAT. The RH order was considered as the basic order and when it was consistent with the sequence assembly, SNPs from the assembly were included in the best physical order.
The second step was the estimation of the recombination rates along chromosomes using the method described by Coop et al.. Briefly, haplotypes transmitted by a parent to each of its offspring were inferred based on informative SNPs. Then, within a given nuclear family, one of the offspring (template) was successively compared to the others: at a marker, it was deduced whether both offspring were Identical By Descent (IBD) or not. Any switch from an IBD to a non-IBD status indicated a recombination event. Regions where the majority of offspring showed a recombination were considered as indicative of a recombination in the template offspring. Finally, the parental phases were partially reconstructed, allowing identification of recombination events that occurred in each meiosis. Recombination rates were transformed into centimorgans (cM) using the Haldane mapping function.
As a result, four recombination maps were computed and recombination rates in cM/Mb were calculated for each pedigree along the genome. These recombination rates were estimated in non-overlapping bins of approximately 1 Mb considering the exact SNP positions as the delimiters of the bins. An average recombination rate was also estimated along the genome over the four pedigrees and was used to carry out further analyses in relation to correlation with sequence features. Similarly, female and male recombination rates were estimated along the genome.
Correlation of recombination with sequence parameters
The average recombination rate was compared to the distribution of various sequence motifs including repetitive elements (LINEs, SINEs, LTRs, simple repeats and low-complexity repeats), GC content, and GC rich motifs previously shown to be correlated with high recombination rates (CCTCCT, CCTCCCT, CTCTCCC, CCCCCCC, CCCCACCCC, the CTCF consensus sequence CCNCCNGGNGG and the PRDM9 consensus binding sequence CCNCCNTNNCCNC). The distribution of sequence motifs and GC content were calculated for bins of 1 Mb using the current assembly (build 10.2) and the correlations with recombination rates were tested using Pearson's correlation coefficient with the CORR procedure in SAS (SAS® 9.1, SAS Institute, Inc.). Similar results were obtained using the more conservative Spearman test (data not shown). To further investigate the link between sequence features and recombination rate, the sequence composition of jungle and desert regions were compared. Jungle regions were defined as the 1 Mb intervals with the 10% highest recombination rates, and conversely, desert regions were defined as the 1 Mb intervals with the 10% lowest recombination rates. The sequence composition of these Jungle and Desert regions were compared to detect whether there is a particular enrichment in some motifs in one of the two regions. A J/D ratio higher than one, indicates that the motif is more frequent in regions with high recombination rates than in regions with low recombination rates. Conversely a ratio lower than one indicates that the motif is more frequent in regions with low recombination rates. These ratios were also estimated independently in males and females. Finally, the correlation between recombination rate and the physical distance to the closest chromosome end was also estimated.
Identical By Descend
Quantitative Trait Loci
Restriction Fragment Length Polymorphism
Single Nucleotide Polymorphism.
Daw EW, Thompson EA, Wijsman EM: Bias in multipoint linkage analysis arising from map misspecification. Genet Epidemiol. 2000, 19: 366-380. 10.1002/1098-2272(200012)19:4<366::AID-GEPI8>3.0.CO;2-F.
Fingerlin TE, Abecasis GR, Boehnke M: Using sex-averaged genetic maps in multipoint linkage analysis when identity-by-descent status is incompletely known. Genet Epidemiol. 2006, 30: 384-396. 10.1002/gepi.20151.
Myers S, Freeman C, Auton A, Donnelly P, McVean G: A common sequence motif associated with recombination hot spots and genome instability in humans. Nat Genet. 2008, 40: 1124-1129. 10.1038/ng.213.
Baudat F, Buard J, Grey C, Fledel-Alon A, Ober C, Przeworski M, Coop G, de Massy B: PRDM9 is a major determinant of meiotic recombination hotspots in humans and mice. Science. 2010, 327: 836-840. 10.1126/science.1183439.
Berg IL, Neumann R, Lam KW, Sarbajna S, Odenthal-Hesse L, May CA, Jeffreys AJ: PRDM9 variation strongly influences recombination hot-spot activity and meiotic instability in Humans. Nat Genet. 2010, 42: 859-863. 10.1038/ng.658.
Paigen K, Petkov PM: Mammalian recombination hot spots: properties, control and evolution. Nat Rev Genet. 2010, 11: 221-233.
Archibald AL, Haley CS, Brown JF, Couperwhite S, McQueen HA, Nicholson D, Coppieters W, Van de Weghe A, Stratil A, WinterÃ AK, et al: The PiGMaP consortium linkage map of the pig (Sus scrofa). Mamm Genome. 1995, 6: 157-175. 10.1007/BF00293008.
Rohrer GA, Alexander LJ, Hu Z, Smith TP, Keele JW, Beattie CW: A comprehensive map of the porcine genome. Genome Res. 1996, 6: 371-391. 10.1101/gr.6.5.371.
Marklund L, Johansson Moller M, Høyheim B, Davies W, Fredholm M, Juneja RK, Mariani P, Coppieters W, Ellegren H, Andersson L: A comprehensive linkage map of the pig based on a wild pig-Large White intercross. Anim Genet. 1996, 27: 255-269.
Mikawa S, Akita T, Hisamatsu N, Inage Y, Ito Y, Kobayashi E, Kusumoto H, Matsumoto T, Mikami H, Minezawa M, Miyake M, Shimanuki S, Sugiyama C, Uchida Y, Wada Y, Yanai S, Yasue H: A linkage map of 243 DNA markers in an intercross of Gottingen miniature and Meishan pigs. Anim Genet. 1999, 30: 407-417. 10.1046/j.1365-2052.1999.00493.x.
Vingborg RK, Gregersen VR, Zhan B, Panitz F, Høj A, Sørensen KK, Madsen LB, Larsen K, Hornshøj H, Wang X, Bendixen C: A robust linkage map of the porcine autosomes based on gene-associated SNPs. BMC Genomics. 2009, 10: 134-10.1186/1471-2164-10-134.
Kong A, Gudbjartsson DF, Sainz J, Jonsdottir GM, Gudjonsson SA, et al: A high-resolution recombination map of the human genome. Nat Genet. 2002, 31: 241-247.
Shifman S, Bell JT, Copley RR, Taylor MS, Williams RW, Mott R, Flint J: A high-resolution single nucleotide polymorphism genetic map of the mouse genome. PLoS Biol. 2006, 4: e395-10.1371/journal.pbio.0040395.
Groenen MAM, Wahlberg P, Foglio M, Cheng HH, Megens HJ, Crooijmans RP, Besnier F, Lathrop M, Muir WM, Wong GK, Gut I, Andersson L: A high-density SNP-based linkage map of the chicken genome reveals sequence features correlated with recombination rate. Genome Res. 2009, 19: 510-519.
van Elferink M, As GP, Veenendaal T, Crooijmans RPMA, Groenen MAM: Regional differences in recombination hotspots between two chicken populations. BMC Genet. 2010, 11: 11-
Arias JA, Keehan M, Fisher P, Coppieters W, Spelman R: A high density linkage map of the bovine genome. BMC Genet. 2009, 10: 18-
Wong AK, Ruhe AL, Dumont BL, Robertson KR, Guerrero G, Shull SM, Ziegle JS, Millon LV, Broman KW, Payseur BA, Neff MW: A comprehensive linkage map of the dog genome. Genetics. 2010, 184: 595-605. 10.1534/genetics.109.106831.
Ramos AM, Crooijmans RPMA, Affara NA, Amaral AJ, Archibald AL, Beever JE, Bendixen C, Churcher C, Clark R, Dehais P, Hansen MS, Hedegaard J, Hu ZL, Kerstens HH, Law AS, Megens HJ, Milan D, Nonneman DJ, Rohrer GA, Rothschild MF, Smith TP, Schnabel RD, Van Tassell CP, Taylor JF, Wiedmann RT, Schook LB, Groenen MAM: Design of a high density SNP genotyping assay in the pig using SNPs identified and characterized by next generation sequencing technology. PLoS One. 2009, 4: e6524-10.1371/journal.pone.0006524.
Servin B, Milan D: High resolution map of the porcine genome using radiated-hybrid genotyping of the Illumina porcineSNP60 BeadChip: analysis and validation of the Pig genome assembly. in preparation
Rohrer GA, Alexander LJ, Keele JW, Smith TP, Beattie CW: A microsatellite linkage map of the porcine genome. Genetics. 1994, 136: 231-245.
Grindflek E, Lien S, Hamland H, Hansen MH, Kent M, van Son M, Meuwissen TH: Large scale genome-wide association and LDLA mapping study identifies QTLs for boar taint and related sex steroids. BMC Genomics. 2011, 12: 362-10.1186/1471-2164-12-362.
Backström N, Forstmeier W, Schielzeth H, Mellenius H, Nam K, Bolund E, Webster MT, Ost T, Schneider M, Kempenaers B, Ellegren H: The recombination landscape of the zebra finch Taeniopygia guttata genome. Genome research. 2010, 20: 485-495. 10.1101/gr.101410.109.
Dumont BL, Broman KW, Payseur BA: Variation in genomic recombination rates among heterogeneous stock mice. Genetics. 2009, 182: 1345-1349. 10.1534/genetics.109.105114.
Groenen MAM, Archibald AL, Uenishi H, Tuggle CK, Takeuchi Y, Rothschild MF, Rogel-Gaillard C, Park C, Milan D, Megens H-J, Li S, Larkin D, Kim H, Frantz LAF, Caccamo M, Ahn H, Aken BL, Anselmo A, Anthon C, Auvil L, Badaoui B, Beattie CW, Bendixen C, Berman D, Blecha F, Blomberg J, Bolund L, Bosse M, Botti S, Bujie Z, et al: Pig genomes provide insight into porcine demography and evolution. 2012, Submitted
Jones GH, Franklin FC: Meiotic crossing-over: obligation and interference. Cell. 2006, 126: 246-248. 10.1016/j.cell.2006.07.010.
Ford CE, Pollock DL, Gustavsson I: Proceedings of the First International Conference for the Standardisation of Banded Karyotypes of Domestic Animals. University of Reading Reading, England 2nd-6th August 1976. Hereditas. 1980, 92: 145-162.
Jensen-Seaman MI, Furey TS, Payseur BA, Lu Y, Roskin KM, Chen CF, Thomas MA, Haussler D, Jacob HJ: Comparative recombination rates in the rat, mouse, and human genomes. Genome research. 2004, 14: 528-538. 10.1101/gr.1970304.
Cheeseman IM, Desai A: Molecular architecture of the kinetochore-microtubule interface. Nature Reviews Molecular Cell Biology. 2008, 9: 33-46. 10.1038/nrm2310.
Duret L, Arndt PF: The impact of recombination on nucleotide substitutions in the human genome. PLoS genetics. 2008, 4: e1000071-10.1371/journal.pgen.1000071.
Birdsell JA: Integrating genomics, bioinformatics, and classical genetics to study the effects of recombination on genome evolution. Molecular biology and evolution. 2002, 19: 1181-1197. 10.1093/oxfordjournals.molbev.a004176.
Marais G: Biased gene conversion: implications for genome and sex evolution. Trends in Genetics. 2003, 19: 330-338. 10.1016/S0168-9525(03)00116-1.
Petes TD: Meiotic recombination hot spots and cold spots. Nature Reviews Genetics. 2001, 2: 360-369. 10.1038/35072078.
Petes TD, Merker JD: Context dependence of meiotic recombination hotspots in yeast: the relationship between recombination activity of a reporter construct and base composition. Genetics. 2002, 162: 2049-2052.
Galtier N, Piganeau G, Mouchiroud D, Duret L: GC-content evolution in mammalian genomes: the biased gene conversion hypothesis. Genetics. 2001, 159: 907-911.
Duret L, Galtier N: Biased gene conversion and the evolution of mammalian genomic landscapes. Annu Rev Genomics Hum Genet. 2009, 10: 285-311. 10.1146/annurev-genom-082908-150001.
Haldane JB: Sex-ratio and unisexual sterility in hybrid animals. J Genet. 1922, 12: 101-109. 10.1007/BF02983075.
Beeckmann P, Schröffel J, Moser G, Bartenschlager H, Reiner G, Geldermann H: Linkage and QTL mapping for Sus scrofa chromosome 1. J Anim Breed Genet. 2003, 120 (suppl1): 1-10.
Ellegren H, Chowdhary BP, Fredholm M, Høyheim B, Johansson M, Bräuner Nielsen PB, Thomsen PD, Andersson L: A physically anchored linkage map of pig chromosome 1 uncovers sex- and position-specific recombination rates. Genomics. 1994, 24: 342-350. 10.1006/geno.1994.1625.
Guo Y, Mao H, Ren J, Yan X, Duan Y, Yang G, Ren D, Zhang Z, Yang B, Ouyang J, Brenig B, Haley C, Huang L: A linkage map of the porcine genome from a large-scale White Duroc x Erhualian resource population and evaluation of factors affecting recombination rates. Animal Genetics. 2009, 40: 47-52. 10.1111/j.1365-2052.2008.01802.x.
Meunier J, Duret L: Recombination drives the evolution of GC-content in the human genome. MolBiol Evol. 2004, 21: 984-990. 10.1093/molbev/msh070.
Petkov PM, Broman KW, Szatkiewicz JP, Paigen K: Crossover interference underlies sex differences in recombination rates. Trends Genet. 2007, 23: 539-42. 10.1016/j.tig.2007.08.015.
Dumont BL, Payseur BA: Genetic analysis of genome-scale recombination rate evolution in house mice. PLoS Genetics. 2011, 7: e1002116-10.1371/journal.pgen.1002116.
Lynn A, Schrump S, Cherry J, Hassold T, Hunt P: Sex, not genotype, determines recombination levels in mice. American J Hum Genet. 2005, 77: 670-675. 10.1086/491718.
Kent WJ: BLAT-the BLAST-like alignment tool. Genome Res. 2002, 12: 656-664.
Coop G, Wen X, Ober C, Pritchard JK, Przeworski M: High-resolution mapping of crossovers reveals extensive variation in fine-scale recombination patterns among humans. Science. 2008, 319: 1395-1398. 10.1126/science.1151851.
This work was supported by USDA AG 2008-34480-19328 and USDA-ARS 538 AG58-5438-7-317 l. ALA was supported by a Biotechnology and Biological Sciences Research Council (UK) Institute Strategic Programme grant and the ROS pedigrees with earlier funding from the Ministry of Agriculture, Fisheries and Food (UK); the ROS pedigrees were genotyped by ARK-Genomics.
The authors declare that they have no competing interests.
FT calculated recombination frequencies and wrote the paper; BT developed the program to calculate the recombination frequencies and RH mapping of the SNPs; LF analysed correlations with sequence features and GC percentage; HJM was involved in the analysis and discussion of the correlation between the genomic landscape and recombination frequency; DM performed RH mapping of SNPs; RW and GR SNP genotypes of the USDA pedigree; JB SNP genotypes of the ILL pedigree; ALA SNP genotypes of the ROS pedigree; LBS genotypes of the UIUC pedigree; MAMG overall coordination and finalizing the paper. All authors were involved in improving the manuscript. The final manuscript version was reviewed and approved by all the authors.
Electronic supplementary material
About this article
Cite this article
Tortereau, F., Servin, B., Frantz, L. et al. A high density recombination map of the pig reveals a correlation between sex-specific recombination and GC content. BMC Genomics 13, 586 (2012). https://doi.org/10.1186/1471-2164-13-586