Genomic organization of duplicated major histocompatibility complex class I regions in Atlantic salmon (Salmo salar)

Background We have previously identified associations between major histocompatibility complex (MHC) class I and resistance towards bacterial and viral pathogens in Atlantic salmon. To evaluate if only MHC or also closely linked genes contributed to the observed resistance we ventured into sequencing of the duplicated MHC class I regions of Atlantic salmon. Results Nine BACs covering more than 500 kb of the two duplicated MHC class I regions of Atlantic salmon were sequenced and the gene organizations characterized. Both regions contained the proteasome components PSMB8, PSMB9, PSMB9-like and PSMB10 in addition to the transporter for antigen processing TAP2, as well as genes for KIFC1, ZBTB22, DAXX, TAPBP, BRD2, COL11A2, RXRB and SLC39A7. The IA region contained the recently reported MHC class I Sasa-ULA locus residing approximately 50 kb upstream of the major Sasa-UBA locus. The duplicated class IB region contained an MHC class I locus resembling the rainbow trout UCA locus, but although transcribed it was a pseudogene. No other MHC class I-like genes were detected in the two duplicated regions. Two allelic BACs spanning the UBA locus had 99.2% identity over 125 kb, while the IA region showed 82.5% identity over 136 kb to the IB region. The Atlantic salmon IB region had an insert of 220 kb in comparison to the IA region containing three chitin synthase genes. Conclusion We have characterized the gene organization of more than 500 kb of the two duplicated MHC class I regions in Atlantic salmon. Although Atlantic salmon and rainbow trout are closely related, the gene organization of their IB region has undergone extensive gene rearrangements. The Atlantic salmon has only one class I UCA pseudogene in the IB region while trout contains the four MHC UCA, UDA, UEA and UFA class I loci. The large differences in gene content and most likely function of the salmon and trout class IB region clearly argues that sequencing of salmon will not necessarily provide information relevant for trout and vice versa.


Background
Major histocompatibility complex (MHC) class I and class II molecules are vital parts of the cellular immune system presenting self and/or foreign peptides to CD8 positive and CD4 positive T cells. Both classes of genes reside in a 4 Mb gene dense region on human chromosome 6 shared with many other immune genes [1].
Atlantic salmon and rainbow trout genomes encode one major MHC class I locus designated UBA in addition to the major MHC class II alpha and beta genes designated DAA and DAB respectively [2][3][4]. For UBA, the main polymorphism resides in the alpha 1 and alpha 2 domains with up to 60% sequence divergence between these antigen binding domains. Added variability for UBA is produced by shuffling of exon 2 onto different exon 3 and downstream regions through recombination occurring in intron 2 [4]. Additional class I loci and lineages have been described in both Atlantic salmon as well as in rainbow trout. The majority of reported salmonid MHC class I molecules are classified into a U-lineage consisting of both UBA as well as non-classical MHC molecules [5,6]. Two other described MHC class I-like lineages are ZE described by Miller et al. [5] and L described by Dijkstra et al. [7].
In all teleosts studied so far including salmonids the MHC class I and class II regions are unlinked [3,8]. Sequence data on the MHC class I region is available from zebrafish [9], fugu [10], medaka [11,12] and rainbow trout [6]. A general feature of these four MHC class I regions is a core region containing genes for the proteasome components (PSMBs) and the transporter for antigen processing (TAP2) being flanked by various numbers of MHC class I loci in addition to many other genes also residing in the human MHC region located on chromosome 6. Data from medaka and zebrafish indicate that other fish orthologs of the mammalian MHC-encoded genes are dispersed on several different chromosomes [13][14][15][16], similar to the paralogue MHC regions described on human chromosomes 1, 9 and 19 [17]. Salmonids are seen as partially tetraploid with a unique whole genome duplication occurring between 25 and 125 million years ago (mya) with remnants of tetraploidy visible also today [18][19][20]. Shiina et al. [6] sequenced two duplicated core MHC regions of rainbow trout. Based on sequence divergence they estimated the duplication event to have taken place approx. 60 mya, in agreement with the salmonid whole genome duplication theory. The classical or IA region contained the major expressed classical MHC class I UBA locus while the duplicated region denoted IB contained the four Onmy-UCA, -UDA, -UEA and -UFA class I loci. Based on expression and polymorphism data, Onmy-UCA, -UDA and -UEA were defined as non-classical loci and -UFA as a pseudogene due to an incapacitating mutation in exon 3 [6].
Data is rapidly emerging on associations between MHC and resistance to salmonid pathogens. In Atlantic salmon, UBA genotypes have been found to provide resistance towards Aeromonas salmonicida and Infectious Salmon Anaemia Virus [21,22]. Class IB, but not class IA was found associated with susceptibility towards infectious hematopoietic necrosis virus (IHNV) in Atlantic salmon and towards infectious pancreatic necrosis virus (IPNV) in rainbow trout [23,24].
Both trout and salmon are main aquaculture species and understanding their immune systems will improve our understanding of how these regions influence disease resistance and thus improve our breeding schemes for the trait. Atlantic salmon and rainbow trout are estimated to have split approx. 20 mya [25]. As Atlantic salmon is a major aquaculture species and displays some differences in response to pathogens when compared to rainbow trout [26], we ventured into sequencing of the two duplicated MHC class I regions of Atlantic salmon. Here we describe the gene organization of these two MHC class I regions comprising approx. 500 kb each and compare our results to data from other teleosts.

Results and discussion
The aim of this study was to characterize the gene organization and identify new genes potentially contributing to disease resistance in the two MHC class I regions of Atlantic salmon.

Characterization and sequencing of BAC clones
Sasa-UBA and TAP2 probes hybridized to 74 BAC clones, where 18 clones were positive for both probes. The 74 BAC clones were ordered into three contigs using restriction fragment analysis together with GRASP HindIII fingerprint information [27].
The two contigs that were positive for UBA, TAP2, PSMB9 and PSMB8 by southern hybridization, were tested for presence of a polymorphic dinucleotide repeat located in the 3'UTR of the UBA locus [3]. Only BAC clones from one of the two contigs gave PCR-products, thus this contig was defined as the IA region, and the other contig remained a candidate for the duplicated IB region. The BAC clones in the third contig hybridized to the UBA probe as well as a mixed UBA exon 2 probe. These clones also tested positive for a U-lineage ULA locus that has previously been found closely linked to UBA [5].
Three BACs were sequenced from the IA region. The BAC clones 92I04 and 714P22 indicated allelic variants based on variation in the UBA 3'UTR marker (data not shown) with 523M19 as a continuation of 714P22. From the duplicated IB region we chose 8I14, 424M17, 15L20 and 189M18 for sequencing. 30C23 was chosen as a candidate from the third contig and was extended 5 kb with the sequence of 868O01. The selected BAC clones were subcloned, sequenced, and assembled into continuous sequences. The Atlantic salmon IA region consisted of the BAC clones 30C23, 868O01, 92I04, 714P22 and 523M19 covering 502869 bp, while the IB region consisted of 8I14, 424M17, 15L20 and 189M18 totaling 522617 bp.

Gene organization of the Atlantic salmon MHC class I regions
We have adopted the nomenclature described by Shiina et al. [6] with IA covering the UBA locus region and IB for the duplicated region. Thus the genes identified in the regions will be named accordingly; the IA proteasome subunits are given an extension of a (PSMB9a) and the IB genes have an extension of b (PSMB9b). The previous symbol ABCB3 has been withdrawn for the transporter for antigen processing 2, so we have used the current symbol TAP2 [28].
The gene organization of the IA and IB MHC regions is shown in Fig. 1. A core region was identified in both regions which included MHC class I genes, together with the proteasome subunits genes PSMB8 (LMP7), PSMB10 (MECL-1), PSMB9-like (LMP2-δ), PSMB9 (LMP2) and TAP2. The gene order and orientation of the Atlantic salmon PSMBs and TAP2 was very similar to that found in rainbow trout and other teleost (Fig. 2). For the IA region, the main difference between Atlantic salmon and rainbow trout is that the rainbow trout PSMB8a gene is a pseudogene.
The IA region contained the major MHC class I Sasa-UBA locus and the recently reported Sasa-ULA locus residing approximately 50 kb upstream. The duplicated class IB region contained an MHC class I locus resembling the rainbow trout UCA locus, but although transcribed it was a pseudogene. No other MHC class I-like genes were detected in the two duplicated regions.
Outside the core region we found 12 Atlantic salmon orthologs of genes residing in the extended human MHC class II region. Alternative nomenclature for these genes is described in Table 1. The following genes were found in both the IA and IB regions; KIFC1, ZBTB22, DAXX, TAPBP, BRD2, COL11A2, RXRB and SLC39A7. Three orthologs found in the IA region only were RING1, RPS18 and VPS52. For other teleosts, the gene organization of the extended MHC class I region is partly known for zebrafish [9,16], fugu [10,15] and medaka [11,12]. The TAPBP, DAXX, ZBTB22 and KIFC1 genes are conserved in the same order in both fish and human (Fig. 2). As described in medaka we also found a gene for ZNF384 in the IA region, which is located on Chromosome 12 in human.
HSD17B8, which resides in between SLC39A7 and RING1 in the extended human class II region, was found in the IB region only and showed more than 81% identity towards counterparts in tilapia [Genbank:AAV74184], zebrafish [Genbank:CAK04961] and medaka [Genbank:BAB83840]. HSD17B8 has thus been deleted from the Atlantic salmon IA region as it is also present in other fish MHC class I regions (Fig. 2).
Three orthologs of genes located in the human class I region were identified in the IA region; TCF19, TUBB and FLOT1. Atlantic salmon tubulin is highly conserved and showed more than 94% identity towards mammalian counterparts. Another highly conserved gene is RPS18, which showed 98% identity towards mammalian sequences.
A gene that was predicted by DIGIT in the IA region had one EST match [Genbank:DW569240], but no homology to annotated proteins and is thus denoted unknown in Fig. 1. However, some sequence identity was found towards a protein in zebrafish located on chromosome 19 [Genbank:XP_001344849] as well as to a tetraodon nigroviridis protein [Genbank:CAF97811], which could indicate a molecule unique to teleosts.
In addition to the genes described above we identified genes for PVRL2, RT, VSHV-induced gene and a novel gene similar to a non-vertebrate chitin synthase protein that are not MHC linked in humans. The human PVRL2 is located on chromosome 19 (19q13.2-q13.4). A homologue of this gene is also found on zebrafish chromosome 19 [Genbank:XP_689425]. A 220 kb insertion was found in the IB region in between the RXRB and SLC39A7 genes containing three copies of a chitin synthase gene approx. 45 kb apart ( Figs. 1 and 3). Chitin synthase is involved in the synthesis of chitin, which is a main structural component of the fungal cell wall. A similar protein has also been identified in zebrafish [Genbank:CAK04859]. No chitin synthase genes were present in the IA region nor are chitin synthase genes found in any other teleost MHC regions suggestive of a single insertion of this gene in the IB region with two subsequent duplications (Fig. 2).
Most genes in both regions are supported by matching cDNAs apart from TCF19 and COL11A2 where no match has been found so far (Table 1). Other open reading frames were also identified, but were associated with transposon related repetitive elements.

Comparison of the IA and IB regions
The two allelic BACs 92I04 and 714P22 had an overall sequences identity of 99.2% over 124574 bp, with similar exon intron organization for all genes. The major differences between the two allelic regions resided in the UBA α2 and α3 exons and in differences in number of repeats (data not shown). Dotplot analysis of 714P22 or 92I04 against themselves showed no extended regions of local similarity, with the exception of the TAP2 region which showed similarity due to a duplicated TAP2 exon 11 (data not shown).
A dot plot analysis of more than 500 kb of the IA and IB regions showed four regions with high sequence similarity consisting of subregion one containing genes for KIFC1 to TAPBP, subregion two ranging from PSMB8 to TAP2, subregion three covering BRD2 to RXRB and subregion four containing SLC39A7 (Fig. 3). The conserved regions in IA and IB have 82.5% identity over 136104 bp. In total, repeats constituted approximately 24% of the sequence in both regions, and 17% of the repeats were fish-specific DNA elements.

MHC Class I genes
Sasa-UBA The promoter, leader and α1 exons of Sasa-UBA were identified in 30C23/868O01, while the remaining exons of Sasa-UBA were found in 92I04 and 714P22. The leader and α1 exons found in 30C23/868O01 were identical to the PCR amplified UBA*0201 allele [Genbank:AF504023] as well as to the leader and α1 exons of another salmon full-length cDNA [Genbank:DY698957]. Together with the α2 and α3 exons of 92I04 they collectively provide a bona fida UBA*0201 allele. The UBA α2 exon and downstream sequences of the two allelic BACs 92I04 and 714P22 have complete sequence identity to the Sasa-UBA*0201/*0301 and Sasa-UBA*0601 alleles respectively. UBA*0201 and UBA*0301 are prime examples of the recombination shown to occur within intron 2 of salmonid UBA alleles [4] showing complete sequence identity in the α2 and downstream regions, but highly divergent α1 exons. The predicted amino acid sequences of UBA, ULA and the two open reading frames of UCAψ were aligned for comparison of the MHC class I genes encoded in the two regions (Fig. 4).

Sasa MHC IB region 522617bp
Analysis of the promoter sequence of UBA*0201 in 30C23/868O01 showed high similarity to a rainbow trout UBA*1501 promoter [29], both containing similar regulatory elements typical for MHC class I promoters such as an interferon stimulated response element (ISRE), W/S-box and enhancer B (enhB) (Fig. 5). The UBA*0201 promoter contains a potential site α element according to the core sequence (TGACGC) [30] while a sequence more resembling an X2-box has been found in rainbow trout (TGAG-GCA). Both the site α and the homologous X2-box found in mammalian MHC class I and MHC class II promoters respectively, are involved in regulation of transcription and bind ATF/CREB family transcription factors [31]. A potential TATA-box was also identified in the promoter sequence of UBA*0201 [32]. A salmon UBA*0301 promoter published by Jorgensen et al. [33] had lower sequence identity to the UBA*0201 promoter, but both promoters are supported by complete identity to 5'UTR cDNA sequences of bona fida UBA alleles suggesting that Atlantic salmon UBA alleles have different promoters. The functional consequences of these differences are being investigated.
The 30C23/868O01 and 92I04 BACs jointly have an intron sequence of 7 kb while in rainbow trout, the intron between the UBA α1 and α2 exons is 18 kb [6] suggesting we lack approximately 11 kb to have a continuous genomic sequence of the entire UBA region. PCR and cloning of the gap was performed multiple times, but despite successful PCR amplification no fragments covering the gap have been cloned suggestive of an unclonable region. The amplified products support an intron sequence of approx. 18 kb. Unfortunately no mRNA or cDNA is available from the BAC library fish, preventing Comparison of the human, Atlantic salmon, rainbow trout, medaka, zebrafish and fugu MHC class I Figure 2 Comparison of the human, Atlantic salmon, rainbow trout, medaka, zebrafish and fugu MHC class I. Color code: red is MHC class I genes, orange is MHC class II genes, yellow is TAP genes, green is proteasome genes, blue is human extended MHC class II region genes, purple is human class I region genes, grey is non-human class I region genes, black is genes unique to the medaka HN1 strain [12]. Pseudogenes are striped. Human class III region genes are not shown. References are: zebrafish [9,16], fugu [10], medaka [11,12], rainbow trout [6] and human [1].

Atlantic salmon IA
verification of expressed UBA alleles in this animal. To verify the linkage between 92I04 and 30C23 fluorescent in situ hybridization was undertaken and showed that both BACs hybridized to the same region of one of the smallest chromosomes, potentially chromosome 27 (Fig.  6). The close linkage described by Miller et al. [5] between ULA and UBA also supported 30C23/868O01 being an extension of the IA region.
Another EST in the cGRASP database [34,35] provided us with a full-length match [Genbank:DY699730]. The exon encoding the transmembrane domain is missing, suggestive of a secreted MHC class I molecule (Fig. 4). Similar secretory class I molecules are also found for human class I molecules and the potential role of secretory HLA-G is currently being deciphered and holds promise for an interesting function. The 30C23 ULA gene has an α1 exon with highest sequence identity to UBA*0301 while α2 and downstream exons have highest identity to UBA*0801.
No ESTs for ULA have been identified in rainbow trout, and a negative PCR-based survey for this gene in rainbow trout by Miller et al. [5] suggest this gene may be unique to Atlantic salmon.

Sasa-UCAψ
Only one MHC class I locus was identified in the four BACs representing the IB region. This locus found in 8I14 showed highest sequence identity to the Onmy UCA*0301 allele and was thus denoted Sasa-UCA. Multiple salmon ESTs with a polymorphic pattern resembling that of Onmy-UCA sequences were found in databases. However, both the 8I14 UCA ORF sequence as well as matching ESTs (Table 1), contained an internal stop codon in exon 3 making Sasa-UCA an expressed pseudogene. The exon intron organization of the UBA, ULA and UCAψ loci are quite similar apart from the enlarged first intron in ULA, the even larger second intron of UBA and the missing transmembrane exon of ULA (Fig. 7).

Antigen presenting genes
Previously reported cDNAs for TAP2, which were assumed to reside in the IA and IB region and denoted TAP2B [Gen-  (Fig. 8).
Atlantic salmon IA and IB TAP2 sequences have more than 90% aa sequence identity and a similar identity to the rainbow trout IA and IB TAP2 sequences described by Shiina et al. [6].
Other TAP2 ESTs were also found in databases, which were difficult to define as TAP2a or TAP2b variants such as the GraspTAP2-1 in Fig. 8. Attempts to decipher locus origin including rainbow trout information shows that the TAP2a and TAP2b sequences described by Shiina et al. [6] resembles the TAP2b sequence identified in Atlantic salmon containing for instance the characteristic FCA motif at position 25 and the two aa deletion at position 110 (Fig. 8). A rainbow trout TAP2a (previously denoted TAP2B) [Genbank:AAD53035] sequence described by Hansen et al. [37], shown by in situ hybridization to reside in the IA region [8], resembles the Atlantic salmon IA TAP2a sequence and does not contain these motifs mentioned above. Thus, rainbow trout has a polymorphic TAP2a locus and the confusing sequence identities between the two TAP2 loci may suggest that these genes are exposed to recombination or gene conversion mechanisms. Locus designation of either salmon or trout TAP2 sequences therefore can not be performed on sequence alone, but must be verified by linkage mapping. Other more divergent Atlantic salmon TAP2 ESTs [Gen-Dot-plot analysis of the Atlantic salmon MHC IA and IB regions bank:DW580644 and Genbank:DW577601] have approx. 50% sequence identity to all above described IA and IB TAP2 sequences (GraspTAP2-2 in Fig. 8), but has 94% sequence identity to a rainbow trout TAP2 variant described by Hansen et al. [37] (previously denoted TAP2A) [Genbank:AF115537]. If these sequences represent an additional TAP2 locus, i.e. a TAP2c locus, or are allelic variants of the TAP2a/b loci is currently unknown. Ancient lineages of divergent MHC class I, TAP1, TAP2 and LMP7 haplotypes have been described in Xenopus where the sequence identity between allelic TAP2s was less than 76% [38]. Similar ancient lineages of UBA and TAP2a may also exist in salmonids, where we were unfortunate enough to sequence allelic variants belonging to similar lineages.

Salmonid MHC evolution and function
In the Atlantic salmon IB region we found only one MHC class I pseudo locus denoted UCAΨ, which is still being transcribed and shows a polymorphic pattern similar to that of rainbow trout UCA and UDA [42]. The rainbow trout IB region contained four MHC class I loci denoted UCA, UDA, UEA and UFAΨ [6]. As suggested by Shiina et al. [6] there has been a primordial salmonid MHC region containing three MHC class I loci (UCA-, UEA -and UBAlike) where UEA and UBA have been deleted from the Atlantic salmon IB region and UCA and UEA have been deleted from the Atlantic salmon IA region. The trout IB UDA locus is a duplication of UCA that occured in trout only. Once the extended trout IA region is sequenced we will see if the UBA to ULA duplication occurred in both species and if the UCA and UEA homologues have been retained in this region of trout.
The salmonid whole-genome duplication was estimated to have occurred between 25 and 125 mya [18] while the study of Shiina et al. [6] estimate the duplication to have occurred 60 mya based on sequence identity of the MHC class I regions. Evolving from a tetraploid to a diploid state includes not only accumulation of mutations, but also random rearrangements and recombinations as exemplified by the multiple deletions that have occurred in the Atlantic salmon IA and IB regions. With a sequence identity between the Atlantic salmon IA and IB regions of approximately 82 percent, recombination may even be occurring between the two duplicates today. Salmonids are also known for using recombination within the second intron of the UBA locus to generate "new" alleles using exons already tested for functionality [2][3][4]. As recombination was not observed in 800 siblings [43] the recombination frequency is probably low. One way of reducing the risk of recombination between duplicates may be insertions such as the 220 kb insertion with three copies of chitin synthase genes in the IB region.
Another example of differences between Atlantic salmon and rainbow trout is the chromosomal location of the IA region. In both species, the IB region is located on chromosome 14 [6,8] (data not shown for salmon), while the IA region is located on chromosome 18 in rainbow trout and on one of the smaller chromosomes, potentially chromosome 27, in Atlantic salmon (Fig. 6) [8,44]. In Atlantic salmon, the IA and IB regions map to linkage groups 15 and 3 respectively [45], while in rainbow trout they map to linkage groups 16 and 3 [20] supporting the differences. Atlantic salmon and rainbow trout have diploid chromosome numbers ranging from 58 to 64 [46,47].

Sasa UBA*0201 ATACAAGCAC-----ATTATATCTTGGT------------GGGAAACA-TGTTTTAAGGTCAT---TTGAGCC-TACATCTCAAGAGCGAAAGTT
salmon and IPNV in rainbow trout where the polymorphic UCA, UDA or UEA loci were suggested as prime candidates for the observed effects [23,24]. As our study indicates that the Atlantic salmon IB region only contains a UCA pseudolocus, there must either be other genes flanking our BACs which contribute to resistance or there could be haplotype variation in number of class I loci between Norwegian and Canadian Atlantic salmon.
The IA region was not found associated with resistance towards IHNV in Atlantic salmon nor IPNV in rainbow trout. Atlantic salmon UBA genotypes have however been shown to provide resistance towards the viral pathogen causing Infectious Salmon Anaemia (ISA) [21,22]. An ongoing study will identify the role of Atlantic salmon IA and IB in providing resistance towards IPNV, enabling us to decipher between differences in pathogens versus genetic organization. Apart from the potential TAP2a and UBA lineages, limited polymorphism in PSMBs and other linked loci suggest that the observed linkage between Sasa-UBA and disease resistance in Norwegian Atlantic salmon [21,22] is caused by Sasa-UBA alleles or genotypes and not closely linked genes. However, the PSMBs and TAP2 molecules residing in the IB region might still influence the overall peptide repertoire available for presentation by UBA alleles. Due to the pseudo status of Sasa-UCA, the PSMBs and TAP2B in the IB region will most likely devolve over time.

Conclusion
We have characterized the gene organization of more than 500 kb of the two duplicated MHC regions in Atlantic salmon. Although Atlantic salmon and rainbow trout are closely related, the gene organization of their IB region has undergone extensive gene rearrangements. The Atlantic salmon had only one identified MHC class I UCA pseudo gene in the IB region while this region in trout contained the four MHC class I loci UCA, UDA, UEA and UFAψ . The Atlantic salmon IB region also contained a 220 kb insertion as compared to the IA region potentially limiting recombination between the two regions. The large difference in gene content and most likely function of salmon and trout class IB regions clearly argues that sequencing of salmon will not necessarily provide information relevant for trout and vice versa.

Characterization of BACs
MHC class I positive BAC clones were ordered into contigs using restriction fragment analysis together with GRASP HindIII fingerprint information [27]. Southern blot analysis of NotI (NEB) and NruI (NEB) digested BAC DNA was performed to characterize the clones. The digested DNA was electrophoresed for 16 h and then transferred to Hybond membranes (Amersham). The MHC class I and TAP2 probes described earlier together with probes for PSMB8 and PSMB9 (unpublished data), were used for hybridization to the southern blots. A mixed probe containing 5 UBA leader to alpha1 exons amplified from the alleles UBA*0201, *0301, *0801, *0901 and *1001 cDNAs (primers listed in Table 3) was also used. Hybridization with end-labeled Sp6 and T7 oligos were used to orient end-fragments of BAC inserts.
Blots were prehybridized at 65°C for 30 minutes in hybridization buffer (5× SSC, 5× Denhardt's solution and 1% SDS) with. This was followed by replacement with fresh, preheated (65°C) hybridization buffer and the addition of the radio labeled probes. Hybridization was allowed to proceed overnight. Following hybridization, the membranes were washed three times with 20 ml of 2× SSC, 0.1% SDS at 65°C for 30 min. Prehybridization, hybridization and wash conditions were the same for all probes. To further characterize the BACs we used primers spanning a polymorphic (CA)n repeat located in the 3'UTR of the UBA locus [3] both on individual BAC DNA as well as on genomic DNA from the animal the library was made from. PCR on genomic DNA from the BAC library animal was performed with GAP-primers (Table 3) with Herculase Enhanced polymerase (Stratagene) according to protocol. Amplified products were ligated into the three different vectors using TOPO-TA Cloning Kit with pCR2.1-TOPO (Invitrogen), TOPO-XL PCR Cloning Kit with pCR-XL-TOPO (Invitrogen) and CloneSmart LCKan Blunt Cloning Kit with pSMART LCKan (Lucigen Corporation) and subsequently transformed into XL-10 Gold cells (Stratagene).