Refined physical map of the human PAX2/HOX11/NFKB2 cancer gene region at 10q24 and relocalization of the HPV6AI1 viral integration site to 14q13.3-q21.1

Background Chromosome band 10q24 is a gene-rich domain and host to a number of cancer, developmental, and neurological genes. Recurring translocations, deletions and mutations involving this chromosome band have been observed in different human cancers and other disease conditions, but the precise identification of breakpoint sites, and detailed characterization of the genetic basis and mechanisms which underlie many of these rearrangements has yet to be resolved. Towards this end it is vital to establish a definitive genetic map of this region, which to date has shown considerable volatility through time in published works of scientific journals, within different builds of the same international genomic database, and across the differently constructed databases. Results Using a combination of chromosome and interphase fluorescent in situ hybridization (FISH), BAC end-sequencing and genomic database analysis we present a physical map showing that the order and chromosomal orientation of selected genes within 10q24 is CEN-CYP2C9-PAX2-HOX11-NFKB2-TEL. Our analysis has resolved the orientation of an otherwise dynamically evolving assembly of larger contigs upstream of this region, and in so doing verifies the order and orientation of a further 9 cancer-related genes and GOT1. This study further shows that the previously reported human papillomavirus type 6a DNA integration site HPV6AI1 does not map to 10q24, but that it maps at the interface of chromosome bands 14q13.3-q21.1. Conclusions This revised map will allow more precise localization of chromosome rearrangements involving chromosome band 10q24, and will serve as a useful baseline to better understand the molecular aetiology of chromosomal instability in this region. In particular, the relocation of HPV6AI1 is important to report because this HPV6a integration site, originally isolated from a tonsillar carcinoma, was shown to be rearranged in other HPV6a-related malignancies, including 2 of 25 genital condylomas, and 2 of 7 head and neck tumors tested. Our finding shifts the focus of this genomic interest from 10q24 to the chromosome 14 site.

malignancies, including 2 of 25 genital condylomas, and 2 of 7 head and neck tumors tested. Our finding shifts the focus of this genomic interest from 10q24 to the chromosome 14 site.

Background
The genomic interval that spans chromosome bands 10q24-q25 contains a high frequency of structural chromosome aberrations associated with cancer and development. However, existing published maps around the PAX2/HOX11/NFKB2 region of 10q24 have differed with respect to marker and gene content, their relative order and chromosomal orientation [1][2][3][4]. More recent inconsistencies in the positioning of selected genes with respect to known markers have also been evident within the context of the human genome databases ( Figure 1). It is essential to have available an accurate physical map of this region so that disease-related chromosome aberrations can be precisely localized, and to better understand the interplay of regulatory elements associated with genes that map within this domain.
The 10q24 region harbors several loci of fundamental biological importance, including the HOX11 (TLX1,TCL3), PAX2 and NFKB2/LYT-10 genes. HOX11 is an orphan homeobox gene that maps outside of the four recognized homeobox gene clusters. It encodes a DNA-binding nuclear transcription factor [5], and is involved in spleen organogenesis [6,7]. About 5% of T-cell acute lymphoblastic leukaemias (T-ALL) show chromosome translocations involving break-prone regions immediately upstream of the HOX11 gene at 10q24 and the TCR genes at 7q35 or 14q11. HOX11 is activated by TCR α/δ regulatory elements as a result of these rearrangements, and inappropriately expresses a 2.3 kb transcript [8][9][10][11][12][13]. PAX2 is important during embryogenesis, and is aberrantly expressed in renal, breast and prostate tumors [14][15][16]. Mutations in PAX2 are also associated with defects in eye, ear, urogenital tract and CNS development [17,18]. The nuclear factor kappa-B 2 (NFKB2) gene is a member of the NFKB/Rel gene family, the signaling pathways of which are recognized as pivotal to the regulation of acute inflammatory and immune responses, and increasingly to tumor development [19,20]. Translocations or structural alterations of NFKB2 (alias LYT-10) have been associated with 2% of various lymphoid malignancies, and usually result in the partial or total deletion of the carboxyl-terminal region encoding the ankyrin domain [21,22].
As one part of our investigations of a novel leukaemia-related chromosomal translocation (Gough et al, unpublished), we have used a combination of chromosome and interphase FISH, BAC end-sequencing, PCR and genome database analysis to map the order and chromosomal orientation of selected cancer-related genes and the HPV6a viral integration site HPV6AI1 within 10q24. Here we present new data that confirms the order CEN-CYP2C9-PAX2-HOX11-NFKB2-TEL. This finding resolves the physical arrangement of larger DNA sequence contigs containing these and at least 9 other genes which have so far been implicated in different cancer subtypes. Our studies further show that HPV6AI1 does not map within 10q24, but at a different chromosomal site, namely at the interface of chromosome bands 14q13.3-q21.1.

Two-Color FISH Establishes Order and Orientation of Genes and Markers at 10q24
Selected large-insert clones (refer Methods) were biotinor digoxygenin-labeled, pooled in combinations of three, then hybridized to G1-arrested interphase nuclei. For each experiment, after signal amplification using FITC-(green) or Texas Red-(red) tagged immunofluorescent reagents, the order of signals from at least 55 chromosomes, and usually 100, was recorded. After exclusion of chromosomes showing ambiguous signal patterns, the predominant order of signals was taken to represent the physical order of the probes along the chromosome (Table 1, Figure 2a,2b,2c,2d,2e,2f,2g,2h,2i,2j,2k,2l). For many combinations, the same three probes were pooled and hybridized twice. For the first of these paired experiments, two probes were labeled with biotin and a third with digoxygenin. For the second, the probe label and detection color was reversed for two of the probes. By this alternation, and through a sequence of different probe combinations, a final arrangement of genes and markers across the 10q24 region was derived (Table 1, Figures 1d,2). Genes and sites relevant to reagents used in this study are shown in bold or in grey-filled boxes, and if not marked, were not found in that particular Build. Also indicated are key STS sites common between previously published maps of this region [1][2][3][4] (C,D). For A and B, only D10S571, D10S1266, LGI1, GOT1 and WNT8B are shown, in addition to markers relevant to the reagents of this study (if found) to illustrate major rearrangements of database contigs within this region through the different NCBI Builds. The locations of 7 additional cancer-related genes are indicated in C and D, with the relative locations of clones used in the present study shown aligned with the recently released Build 31 (D) (see Results for further details). Arrowheads above maps B-D correspond to cancer-related chromosome breakpoint sites associated with disruption of the LGI1 gene [23,46], HOX11 [8][9][10][11][12][13] and NFKB2/LYT-10 [21,22]. The cancer-related gene MXI1 [30]

Figure 2
Metaphase (c,j) and interphase (a,b,d-i,k,l) FISH analysis determines order and relative orientation of selected cancer-related gene probes within 10q24 (m). For interphase figures (a-l), locus identities are indicated in text beneath each representative image, with text color corresponding to the FISH detection color used. Observed frequencies (%) of the probe orders shown are also indicated. Note that the triangular pattern of probes visible on one homologue in a) was classified ambiguous.
BAC clones 10E3, 10D14 and 10D15 were each first hybridized to normal metaphase chromosomes and confirmed to map to 10q24 (data not shown). Interphase FISH experiments subsequently revealed the clone order 10D15-10D14-10E3 (Table 1, Figure 2a,2b). The chromosomal orientation of this map was determined when the three pooled biotinylated BAC clones were shown to map centromeric of the digoxygenylated YAC clone 29CD5 (PAX2) on the majority of metaphase chromosomes 10 analyzed (Figure 2c, left), and when BAC 10D15 was similarly shown to map centromeric of BAC 165f21 (PAX2) (Figure 2c, right). The final centromere-telomere orientation of the BAC clones relative to PAX2 was established when BACs 10E3 and 10D14 were hybridized in combination with YAC 29CD5 (PAX2) or BAC 165f21 (PAX2) to interphase nuclei in a series of three different experiments (Table 1, Figure 2d,2e,2f). From this combined data we concluded the preliminary map order centromere-10D15-10D14-10E3-29CD5/165f21/PAX2-telomere.

Chromosome 10 clone CosC2 does not contain the human papillomavirus integration site HPV6AI1 which instead maps to chromosome 14
The close proximity of CosC2 and λSh3F (HOX11) probes, indicated by a high percentage of overlapping red and green signals (Figure 2i and data not shown), was

Red-Gr-Red Red-Red-Gr Red-Gr-Gr Gr-Red-Gr
noteworthy in this analysis and suggested that the HPV6AI1 integration site (CosC2) maps closely centromeric and probably <200 kb of HOX11 [43,44]  From these studies we conclude that CosC2 does not contain the 2746 bp viral integration site HPV6AI1 of NCBI Accession X77607, but that this site instead maps to the chromosome band 14q13.3-q21.1 interface. We are unable to explain the 209 bp of sequence that is present on chromosome 14q but not in X77607 (Figure 3c). The chromosome 14 "insertion" occurs within a L1PREC2 nontransposable repeat element, and may correspond to an insertion/deletion polymorphism although sequence artifact cannot be excluded.

Discussion
Using interphase FISH we have determined the relative order of selected cancer-related genes that map within chromosome band 10q24. This analysis, complemented by BAC end-sequencing and genomic database analysis, has also established the chromosomal orientation of NCBI nucleotide contigs that contain these genes, and which span ~11 Mb. The gene and marker order that we have determined is centromere-BAC10D15-BAC10D14/CYP2C9-  There have been several published studies designed to clarify the physical map across 10q24 [1], although none have specifically focussed on the order and orientation of cancer-related genes or sites of inter-and intra-chromosomal rearrangement involving this region. In one of these reports, Gray et al applied a combination of metaphase FISH (to confirm 10q24 map location) and PCR amplification of microsatellite, STS and other known gene markers to the analysis of a series of overlapping YACs spanning ~15 Mb of DNA [1]. Of particular interest, Gray et al noted a high frequency of rearranged YACs (4/5 analysed) distal to D10S574 and incorporating the GOT1, WNT8B and PAX2 gene cluster. That feature, combined with the inability to isolate YAC clones immediately telomeric of this region, a known BrdU-inducible fragile site in 10q24-q25, and chromosome translocation breakpoint clusters associated with ALL and other lymphoproliferative neoplasms is suggestive of inherent instability of this chromosomal segment [1]. To date, although chromosomal imbalances involving this region are prevalent across a wide variety of different cancer subtypes, the genomic basis for this instability has yet to be explained.
In the separate study of Nikali and colleagues, which was designed to refine the map location of the IOSCA locus at 10q24, a similar map configuration was derived [2]. For this study, radiation hybrid analysis was used to determine the order of selected microsatellite and other known STS or gene markers, including PAX2, across a >500 kb region between D10S198 and D10S222 [2]. Fibre-FISH was also used to confirm the order and orientation of selected clones within the IOSCA region but did not confirm chromosomal orientation of the contig under study [2].
Neither of the above two studies, nor others reported subsequently [4,[54][55][56], have provided unequivocal evidence of the centromere-telomere orientation of PAX2, HOX11 and NFKB2/LYT-10 within the 10q24 region that we have targeted. Nor do they include analyses that convincingly demonstrate the relative location of these and other cancer-related genes across this region. Most published investigations to date have used YAC clones and microsatellite analysis to map the 10q24 region, or have referenced maps constructed with YACs. However, the reported propensity for YACs derived from this region to undergo internal rearrangement and deletions may have contributed to map inconsistencies. The information we have presented is based mostly on smaller insert BAC, cosmid and bacteriophage clones. Our clarification of the 10q24 physical map is complementary to the International Human Genome Sequencing Consortium BAC-derived assemblies hosted by the NCBI, and will enable more precise mapping of chromosome translocation breakpoints or other structural rearrangements associated with malignant and nonmalignant disease conditions.
Our FISH studies show that the cosmid clone CosC2 maps close, centromeric and upstream of the homeobox gene HOX11. This finding was of interest because CosC2 reportedly contains the HPV6a viral integration site HPV6AI1 previously associated with an infiltrating squamous cell carcinoma of the tonsil [42], and suggested that deregulation of HOX11 by the integration process may have been implicated in this malignancy. However, genomic database analysis of sequences that did not derive specifically from the germline CosC2 insert but which directly flank the previously described HPV6a integration site indicates that HPV6AI1 maps not to 10q24 but to the interface of chromosome band 14q13.3-q21.1. Further database analysis showed that HPV6a integration occurred within a L1PB4 nontransposable repeat, and be- Alu Sb2 f R-GGCCGGACTGCGGACTGCA 65 1.5 a, F = forward, R = reverse; b, as represented in Figure 3; c,d derived from NCBI Accession No X77607 and NT_025892.8, respectively; e,f derived from Alu-repeat consensus and AluYb8 (Sb2) subfamily-specific consensus sequences, respectively (Labuda et al, unpublished).
tween the predicted coding domains of FLJ30803 and LOC122529 (data not shown). This location was confirmed by subsequent PCR and somatic cell hybrid investigations. We presume that the cosmid clone C2 was spuriously selected during the original screening process, for which a probe derived from the sequenced region X77607 was used [42].
The human papillomaviruses (HPVs) are a diverse family that type-specifically infects distinct subsets of epithelial cells. The HPV genome persists as an extrachromosomal episome in non-or premalignant tissues, whereas in invasive cancers, the viral DNA typically integrates into the host genome. This event has been shown to increase stability of HPV16 E6 and E7 mRNAs [57]. Recent evidence suggests that HPV integration sites associated with malignancy are distributed widely across the human genome rather than preferentially localized [58][59][60]. However, there is a recognized propensity for HPV DNA integration to occur in or near fragile sites, translocation breakpoint sites, oncogenes or the coding regions of as yet poorly characterized genes that may have relevance to cancer progression [59][60][61][62][63][64][65]. Although HPV6AI1 does not appear to interrupt a coding domain, it is of interest that the HPV6AI1 site falls within a repeat element and close to a possible insertion/deletion polymorphism site on chromosome 14. These findings are consistent with the highly recombinogenic/unstable regions in which papillomavirus integrations are predicted to occur.
Previously published studies of the HPV6a integration site we have relocalized showed that the corresponding normal allele was not present in the originally characterized tonsillar carcinoma cells [42]. Rearrangements in the region of the HPV6a integration site were also implicated by Southern blot analysis of a limited number of other tumors, including 2 of 25 genital condylomas, and 2 (1 tonsillar, 1 hypopharnyx) of 7 head and neck tumors tested [42]. Together, these findings were suggestive that the HPV6a genome had disrupted a novel cancer-related gene or gene regulatory region at 10q24. However, analyses reported here indicate that this site instead maps to 14q13.3-q21.1, and that loci FLJ30803 and LOC122529 may be two such candidate genes. Further investigations of the chromosome 14 integration site are necessary to verify its functional relevance to HPV-related malignancies.

Conclusions
We have applied chromosome and interphase fluorescent in situ hybridization (FISH), BAC end-sequencing, PCR and genome database analysis to map the order and chromosomal orientation of selected cancer-related genes within chromosome band 10q24, and the human papillomavirus Type 6a (HPV6a) viral integration site HPV6AI1. The 10q24 region is already known to harbor several well studied cancer-related genes, but published cytogenetic, comparative genomic hybridization, allelotyping and other molecular studies suggest that yet to be identified genes relevant to cancer initiation and progression map within this domain. Towards this end it is vital to establish a definitive genetic map of this region, which to date has shown considerable volatility through time in published works of scientific journals, within different builds of the same international genomic database, and across the differently constructed databases. This volatility has proved an ongoing frustration for positional cloning efforts such as those we are pursuing, and it is for this reason that we found the present study necessary. We describe new data that confirms the order CEN-CYP2C9-PAX2-HOX11-NFKB2-TEL. This finding resolves the physical arrangement of larger DNA sequence contigs containing these and at least 10 other genes which have so far been implicated in different cancer subtypes. Furthermore, our studies show that the viral integration site HPV6AI1 does not map within 10q24, but at a different chromosomal site, namely at the interface of chromosome bands 14q13.3-q21.1. This is a significant finding and important to report because the HPV6AI1 integration site, originally isolated from a tonsillar carcinoma, was shown to be rearranged in other HPV6a-related malignancies, including 2 of 25 genital condylomas, and 2 of 7 head and neck tumors tested (Kahn et al, 1994, Cancer Res. 54: 1305-1312). Our finding shifts the focus of this genomic interest from 10q24 to a site on chromosome 14 that may potentially harbor a gene with an as yet undetermined role in the etiology of carcinogenesis.

Genomic clones
Recombinant YAC, BAC and bacteriophage lambda clones were used to determine the relative order and orientation of the CYP2C9, NFKB2/LYT-10, HOX11 and PAX2 gene loci, and the HPV6AI1 integration site. BAC clones 10D15, 10D14 and 10E3 were selected from the Cedars-Sinai Medical Center Genomic Reagents Resource database http://www.csmc.edu/genetics/korenberg/intbac-sts.html based on their preliminary map location at 10q24 [66]. BAC clone 165f21, which contains the PAX2 gene, was kindly gifted by Dr K. Nikali (National Public Health Institute, Helsinki, Finland), and YAC clone 29CD5 was isolated from the ICRF YAC library using PCR primers specific for PAX2 (Eccles et al, unpublished). Genomic lambda (λ) clones of PAX2 (λPAX2) and HOX11 (λSh3F) were also isolated from a genomic liver library constructed in λGem-11 (Promega, Madison, WI). Radioactively labelled probes used to screen the λ library as previously described were pG3a [67] and Sh3F [9] for PAX2 and HOX11, respectively. The phage clone λlyt-10 contains a 12 kb genomic fragment including the entire NFKB2/LYT-10 coding sequence [68]. The previously described cosmid clone CosC2 is reported to contain the viral integration site HPV6AI1 [42].

Fluorescent In Situ Hybridization (FISH)
Procedures used, including preparation of metaphase or interphase nuclei from peripheral blood or G1 fibroblasts, slide pre-treatment, probe labeling with biotin or digoxygenin, repeat suppression, hybridization parameters, and single-or two-color immuno-fluorescent detection of labeled probes, were essentially as described [69,70]. Fluorescent images from metaphase or interphase cells were captured at selective bandwidths to computer using a Leitz Aristoplan microscope fitted with a Photometrics KAF1400 CCD camera and QUIPS Smartcapture software (version 1.3; Vysis Inc, Downers Grove, IL, USA). Probes were mapped precisely to chromosome bands in metaphase cells, or relative to each other in interphase cells, using color-joined DAPI, FITC and Texas Red images, and QUIPS CGH/Karyotyping software (version 3.0.2). To avoid damaged cells, metaphase and G1 interphase nuclei were initially selected for analysis under a DAPI-selective filter. G1 interphase nuclei were subsequently scored only if red or green signals corresponding to all probes hybridized were distinguishable on both homologues.

Polymerase Chain Reaction (PCR)
Primer sequences designed to amplify HPV6AI1, the homologous region on chromosome 14q and a control internal Alu-specific 260 bp fragment are as listed in Table  2 Table 2), 30 sec, extension 72°C, 30 sec], the PCR products were electrophoresed in 1.5% agarose gels and visualized by ethidium bromide staining. Products were transferred to Hybond N + nylon membrane (Amersham, Buckinghamshire, UK) and hybridized with Rediprime random-prime 32 P-labeled probes (Amersham) following the manufacturer's instructions. Control human genomic DNA for PCR studies was extracted from healthy subjects using methods previously described [71].

Somatic Cell Hybrid Analysis
A Southern blot containing PstI digested genomic DNA from a human-rodent somatic cell hybrid panel was obtained from Oncor (Gaithersburg, MD) and hybridized with 32 P-dCTP-labeled DNA fragments as outlined above.

BAC DNA End-Sequencing
End-sequencing of BAC clones 10D14, 10D15 and 10E3 (cloning vector pBAC108L; Shizuya et al., 1992) was per-formed using ThermoSequenase (Amersham) and a universal T7 IRD41-labeled primer (MWG Biotech, Germany) essentially according to the manufacturers recommendations. Reactions were separated on a LICOR 4000L semi-automated DNA sequencer, and nucleotide sequences subsequently submitted to BLASTN for homology determination.

Genomic Database Resources
Genomic database resources utilized early in this study include the GTC Integrated STS respectively, were additional sources of data complementary to our findings presented in Figure 1. Repeat sequences were identified within the BAC end-sequences and in the HPV6AI1 integration site using CENSOR http://www.girinst.org/Censor_Server.html prior to BLASTN homology searches http://ncbi.nlm.nih.gov/BLAST.

Authors contributions
SG performed most of the FISH experiments, contributed to experimental design, collected and collated most of the final data, and contributed significantly to the final draft of the manuscript including preparation of the final tables and figures; MMcD assisted with FISH experiments including hybridisations and analysis; XNC and JK provided BAC clones 10D15, 10D14 and 10E3 and offered guidance to the culture and purification of DNA from these clones; AN provided the λlyt-10 clone; TK provided the cosmid clone CosC2, and assisted with final interpretation of the revised HPV6a viral integration site findings; ME provided the YAC clone 29CD5, purifed DNA from λPAX2 and λSh3F, and sourced the BAC clone 165f21; CM first conceptualised the study, participated in its design and coordination, and assisted with preparation of the final manuscript. All authors have read and approved the final manuscript.