Genomic organization and alternative splicing of the human and mouse RPTPρ genes

Background Receptor protein tyrosine phosphatase rho (RPTPρ, gene symbol PTPRT) is a member of the type IIB RPTP family. These transmembrane molecules have been linked to signal transduction, cell adhesion and neurite extension. The extracellular segment contains MAM, Ig-like and fibronectin type III domains, and the intracellular segment contains two phosphatase domains. The human RPTPρ gene is located on chromosome 20q12-13.1, and the mouse gene is located on a syntenic region of chromosome 2. RPTPρ expression is restricted to the central nervous system. Results The cloning of the mouse cDNA, identification of alternatively spliced exons, detection of an 8 kb 3'-UTR, and the genomic organization of human and mouse RPTPρ genes are described. The two genes are comprised of at least 33 exons. Both RPTPρ genes span over 1 Mbp and are the largest RPTP genes characterized. Exons encoding the extracellular segment through the intracellular juxtamembrane 'wedge' region are widely spaced, with introns ranging from 9.7 to 303.7 kb. In contrast, exons encoding the two phosphatase domains are more tightly clustered, with 15 exons spanning ∼ 60 kb, and introns ranging in size from 0.6 kb to 13.1 kb. Phase 0 introns predominate in the intracellular, and phase 1 in the extracellular segment. Conclusions We report the first genomic characterization of a RPTP type IIB gene. Alternatively spliced variants may result in different RPTPρ isoforms. Our findings suggest that RPTPρ extracellular and intracellular segments originated as separate modular proteins that fused into a single transmembrane molecule during a later evolutionary period.


Background
Protein tyrosine phosphorylation regulates many important cellular functions including signal transduction, growth, differentiation, cell adhesion and axon guidance. The balance between protein tyrosine kinase and phosphatase activity is an integral part of this regulatory mechanism. A large number of protein tyrosine phosphatases have been identified, which fall into the broad categories of cytoplasmic and receptor-like molecules. All receptor-like protein tyrosine phosphatases (RPTPs) contain an extracellular region, a single transmembrane segment and at least one intracellular catalytic domain. They have been subdivided into several classes based on the structure of their extracellular segments (Figure 1). A combination of immunoglobulin-like (Ig) domains and fibronectin type III (FN-III) repeats in the ectodomain defines the type II class of RPTPs. An additional feature of type II RPTPs is a potential proteolytic cleavage site within the membrane-proximal FN-III repeat. Upon cleavage, extracellular N-terminal and predominantly intracellular, membrane bound C-terminal segments are generated, which remain non-covalently associated [1] A subset of the type II class, identified previously as type IIB RPTPs [2], is characterized by the presence of an Nterminal MAM domain.
RPTPρ is the most recently isolated member of the IIB family [6,7]. Northern blot and in situ hybridization studies have shown that RPTPρ is largely restricted to the central nervous system [6]. Within the CNS, expression is developmentally regulated and, in the mouse, delineates a unique boundary region in the granule cell layer of the cerebellar cortex [7]. Motifs in the RPTPρ extracellular segment (MAM, Ig and FN-III domains) are commonly found in cell adhesion molecules. The two phosphatase domains in the intracellular segment suggest that RPTPρ, like other members of the RPTP family, is involved in signal transduction through protein tyrosine dephosphorylation.
The human RPTPρ gene has been mapped to chromosome 20q12-13.1 [6]; it is located between anchor markers D20S99 and D20S96, and is flanked by the phospholipase C gamma 1 and splicing factor SRp55-2 genes. The mouse gene maps to a syntenic region at 93 cM on mouse chromosome 2, a region closely linked to Pltp and flanked by the markers, D2Mit22 and D2Mit52. To date, only portions of the human RPTPκ, RPTPµ and PCP-2 genes have been sequenced, however, the region encompassing the human RPTPρ gene has been sequenced in its entirety (Chromosome 20 sequencing group, Sanger Centre), but it is not, as yet, fully assembled and annotated. The mouse chromosomal region containing the RPTPρ gene has been sequenced (Celera Discovery System), but it is also largely unassembled. In this report, we describe the cloning of the mouse cDNA, the identification of an unusually long 3' UTR, the identification of alternatively spliced exons, and the genomic organization of the human and mouse RPTPρ genes.

Figure 1
Domain structure of the receptor-like protein tyrosine phosphatase family. Variations in the extracellular domain structure separate the RPTP family of transmembrane proteins into five major classes (I-V). RPTPρ is a member of the type IIB subfamily of RPTPs that includes RPTPκ, µ and PCP-2.

Table 1
Columns (left to right): Exon number, protein domain, exon size, exon/intron junctional sequences, and intron phases are shown. Amino acids (standard one-letter code) are listed below the coding nucleotides. D1 and D2 represent the first and second phosphatase domains, respectively. a -i designations indicates the individual exons within a single domain; ** intron size is not determined due to lack of contiguity of clones. age site is located at AA 632-635, in the fourth fibronectin repeat. The transmembrane segment is located at AA 765-785. The intracellular region contains a juxtamembrane 'wedge' region (AA 888-920), and two highly conserved phosphatase domains (AA 1061-1162 and 1351-1456). The 11 hallmark amino acids that define the catalytic core of the first phosphatase domain are located at AA 1104-1114. The stop codon is found after residue 1463 of the amino acid sequence.

Human RPTPρ genomic organization
We have determined that the region encompassing human RPTPρ is contained within 10 contiguous PAC clones and 1 BAC clone (dJ269M15, dJ47A22, dJ753D4, dJ914M10, bA32G22, dJ232N11, dJ3E5, dJ230I19, dJ81G23, dJ707K17, and dJ1121H13; Sanger Center, chromosome 20 group) ( Figure 2). We have ordered these clones by identifying RPTPρ exons within each of them. The RPTPρ gene spans a minimum of 1 Mbp, and the RPTPρ coding sequence is comprised of at least 33 exons, several of which are alternatively spliced. A prominent feature of the RPTPρ gene structure is the considerable variability of exon spacing ( Figure 2). Exons 1-19 extend over the initial ∼ 1000 kbp of the gene; exons 1-10 are widely separated, while exons 10-19 are more closely spaced. Of particular note are introns 1 and 7, which are ∼ 300 and ∼ 200 kbp long, respectively, considerably longer than the next largest intron. In contrast, exons 20-28 and 29-32 form two tight clusters, which together span approximately 60 kbp. In general, this pattern of exon organization appears to be characteristic of most RPTPs, as it is also observed in RPTPγ [8], LAR [9], CD45 [10] and RPTPα [11]. Each of these phosphatases has at least one very large intron in the 5'-region of the gene. This feature is not restricted to receptor-like phosphatases as it is also present in a number of adhesion receptor genes, including E-cadherin, N-cadherin, P-cadherin, N-CAM, deleted in colorectal cancer (DCC), axonin-1 and F11 (discussed in [12]).
The exon and intron sizes and exon/intron junctional sequences of the human RPTPρ gene are detailed in Table  6. The majority of 5' and 3' splice sites are consensus sequences. There is some variation in the length of exons, which range from 30 to 297 bp. Approximately one third of the exons are less than 100 bp, while the remaining two thirds are in the 100-300 bp range. Greater variation occurs in the size of the introns, which range from 725 to 303,715 bp. The largest number of introns (15) falls into the 10 4 to 10 5 bp bin, and somewhat fewer (12) fall into the 10 3 to 10 4 bp bin size. Only 5 introns lie outside this range: Three of these fall into the 10 2 to 10 3 bp range, and two unusually long introns in the extracellular domain are over 10 5 bp.
The RPTP extracellular segment is comprised of protein domains; the borders of these modules correspond to the boundaries of exon-clusters. There are three possible junctional phases between exons and introns: Phase 0 refers to introns with junctions between the triplet codons, whereas phase 1 and 2 introns separate within the triplet after the first and second nucleotides, respectively. Figure 3A shows the distribution of intron phases relative to the domain structure of RPTPρ. Within the RPTPρ gene, the number of phase 0 and phase 1 introns is comparable at 15 and 12, respectively. In contrast, there are only five phase 2 introns in the entire gene. A notable feature of RPTPρ gene structure is that phase 1 introns appear to be preferentially associated with the extracellular segment, where they flank each of the protein domain exon modules. The intracellular segment is almost devoid of phase 1 introns. In contrast, phase 0 introns are primarily associated with the intracellular segment, and are only infrequently represented in the extracellular region.
Recently, RPTPs have been examined in sponges [13,14] the phylogenetically oldest extant metazoan. Although sponges are multicellular organisms, they lack the cellular cohesiveness of the higher eukaryotes. When RPTPs from yeast, sponge and human were aligned and rooted cladograms constructed, the common early ancestor of the phosphatase domains appeared to be yeast. The second phosphatase domain arose as a duplication of the first [13]. The RPTP extracellular domain was acquired during the transition from single-celled to multicellular organisms. In RPTPρ, the extracellular and intracellular exon modules are separated by phase 1 and phase 0 introns, respectively. Furthermore, intracellular introns are much smaller than those in the extracellular segment. Together, these observations suggest that the RP-TPρ extracellular and intracellular segments originated as separate modular proteins that evolved by exon shuffling and duplication, respectively [13,15]. The two segments became linked to form a functional transmembrane molecule during the transition from single to multicellular organisms.
Over fifty percent of the human genome is comprised of repeat sequences [16], making it the first repeat-rich genome to be sequenced. Analysis of these numerous segments can provide important indications of the evolutionary history of a particular region, or gene. Transposon-derived elements form the largest category of repeats, and include LINEs, SINEs, LTRs and DNA elements. In the RPTPρ gene, the most common of these are: LINE1 (7.6%) and LINE2 (2.0%); the SINEs Alu (4.2%), MIR (3.6%) and THE (0.65%); LTR (0.7%); and the DNA elements MLT (2.5%), MER (2.5%), and MST (0.5%). Less common elements found in the RPTPρ gene include Tiggers in introns 2, 7 and 9 (0.5%), HAL in introns 2 and 7 (0.28%), MAD in introns 1 and 16 (0.013%), and U2 in intron 2 (0.006%). There is also a Charlie repeat in intron 7 (0.005%). In addition to the transposon-derived repeats, there is a pseudogene in intron 7, a tRNA-derived repeat in intron 30, and 133 variable length nucleotide tandem repeats (VNTRs/ microsatellites) found in the gene. The G/C content of the RPTPρ gene is approximately 42%. Descriptions of the above repeat elements may be found on Repbase at [http://www.girinst.org./] The overall percentage of the RPTPρ gene comprised of repeat sequences is lower (by 45%) than that of the entire human genome. In the human genome, LINEs comprise 21% of repetitive sequences, SINEs 13%, LTRs 8%, and DNA elements 3% [16]. In RPTPρ, LINEs comprise 9.6% of repetitive sequences, SINEs 8.4%; LTRs 0.7%; and DNA elements 6.3%. The significance of this deviation in RPTPρ from the normal range is unknown.

cDNA cloning and genomic structure of mouse RPTPρ
The mouse RPTPρ cDNA was cloned using a combination of PCR and 5'-RACE. The mouse cDNA (Genbank accession #AF152556) encodes a 1451AA polypeptide that is 96% identical to that of the human protein and

Table 2
Columns (left to right): Exon number, protein domain, exon size, exon/intron junctional sequences, and intron phases are shown. Amino acids (standard one-letter code) are listed below the coding nucleotides. D1 and D2 represent the first and second phosphatase domains, respectively. a -i designations indicates the individual exons within a single domain; ** intron size is not determined due to lack of contiguity of clones.
predicts an analogous domain structure ( Figure 3A). The Celera Discovery System mouse genomic database was used to identify clones containing RPTPρ exons. These clones were then ordered and analyzed to identify exon/ intron junctions. Exon and intron sizes, exon/intron junctional sequences, and intron phases of the mouse RPTPρ gene are shown in Table 7. In general, the exon/ intron splice sites in the mRPTPρ correspond to expected GT-AG intron consensus splicing sequences, and the intron phases in mouse (Table 7) are identical to those in the human gene (Table 6). Although the two species share approximately 89% nucleotide identity overall, when examined exon by exon, the degree of identity varies slightly between the extracellular and intracellular segments ( Figure 3B). The overall identity of the mouse and human extracellular and intracellular segments is 89% and 92%, respectively. In general, there is slightly greater variance between the two species in the extracellular segment; for example, mouse and human exons 1 and 9 share 78% and 95% identity. Within the intracellular segment, mouse exon 21 is 86% identical to that of the human, and exon 24, which contains the first half of the catalytic core, is 96% identical. Notably, the alternatively spliced exons 14, 16 and 22a (discussed below) are 100%, 97% and 95% identical, respectively, indicating a high degree of conservation between mouse and human. In summary, the mouse and human genes are virtually identical in terms of the number and size of exons, and the exons differ only slightly with respect to the nucleotide sequence.

Exon/intron organization of the RPTPρ extracellular segment MAM domain
The relationship between RPTPρ exon organization and protein domain boundaries is shown in Figure 3A and in Tables 6 and 7. Within the extracellular segment, exon 1 encodes the signal peptide, and exons 2, 3 and 4 encode the single N-terminal MAM domain, a distinguishing feature of all type IIB phosphatases. Although the function of the RPTPρ MAM domain is unclear, other type IIB phosphatases have shown homophilic binding properties: When heterologously expressed in non-adherent cells, both RPTPµ and RPTPκ bind homophilically to induce the formation of large, calcium-independent aggregates [17,18]. Furthermore, when the RPTPµ MAM domain was deleted, aggregation was eliminated [19], implying that the domain had a crucial role in homophilic cellular interactions.
The three RPTPρ MAM exons differ widely in size: 126 bp (exon 2), 272 bp (exon 3) and 82 bp (exon 4). All MAM-associated introns are in phase 1, with the exception of the second internal intron, which is in phase 0. MAM domains have been identified in a variety of cell adhesion molecules. We have determined the exon structure of the MAM domain in all four human RPTP IIB genes, and in human zonadhesin and human enteropeptidase (NCBI database). The genomic organization of the MAM domain in all four IIB phosphatases is identical. In all RPTP IIB proteins (Genbank #NM 002844; NM 002845; NM 005704; NM 007050) and in human zonadhesin (Genbank #AF312032) there is a MAM domain at the N-terminus, the genomic structure of which is highly conserved. In zonadhesin, there are two additional and adjacent MAM domains. The genomic organization of the latter two domains differs from that of the first. The single MAM domain in the human enteropeptidase gene (Genbank #Y19124) is more internally located than that of RPTPρ, close to the transmembrane region. It is comprised of four exons that are 150, 135, 89 and 125 bp in length, and is unlike any of the IIB and zonadhesin MAM domains. In summary, all known MAM domains are located within the extracellular segment, but within this region, their location, exon number and exon size can vary considerably. The size and structure of exons comprising the most N-terminal MAM domain appear to be unique. Because the nucleotide sequence of the RPT-Pρ MAM domain predicts a protein similar to that found in the other type IIB RPTPs, it might be expected that the RPTPρ MAM domain also participates in homophilic interactions, as was shown for RPTPµ [19].

Ig domain
Adjacent to the MAM domain, the single Ig-like domain is split into two similarly sized exons (5 and 6) by one intron in phase 0 ( Figure 3A). Introns flanking the Ig-like domain are in phase 1. In the majority of genes encoding Ig-like domains, only one exon encodes each domain, while in others such as N-CAM, two exons encode each domain [20]. The single Ig-like domain of the RPTPρ gene falls into the latter category, suggesting a closer relationship to N-CAM-like molecules. LAR has characteristics of both groups [9], a feature which it shares with several other genes, such as perlecan [21] and DCC [22]. Within the RPTP IIB family, the Ig-like domain appears to act in conjunction with the MAM domain to bring about homophilic cell-cell interactions [23].

FN-III domains
Following the Ig domain are four FN-III repeats ( Figure  3A), each of which begins with a highly conserved proline residue. FN-III domains are found in a wide range of proteins, and recently, have been shown to be involved in retinal axon target selection [24]. As a general rule, FN-III domains are encoded either by 1 or 2 exons [25]. Within genes that encode multiple FN-III domains, exon organization may be of one type, or a combination of the two types. For example, N-CAM has 2 exons for each FN-III domain [26], whereas tenascin [27] and LAR [9] have a mixture of both types. In the RPTPρ gene, there is a good correlation between exon structure and FN-III boundaries ( Figure 3A), although there is some variation in the number of exons per domain: Each of the first two FN-III repeats is encoded by a single exon (exons 7 and 8, respectively). In contrast, the third FN-III repeat is encoded by two exons (9 and 10). Somewhat atypically, the fourth FN III repeat is encoded by three exons (11, 12 and 13). This domain contains a putative proteolytic cleavage site. RPTPρ FN-III repeats share high sequence similarity with those of N-CAM, but only the third FN-III domain in RPTPρ is encoded by two exons. In contrast to the type IIA phosphatase LAR, the RPTPρ gene does not contain exons encoding more than one fibronectin domain; however, like LAR, it has a FN-III domain encoded by three exons.
In the majority of known cases, the exon/intron junctions corresponding to the FN-III domain boundaries are in phase 1. When two exons encode a FN-III domain, an intron interrupts the coding region in a central, relatively non-conserved, part of the domain, and the exon/ intron junction may be in any phase. In the RPTPρ gene, introns separating the individual FN-III repeats are in phase 1; the intron internal to the third repeat is in phase 0, and introns internal to the fourth FN-III repeat are in phase 2 and 0, respectively.

Exon/intron organization of the RPTPρ intracellular segment
Juxtamembrane region Following the transmembrane segment (exon 15), exons 16-18 encode the juxtamembrane region ( Figure 3A, Tables 6 and 7). This segment of the RPTPρ protein is similar to the membrane proximal region in the type IV phosphatase, murine RPTPα, for which the crystal structure has been determined [28]. RPTPα exists as a dimer in which the catalytic site of one molecule is blocked by contact with a 'wedge' from the other. Specifically, the 'turn' part of the helix-turn-helix motif is inserted into the active site, which maintains the WpD loop in the open state [28]. In other phosphatases [29], the WpD loop undergoes a conformational shift upon substrate binding, which appears to be crucial for catalysis. Thus, it is very likely that the dimeric form of RPTPα is unable to bind tyrosine-phosphorylated substrates, rendering it catalytically inactive. The negative charge of two adjacent residues within a highly conserved sequence in the juxtamembrane region appears to be crucial for inhibition [28,30]. In RPTPα, these two residues are negatively charged aspartates. In type IIB RPTPs, the first residue is changed to an alanine in PCP-2 and RPTPµ, and to a serine in RPTPκ . The second residue is retained as either a glutamate in PCP-2 and RPTPµ, or an aspartate in RP-TPκ . These single amino acid changes may indicate a somewhat weaker level of inhibition. This is supported by the examination of the crystal structure of RPTPµ, which shows that although a wedge is formed, catalytic activity is not inhibited by its insertion into the active site on the adjacent monomer [31]. However, in the case of RPTPρ, the first residue is a glycine, and the second is the large basic residue, glutamine. Thus, the RPTPρ juxtamembrane catalytic region is likely to have a different conformation to that of the other phosphatases and a net positive charge, making the regulation of phosphatase activity by dimerization-induced wedge inhibition unlikely.

Phosphatase domains
Although the extracellular regions of receptor-like phosphatases are highly variable, the intracellular tandem phosphatase domains appear quite closely related. The structure of the CD45 gene indicates that both protein tyrosine phosphatase (PTPase) domains have a very similar exon/intron organization, which probably arose by duplication [10]. In RPTPρ, the first and second phosphatase domains are encoded by exons 19-26 and 27-32, respectively ( Figure 3A). The exon structure of the RPT-Pρ phosphatase domains, and that of homologous domains in PCP-2 (NM_005704), RPTPκ (NM 002844), RPTPµ (NM 002845), LAR [9], CD45 [10] RPTPα [11], RPTPγ [8] and rat Esp/mOST-PTP [32,33], are compared in Figure 4. We have deduced the genomic structure of RPTPκ, RPTPµ and PCP-2 by comparing known cDNA sequences with human genomic clones (NCBI). The positions of the exon boundaries in the phosphatase domains of RPTPρ, RPTPκ, RPTPµ and PCP-2 coincide exactly, and correspond well with the five other phosphatases. LAR is somewhat anomalous in that, although the exon/intron structure of the second phosphatase domain is generally similar to that of the other RPTPs, exons in the first phosphatase domain are fewer in number, but greater in size. The final exon in all nine genes encodes the end of the second phosphatase domain, the short C-terminus and the entire 3'-untranslated region.
A striking similarity among the RPTP genes is the conservation of exon/intron junction 24/25 in the first phosphatase domain. In LAR, CD45 and RPTPα, this junction interrupts the highly conserved sequence VHCSAGV, part of the catalytic core of the phosphatase [34,35]. Although this exon/intron junction in the IIB phosphatases corresponds exactly, there is a change in the last amino acid from a valine to an alanine. Interestingly, an exon/intron junction is not observed at this position in the cytoplasmic PTPase PTP1B [36], an observation that may indicate an early evolutionary divergence of the cytoplasmic and transmembrane PTPases [37].
Although the exon/intron structure of the two phosphatase domains was remarkably similar in each of the nine RPTPs examined, there were variations in exon size and number, primarily in those close to the transmembrane domain. For example, the third exon (135 nt) in the first phosphatase domain of rat Esp/mOST-PTP and RPTPγ is replaced by two smaller exons (37 and 98 nt) in RPTPα, CD45, RPTPρ, PCP-2, RPTPκ, and RPTPµ . Two smaller exons replace a single exon at the C-terminal end of the first phosphatase domain of rat Esp/mOST-PTP. Similarly, at the start of the second phosphatase domain, the first exon (174nt) in RPTPρ, PCP-2, RPTPκ, RPTPµ and LAR is replaced by two smaller exons in rat Esp/ mOST-PTP, RPTPα, RPTPγ and CD45. In each case, the total number of nucleotides in the two smaller exons is virtually identical to that of the single larger exon at the same position. It is unclear whether these changes in exon number resulted from intron gain or exon fusion.

RPTPρ 3' untranslated region
Following the second phosphatase domain, there is a long (8.0 kb) 3' untranslated sequence. BLAST comparisons identified a region on the KIAA0283 gene (Genbank accession #AB006621) that showed 99% identity to nucleotides 3181 to 4437 of the hRPTPρ sequence. Thus, the 3'-UTR of hRPTPρ, which is contained in exon 32, was identified as KIAA0283. Polyadenylation signals were found at 12425 nt and 12663 nt (NM_007050).

Alternative splicing of mouse and human RPTPρ genes
Comparison of the four RPTP type IIB (RPTPµ, RPTPκ, RPTPρ, PCP-2) nucleotide sequences predicted that, at least, two exons (14 and 16) are likely to be alternatively

Figure 4
Genomic organization of the two phosphatase domains in nine RPTPs. Boxed numbers indicate the number of nucleotides in each exon; interconnecting horizontal lines represent introns (neither are to scale). Note that exon 22a is not shown in order to preserve alignment among the type IIB RPTPs.
spliced. In addition, the presence of a segment (AA 826-850) in xenopus RPTPρ that is absent in the majority of other type IIB RPTPs, raised the possibility of an alternatively spliced exon between exons 17 and 18. Human fetal brain, mouse neonatal brain, and several regions (cortex, forebrain, brainstem, and cerebellum) of adult C57BL/6 mouse brain were examined for the presence of alternatively spliced regions. PCR primers were designed to amplify the regions encapsulating exons 14 and 16, and the region between exons 17 and 18. An additional region between exons 22 and 23 was also examined. The identity of all PCR products was verified by sequencing.
The RPTPρ exon 14 primers yielded two products of 257 and 200 bp ( Figure 5A and 5B), indicating a 57 nt alternatively spliced region at 2177 to 2233 nt. This 19AA segment is encoded by exon 14. Both splice forms were observed in human fetal, and in neonatal and adult mouse brain mRNA. We have obtained similar results for RPTPµ (data not shown), in which exon 14 was reported to be absent (NM_002845). The RPTPρ exon 16 primers yielded two bands of 356 and 326 bp ( Figure 5C and 5D). This indicates an additional 10 AA alternatively spliced region, located between the transmembrane and the first phosphatase domain (2370-2399 nt). Both transcripts were present in mouse and human brain, and were observed in all brain regions analyzed. PCR of the same region in RPTPµ yielded only one product that did not contain the exon 16 sequence (data not shown). A third alternatively spliced exon (22a) was identified in the first phosphatase domain between exons 22 and 23. Exon 22a was inserted after nucleotide 3172 in mouse, and after nucleotide 3232 in human RPTPρ, predicting an additional alternatively spliced region 20 AA in length. In each case, primers yielded two bands of 93 and 152 bp ( Figure 5E and 5F) in all brain regions examined. It remains to be determined if other members of the type IIB subfamily also contain this exon, or whether the region is unique to RPTPρ.
Comparison of xenopus, mouse and human type IIB RPTP nucleotide sequences indicated the possibility of a fourth alternatively spliced region located 3' to exon 17, within the wedge domain. This 75 nt segment is present in the reported sequence of human RPTPµ (2445-2520 nt) and in xenopus RPTPρ (2448-2523 nt). It is absent in the reported sequences of human and mouse RPTPκ, RPTPρ and PCP-2. The exon 17/18 primers were designed to amplify two potential products of 209 and 134 nt. However, only a single product of 134 nt was observed in human and mouse brain regions (data not shown). This sequence appears to be unique to human RPTPµ and xenopus RPTPρ and is unlikely to represent an alternatively spliced exon in any of the RPTP IIB genes.
Both splice variants of exons 14, 16 and 22a were present in human and mouse brain, at all ages and in all brain regions examined. Although the RPTPρ protein products encoded by the alternatively spliced exons do not appear to encode any known motifs, different isoforms of the phosphatase, with as yet unknown functions, are likely to be present. Alternatively spliced isoforms of the related RPTPs, LAR [38] and RPTPβ/ζ [39], are spatially and temporally distinct in the central nervous system, and there is evidence that alternatively spliced exons can influence ligand binding, as is the case with LAR [9].

Conclusions
We describe the cloning of the mouse RPTPρ cDNA, the genomic structure and alternative splicing of the mouse and human genes, and the presence of an 8 kb 3'-UTR in human RPTPρ. RPTPρ is the largest RPTP gene characterized to date, extending over more than 1 megabase pairs of genomic DNA. Its considerable length is due, primarily, to expanded introns in the extracellular region. The protein domains of the extracellular segment are encoded by 1 to 3 exons, which form modules that are flanked by phase 1 introns. The majority of introns in the intracellular segment are in phase 0, and are relatively small. These data suggest that the ectodomain and the phosphatase domain arose separately by exon shuffling and duplication and fused at a later evolutionary period. The MAM domain, the region characterizing type IIB phosphatases, possesses a unique genomic structure common to all such domains when located at the N-terminus. The fourth fibronectin repeat in RPTPρ is encoded by three exons, an additional feature found only in type II phosphatases. At least two alternatively spliced exons flank the transmembrane domain, the region showing the greatest variability between the four IIB phosphatases. An additional alternatively spliced exon precedes the catalytic core of the first phosphatase domain. Comparison of the genomic structure of representative members of the RPTP family (types I-V) indicates that the intron/exon organization of both phosphatase domains is highly conserved. There is considerable variation in the length of the 3' UTR in the RPTPs; at 8 kb, the RPTPρ 3' UTR is the longest characterized to date. Our results provide the first characterization of the genomic structure of an RPTP type IIB gene. This information will facilitate future studies of promoter and other regulatory elements responsible for the tissue specificity of gene expression.

Cloning of mouse RPTPρ cDNA
The mouse RPTPρ cDNA was obtained using a combination of 5'-RACE and PCR by methods described in [40]. Total RNA was isolated (RNAzol, Tel-Test, Friendswood, TX) from C57BL/6 mouse brain and used to synthesize RT-PCR products were amplified using primers flanking exon 14 (panels A and B), exon 16 (panels C and D) and exon 22a (panels E and F). Left panels: bands in lanes 1, 2, and 3 are from human fetal brain, mouse P1 brain, and mouse P60 brain total RNA, respectively. Right panels: bands in lanes 4, 5, 6 and 7 contain total RNA from cerebellum, brain stem, basal forebrain and cortex (P23), respectively. Transcripts containing both splice forms of exons 14, 16 and 22a were found in all lanes.
first strand cDNA (AMV-RT, Roche Molecular Biochemicals, Indianapolis), which was then amplified by PCR using degenerate primers based on the human RPTPρ sequence. PCR products were analyzed on 1% agarose gels and subcloned into the TOPO2.1 vector (Invitrogen, Carlsbad, CA). Each strand was sequenced at least twice. Sequence analysis and assembly were performed using Vector NTI Suite (Informax, Bethesda, MD). Murine RP-TPρ sequences were identified by BLAST [41] using blastn, on the nr database, with all parameters set to default values. An initial 923 nt fragment was obtained, which spanned the region from the 4 th FN-III repeat through the first phosphatase domain. Additional PCR was performed using new gene specific primers based on the newly isolated murine RPTPρ sequence (Genbank #AF152556), and degenerate primers based on the hRP-TPρ sequence (Genbank #NM 007050).