Genomic structure and alternative splicing of murine R2B receptor protein tyrosine phosphatases (PTPκ, μ, ρ and PCP-2)

Background Four genes designated as PTPRK (PTPκ), PTPRL/U (PCP-2), PTPRM (PTPμ) and PTPRT (PTPρ) code for a subfamily (type R2B) of receptor protein tyrosine phosphatases (RPTPs) uniquely characterized by the presence of an N-terminal MAM domain. These transmembrane molecules have been implicated in homophilic cell adhesion. In the human, the PTPRK gene is located on chromosome 6, PTPRL/U on 1, PTPRM on 18 and PTPRT on 20. In the mouse, the four genes ptprk, ptprl, ptprm and ptprt are located in syntenic regions of chromosomes 10, 4, 17 and 2, respectively. Results The genomic organization of murine R2B RPTP genes is described. The four genes varied greatly in size ranging from ~64 kb to ~1 Mb, primarily due to proportional differences in intron lengths. Although there were also minor variations in exon length, the number of exons and the phases of exon/intron junctions were highly conserved. In situ hybridization with digoxigenin-labeled cRNA probes was used to localize each of the four R2B transcripts to specific cell types within the murine central nervous system. Phylogenetic analysis of complete sequences indicated that PTPρ and PTPμ were most closely related, followed by PTPκ. The most distant family member was PCP-2. Alignment of RPTP polypeptide sequences predicted putative alternatively spliced exons. PCR experiments revealed that five of these exons were alternatively spliced, and that each of the four phosphatases incorporated them differently. The greatest variability in genomic organization and the majority of alternatively spliced exons were observed in the juxtamembrane domain, a region critical for the regulation of signal transduction. Conclusions Comparison of the four R2B RPTP genes revealed virtually identical principles of genomic organization, despite great disparities in gene size due to variations in intron length. Although subtle differences in exon length were also observed, it is likely that functional differences among these genes arise from the specific combinations of exons generated by alternative splicing.


Background
Over the past decade, receptor protein tyrosine phosphatases (RPTPs) have emerged as integral components of signal transduction in the vertebrate and invertebrate central nervous system. RPTP domain structure suggests cell adhesive properties, and studies on Drosophila mutants have provided strong evidence that specific RPTPs act together to provide a set of partially redundant signals necessary for muscle targeting and fasciculation decisions in CNS neurons [1,2], both crucial components in the establishment and maintenance of neural circuits.
RPTPs have been divided into eight major subfamilies (Figure 1), based on phylogenetic analysis of the phosphatase domains [3]. Four of these subfamilies (R2A, R2B, R3, and R4) play critical roles in CNS development [4]. Common to all Type 2 RPTPs is an extracellular segment containing a combination of multiple fibronectin and immunoglobulin (Ig)-like domains, and a single transmembrane region. The intracellular region contains a membrane proximal juxtamembrane domain, followed by a catalytically active tyrosine phosphatase domain and a second inactive domain. Type 2 RPTPs have been further subdivided into two distinct classes (R2A and R2B). Genes in the R2B class are differentiated from the R2A class by an additional MAM (Meprin/ A5/PTP mu) domain at the Nterminus [5]. In addition to a putative role in signal transduction, R2B molecules have cell adhesive properties [6]. Because no invertebrate homologues of the four R2B molecules have been found to date [7], and no ESTs indicative of R2Bs have been isolated from invertebrates, the function(s) of these phosphatases is likely to be highly specific to vertebrate species. Previously, we have described the genomic structure of human PTPρ [8] and have shown that the transcript is Classification of receptor-like protein tyrosine phosphatases (RPTPs) into eight subfamilies (R1-R8), based on sequence similar-ity among PTP catalytic domains [3] Figure 1 Classification of receptor-like protein tyrosine phosphatases (RPTPs) into eight subfamilies (R1-R8), based on sequence similarity among PTP catalytic domains [3]. PTPµ, κ, ρ and PCP-2 are members of the R2B subfamily. expressed primarily in the central nervous system where it delineates a distinct developmental compartment in the cerebellar cortex [9,10]. In the present study, the genomic structures of all four murine R2B genes (PTPκ, PTPµ, PTPρ and PCP-2) were compared, and their expression localized to specific cell types within the central nervous system. The 5'-genomic sequences were examined for putative promoter regions and transcription factor binding sites, and full-length sequences were used to determine the phylogenetic relationship between the four genes. Clustal-X alignment of cDNA and Genbank sequences predicted the presence of alternatively spliced exons. Five such exons were confirmed experimentally, with the majority being localized in the juxtamembrane and first phosphatase domain in each of the four genes.
Murine and human R2B cDNA sequences were used to identify the corresponding genomic DNA contigs in the Celera and NCBI genomic databases, using BLAST and MEGABLAST programs. Alignments were used to establish exon and intron size, and junction phase. The genomic structure of human PTPρ has been reported previously [8]; the human PTPµ, κ and PCP-2 annotated structures are available from the authors (rotter.1@osu.edu) upon request. The sizes and genomic organization of the mouse R2B genes are derived from Figures 2,3,4,5, and are summarized in Figure 6. The overall size of the mouse genes and their corresponding human orthologs was very similar. In general, gene size exceeded the average, especially in the case of PTPρ, which was the largest gene (~1,117,873 bp), followed by PTPµ (~686,308 bp), PTPκ (~521,813 bp) and PCP-2 (~63,884 bp) ( Figure 6). The recent completion of the human chromosome 20 sequence [11] revealed that PTPρ is the largest confirmed gene on that chromosome, due primarily to expanded introns in the genomic region containing coding regions for the extracellular and juxtamembrane segments of the protein. Although the functional consequence of this large gene size is not clear, one predicted outcome is an extended time period for transcription of the corresponding mRNA.
Each of the R2B genes contained over 30 exons, which were examined pairwise to determine the overall nucleotide/exon identity between the four genes ( Figure 7). Three major regions were delineated, each with varying degrees of sequence identity: Exons 2-13 comprised the extracellular segment (MAM, Ig and four fibronectin (FN) type III domains), exon 14-18 (juxtamembrane region), and exons 19-32 (two phosphatase domains). Although the number of exons comprising each of the extracellular domains was identical in each of the four genes, exon size varied in some domains and remained unchanged in others. Within the extracellular segment, the MAM domain showed the most extensive variation in exon size: The first exon ranged from 123 to 132 bp, and the third from 79 to 82bp ( Figure 8). MAM domains are comprised of 160-170 amino acids containing four conserved cysteines; their function has been examined in some detail. When expressed in non-adherent cells, PTPµ [12][13][14] and PTPκ [15] proteins formed large calcium-independent clusters. Aggregation was strictly homophilic, consisting exclusively of cells expressing only a single R2B type [14][15][16]. Because this property had not been demonstrated with any of the other RPTP subfamilies, a crucial role for the MAM domain in this homophilic interaction was implied. However, in an in vitro binding assay in which regions of recombinant PTPµ were expressed [17], the homophilic binding site was localized to the immunoglobulin (Ig)like domain. Subsequently, MAM and Ig domains were shown to function cooperatively in homophilic binding in both PTPµ and PTPκ [16]. It was suggested that the binding site is located in the Ig domain and the MAM domain is part of a "sorting" mechanism that confers homophilic binding specificity [6]. Figures 7 and 8 show that, when combined with the invariant 272 bp middle exon, each R2B MAM domain had a unique combination of exon sizes and low sequence identity, indicating a region of high specificity. The adjacent Ig-like domain contained exons of identical size, implying a less specific role than that of the MAM domain. These marked variations in sequence identity are consistent with the idea that the MAM domain plays a role in the mediation of homophilic binding specificity [6].
The four FN type III repeats are involved in general adhesive interactions. The size of the first and third of these domains was identical among the R2B genes, whereas the second and fourth FNIII domains differed slightly ( Figure  8). In the second FNIII domain, exon sizes varied from 297 in ptprt, to 303 in ptprk, and 309 in ptprm and ptprl. The only difference in the fourth FNIII domain was in ptprk, in which one of the three exons comprising this domain was slightly larger (106 vs 103) than in the other three genes.
Organization of the murine PTPρ gene based on Celera genomic sequences Figure 2 Organization of the murine PTPρ gene based on Celera genomic sequences. Left to right: Exon number, 3' splice site, exon sequence, 5' splice site, nucleotide number, exon size, intron size, intron phases and protein domain are shown. Amino acids (standard one letter code) are listed below the encoding nucleotides. D1 and D2 represent the first and second phosphatase domains, respectively; a to i designations indicate the individual exons within a single domain. Organization of the murine PCP-2 gene based on Celera genomic sequences  Within the intracellular segment, the most dramatic variation in size, number and percentage nucleotide identity was observed in exons corresponding to the juxtamembrane region (Figures 7 and 8). This region consisted of six distinct exons (14)(15)(16)(17)(18) and is thought to be involved in substrate recognition and specificity, properties likely to show the greatest differences among the RPTPs (discussed below). Sequence comparison and exon/intron structure indicated that the two phosphatase domains (exons 19-32) were highly conserved. Furthermore, the degree of nucleotide identity was constrained to a relatively narrow range. A detailed analysis of the R2B phosphatase domains has been described previously [8].
The first intron in all four R2B genes ( Figure 6) was disproportionately large, a feature shared with other cell adhesion molecules. Intron/exon junctions (Figures 2, 3, 4, 5) conformed to the AG/GT rule [18]. Precise exon boundaries were determined by the presence of consensus splice sites [19] and preservation of the cDNA reading frame. Exon/intron boundaries were identical in all four mouse and human genes. Extracellular exons were primarily in phase 1 and the boundaries of the protein domains were always demarcated by a phase 1 boundary. In contrast, intracellular exons were much smaller and the majority, including those aligned with domain boundaries, was in phase 0 (Figures 2, 3, 4, 5, 8).

In situ hybridization
Previous in situ hybridization and Northern studies have shown that the four R2B family members are expressed in many tissues throughout development: PTPκ mRNA was Genomic organization of the murine RPTP R2B genes Figure 6 Genomic organization of the murine RPTP R2B genes. Exons are shown as vertical bars and introns as thin horizontal lines drawn to different scales (indicated by scale bars). The size of the genomic regions encoding the extracellular and intracellular segments of each gene is not drawn proportionally. Note that exon distribution and clustering is similar for each gene. present in brain, lung, skeletal muscle, heart, placenta, liver, kidney, and intestine; PTPµ was present in brain, lung, skeletal muscle, heart, placenta, and embryonic blood vessels [20,21], and PCP-2 was detected in the brain, lung, skeletal muscle, heart, kidney and placenta [20,22,23]. The distribution of PTPρ is somewhat anomalous in that it was almost entirely restricted to the brain and spinal cord [9,10].
In the present study, digoxigenin-labeled cRNA probes were used to determine the cellular localization of R2B transcripts in specific regions of the adult (P180) mouse brain: The olfactory bulb, cerebral cortex, hippocampus and cerebellum (Figure 9). Each of the four R2B transcripts was expressed at moderate to high levels in the mitral, external granule and glomerular layers of the olfactory bulb, and at lower levels in the external plexiform layer. All four R2B transcripts were distributed throughout the cerebral cortex, with the highest levels observed in layers II, IV, and V (PTPρ), IV and V (PTPµ), II to V (PTPκ), and II through VI (PCP-2). Within the hippocampus and dentate gyrus, large cells (Golgi II neurons) scattered throughout the hippocampal CA1, CA2, and CA3 regions, oriens and pyramidal layers, the hilus and subiculum, expressed PTPρ and PTPµ at very high levels. The PTPκ and PCP-2 transcripts were also present in Golgi II neurons, however, expression was restricted to cells in the hilus (PTPκ, PCP-2) and subiculum (PCP-2). Much higher expression levels were present in hippocampal pyramidal cells and dentate granule cells. Each of the four R2B transcripts was differentially expressed in the cerebellum. PTPρ mRNA was almost entirely restricted to the granule cell layer of lobules 1-6 of the cerebellar cortex and deep cerebellar neurons; very sparse labeling was also present in basket and stellate cells in the molecular layer.   5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29  cerebellar neurons. The sense signal for each of the four genes (not shown) was very low and distributed uniformly across sections, indicating that non-specific expression was negligible. These studies show that each of the four R2B transcripts exhibit exclusive, as well as overlapping, distribution patterns.

Phylogenetic analysis of murine RPTP R2B cDNA sequences
The phylogenetic relationship of the entire sequence of the R2B phosphatases encompassing both extra-and intracellular regions was compared. Analysis of the fulllength mouse cDNA nucleotide and predicted amino acid sequences indicated that the four genes originated from a common ancestor that gave rise to two separate branches ( Figure 10). Of the four R2B genes, PTPρ (ptprt) and PTPµ (ptprm) were most closely related, followed by PTPκ (ptprk). The most distant member was PCP-2 (ptprl). Previous phylogenetic analyses, based solely on the comparison of the first [3,24] and second [25] phosphatase domains, provided similar results. A priori, the four type R2B phosphatases could have arisen either by a single fusion event followed by at least two rounds of duplication, or by several separate fusion events. In the first instance, the phylogenetic tree generated by comparing the first phosphatase domains should be the same as that generated by comparing the entire proteins. Different phylogenetic trees would be expected if the four R2B phosphatases were generated by separate fusion events. Our finding that the phylogenetic relationship of the four Exon sizes within the murine R2B extracellular and juxtamembrane domains   complete proteins is the same as that of the phosphatase domains argues in favor of the former explanation, and supports the contention that during the transition from single-celled to multicellular organisms, double domain phosphatases originated by duplication, followed by fusion to cell adhesion-like genes [25].
Each of the four R2B genes expressed in the brain used the five alternatively spliced exons in a different combination: In PTPρ transcripts, exon 17a and 20a were absent, and exons 14, 16, and 22a were alternatively spliced ( Figure  11). In PTPµ transcripts, exons 14, 16, 20a and 22a were absent; exon 17a was present and not alternatively spliced. The alternative use of two 5' splice consensus sites resulted in the transcription of an additional 58 bp of the intron between exons 13 and 15 ( Figure 12). In PTPκ mRNA, exons 14 and 22a were absent, and exons 16, 17a and 20a were alternatively spliced ( Figure 13). In PCP-2 mRNA, exons 14 was absent, exon 16 was not transcribed in brain, and exons 17a, 20a, and 22a were alternatively spliced ( Figure 14). These results are summarized in Table  2. Splicing was also examined in human R2B transcripts where the use of alternatively spliced exons was virtually identical to that observed in the mouse genes. No agerelated or regional differences were observed in the CNS in any of the above studies.
The high frequency of alternatively spliced exons in the R2B juxtamembrane segment suggests that the region has highly specialized functions. The importance of alternatively spliced exons has been well documented for the closely related Type 2 RPTP, LAR, in which a small (27 bp) alternatively spliced exon (LASE-c) was identified in the fifth FN-III domain [31]. Subsequently, a 33 bp exon (LASE-a), was identified in the intracellular juxtamembrane region [32]. LASE-a, which was shown to be brain specific and developmentally regulated, was present in cell bodies of cultured granule cells, but was absent in neurites. Conversely, the LASE-c isoform was absent in cell bodies and present in neurites. Using in vitro ligand binding assays, the laminin-nidogen extracellular matrix complex was identified as a ligand for LAR, specifically interacting with the fifth FN-III domain [33]. When LAR bound the laminin-nidogen complex, cells formed long processes. Inclusion of the alternatively spliced 27 bp LASE-c exon disrupted this binding, causing changes in cell morphology. These studies imply a role for alternatively spliced exons in neurite extension through modification of cell adhesion.
The juxtamembrane region of the four R2B phosphatases shows greater variation in exon size and number, and is considerably longer, than the comparable region in other receptor-like PTPs. Furthermore, the region displays sequence similarity to the intracellular domain of cadherins, a family of calcium-dependent transmembrane proteins involved in homophilic cell adhesion. Cadherins bind catenins [34], which in turn bind the actin cytoskeleton [35] thereby influencing cell adhesiveness and changes in morphological attributes such as neurite extension and growth cone rearrangement. The intracellular domain is highly conserved among cadherin family members, and is essential for cadherin-mediated cell adhesion [36]. Both PTPµ [37] and PTPκ [38] have been shown to stimulate neurite extension in retinal explants and in cerebellar cultures, respectively. Furthermore, the intracellular segment of PTPµ binds directly to the intracellular domain of E-cadherin [39,40] in a complex with αand β-catenin. The other R2B phosphatases have also been shown to interact with the cadherin/catenin pathway: PTPκ interacts with βand γ-catenin at adherens junctions [41]; PCP-2 colocalizes with β-catenin and Ecadherin at cell junctions [22], and directly interacts with β-catenin [42]; and PTPρ binds cytoskeletal components including α-actinin and β-catenin [29]. More recent studies on PTPµ have further delineated this pathway: PTPµ-mediated neurite extension in retinal neurons is also dependent on PKCδ [43] and Cdc42 [44] activity. In addition, PTPµ is required for E-cadherin dependent cell adhesion [45], and for recruiting RACK1 to cell-cell contacts [46]. The physical association of PTPµ with RACK1 has been demonstrated [46]. It is likely that the juxtamembrane segment also mediates the interaction of PTPµ with these additional transduction molecules. The preponderance of alternatively spliced exons in the juxtamembrane region may add specificity to R2B adhesive functions via regulation of juxtamembrane binding specificity.

Conclusions
Analysis of the intron/exon structure of the four R2B phosphatase genes revealed that despite considerable disparities in gene size, genomic organization was virtually identical, possibly reflecting their close phylogenetic relationship. In the central nervous system, the expressions of the four transcripts were unique, perhaps resulting from the use of different transcription binding sites. Considerable variation in exon utilization was seen in the juxtamembrane domain, a region shown to interact with a variety of intracellular signal transduction molecules. Alternative splicing of exons in this region could result in different functional roles for each of the R2B phosphatases.

Genomic structure of R2B genes
The genomic structure of the four murine R2B RPTP genes was determined as follows: The R2B cDNA sequences were used to identify the corresponding genomic shotgun clones in the Celera mouse genomic DNA database, using BLAST (parameters set to default values) and MEGABLAST programs. The identified individual shotgun fragments were aligned onto their respective scaffolds, and distances were calculated based on scaffold lengths. A similar approach using the NCBI [47] and Sanger Center [48] databases was used to identify the human R2B gene structure. The identified clones were superimposed onto the assembled minimal tiling paths and the size of the genes Alternative splicing of PTPρ mRNA Figure 11 Alternative splicing of PTPρ mRNA. RT-PCR products were amplified using primers flanking exon 14 (panels A and B), exon 16 (panels C and D) and exon 22a (panels E and F). Left panels: bands in lanes 1, 2, and 3 are from human fetal brain, mouse P1 brain, and mouse P60 brain total RNA, respectively. Right panels: bands in lanes 4, 5, 6 and 7 contain total RNA from cerebellum, brain stem, basal forebrain and cortex (P23), respectively. Transcripts containing both splice forms of exons 14, 16 and 22a were found in all lanes.
was calculated from the sizes of the individual overlapping clones. In order to determine exon/intron organization, each cDNA sequence was compared to genomic DNA sequences using Spidey [49]. The vertebrate genomic sequence was selected as input, "use large intron sizes" was enabled, and the minimum mRNA-genomic identity was set to 60%.

Phylogenetic analysis
RPTP R2B nucleotide and amino acid sequences were aligned using Vector NTI Suite, V.6, AlignX. PAUP 4.0b10 was used to construct a phylogenetic tree of the R2B gene family. The S. cerevisiae tyrosine phosphatase PTP1, and the D. melanogaster receptor tyrosine phosphatase, DLAR, were used as outgroups. Rooted phylogenetic trees were drawn using the parsimony method with transversions weighted 10:1 over transitions, and changes in the first nucleotide of the triplet codon were weighted by a factor of 2 over changes in the second or third nucleotides. Heuristic searches were used to find the optimum tree, with the order of sequence additions randomized.

Transcription factor binding sites
The genomic region to be examined for transcription factor binding sites was determined using BLAST2 [50] and FirstEF [51]. The RPTP 5' UTRs and genomic DNA sequences were aligned pairwise to detect introns. For cases where multiple 5' UTRs were reported in Genbank, the sequences were aligned and differences identified as either an incomplete reporting of the 5' UTR, or possible alternative start sites if sequences were located in different regions of the genome. The "MATCH" program [52] was used to identify potential transcription factor binding sites in the 5000 bp preceding the 5' UTR, using the Vertebrate matrix of the TRANSFAC 5.0 database, with cut off values set to "minimize false positives and false negatives".

Riboprobe synthesis and in situ hybridization
The distribution of R2B RPTPs in the brain was determined by in situ hybridization with digoxigenin-labeled RNA probes, synthesized as follows:  (20 µm) in the sagittal plane, and in situ hybridization was conducted as described previously [9,10]. Riboprobe-labeled sections were washed at a final stringency of 0.125x SSC, at 65°C. Following the hybridization washes, the sections were processed with an anti-digoxigenin antibody (Roche) [53], dried and coverslipped.

Alternative splicing of the four RPTP R2B genes
First strand cDNA was made from total RNA from neonatal (P1) and adult (P60) mouse whole brain using Superscript II Reverse Transcriptase (Invitrogen). In addition, cDNA was made from cerebellum, brainstem, forebrain and cortex of a P23 mouse, and a 16-24 week old human fetal brain (Clontech). The reverse primer (5' CACG-CACACAGTTGAAGATGTCC), which was used in all RPTP first strand cDNA synthesis, is complementary to a region near the end of the first phosphatase domain (3580 to 3602 nt; NM_007050). PCR was performed (Platinum Taq, Invitrogen) as recommended by the manufacturer.
Alternative splicing of PTPµ mRNA Figure 12 Alternative splicing of PTPµ mRNA. RT-PCR products were amplified using primers flanking exon 14. Panel A: Bands in lanes 1, 2, and 3 are from human fetal brain, mouse P1 brain, and mouse P60 brain total RNA, respectively. Panel B: Bands in lanes 4, 5, 6 and 7 contain total RNA from P23 cerebellum, brain stem, basal forebrain and cortex, respectively. Transcripts containing both splice forms were found in all lanes.
All primers were used at a final concentration of 250 nM. An Eppendorf Mastercycler Gradient was used with the following cycling parameters: 2 minutes at 94°C, 35 cycles of 15 seconds at 94°C, 30 seconds at 58 or 60°C, 45 seconds at 72°C, and a final extension step (5 minutes at 72°C). The PCR products were run on 3.5% NuSieve GTG agarose (Biowhittaker) gels, stained with ethidium bromide and photographed using a Kodak DC120 camera. DNA bands were isolated and gel purified using Qiagen Gel Extraction kit. Identity of all RT-PCR products was confirmed by sequencing. Primer sequences are available from the authors upon request (rotter.1@osu.edu). Alternative splicing of PTPκ mRNA Figure 13 Alternative splicing of PTPκ mRNA. RT-PCR products were amplified using primers flanking exon 16 (panels A and B), exon 17a (panels C and D) and exon 20a (panels E and F). Left panels: bands in lanes 1, 2, and 3 are from human fetal brain, mouse P1 brain, and mouse P60 brain total RNA, respectively. Right panels: bands in lanes 4, 5, 6 and 7 contain total RNA from cerebellum, brain stem, basal forebrain and cortex (P23), respectively. Transcripts containing both splice forms of exons 16 and 20a were found in all lanes.

17a 20a
Alternative splicing of PCP-2 mRNA Figure 14 Alternative splicing of PCP-2 mRNA. RT-PCR products were amplified using primers flanking exon 17a (panels A and B), exon 20a (panels C and D) and exon 22a (panels E and F). Left panels: bands in lanes 1, 2, and 3 are from human fetal brain, mouse P1 brain, and mouse P60 brain total RNA, respectively. Right panels: bands in lanes 4, 5, 6 and 7 contain total RNA from cerebellum, brain stem, basal forebrain and cortex (P23), respectively. Transcripts containing both splice forms of exons 17a, 20a, and 22a were found in all lanes. Eight genomic regions containing predicted exons were examined. 0 indicates that the exon was absent (one band at the smallest expected size); 1 indicates the exon was present, but not alternatively spliced (one band seen at the largest expected size); 2 indicates that the exon was present and alternatively spliced (2 bands observed). ** exon not transcribed in brain.

Author's Contributions
JB conducted alternative splicing experiments and bioinformatic analysis; MP conducted in situ hybridization experiments; RD identified transcription factor binding sites; AF prepared text and figures, and assisted with data analysis; AR supervised studies and assisted with data analysis.