GPR99, a new G protein-coupled receptor with homology to a new subgroup of nucleotide receptors

Background Based on sequence similarity, the superfamily of G protein-coupled receptors (GPRs) can be subdivided into several subfamilies, the members of which often share similar ligands. The sequence data provided by the human genome project allows us to identify new GPRs by in silico homology screening, and to predict their ligands. Results By searching the human genomic database with known nucleotide receptors we discovered the gene for GPR99, a new orphan GPR. The mRNA of GPR99 was found in kidney and placenta. Phylogenetic analysis groups GPR99 into the P2Y subfamily of GPRs. Based on the phylogenetic tree we propose a new classification of P2Y nucleotide receptors into two subgroups predicting a nucleotide ligand for GPR99. By assaying known nucleotide ligands on heterologously expressed GPR99, we could not identify specifically activating substances, indicating that either they are not agonists of GPR99 or that GPR99 was not expressed at the cell surface. Analysis of the chromosomal localization of all genes of the P2Y subfamily revealed that all members of subgroup "a" are encoded by less than 370 kb on chromosome 3q24, and that the genes of subgroup "b" are clustered on one hand to chromosome 11q13.5 and on the other on chromosome 3q24-25.1 close to the subgroup "a" position. Therefore, the P2Y subfamily is a striking example for local gene amplification. Conclusions We identified a new orphan receptor, GPR99, with homology to the family of G protein-coupled nucleotide receptors. Phylogenetic analysis separates this family into different subgroups predicting a nucleotide ligand for GPR99.


Background
The superfamily of G protein-coupled receptors (GPRs) is one of the largest human gene families [1]. Many different approaches have been undertaken to identify new GPRs, both biochemically and by database searches. Recently, we have identified several new GPR genes from the database of expressed sequence tags by a complex computational strategy [2]. With the availability of the human genomic sequence another source for data mining became accessible, which is especially valuable for GPR searches, since many GPR genes contain no or only a few introns. Nevertheless, the existance of pseudogenes, many of which are not transcribed or lead to truncated proteins, makes it necessary to prove the expression of each putative gene found in the genome. For this publication we used some of the new receptors with homology to the subfamily of P2Y nucleotide receptors [2] to search for further GPRs in the human genomic database.

Identification of GPR99
Performing TBLASTN searches of the human genomic database with the known nucleotide GPRs we identified an additional ORF, which we named GPR99. The ORF reaches from bp 140188 to 141201 of the BAC clone with the accession number AC026756. 15. The putative start codon in a Kozak context is located 15 bp behind an in-frame stop codon. The last 20 bp of the ORF and the 3' UTR are found on an EST clone derived from thyroid epithelium (AW827323). Sequencing the entire EST clone from the I.M.A.G.E consortium [3], which was supplied by the Resource Center of the German Human Genome Project at the Max-Planck-Institute for Molecular Genetics, confirmed the genomic sequence behind the coding region for transmembrane domain six. To show mRNA expres-sion of full-length GPR99, we amplified the entire coding region together with the upstream stop codon from a human placenta cDNA library by PCR. Direct sequencing of the PCR product was in perfect accordance with the genomic data. The sequence was submitted to GenBank under accession number AF370886. mRNA expression of GPR99 was proven by northern blot analysis. A 3.0 kb mRNA was detected in kidney and, to a lower extend, in placenta ( Fig. 1).

Sequence analysis
GPR99 shares 36% identical amino acids both with the P2Y 1 receptor (57% similarity) and with GPR91 (55% similarity) as its closest homologs (Fig. 2). "Fingerprint" analysis for GPR subtypes [4] also groups GPR99 into the P2 purinoceptor subfamily. No receptors from other species with a similar degree of relatedness were identified in the databases.
Phylogenetic analysis of all human members of the P2Y subfamily of GPRs results in three subgroups designated "a", "b", and "n" (Fig. 3A). Subgroup "n" (for non-nucleotide receptors) contains P2Y 5 , P2Y 7 , P2Y 9 , and P2Y 10 , all of which, for homology reasons, were designated P2Y receptors. Closer analysis revealed that none of these is activated by nucleotides, but one of them, P2Y 7 , binds leukotriene B4 [5]. In contrast, GPR99 belongs to subgroup "b" which, in addition to the new orphan receptor GPR91, consists of five receptors with proven nucleotide agonists, hinting at a similar ligand for GPR91 and GPR99. Subgroup "a" consists of the orphan receptors GPR87 and H963, and of GPR86/P2Y 13 [6], P2Y 12 and KIAA0001, which bind the nucleotide ligands ADP, ADP, and UDP-glucose, respectively.
A closer analysis of residues implicated in direct nucleotide interactions confirms the existance of two distinct nucleotide receptor subgroups. At the beginning of transmembrane domain seven, all receptors of subgroup "b", including GPR99, share the motif Y-X-V-T-R-P-L, which is not found in the other GPRs (Fig. 3B). The highly specific positively charged arginine residue of this motif was shown to play an important role for binding of negatively charged nucleotides [7]. In the same position the subgroup "a" members share the motif K-E-X-T-L-X-L. P2Y 5 , P2Y 7 , P2Y 9 , P2Y 10 and other non-nucleotide receptors lack both of these motifs (Fig. 3B). These motifs might, therefore, be good diagnostic tools to predict ligands for additional orphan receptors.
Towards the end of transmembrane domain seven many GPRs share the motif D/N-P-X-X-Y. GPR99 lacks the highly conserved proline residue of this motif (Fig. 3B) and shares a leucine residue at this position with two GPRs encoded by Herpes simplex virus six. Since also some mela- tonin receptors, some retinal receptors, the muscarinic acetylcholine receptor from drosophila, and the orphan receptors GPR35 and GPR52 lack the proline residue at this position, GPR99 might very well be a functional receptor.
All nucleotide-binding GPRs including GPR99 contain the motif H-X-X-R/K at the end of transmembrane domain six, which is shared by only a few other GPRs, like the orphan receptors GPR17 and GPR34.

Chromosomal localization
The BAC clone AC026756.15 contains the GPR99 gene and is part of the contig NT_009840 which maps to chromosome 13q32-33. The Ensembl contig view tool [8], which identified the GPR99 ORF by automated gene prediction (volatile ID 426046), maps GPR99 to chromosome 13q32.2. To the same chromosomal location two disorders, congenital Microcoria and one form of Schizophrenia, had been linked. The proline-containing motif towards the end of transmembrane domain seven is indicated below the alignment. The accession numbers of the receptors are: GPR86, AAK01864; GPR87, AAK01858; GPR91, AF348078; H963, AAC51846; KIAA0001, BAA02791; P2Y 1 , CAA07339; P2Y 2 , XP_006367; P2Y 4 , P51582; P2Y 5 , P43657; P2Y 6 , Q15077; P2Y 7 , Q15722; P2Y 9 , Q99677; P2Y 10 , AAB57836; P2Y 11 , AAB88674; P2Y 12 , AAG48944. (C) Schematic diagram of the subgroup "a" gene cluster on chromosome 3q24. In the upper part the overlapping BAC clones, coding for the receptors indicated below, are given. The genes of P2Y 12 , GPR86, and GPR87 are located in a head-to-tail configuration within a distance of 46.5 kb on one contig of AC024886. The GPR87 gene is also present on AC078816, which overlaps with AC024886 in 15 kb of its sequence. The H963 gene is encoded by AC078816 and AC011103, the KIAA0001 gene by AC078816, AC011103, and AC063935. From these data the order of the genes can be deduced. Gaps in the known sequence of the BAC clones coding for H963 and KIA0001 make it impossible to determine the orientation and exact size of the intergenic regions.
In contrast to other nucleotide GPR genes, which are often found in clusters, we did not recognize further GPR genes close to GPR99. The chromosomal localization of the other nucleotide GPRs supports the proposed subgroup classification, since all genes of the subgroup "a" receptors are located within 355 kb on two overlapping BAC clones (AC024886 and AC078816) on chromosome 3q24 (Fig.  3C). The five subgroup "a" genes are, therefore, excellent examples for gene duplication events and divergent evolution of the resulting genes. The subgroup "b" genes are more scattered throughout the genome with clusters of P2Y 1 and GPR91 on chromosome 3q24-25.1, not far away from the subgroup "a" cluster, and of P2Y 2 and P2Y 6 on chromosome 11q13.5 [9] on the same BAC clone consisting of 191 kb (AP002761).

Functional assays
A full-length cDNA clone of GPR99 was constructed by ligating error-free PCR products. For expression in Xenopus oocytes this cDNA was cloned into pGemHE [10]. In vitro transcribed cRNA was injected together with the cRNA of a G-protein coupled inwardly rectifying potassium channel, and the oocytes were analyzed by whole cell clamp as described [11,12]. None of the applied nucleotide ligands (ATP, CTP, GTP, UTP, ADP, GDP, UDP, AMP, CMP, GMP, cAMP, cGMP, CMP-sialic acid, GDP-fucose, GDP-mannose, UDP-N-acetyl galactosamin, UDP-glucuronate, UDP-galactose, UDP-N-acetyl glucose) evoked stronger responses in GPR99-injected than in control oocytes. As an additional mammalian expression system we used CHO cells stably transfected with aequorin and the promiscous G protein α subunit G α16 [13]. These cells were transiently transfected with a GPR99-pcDNA3 construct. G-protein activation was analyzed by measuring the Ca 2+dependent luminescence of aequorin in a bioluminometer (Berthold). As in the oocyte system, the above mentioned nucleotides did not stimulate the emission of light in a GPR99-dependent manner. We conclude from this that the nucleotide derivatives assayed are either no agonists of GPR99 or that GPR99 can not activate the G proteins present in the oocyte and the CHO cells. Both systems are widely used for heterologous cell-surface expression of GPRs. Nevertheless, since no GPR99-specific antiserum was available, and since we did not add an epitope tag to GPR99, we can not prove cell-surface expression which is a prerequisite for the ligand assays used.

Conclusions
We describe the identification of GPR99 from human genomic data and its expression in kidney and placenta. With respect to a new subgroup classification for the P2Y subfamily based on sequence similarity, specific ligandbinding motifs, and gene clustering, we propose that GPR99 is a receptor for nucleotide ligands.

Database searches and construction of evolutionary trees
BLASTN and TBLASTN searches of GenBank were performed using the National Center for Biotechnology Information server [14]. Amino-acid sequences were aligned with Clustal W [8]. Phylogenetic analysis was performed using the program PUZZLE [15], a support value of at least 53% was assigned to each internal branch.

Northern-blot analysis
A commercial human multiple-tissue northern blot (Clontech) was hybridized using a [ 32 P]-labeled 400 bp NotI-GPR99 fragment of the IMAGE clone 3014233 (AW827323) according to the manufacturer's instructions. The blot contained 1 µg of poly-A + RNA in each lane.

PCR
Based on the genomic sequence of hGPR99 an upstream primer containing the in-frame stop and start codon (AGA TGA AAG GAG ACA ACC ATG AAT G) and a downstream primer about 50 nucleotides 3' of the stop codon (CTT AGG ATG CTA GGT AAA GTA TCA GC) were designed. Full-length GPR99 was PCR amplified from placental cDNA prepared with the Marathon kit (Clontech) using a protocol with 30 cycles at an annealing temperature of 55°C and Taq polymerase.
To obtain an error-free GPR99 cDNA, the PCR products were cloned into the pGEM-T Easy vector (Promega) and a full-length clone was constructed by ligating error-free parts of two individual clones.

Ligand assays
For expression in Xenopus oocytes GPR99 was cloned into pGemHE [10]. In vitro transcribed cRNA was injected together with the cRNA of the G protein-coupled inwardly-rectifying potassium channel GIRK, and the oocytes were analyzed by whole cell clamp analysis [11,12].
As a mammalian expression system we used CHO cells stably transfected with aequorin and the promiscuous G protein α subunit G α16 [13]. These cells were transiently transfected with GPR99 cloned into pcDNA3 (Invitrogen). Activation of G-proteins was analyzed by measuring the Ca 2+ -dependent luminescence of aequorin in a bioluminometer (Berthold).