Skip to main content
  • Research article
  • Open access
  • Published:

Cross genome phylogenetic analysis of human and Drosophila G protein-coupled receptors: application to functional annotation of orphan receptors



The cell-membrane G-protein coupled receptors (GPCRs) are one of the largest known superfamilies and are the main focus of intense pharmaceutical research due to their key role in cell physiology and disease. A large number of putative GPCRs are 'orphans' with no identified natural ligands. The first step in understanding the function of orphan GPCRs is to identify their ligands. Phylogenetic clustering methods were used to elucidate the chemical nature of receptor ligands, which led to the identification of natural ligands for many orphan receptors. We have clustered human and Drosophila receptors with known ligands and orphans through cross genome phylogenetic analysis and hypothesized higher relationship of co-clustered members that would ease ligand identification, as related receptors share ligands with similar structure or class.


Cross-genome phylogenetic analyses were performed to identify eight major groups of GPCRs dividing them into 32 clusters of 371 human and 113 Drosophila proteins (excluding olfactory, taste and gustatory receptors) and reveal unexpected levels of evolutionary conservation across human and Drosophila GPCRs. We also observe that members of human chemokine receptors, involved in immune response, and most of nucleotide-lipid receptors (except opsins) do not have counterparts in Drosophila. Similarly, a group of Drosophila GPCRs (methuselah receptors), associated in aging, is not present in humans.


Our analysis suggests ligand class association to 52 unknown Drosophila receptors and 95 unknown human GPCRs. A higher level of phylogenetic organization was revealed in which clusters with common domain architecture or cellular localization or ligand structure or chemistry or a shared function are evident across human and Drosophila genomes. Such analyses will prove valuable for identifying the natural ligands of Drosophila and human orphan receptors that can lead to a better understanding of physiological and pathological roles of these receptors.


G protein-coupled receptors (GPCRs) are one of the largest superfamilies of cellular receptor proteins, generally consisting of seven transmembrane helices (TMH) connected by three extracellular and three cytoplasmic loops of varying lengths. Different GPCRs respond to a wide variety of different external stimuli (light, odorants, peptides, lipids, ions, nucleotides etc) and activate a number of different GTP binding proteins (G proteins), there by initiating a wide spectrum of intracellular responses. GPCRs play important roles in cellular signaling networks involving such processes as neurotransmission, taste, smell, vision, cellular metabolism, differentiation and growth, inflammatory and immune responses and secretion. Abnormalities of signaling by GPCRs are the root cause of disorders that affect most tissues and organs in our body, such as color blindness, thrombosis, restenosis, atherosclerosis, hyper functioning thyroid adenoma and nephrogenic diabetes insipidus and precocious puberty. GPCRs are of major importance to the pharmaceutical industry since they play major roles in the pathogenesis of human diseases and are targets for more than half of the current therapeutic agents on the market [1]. Despite the importance of GPCRs in physiology and diseases, only one high-resolution structure has been solved, that of bovine rhodopsin [2]. A majority of the identified GPCRs are with no known ligand specificity (orphan receptors), which presents a challenge for identifying their native ligands and defining their function.

Characterizing the role of any GPCR involves the identification of both the activating ligand and the activated G protein. A diverse range of procedures have led to the identification of ligands for orphan receptors: (1) identifying relationship between receptor and ligand expression patterns [3], (2) testing tissue extracts in receptor-based functional assays and (3) testing ligands for identified GPCRs on orphan GPCRs with high sequence identity [4] and in some cases randomly evaluating orphan GPCRs against arrayed families of known ligands. The physiological role of these receptors can be well understood by the identification of natural ligands, which further advance the design of pharmacologically active surrogate activators or inhibitors of the GPCRs that have defined native ligands. Strategies described above will be facilitated by better prediction of ligand structure or chemical class of orphan GPCRs.

Proteins similar in sequence often exhibit similar functions. Therefore, sequence homology can be used as a primary criterion for functional screening. This powerful principle can be extended to proteins that are homologous in different species. This has led to the identification of many new novel GPCRs across different species [5]. Many orphan GPCRs are conserved among different species suggesting that they should be active and thus bind novel ligands. This led to the idea that orphan GPCRs could be used as targets to identify their natural ligands and consequently led to the discovery of novel transmitters [6]. Those orphan receptors that share more than 45 percent of sequence identity with the GPCRs with known ligands are very likely to also share common ligands [5]. Often, the direct association of ligand class to orphan receptors is non-trivial by simple BLAST searches even at high sequence identity [7]. The top ranking hits constitute GPCRs from diverse ligand classes (Metpally and Sowdhamini unpublished results) and may not suggest a consensus on possible ligand class to be inferred directly. However, if the sequence identity is below the twilight zone (less than 30 percent), predictions using direct sequence search methods often fail. Phylogenetic tree building has shown that receptors that respond to the same, or similar, agonists often cluster together, even with low sequence identity. For example, most members of the prostanoid receptor subfamily share less than 30 percent amino acid identity, yet these receptors are more like one another than any other GPCR [8]. Phylogenetic clustering methods were used to elucidate the chemical nature of receptor ligands, which led to the identification of natural ligands for many orphan receptors [914].

GPCRs were previously classified into distinct families by different groups [1418]. The classifications would include rhodopsin-like receptors, secretin receptor-like receptors, metabotropic glutamate-like receptors, adhesion-like receptors and frizzled/smoothened-like receptors as proposed by Fredriksson and coworkers [16]; in addition, other groups have proposed two more classes, viz., the fungal pheromone receptor like family and cyclic AMP receptors family [17, 18]. These classification schemes were generated mostly from individual genome studies [12, 16].

Studies in model organisms and cross-genome comparisons have provided major insights in the general understanding of numerous genes and pathways involved in a wide variety of physiological processes and human diseases [19]. Drosophila is a very good model organism owing to the simplicity in the genetic system and a short lifespan enabling the screening of large individuals to identify mutations in new candidate genes that may have human counterparts involved in cellular physiology and diseases [20]. Despite disparity in morphology or phenotype, Drosophila shows similarity with humans in developmental and cellular processes like core aspects of cell cycle, signaling pathways, apoptosis, neuronal signaling, cytoskeleton and core proteome (including main protein domains and families) [21]. We, therefore, sought out to adopt Drosophila GPCRs to study human gene function using comparative genomics [2123].

A large number of Drosophila GPCRs have no characterized ligands. On the other hand, many human GPCRs are well characterized in their physiology and pharmacology. In this study, we collected a large set of GPCR sequences from human and Drosophila genomes and performed cross-genome multiple phylogenetic analyses. Further analysis reveals unexpected levels of similarity between GPCRs of these two species and phylogenetic association could be employed to predict ligands (chemical structure or class and/or functions) for many of Drosophila and human orphan receptors.

Results and discussion

Cross genome phylogenetic analysis of human and Drosophila non-olfactory receptors resulted in eight major groups. They are i) peptide receptors, ii) chemokine receptors, iii) nucleotide and lipid receptors iv) biogenic amine receptors v) secretin receptors vi) glutamate receptors vii) cell adhesion receptors and viii) frizzled receptors. These were further classified into 32 clusters (Table 1) with eleven clusters of peptide receptors, two clusters of chemokine receptors, six clusters of nucleotide and lipid receptors, five clusters of biogenic amine receptors, two clusters of secretin receptors, four clusters of glutamate receptors and one cluster each of cell adhesion and frizzled receptors (The combined phylogenetic and ligand analyses of human-Drosophila GPCRs are shown in Figures 1, 2, 3, 4, 5, 6, 7, 8, 9). About thirty one GPCR sequences could not be assigned to any of these clusters; these are discussed separately below as unassociated GPCRs. Our method sometimes resulted in clusters with members whose ligands belong to different chemical structure or classes and these results are discussed in detail below.

Table 1 List of GPCRs in each of the 32 clusters derived from phylogenetic analysis. Suffix _Hum and _Dro refers to human and Drosophila sequences respectively. Orphan receptors are shown in bold.
Figure 1
figure 1

Phylogenetic trees of peptide receptors (clusters 1–11). Trees were inferred as described in Methods (using TREE-PUZZLE 5.1 corrected using JTT substitution frequency matrix. Quartet-puzzling support percentage values from 10,000 puzzling steps are shown). Out-group not showed in the figure. The scale bars indicate a maximum likelihood branch length of 0.1 inferred substitutions per site. Orphan receptors are shown in bold letters. Cluster numbers are marked in the top left corner.

Figure 2
figure 2

Representative multiple sequence alignment of GPCR clusters. GPCR sequences of ET1R_Hum, ETAR_Hum, ETBR_Hum, ETB2_Hum, GRPR_Hum, NMBR_Hum, BRS3_Hum, GP37_Hum, Q8TDV0_Hum, Q9V858_Dro and Q9V9K3_Dro belonging to cluster 4 were aligned with ClustalX. Sequence region comprising of TMH-1 to TMH-7 alone were considered for the analysis (Alignment was modified by deleting the extremely variable amino termini upstream of the first transmembrane helix and carboxyl termini downstream of the seventh transmembrane helix). Identical amino-acid residues in all aligned sequences are shaded in black and similar residues in gray and consensus residues are indicated below. Transmembrane helices (TMH) identified by the HMMTOP program are indicated.

Figure 3
figure 3

Phylogenetic trees of chemokine receptors (clusters 12 and 13). The mode of deriving phylogenetic trees is as described in Methods and indications are as in Figure 2.

Figure 4
figure 4

Phylogenetic trees of nucleotide and lipid receptors (clusters 14–19). The mode of deriving phylogenetic trees is as described in Methods and indications are as in Figure 2.

Figure 5
figure 5

Phylogenetic trees of biogenic amine receptors (clusters 20–24). The mode of deriving phylogenetic trees is as described in Methods and indications are as in the Figure 2 except for the cluster 22, where scale bar indicates a maximum likelihood branch length of 1.0 inferred substitutions per site.

Figure 6
figure 6

Phylogenetic trees of class B (secretin) receptors (clusters 25 and 26). The mode of deriving phylogenetic trees is as described in Methods and indications are as in Figure 2.

Figure 7
figure 7

Phylogenetic tree of cell adhesion receptors (cluster 27). The mode of deriving phylogenetic tree is as described in Methods and indications are as in Figure 2.

Figure 8
figure 8

Phylogenetic trees of class C (glutamate) receptors (clusters 28–31). The mode of deriving phylogenetic trees is as described in Methods and indications are as in Figure 2.

Figure 9
figure 9

Phylogenetic tree of frizzled/smoothened receptors (cluster 32). The mode of deriving phylogenetic tree is as described in Methods and indications are as in Figure 2.

Peptide receptors

Clusters 1 to 11 comprise of peptide receptors (Figure 1). The size of peptide ligands can vary from two amino acids to as many as 50. Some of the natural peptide ligands include apelin, bombesin, calcitonin, endothelin, galanin, gastrin, ghrelin, neurotensin, neuropeptide B, W, Y, orexin, oxytocin, relaxin, somatostatin, urocortins, etc. These receptors are involved in many human diseases including chronic inflammatory diseases, degenerative diseases, autoimmune diseases, cancer, cardiovascular diseases etc, thus they could be of new therapeutic targets [24, 25].

Receptors with known ligands in cluster 1 binds to galanins or kisspeptins or cyclic peptides. Drosophila allostatin receptors (DARs) (Q9NBC8_Dro and Q9U721_Dro) are very closely related to galanin receptors [26]. Receptors, Q969V1_Hum and Q96S47_Hum, are closely related to GP24_Hum receptor that bind to melanin-concentrating hormone and may have similar cyclic peptides as their ligands. As the name suggests, orphan receptor, SAPR_Hum, does not bind to somatostatins and angiotensins [27] since it is distantly related to GP24_Hum and UR2R_Hum receptors in this tree. Instead, this receptor may bind to similar cyclic peptides.

Cluster 2 consists of receptors for opioid, somatostatin and neuropeptide (NPB or NPW) ligands forming different branches. Opioids and somatostatins are obtained from preprocessing of larger precursor peptides. It is known that GPR7_Hum and GPR8_Hum bind to NPB/W ligands [28]. Drosophila orphan receptors, Q8ISJ9_DRo and Q8I943_Dro branch is close to somatostatin receptors and might bind to ligands similar to somatostatins. Small peptide (apelin, angiotensin, and bradykinin) receptors comprise of cluster 3. The human orphan receptors encoded by GPR15_Hum, GPR25_Hum and Q8NGZ8_Hum are related to APJ_Hum and show significant amino acid identity suggesting these might bind to small peptide endogenous ligands.

Cluster 4 comprises of endothelin and bombesin receptors with known ligands (ET1R_Hum, ETAR_Hum and ETBR_Hum, gastrin-releasing peptide receptor (GRPR_Hum), the neuromedin B receptor (NMBR_Hum) and bombesin receptor (BRS3)). Drosophila orphan receptors, Q9V9K3_Dro and Q9V858_Dro, share the branch with bombesin, GRPR and NMBR receptors. They share many conserved amino acids, known to be important for high affinity binding of gastrin-releasing peptide (GRP) and bombesin to GRPR and NMB binding to NMB-R [2931] (Figure 2). This suggests Q9V9K3_Dro and Q9V858_Dro might bind to similar neuropeptide(s) for its activation. Human orphan receptor GPR37_Hum is closely related to ETB2_Hum suggesting it may bind to endothelin-like peptides. Q8TDV0_Hum is sequentially similar to both galanin (cluster 1) and bombesin receptors but sub-clustering of peptide receptors by maximum likelihood method has placed it in this cluster suggesting closer association of these two clusters.

Cluster 5 is composed of receptors for neurotensin (NT), neuromedin U (NMU), motilin, growth hormone secretagogue, thyrotropin-releasing hormone and some of PRX-amide peptides. GPR39_Hum is closely related to NT receptors and might bind to neurotensin ligands. Drosophila receptors, Q8ITC7_Dro, Q9VFW5_Dro, Q9VFW6_Dro, Q8ITC9_Dro and Q9VP15_Dro form a separate branch, which are closely related to vertebrate neuromedin receptors and they bind to PRXa pyrokinins or FXPRXamide or Cap2b-like peptides (FPRXamide) or ecdysis triggering hormones (PRXamide) (Park et al. 2002). Q9VDC4_Dro forms a distinct branch and is sequentially close to GHSR_Hum, TRFR_Hum, Q8ITC7_Dro and Q9VFW5_Dro and might bind to neuropeptides. Drosophila orphan receptors, Q9W4H3_Dro, Q9VT27_Dro, Q8SWR3_Dro, Q9V5T1_Dro, Q9W025_Dro and Q9W027_Dro, branch out from that of TRFR_Hum and might form a separate family of receptors binding to novel neuropeptide ligands. Supporting our analysis, Q9W025_Dro and Q9W027_DRo were reported as first receptors specific for Drosophila myosuppressins (Drome-MS) [32] and Q9W4H3_Dro was reported as neuropeptide proctolin binding receptor [33]. Q9VT27_Dro is very closely related to Q9W4H3_Dro and might bind to proctolin or similar neuropeptide ligands for its activation.

Cluster 6 consists of peptide hormone receptors binding arginine vasopressin (AVP) or growth hormone releasing hormone or oxytocin or gonadotropin-releasing hormone II or crustacean cardioactive peptide (CCAP) or corazonin or adipokinetic hormone (AKH) (Park et al. 2002). GP19_Hum is related to Drosophila CCAP receptor (Q8ITD2_Dro) that is activated by CCAP and AKH, but not by AVP. Thus, CCAP and AKH might as well bind to GP19_Hum for its activation. Drosophila gonadotropin-releasing hormone and/or corazonin receptor (GRHR_Dro) and putative corazonin (GRHR II) receptor clusters well with human counterparts (GRHR_Hum and GRR2_Hum) suggesting early evolution of GRHR receptors. Q8NGU9_Hum forms a separate branch, but shares sequence similarity with AVP receptors and might bind to similar neuropeptide ligands.

Cluster 7 comprises leucine-rich repeat-containing G protein-coupled receptors (LGR) like glycoprotein receptors, follicle stimulating hormone receptor (FSHR_Hum), thyroid-stimulating hormone receptor (TSHR_Hum), luteinizing hormone receptor (LSHR_Hum) and receptors binding to relaxin. These are unique in having a large N-terminal extracellular (ecto) domain containing leucine-rich repeats important for interaction with the glycoprotein ligands and are classified into three sub-groups [34]. Our analysis also shows that there are three LGR subfamilies: (i) the glycoprotein hormone receptors LSHR_Hum, FSHR_Hum, TSHR_Hum, Q8SX01_Dro and Q9NDI1_Dro (ii) LGR4_Hum LGR5_Hum and LGR6_Hum (iii) LGR5_Hum, LGR7_Hum and LGR8_Hum, Q9VBP0_Dro, and Q9VYG0_Dro. Drosophila orphan receptors Q8SX01_Dro and Q9NDI1_Dro are closely related to human glycoprotein hormone receptors and might bind to glycoprotein hormones. Q9VBP0_Dro and Q9VYG0_Dro are very similar in their overall domain architecture to LGRs with long N-termini, but their similar relationship in extracellular domain arrangements are also evident from this phylogenetic analysis without considering the N and C termini.

Cluster 8 consists of peptide receptors with known ligands such as gastrin (GAS), cholecystokinin (CCK), orexin (OXR) and neuropeptide FF (NFF) or morphine modulating peptides. GPR103_Hum (Q96P65) is closely related to neuropeptide FF receptors, as predicted by our phylogenetic analysis and previous prediction on human GPCRs [12]. Subsequently, GPR103 was characterized and a novel RF-amide peptide, P52 was shown to be its ligand [35]. Drosophila orphan receptors, Q9VWR3_Dro (CCKLR-17D1) and Q9VWQ9_Dro (CCKLR-17D3), are related to each other and branch off from the cholecystokinin (CCK) receptors and might have cholecystokinin as its natural ligand. Q14439_Hum branch off orexin receptors that bind to two novel neuropeptides, orexin-A and B, derived from a common prepro-orexin precursor by proteolytic processing [36].

The receptors with known ligands binding to chemotactic substances (hydrophilic peptides, N-formyl-methionyls (FML) and anaphylactic complement factors) are part of cluster 9. These ligands are structurally very diverse but functionally related peptides. Human orphan receptors, GP32_Hum and Q8NGA4_Hum branch out early from FML receptors and may probably bind to smaller hydrophilic peptides. L4R1_Hum, L4R2_Hum and Q8TDT2_Hum form a separate branch distant from other chemotactic peptide receptors with out bootstrap support. CML1_Hum and GPR1_Hum form a separate branch distinct from the other branches, and also GPR44_Hum forming an individual branch. Prediction of ligands for these receptors is not possible using this phylogenetic tree, but these receptors may be activated by chemotactic substances [37].

Mas proto-oncogene, Mas-related genes (MRGs) and sensory neuron-specific G protein-coupled receptors (SNSRs) form cluster 10. Angiotensin (1–7) has been identified as an endogenous ligand for the G protein-coupled receptor Mas [38]. SNSRs are activated by proenkephalin A peptide fragments, like bovine adrenal medulla peptide 22 (BAM22). Some MRGs and SNSRs are expressed in nociceptive sensory neurons suggesting that they could be involved in pain sensation or its modulation. Previous studies also suggest that ligands for MRG receptors may include neuropeptides that modulate pain sensitivity [39]. Human orphan receptor Q8NGK7_Hum is closely related to MRG receptor.

All receptors with known ligands in cluster 11 are neuropeptide receptors. Drosophila tachykinin-like peptide receptors (TLR1_Dro and TLR2_Dro) and human neurokinin receptors (NK1-4R_Hum) form a closely-knit branch. PKR1_Hum (Q8NFJ7) and PKR2_Hum (Q8NFJ6) form a separate branch of receptors that bind to prokineticins [40]. Q9VRM0_Dro is closely related to Drosophila receptor NYR_Dro that bind to neuropeptide Y. Q9VRM0_Dro might probably bind to similar neuropeptides. Neuropeptide Y binding receptors (NY1R_Hum, NY4R_Hum, NY5R_Hum and NY6R_Hum (Q99463)) form a separate branch. The human prolactin-releasing peptide (PrRP) binding GPR10_Hum forms a separate branch in this phylogenetic tree [41]. Drosophila orphan receptors, Q9VW75_Dro and Q8SZ35_Dro constitute a separate branch close to other neuropeptide receptors that might functionally be activated by neuropeptides. Similarly, orphan receptor GP72_Hum forms a new branch. Drosophila orphan receptor Q9W189_Dro is a very distantly related member and was only grouped into this cluster by blastp results.

Chemokine receptors

Chemokine receptors are phylogenetically represented by two clusters 12 and 13 (Figure 3). Chemokines are important molecules in inflammatory responses, as immunomodulators and they also have critical functions in lymphopoiesis [42]. There are no Drosophila members belong to this group of receptors suggesting these receptors might be recent in evolutionary origin. They have been divided into two subfamilies on the basis of the arrangement of the two disulphide-bond forming N-terminal cysteine residues, CXC and CC. Many human CXC chemokines that mainly act on neutrophils are clustered at chromosome 4q12–13, while many CC chemokines that mainly act on monocytes are located in another cluster at chromosome 17q11.2. Our phylogenetic analysis has also divided chemokine receptors into two major clusters, concurrent with that of chemokine classes, suggesting co-evolution of receptors and ligands [43].

Cluster 12 consists of receptors associated with CC type chemokines. As reported previously through earlier approach [12] O75307_Hum (CRAM-A) might bind to CC-type chemokine ligand. Cluster 13 consists of both CXC and CC-type receptors. ADMR_Hum and Q8NE10_Hum (RDC1) form a branch whereas Duff antigen and Q96CH1_Hum are distantly related to CML2_Hum. These two branches are associated to chemokine receptors based on BLASTP similarity at an E-value significance of 5e-04 and 7e-07, respectively, with other members of this cluster.

Nucleotide and lipid receptors

Nucleotide and lipid receptors consists of six clusters (Figure 4), except for cluster 14 (opsins) and cluster 18 (receptors binding ligands are derivatives of arachidonic acid) there are no counter parts from Drosophila. Opsins are included in cluster 14 that are activated by isoprenoid ligands. Drosophila opsins show significantly high homology to human opsins. There is strong conservation of the retinal binding site and other regions suggesting that they are derived from a common ancestor and diverged thereafter retaining structural and functional features [44]. Drosophila receptor Q9VTU7_Dro is closely related to OPS3–5_Dro receptors, which are localized in the inner-cells of the Drosophila eye (either R7 or R8 cells). This suggests Q9VTU7_Dro might be localized in the inner cells of Drosophila eye.

Receptors for pyramidine or purine nucleotides, cysteinyl leukotriene, nicotinic acid (niacin; pellagra preventing factor) and short, medium and long chain fatty acids make up cluster 15. Q9BXC0_Hum (GPR81), Q8TDS5_Hum and GP31_Hum share the branch with closely related nicotinic acid (HM74_Hum) receptor [45] and might have similar carboxylic acids as their ligands. Q8TDQ8_Hum and Q96P68_Hum are related to each other as well as to P2Y receptors and may bind to P2Y nucleotides. GP17_Hum and GP82_Hum receptors are distantly related to other members in this cluster and might represent potential new subfamilies binding to nucleotide or lipids.

Cluster 16 is a heterogeneous group of receptors binding to lipids, nucleotides, modified nucleotides and platelet activating factor (PAF). Orphan receptor Q8TDU7_Hum (GPR86) is closely related to platelet ADP-binding receptor (P2YC_Hum). Q96JZ8_Hum (GPR87) is closely related to UDP-glucose receptor (P2YX_Hum) and might bind to a modified nucleotide ligand. GPR34_Hum forms a separate branch which is distantly related to PAFR_Hum. No prediction of ligands is possible for GPR34_Hum with this phylogenetic tree.

Cluster 17 consists of lipid receptors (cannabinoids, lysophospholipid sphingosine 1-phosphate (S1P)) and exceptionally some of the peptide receptors (melanocortin peptides derived from processing of pro-opiomelanocortin) are represented in different branches. Although they bind to different ligands, they identify each other during sequence searches and display 23–29% sequence identity. The functionally important motifs are fairly conserved [46] (please see Additional data file 2). Indeed, this unusual branching including peptide and lipid receptors has been noted earlier by Methner's and Fredicksson's groups [12, 16].

Cluster 18 is composed of receptors binding to prostaglandins, prostacyclins and thromboxanes. All these ligands are derivatives of arachidonic acid (AA), which serves as the precursor via the cyclooxygenase (COX) pathway. Drosophila orphan receptor Q9VVJ1_Dro within this tree might bind to ligands derived from AA by the action of COX.

Cluster 19 is also a heterogeneous group of receptors consisting of protease-activated receptors, psychosine receptors, lysophosphatidylcholine and sphingosylphosphorylcholine. Ovarian cancer G-protein-coupled receptor 1 (OGR1), previously described as a receptor for sphingosylphosphorylcholine, acts as a proton-sensing receptor stimulating inositol phosphate formation [47], whereas GPR4 is also involved in pH homeostasis, but elicits cyclic AMP formation [48]. OGR1 (GPR68) and GPR4 are different from other sphingosylphosphorylcholine binding endothelial differentiation gene (EDG) receptors. Orphan P2Y receptors in this cluster are misnomers since they do not cluster with the classical neuropeptide receptors (cluster 15 and 16) and instead appear to be co-clustered with members of this heterogeneous cluster. Either they may have uncommon nucleotide(s) as natural ligand or despite their structural similarity to the P2Y family they may not be nucleotide receptors [49]. GP35_Hum and Q8N580_Hum, EBI2_Hum and GP18_Hum and GP20_Hum cluster as separate branches and are distantly related to members of other branches but probably bind to lipids as their natural ligands.

Biogenic amine receptors

Biogenic amine receptors consists of five clusters mainly consisting of trace amine; melatonin; serotonin receptors; histamines, muscarinic acetylcholine, adenosine and histamine; dopamine, octopamine and adrenaline receptors (Figure 5). In these clusters fairly good intermixing of human and Drosophila receptors is observed. This suggests biogenic amine receptors have ancient evolutionary origin as they are observed in invertebrates to higher vertebrates. Cluster 20 is represented mainly by trace amine (TA) receptors (Figure 5). Trace amines binding these receptors are believed to play an important role in human disorders such as depression, attention deficit disorder, schizophrenia and parkinson's disease [50]. They form a subfamily of GPCRs, distinct from, but related to serotonin (5-HT), Norepinephrine (NE) and dopamine (DA) receptors. Drosophila orphan receptors Q9VG54_Dro and Q9VCZ3_Dro are closely related to 5H4_Hum. Q9P1P4_Hum (GPR57) and Q9P1P5_Hum (GPR58) are closely related to Q96RJ0_Hum (TA1). Similarly O14804_Hum, a putative neurotransmitter receptor (PNR) is closely related to trace amine (Q969N4_Hum, Q96RI8_Hum, and Q96RI9_Hum) receptors.

Cluster 21 consists of melatonin receptors (ML1A_Hum, ML1B_Hum and ML1X_Hum) and other related orphan receptors (O77269_Dro, O77270_Dro, and Q9NQS5_Hum). Melatonin receptors bind to and are activated by biogenic amine 5-methoxy-N-acetyltryptamine (melatonin). The melatonin-related receptor (ML1X_Hum), despite sharing considerable amino acid sequence identity with other melatonin receptors, does not bind melatonin [51]. The receptors in this cluster show considerable sequence similarity to neuropeptide Y (NPY) receptors than other biogenic amine receptors and were previously grouped along with NPY receptors [12].

All receptors with known ligands of Cluster 22 consist of serotonin receptors. These are structurally distinct from serotonin receptors in cluster 24. Drosophila orphan receptors Q9VEG1_Dro and Q9VEG2_Dro form a separate branch but are closely related to other serotonin receptors in this tree and might have similar ligand (s) for its activation. Q8TDV2_Hum and Q16538_Hum (Protein A-2), however, are distantly related to other receptors in this tree and were placed only based on BLASTP similarity.

Receptors of biogenic amines (muscarinic acetylcholine, adenosine and histamine) and many orphan receptors are all placed in different branches in cluster 23. Drosophila orphan receptor Q9VHW1_Dro branch out along with muscarinic acetylcholine and histamine receptors in this tree and might bind to acetylcholine or histamines for its activation. Q9VAA2_Dro is closely related to that of adenosine receptors. Super conserved receptors expressed in brain (SRB1-3) from vertebrate species form a separate branch and might represent potential novel subfamily of GPCRs binding to undiscovered endogenous biogenic amine ligands [52]. High-affinity lysophosphatidic acid (LPA) receptor homologs O43898_Hum and GPR63_Hum form a distinct branch. Similarly, orphan receptors GP21_Hum and GP51_Hum, GPR62_Hum and Q8TDV4_Hum, Q8NDV2_Hum (GPR26) and Q8NGV3_Hum and Q9VMI4_Dro form a distinct branch, suggesting only distant relationship with other members of the cluster.

Receptors of biogenic amines (dopamine, histamine, octopamine and adrenaline), few serotonergic receptors and many orphan receptors are represented in different branches in cluster 24. Drosophila dopamine 2-like receptor (DD2R), Q8IS45_Dro, groups well with the human counterparts suggesting that their evolution extends much before Drosophila. Interestingly, DOP2_Dro is grouped with the adrenaline receptors instead with dopaminergic receptors and shows similar sequence identity (40–48%) with vertebrate alpha 1-, and beta-adrenergic, and D1-like, D2-like dopaminergic and serotonergic receptors. This Drosophila receptor has been discussed as a novel structural class of dopamine receptors [53]. Drosophila octopamine receptor isoforms in mushroom bodies (OAMB) (O97171_Dro and O61730_Dro) branch out with human alpha 1 adrenergic (A1A (A, B and D) _Hum) receptors since they share high sequence identity (52–55%) in TM regions with alpha 1 adrenergic receptors [54]. Q9VE32_Dro branches out from human alpha 2 adrenergic receptors and may have adrenaline as its ligand for activation. Orphan striatum-specific G protein-coupled receptor (STRG or Q9GZN0_Hum), though grouped with biogenic amine receptors, may represent a novel subtype of GPCR due to the lack of conservation of key functional residues [55]. Orphan receptors, Q9W3V5_Dro and Q8TDV5_Hum, Q96P66_Hum and Q8N6U8_Hum, Q9VHP6_Dro and Q9VBG4_Dro form their own branch sharing distant relationship with other receptors in this tree and might represent potential novel subfamilies of biogenic amine GPCRs.

Class B (secretin) receptors

Class B receptors are represented by two clusters (25 and 26) consisting of classical hormone receptors and Drosophila methuselah (MTH) like proteins (Figure 6). The ligands for receptors of cluster 25 are structurally related polypeptide hormones of 27–141 amino-acid residues (pituitary adenylate cyclase-activating polypeptide (PACAP), secretin, calcitonin, corticotropin-releasing factor (CRF), urocortins, growth-hormone-releasing hormone (GHRH), vasoactive intestinal peptide (VIP), glucagon, glucagon-like peptides (GLP-1, GLP-2) and glucose-dependent insulinotropic polypeptide (GIP). Drosophila orphan receptors, Q9V716_Dro and Q9V6C7_Dro are closely related to the human receptor for Corticotropin releasing factor receptor (CRF) which binds to urocortins. Q9V6N4_Dro, Q9VYH9_Dro and Q9NEF7_Dro are related to calcitonin (CALR_Hum) and calcitonin gene-related peptide type 1 receptors (CGRR_Hum). Three small accessory proteins, called receptor activity-modifying proteins (RAMPs), interact with these calcitonin receptors and can generate six pharmacologically distinct receptors. If this phenomenon of RAMP-enabled receptor diversity exists in other receptors, then it will further complicate the ligand-receptor interactions of GPCRs, assuming they still bind to structurally similar ligands. Human orphan receptor, Q8NHB4_Hum, is very closely related to PTRR_Hum receptor binding to parathyroid hormone and parathyroid hormone-related protein (PTHrP). Methuselah receptors and its paralogs of Drosophila solely represent cluster 26. The Drosophila mutant methuselah (MTH) was identified from a screen for single gene mutations that extended average lifespan of an organism and also increased resistance to several forms of stress, including starvation, heat, and oxidative damage [56]. There are no obvious homologues of these receptors within human or C. elegans genomes. Drosophila receptors, Q8INM0_Dro, Q8IPD0_Dro and Q95NU7_Dro, are closely related to previously identified MTH members and may be new paralogs of these receptors.

Cell adhesion receptors

Large number of GPCRs with long extracellular N-termini, containing GPCR proteolytic site (GPS) domain, are represented in cluster 27 (Figure 7). Several of these receptors also have one or many functional domains such as epidermal growth factor (EGF), leucine rich repeat (LRR), hormone-binding domain (HBD) and immunoglobulin (Ig) domains [16]. These form several distantly related branches. Except CD97_Hum, all the receptors in this cluster are orphans with no known ligands [57]. There are only four Drosophila sequences representing these receptors.

Class C (glutamate) receptors

Receptors of Class C are divided mainly into four clusters (28–31): metabotropic glutamate receptors (MGR), γ-aminobutryic acid (GABA) receptors, calcium-sensing receptors (CASR) and retinoic acid-inducible G-protein-coupled receptors (RAIG) (Figure 8).

Cluster 28 consists of human and Drosophila MGRs. Human MGRs are sub-grouped into three different branches: first contains MGR1_Hum and MGR5_Hum and second contains MGR2_Hum and MGR3_Hum. The third branch, including MGR4_Hum, 6–8 and Drosophila MGRs represent a separate subgroup [58]. Drosophila orphan receptor Q9V4U4_Dro is closely related to MGR_Dro and might bind to glutamate for its activation.

Calcium-sensing receptor (CASR_Hum) forms cluster 29 along with a set of orphan receptors (Q8NHZ9_Hum, Q8NGV9_Hum, Q8NGW9_Hum and Q8NGZ7_Hum). These orphan receptors either may have ligands and/or function similar to that of CASR_Hum or they may act as pheromone/olfactory receptors. Phylogenetic tree of most members (including olfactory, putative pheromone, and sweet and amino acid taste receptors) of family 3 GPCRs across different genomes (Catfish (Ictalurus punctatus), Caenorhabditis elegans, Drosophila melanogaster, Japanese pufferfish (Fugu rubripes), Goldfish (Carassius auratus), human (Homo sapiens sapiens), mouse (Mus musculus), rat (Rattus norvegicus) and Salmon (Oncorhynchus masou)) have shown CASR_Hum forms a separate branch part of pheromone/olfactory cluster of class C GPCRs [59]. To note that olfactory and gustatory/taste receptors are not considered in this work.

Cluster 30 consists of retinoic acid-inducible G-protein-coupled receptors (RAIG). RAIGs have short (30–50 amino acids) extracellular amino-terminal domains (ATDs) as opposed to the other receptors currently assigned to class C. BOSS_Dro also has short ATD and branch out very early with the members of RAIGs and may represent new single member subfamily of class C receptors.

The GABAB receptors are present in cluster 31. It is represented by four sub-branches, of which three are GABABR1-3_Hum type receptors and fourth sub-branch of Drosophila orphan receptors (Q9VKA4 and Q9VR40) related to that of GABA receptors. GABAB3 is exclusively present in Drosophila as separate branch whose function is not yet known. Previous results have only been able to functionally characterize D-GABABR1 and R2 when the two subtypes are co-expressed either in Xenopus laevis oocytes or mammalian cell lines, whilst D-GABABR3 was inactive in any combination. This suggests D-GABABR3 requires a counterpart other than D-GABABR1 and R2 to form a functional heterodimer [60]. Thus the current clustering approach suggests that Q9VKA4_Dro or Q9VR40_Dro may interact with D-GABABR3 and form a functional heterodimer.

Frizzled/smoothened receptors

Cluster 32 comprises receptors with a long (about 200-amino acid) N-terminus and conserved cysteine rich domains (CRD) which are likely to participate in Wnt ligand binding (Figure 9). These receptors control the specification of cell fate, cell adhesion, migration, polarity and proliferation [61]. This cluster is represented by ten human (FZD1-10) and four Drosophila (FRZ1-4) frizzled receptors together with smoothened (SMO_Hum and SMO_Dro) receptors. The topology of the phylogenetic tree shows one smoothened and four frizzled branches. FRZ1_Dro is closely related to human FZD3_Hum and FZD6_Hum. FRZ2_Dro is related to FZD5_Hum and FZD8_Hum, whereas FRZ3_Hum and FRZ4_Hum form separate branches distantly related to other receptors.

Unassociated GPCRs

Thirty one GPCR sequences could not be included in any cluster with appreciable bootstrap values or BLASTP similarity. This can either be viewed as members of single member clusters with certain atypical parts of their sequences that could be a result of chimeric origin of the receptors or due to evolutionary pressure not shared by their closest phylogenetic neighbors [62]. We have therefore placed these receptors separately as unassociated GPCRs, although these receptors clearly do not belong to the same group (see Additional data file 1). Most of the unassociated receptors remain as orphan receptors.


The phylogenetic analyses performed using human and Drosophila GPCRs suggest that the sequences can be divided into 32 clusters and reveals unexpected level of similarity between human and Drosophila GPCRs. 21 clusters group Drosophila and human GPCRs together suggesting high evolutionary conservation across species for GPCR sequences. There are 10 clusters, four of nucleotide-lipid receptors three clusters of peptide receptors and two clusters of chemokine and one cluster of glutamate receptors that do not contain any representation from Drosophila GPCRs in our current dataset of sequences considered. Perhaps the immune-related receptors, such as the chemokine ones, are not either recognized yet or not present in lower organisms such as Drosophila. If there is a clear absence of such classes of receptors, this might also suggest that immune defense is regulated by proteins other than GPCRs in Drosophila. Interestingly, there is one cluster of secretin Drosophila receptors where there is no human representation. These proteins are involved in aging in Drosophila. Furthermore, in this analysis, we also notice that out of the 21 clusters that co-cluster human and Drosophila GPCRs, Drosophila GPCRs remain isolated sub-clusters in 12 of them leaving behind only nine clusters that allow easy inter-mixing of the two sets of sequences. This includes 3 clusters each of peptide and biogenic amine receptors and one cluster each of class B, C and frizzled receptors.

The current clustering analysis provides ligand class association to 52 Drosophila (Table 2) and 95 human orphan receptors could be associated with probable ligand classes using co-clustering principles as earlier observed within human GPCR sequences alone [12]. Further, similar cellular localizations have been suggested for Drosophila orphan receptors that belong to the opsin family (cluster 14). GPCRs with similar extracellular domain architecture also co-cluster suggesting this similarity is encoded even within the GPCR domain. Further this analysis also suggests dimerizing partner (Q9VKA4_Dro or Q9VR40_Dro) for D-GABABR3 that might form a functional heterodimer. We have determined the relationship of the receptors within subgroups of the large GPCR superfamily by means of a cross-genome phylogenetic clustering approach. These studies also revealed a higher-level phylogenetic organization in which clusters with common ligand structure or chemistry, or a shared function, are evident across genomes. We hope that this approach proves valuable for identifying the natural ligands of Drosophila and human orphan receptors.

Table 2 List of Drosophila orphan receptors


Sequence data mining

Human (537) and Drosophila (284) GPCR amino acid sequences were downloaded from GPCRDB (7.0) [18]. The subset of entries containing the keyword 'olfactory receptors (OR)' or 'gustatory receptors (GR)' or 'taste receptors' were extracted by text parsing and were removed as they were extremely diverse sequences and inclusion of them affects badly on alignments quality. Further, we wanted to avoid polymorphism, splice variants, pseudogenes and duplicates of these receptors and sequences above 90% sequence identity were removed from the data set using CD-HIT [63]. This set amounted to 371 human and 113 Drosophila sequences (Additional data file 1). GPCRs without published ligands in the NCBI-PubMed were considered as orphan receptors. The sequences were renamed to add suffix _Hum and _Dro to refer to human and Drosophila sequences respectively.

Transmembrane helix predictions

Transmembrane domains were identified using HMMTOP program [64]. Amino termini upstream of TMH-1 and carboxyl termini downstream of TMH-7 were removed as they show extreme variability in these regions. Sequence comprising of TMH-1 to TMH-7 alone were considered for the analysis (Figure 2).

Multiple sequence alignments

ClustalX 1.83 [65] was used for multiple sequence alignments (MSA) of receptors with a gap penalty of 10, a gap extension penalty of 0.05 and delay divergent sequences of 35% and protein weight matrix was BLOSUM series. The slow-accurate method was used for the initial pairwise alignments. The protein weight matrix was Blossom 30. When necessary, alignments were optimized by manual editing (Figure 2).

Phylogenetic analysis

An overall phylogenetic tree was inferred from the multiple sequence alignment using PHYLIP package (V 3.5) [66]

Sequence bootstrapping

The bootstrapping of multiple sequence alignment was performed 100 times using SEQBOOT to obtain 100 different alignments. Owing to the limitations in the CONSENSE program of Phylip package to handle large datasets, we restricted to 100 bootstrap replication steps [16].

Neighbor-joining tree

Protein distances were calculated using PROTDIST from the PHYLIP package. The trees were calculated using Neighbor-Joining (NJ) method [67, 68] on 100 different distance matrices using NEIGHBOR from the PHYLIP 3.5 package, resulting in 100 trees. These were analyzed using CONSENSE from the PHYLIP package to derive a bootstrapped consensus tree. An unrooted tree was plotted using TREEVIEW [69]. Sequences with more than 50% bootstrap support values were confirmed and grouped.

Maximum likelihood trees

MSAs for each of the groups were obtained as described above and were used for building maximum likelihood trees [70] using TREE-PUZZLE 5.1 [71]. It is least affected by sampling errors and robust to many violations of the assumptions in the evolutionary model [72]. Parameters were estimated by Quartet sampling and NJ tree; The jones-taylor-thornton (JTT) substitution model was used for the calculation with amino acid usage estimated from data, site-to-site rate variation modeled on a gamma distribution with eight rate categories plus invariant sites, and the gamma distribution parameters estimated from the data. 10,000 quartet puzzling steps were performed to obtain support values for each internal branch and trees inferred with the highest likelihood. This method outperforms other methods like neighbor joining or parsimony methods except that it is computationally intensive, extremely slow and cannot be applied to very large datasets. Drosophila 5HTA receptor (5HTA_Dro) of family A was used as out-group for secretin, glutamate, cell adhesion and frizzled receptors. Human (O75205_Hum or GPRC5B) receptor of family B was used as out-group for peptide, chemokine, nucleotide and lipid and biogenic amine receptors for tree constructions (out-groups not shown in the figures) using Tree View [69].

BLAST searches

For sequences with lower support values, similarity measures obtained by searching all against all sequences using BLASTP [73] were used to associate them to the clusters identified by PHYLIP and maximum likelihood methods. Manual inspection of the alignments, bit-score, E-Value, and length of pairwise alignments were considered as measures of similarity. Such receptors may be distantly related to members of the groups but may be sharing high structural similarity and common functional role, possibly due to convergent evolution [74]. It is also possible that these sequences are very diverse that the clustering methods were not sensitive enough to measure these changes [17].


  1. Christopoulos A: Allosteric binding sites on cell-surface receptors: novel targets for drug discovery. Nat Rev Drug Discou. 2002, 1: 198-210. 10.1038/nrd746.

    Article  CAS  Google Scholar 

  2. Palczewski K, Kumasaka T, Hori T, Behnke CA, Motoshima H, Fox BA, Le Trong I, Teller DC, Okada T, Stenkamp RE, Yamamoto M, Miyano M: Crystal structure of rhodopsin: A G protein-coupled receptor. Science. 2000, 289: 739-745. 10.1126/science.289.5480.739.

    Article  PubMed  CAS  Google Scholar 

  3. Libert F, Schiffmann SN, Lefort A, Parmentier M, Gerard C, Dumont JE, Vanderhaeghen JJ, Vassart G: The orphan receptor cDNA RDC7 encodes an A1 adenosine receptor. Embo J. 1991, 10: 1677-1682.

    PubMed  CAS  PubMed Central  Google Scholar 

  4. Pyne S, Pyne NJ: Sphingosine 1-phosphate signalling in mammalian cells. Biochem J. 2000, 349: 385-402. 10.1042/0264-6021:3490385.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  5. Marchese A, George SR, Kolakowski LFJ, Lynch KR, O'Dowd BF: Novel GPCRs and their endogenous ligands: expanding the boundaries of physiology and pharmacology. Trends Pharmacol Sci. 1999, 20: 370-375. 10.1016/S0165-6147(99)01366-8.

    Article  PubMed  CAS  Google Scholar 

  6. Civelli O, Nothacker HP, Reinscheid R: Reverse physiology: discovery of the novel neuropeptide, orphanin FQ/nociceptin. Crit Rev Neurobiol. 1998, 12: 163-176.

    Article  PubMed  CAS  Google Scholar 

  7. Gaulton A, Attwood TK: Bioinformatics approaches for the classification of G-protein-coupled receptors. Curr Opin Pharmacol. 2003, 3: 114-120. 10.1016/S1471-4892(03)00005-5.

    Article  PubMed  CAS  Google Scholar 

  8. Narumiya S, Sugimoto Y, Ushikubi F: Prostanoid receptors: structures, properties, and functions. Physiol Rev. 1999, 79: 1193-1226.

    PubMed  CAS  Google Scholar 

  9. An S, Bleu T, Hallmark OG, Goetzl EJ: Characterization of a novel subtype of human G protein-coupled receptor for lysophosphatidic acid. J Biol Chem. 1998, 273: 7906-7910. 10.1074/jbc.273.14.7906.

    Article  PubMed  CAS  Google Scholar 

  10. Im DS, Heise CE, Ancellin N, O'Dowd BF, Shei GJ, Heavens RP, Rigby MR, Hla T, Mandala S, McAllister G, George SR, Lynch KR: Characterization of a novel sphingosine 1-phosphate receptor, Edg-8. J Biol Chem. 2000, 275: 14281-14286. 10.1074/jbc.275.19.14281.

    Article  PubMed  CAS  Google Scholar 

  11. Szekeres PG, Muir AI, Spinage LD, Miller JE, Butler SI, Smith A, Rennie GI, Murdock PR, Fitzgerald LR, Wu H, McMillan LJ, Guerrera S, Vawter L, Elshourbagy NA, Mooney JL, Bergsma DJ, Wilson S, Chambers JK: Neuromedin U is a potent agonist at the orphan G protein-coupled receptor FM3. J Biol Chem. 2000, 275: 20247-20250. 10.1074/jbc.C000244200.

    Article  PubMed  CAS  Google Scholar 

  12. Joost P, Methner A: Phylogenetic analysis of 277 human G-protein-coupled receptors as a tool for the prediction of orphan receptor ligands. Genome Biol. 2002, 3: RESEARCH0063-10.1186/gb-2002-3-11-research0063.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Ignatov A, Lintzel J, Hermans-Borgmeyer I, Kreienkamp HJ, Joost P, Thomsen S, Methner A, Schaller HC: Role of the G-protein-coupled receptor GPR12 as high-affinity receptor for sphingosylphosphorylcholine and its expression and function in brain development. J Neurosci. 2003, 23: 907-914.

    PubMed  CAS  Google Scholar 

  14. Metpally RPR, Sowdhamini R: Genome wide survey of G protein-coupled receptors in Tetraodon nigroviridis. BMC Evol Biol. 2005, 5: 41-10.1186/1471-2148-5-41.

    Article  PubMed  PubMed Central  Google Scholar 

  15. Attwood TK, Findlay JB: Fingerprinting G-protein-coupled receptors. Protein Eng. 1994, 7: 195-203.

    Article  PubMed  CAS  Google Scholar 

  16. Fredriksson R, Lagerstrom MC, Lundin LG, Schioth HB: The G-protein-coupled receptors in the human genome form five main families. Phylogenetic analysis, paralogon groups, and fingerprints. Mol Pharmacol. 2003, 63: 1256-1272. 10.1124/mol.63.6.1256.

    Article  PubMed  CAS  Google Scholar 

  17. Josefsson LG: Evidence for kinship between diverse G-protein coupled receptors. Gene. 1999, 239: 333-340. 10.1016/S0378-1119(99)00392-3.

    Article  PubMed  CAS  Google Scholar 

  18. Horn F, Weare J, Beukers MW, Horsch S, Bairoch A, Chen W, Edvardsen O, Campagne F, Vriend G: GPCRDB: an information system for G protein-coupled receptors. Nucleic Acids Res. 1998, 26: 275-279. 10.1093/nar/26.1.275.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  19. Banfi S, Borsani G, Rossi E, Bernard L, Guffanti A, Rubboli F, Marchitiello A, Giglio S, Coluccia E, Zollo M, Zuffardi O, Ballabio A: Identification and mapping of human cDNAs homologous to Drosophila mutant genes through EST database searching. Nat Genet. 1996, 13: 167-174. 10.1038/ng0696-167.

    Article  PubMed  CAS  Google Scholar 

  20. Fortini ME, Skupski MP, Boguski MS, Hariharan IK: A survey of human disease gene counterparts in the Drosophila genome. J Cell Biol. 2000, 150: F23-30. 10.1083/jcb.150.2.F23.

    Article  PubMed  CAS  Google Scholar 

  21. Rubin GM, Yandell MD, Wortman JR, Gabor Miklos GL, Nelson CR, Hariharan IK, Fortini ME, Li PW, Apweiler R, Fleischmann W, Cherry JM, Henikoff S, Skupski MP, Misra S, Ashburner M, Birney E, Boguski MS, Brody T, Brokstein P, Celniker SE, Chervitz SA, Coates D, Cravchik A, Gabrielian A, Galle RF, Gelbart WM, George RA, Goldstein LS, Gong F, Guan P, Harris NL, Hay BA, Hoskins RA, Li J, Li Z, Hynes RO, Jones SJ, Kuehl PM, Lemaitre B, Littleton JT, Morrison DK, Mungall C, O'Farrell PH, Pickeral OK, Shue C, Vosshall LB, Zhang J, Zhao Q, Zheng XH, Lewis S: Comparative genomics of the eukaryotes. Science. 2000, 287: 2204-2215. 10.1126/science.287.5461.2204.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  22. Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, Amanatides PG, Scherer SE, Li PW, Hoskins RA, Galle RF, George RA, Lewis SE, Richards S, Ashburner M, Henderson SN, Sutton GG, Wortman JR, Yandell MD, Zhang Q, Chen LX, Brandon RC, Rogers YH, Blazej RG, Champe M, Pfeiffer BD, Wan KH, Doyle C, Baxter EG, Helt G, Nelson CR, Gabor GL, Abril JF, Agbayani A, An HJ, Andrews-Pfannkoch C, Baldwin D, Ballew RM, Basu A, Baxendale J, Bayraktaroglu L, Beasley EM, Beeson KY, Benos PV, Berman BP, Bhandari D, Bolshakov S, Borkova D, Botchan MR, Bouck J, Brokstein P, Brottier P, Burtis KC, Busam DA, Butler H, Cadieu E, Center A, Chandra I, Cherry JM, Cawley S, Dahlke C, Davenport LB, Davies P, de Pablos B, Delcher A, Deng Z, Mays AD, Dew I, Dietz SM, Dodson K, Doup LE, Downes M, Dugan-Rocha S, Dunkov BC, Dunn P, Durbin KJ, Evangelista CC, Ferraz C, Ferriera S, Fleischmann W, Fosler C, Gabrielian AE, Garg NS, Gelbart WM, Glasser K, Glodek A, Gong F, Gorrell JH, Gu Z, Guan P, Harris M, Harris NL, Harvey D, Heiman TJ, Hernandez JR, Houck J, Hostin D, Houston KA, Howland TJ, Wei MH, Ibegwam C, Jalali M, Kalush F, Karpen GH, Ke Z, Kennison JA, Ketchum KA, Kimmel BE, Kodira CD, Kraft C, Kravitz S, Kulp D, Lai Z, Lasko P, Lei Y, Levitsky AA, Li J, Li Z, Liang Y, Lin X, Liu X, Mattei B, McIntosh TC, McLeod MP, McPherson D, Merkulov G, Milshina NV, Mobarry C, Morris J, Moshrefi A, Mount SM, Moy M, Murphy B, Murphy L, Muzny DM, Nelson DL, Nelson DR, Nelson KA, Nixon K, Nusskern DR, Pacleb JM, Palazzolo M, Pittman GS, Pan S, Pollard J, Puri V, Reese MG, Reinert K, Remington K, Saunders RD, Scheeler F, Shen H, Shue BC, Siden-Kiamos I, Simpson M, Skupski MP, Smith T, Spier E, Spradling AC, Stapleton M, Strong R, Sun E, Svirskas R, Tector C, Turner R, Venter E, Wang AH, Wang X, Wang ZY, Wassarman DA, Weinstock GM, Weissenbach J, Williams SM, WoodageT, Worley KC, Wu D, Yang S, Yao QA, Ye J, Yeh RF, Zaveri JS, Zhan M, Zhang G, Zhao Q, Zheng L, Zheng XH, Zhong FN, Zhong W, Zhou X, Zhu S, Zhu X, Smith HO, Gibbs RA, Myers EW, Rubin GM, Venter JC: The genome sequence of Drosophila melanogaster. Science. 2000, 287: 2185-2195. 10.1126/science.287.5461.2185.

    Article  PubMed  Google Scholar 

  23. Burdett H, van den Heuvel M: Fruits and flies: a genomics perspective of an invertebrate model organism. Brief Funct Genomic Proteomic. 2004, 3: 257-266.

    Article  PubMed  CAS  Google Scholar 

  24. Davenport AP: Peptide and trace amine orphan receptors: prospects for new therapeutic targets. Curr Opin Pharmacol. 2003, 3: 127-134. 10.1016/S1471-4892(03)00003-1.

    Article  PubMed  CAS  Google Scholar 

  25. Reubi JC: Peptide receptors as molecular targets for cancer diagnosis and therapy. Endocr Rev. 2003, 24: 389-427. 10.1210/er.2002-0007.

    Article  PubMed  CAS  Google Scholar 

  26. Birgul N, Weise C, Kreienkamp HJ, Richter D: Reverse physiology in Drosophila: identification of a novel allatostatin-like neuropeptide and its cognate receptor structurally related to the mammalian somatostatin/galanin/opioid receptor family. Embo J. 1999, 18: 5892-5900. 10.1093/emboj/18.21.5892.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  27. Matsumoto M, Kamohara M, Sugimoto T, Hidaka K, Takasaki J, Saito T, Okada M, Yamaguchi T, Furuichi K: The novel G-protein coupled receptor SALPR shares sequence similarity with somatostatin and angiotensin receptors. Gene. 2000, 248: 183-189. 10.1016/S0378-1119(00)00123-2.

    Article  PubMed  CAS  Google Scholar 

  28. Tanaka H, Yoshida T, Miyamoto N, Motoike T, Kurosu H, Shibata K, Yamanaka A, Williams SC, Richardson JA, Tsujino N, Garry MG, Lerner MR, King DS, O'Dowd BF, Sakurai T, Yanagisawa M: Characterization of a family of endogenous neuropeptide ligands for the G protein-coupled receptors GPR7 and GPR8. Proc Natl Acad Sci U S A. 2003, 100: 6251-6256. 10.1073/pnas.0837789100.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  29. Akeson M, Sainz E, Mantey SA, Jensen RT, Battey JF: Identification of four amino acids in the gastrin-releasing peptide receptor that are required for high affinity agonist binding. J Biol Chem. 1997, 272: 17405-17409. 10.1074/jbc.272.28.17405.

    Article  PubMed  CAS  Google Scholar 

  30. Sainz E, Akeson M, Mantey SA, Jensen RT, Battey JF: Four amino acid residues are critical for high affinity binding of neuromedin B to the neuromedin B receptor. J Biol Chem. 1998, 273: 15927-15932. 10.1074/jbc.273.26.15927.

    Article  PubMed  CAS  Google Scholar 

  31. Lin Y, Jian X, Lin Z, Kroog GS, Mantey S, Jensen RT, Battey J, Northup J: Two amino acids in the sixth transmembrane segment of the mouse gastrin-releasing peptide receptor are important for receptor activation. J Pharmacol Exp Ther. 2000, 294: 1053-1062.

    PubMed  CAS  Google Scholar 

  32. Egerod K, Reynisson E, Hauser F, Cazzamali G, Williamson M, Grimmelikhuijzen CJ: Molecular cloning and functional expression of the first two specific insect myosuppressin receptors. Proc Natl Acad Sci U S A. 2003, 100: 9808-9813. 10.1073/pnas.1632197100.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  33. Johnson EC, Garczynski SF, Park D, Crim JW, Nassel DR, Taghert PH: Identification and characterization of a G protein-coupled receptor for the neuropeptide proctolin in Drosophila melanogaster. Proc Natl Acad Sci U S A. 2003, 100: 6198-6203. 10.1073/pnas.1030108100.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  34. Hsu SY, Kudo M, Chen T, Nakabayashi K, Bhalla A, van der Spek PJ, van Duin M, Hsueh AJ: The three subfamilies of leucine-rich repeat-containing G protein-coupled receptors (LGR): identification of LGR6 and LGR7 and the signaling mechanism for LGR7. Mol Endocrinol. 2000, 14: 1257-1271. 10.1210/me.14.8.1257.

    Article  PubMed  CAS  Google Scholar 

  35. Jiang Y, Luo L, Gustafson EL, Yadav D, Laverty M, Murgolo N, Vassileva G, Zeng M, Laz TM, Behan J, Qiu P, Wang L, Wang S, Bayne M, Greene J, Monsma FJ, Zhang FL: Identification and characterization of a novel RF-amide peptide ligand for orphan G-protein-coupled receptor SP9155. J Biol Chem. 2003, 278: 27652-27657. 10.1074/jbc.M302945200.

    Article  PubMed  CAS  Google Scholar 

  36. Sakurai T, Amemiya A, Ishii M, Matsuzaki I, Chemelli RM, Tanaka H, Williams SC, Richardson JA, Kozlowski GP, Wilson S, Arch JR, Buckingham RE, Haynes AC, Carr SA, Annan RS, McNulty DE, Liu WS, Terrett JA, Elshourbagy NA, Bergsma DJ, Yanagisawa M: Orexins and orexin receptors: a family of hypothalamic neuropeptides and G protein-coupled receptors that regulate feeding behavior. Cell. 1998, 92: 573-585. 10.1016/S0092-8674(00)80949-6.

    Article  PubMed  CAS  Google Scholar 

  37. Bae YS, Park EY, Kim Y, He R, Ye RD, Kwak JY, Suh PG, Ryu SH: Novel chemoattractant peptides for human leukocytes. Biochem Pharmacol. 2003, 66: 1841-1851. 10.1016/S0006-2952(03)00552-5.

    Article  PubMed  CAS  Google Scholar 

  38. Santos RA, Simoes e Silva AC, Maric C, Silva DM, Machado RP, de Buhr I, Heringer-Walther S, Pinheiro SV, Lopes MT, Bader M, Mendes EP, Lemos VS, Campagnole-Santos MJ, Schultheiss HP, Speth R, Walther T: Angiotensin-(1-7) is an endogenous ligand for the G protein-coupled receptor Mas. Proc Natl Acad Sci U S A. 2003, 100: 8258-8263. 10.1073/pnas.1432869100.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  39. Dong X, Han S, Zylka MJ, Simon MI, Anderson DJ: A diverse family of GPCRs expressed in specific subsets of nociceptive sensory neurons. Cell. 2001, 106: 619-632. 10.1016/S0092-8674(01)00483-4.

    Article  PubMed  CAS  Google Scholar 

  40. Lin DC, Bullock CM, Ehlert FJ, Chen JL, Tian H, Zhou QY: Identification and molecular characterization of two closely related G protein-coupled receptors activated by prokineticins/endocrine gland vascular endothelial growth factor. J Biol Chem. 2002, 277: 19276-19280. 10.1074/jbc.M202139200.

    Article  PubMed  CAS  Google Scholar 

  41. Langmead CJ, Szekeres PG, Chambers JK, Ratcliffe SJ, Jones DN, Hirst WD, Price GW, Herdon HJ: Characterization of the binding of [(125)I]-human prolactin releasing peptide (PrRP) to GPR10, a novel G protein coupled receptor. Br J Pharmacol. 2000, 131: 683-688. 10.1038/sj.bjp.0703617.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  42. Zlotnik A, Yoshie O: Chemokines: a new classification system and their role in immunity. Immunity. 2000, 12: 121-127. 10.1016/S1074-7613(00)80165-X.

    Article  PubMed  CAS  Google Scholar 

  43. Park Y, Kim YJ, Adams ME: Identification of G protein-coupled receptors for Drosophila PRXamide peptides, CCAP, corazonin, and AKH supports a theory of ligand-receptor coevolution. Proc Natl Acad Sci U S A. 2002, 99: 11423-11428. 10.1073/pnas.162276199.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  44. O'Tousa JE, Baehr W, Martin RL, Hirsh J, Pak WL, Applebury ML: The Drosophila ninaE gene encodes an opsin. Cell. 1985, 40: 839-850. 10.1016/0092-8674(85)90343-5.

    Article  PubMed  Google Scholar 

  45. Tunaru S, Kero J, Schaub A, Wufka C, Blaukat A, Pfeffer K, Offermanns S: PUMA-G and HM74 are receptors for nicotinic acid and mediate its anti-lipolytic effect. Nat Med. 2003, 9: 352-355. 10.1038/nm824.

    Article  PubMed  CAS  Google Scholar 

  46. Montero C, Campillo NE, Goya P, Paez JA: Homology models of the cannabinoid CB1 and CB2 receptors. A docking analysis study. Eur J Med Chem. 2005, 40: 75-83. 10.1016/j.ejmech.2004.10.002.

    Article  PubMed  CAS  Google Scholar 

  47. Xu Y, Zhu K, Hong G, Wu W, Baudhuin LM, Xiao Y, Damron DS: Sphingosylphosphorylcholine is a ligand for ovarian cancer G-protein-coupled receptor 1. Nat Cell Biol. 2000, 2: 261-267. 10.1038/35010529.

    Article  PubMed  CAS  Google Scholar 

  48. Ludwig MG, Vanek M, Guerini D, Gasser JA, Jones CE, Junker U, Hofstetter H, Wolf RM, Seuwen K: Proton-sensing G-protein-coupled receptors. Nature. 2003, 425: 93-98. 10.1038/nature01905.

    Article  PubMed  CAS  Google Scholar 

  49. Li Q, Schachter JB, Harden TK, Nicholas RA: The 6H1 orphan receptor, claimed to be the p2y5 receptor, does not mediate nucleotide-promoted second messenger responses. Biochem Biophys Res Commun. 1997, 236: 455-460. 10.1006/bbrc.1997.6984.

    Article  PubMed  CAS  Google Scholar 

  50. Branchek TA, Blackburn TP: Trace amine receptors as targets for novel therapeutics: legend, myth and fact. Curr Opin Pharmacol. 2003, 3: 90-97. 10.1016/S1471-4892(02)00028-0.

    Article  PubMed  CAS  Google Scholar 

  51. Barrett P, Conway S, Morgan PJ: Digging deep--structure-function relationships in the melatonin receptor family. J Pineal Res. 2003, 35: 221-230. 10.1034/j.1600-079X.2003.00090.x.

    Article  PubMed  CAS  Google Scholar 

  52. Matsumoto M, Saito T, Takasaki J, Kamohara M, Sugimoto T, Kobayashi M, Tadokoro M, Matsumoto S, Ohishi T, Furuichi K: An evolutionarily conserved G-protein coupled receptor family, SREB, expressed in the central nervous system. Biochem Biophys Res Commun. 2000, 272: 576-582. 10.1006/bbrc.2000.2829.

    Article  PubMed  CAS  Google Scholar 

  53. Feng G, Hannan F, Reale V, Hon YY, Kousky CT, Evans PD, Hall LM: Cloning and functional characterization of a novel dopamine receptor from Drosophila melanogaster. J Neurosci. 1996, 16: 3925-3933.

    PubMed  CAS  Google Scholar 

  54. Han KA, Millar NS, Davis RL: A novel octopamine receptor with preferential expression in Drosophila mushroom bodies. J Neurosci. 1998, 18: 3650-3658.

    PubMed  CAS  Google Scholar 

  55. Mizushima K, Miyamoto Y, Tsukahara F, Hirai M, Sakaki Y, Ito T: A novel G-protein-coupled receptor gene expressed in striatum. Genomics. 2000, 69: 314-321. 10.1006/geno.2000.6340.

    Article  PubMed  CAS  Google Scholar 

  56. Lin YJ, Seroude L, Benzer S: Extended life-span and stress resistance in the Drosophila mutant methuselah. Science. 1998, 282: 943-946. 10.1126/science.282.5390.943.

    Article  PubMed  CAS  Google Scholar 

  57. Foord SM, Jupe S, Holbrook J: Bioinformatics and type II G-protein-coupled receptors. Biochem Soc Trans. 2002, 30: 473-479. 10.1042/BST0300473.

    Article  PubMed  CAS  Google Scholar 

  58. Parmentier ML, Galvez T, Acher F, Peyre B, Pellicciari R, Grau Y, Bockaert J, Pin JP: Conservation of the ligand recognition site of metabotropic glutamate receptors during evolution. Neuropharmacology. 2000, 39: 1119-1131. 10.1016/S0028-3908(99)00204-X.

    Article  PubMed  CAS  Google Scholar 

  59. Pin JP, Galvez T, Prezeau L: Evolution, structure, and activation mechanism of family 3/C G-protein-coupled receptors. Pharmacol Ther. 2003, 98: 325-354. 10.1016/S0163-7258(03)00038-X.

    Article  PubMed  CAS  Google Scholar 

  60. Mezler M, Muller T, Raming K: Cloning and functional expression of GABA(B) receptors from Drosophila. Eur J Neurosci. 2001, 13: 477-486. 10.1046/j.1460-9568.2001.01410.x.

    Article  PubMed  CAS  Google Scholar 

  61. Wang HY, Malbon CC: Wnt signaling, Ca2+, and cyclic GMP: visualizing Frizzled functions. Science. 2003, 300: 1529-1530. 10.1126/science.1085259.

    Article  PubMed  CAS  Google Scholar 

  62. Fredriksson R, Gloriam DE, Hoglund PJ, Lagerstrom MC, Schioth HB: There exist at least 30 human G-protein-coupled receptors with long Ser/Thr-rich N-termini. Biochem Biophys Res Commun. 2003, 301: 725-734. 10.1016/S0006-291X(03)00026-3.

    Article  PubMed  CAS  Google Scholar 

  63. Li W, Jaroszewski L, Godzik A: Clustering of highly homologous sequences to reduce the size of large protein databases. Bioinformatics. 2001, 17: 282-283. 10.1093/bioinformatics/17.3.282.

    Article  PubMed  CAS  Google Scholar 

  64. Tusnady GE, Simon I: The HMMTOP transmembrane topology prediction server. Bioinformatics. 2001, 17: 849-850. 10.1093/bioinformatics/17.9.849.

    Article  PubMed  CAS  Google Scholar 

  65. Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG: The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 1997, 25: 4876-4882. 10.1093/nar/25.24.4876.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  66. Felsenstein J: PHYLIP, phylogenetic inference package, Department of Genetics, University of Washington, Seattle, WA. 2003

    Google Scholar 

  67. Saitou N, Nei M: The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 1987, 4: 406-425.

    PubMed  CAS  Google Scholar 

  68. Kumar S, Gadagkar SR: Efficiency of the neighbor-joining method in reconstructing deep and shallow evolutionary relationships in large phylogenies. J Mol Evol. 2000, 51: 544-553.

    Article  PubMed  CAS  Google Scholar 

  69. Page RD: TreeView: an application to display phylogenetic trees on personal computers. Comput Appl Biosci. 1996, 12: 357-358.

    PubMed  CAS  Google Scholar 

  70. Strimmer K, Haeseler AV: Quartet puzzling: a quartet maximum-likelihood method for reconstructing tree topologies. Mol Biol Evol. 1996, 13: 964-969.

    Article  CAS  Google Scholar 

  71. Schmidt HA, Strimmer K, Vingron M, von Haeseler A: TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics. 2002, 18: 502-504. 10.1093/bioinformatics/18.3.502.

    Article  PubMed  CAS  Google Scholar 

  72. Felsenstein J: Phylogenies from molecular sequences: inference and reliability. Annu Rev Genet. 1988, 22: 521-565. 10.1146/

    Article  PubMed  CAS  Google Scholar 

  73. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25: 3389-3402. 10.1093/nar/25.17.3389.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  74. Donnelly D, Findlay JB, Blundell TL: The evolution and structure of aminergic G protein-coupled receptors. Receptors Channels. 1994, 2: 61-78.

    PubMed  CAS  Google Scholar 

Download references


R.S. is a recipient of Senior Research Fellowship awarded by the Wellcome Trust, UK. M.R.P. Rao is a recipient of Senior Research fellowship awarded by Council of Scientific and Industrial Research (CSIR), INDIA. We also thank NCBS-TIFR for infrastructural support.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Ramanathan Sowdhamini.

Additional information

Authors' contributions

M.R.P. Rao has carried out the work and has written the first draft of the manuscript. R.S. has initiated the idea and was involved in discussions and drafting of the final manuscript.

Electronic supplementary material


Additional data file 1: Table indicating the cluster, accession numbers, swissprot codes, gene names and description of the GPCR sequences that have been used. (XLS 118 KB)

Additional data file 2: Key residues conserved among the members of cluster 17. (XLS 18 KB)

Authors’ original submitted files for images

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Metpally, R.P.R., Sowdhamini, R. Cross genome phylogenetic analysis of human and Drosophila G protein-coupled receptors: application to functional annotation of orphan receptors. BMC Genomics 6, 106 (2005).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: