Widespread ectopic expression of olfactory receptor genes
© Feldmesser et al. 2006
Received: 20 April 2006
Accepted: 22 May 2006
Published: 22 May 2006
Skip to main content
© Feldmesser et al. 2006
Received: 20 April 2006
Accepted: 22 May 2006
Published: 22 May 2006
Olfactory receptors (ORs) are the largest gene family in the human genome. Although they are expected to be expressed specifically in olfactory tissues, some ectopic expression has been reported, with special emphasis on sperm and testis. The present study systematically explores the expression patterns of OR genes in a large number of tissues and assesses the potential functional implication of such ectopic expression.
We analyzed the expression of hundreds of human and mouse OR transcripts, via EST and microarray data, in several dozens of human and mouse tissues. Different tissues had specific, relatively small OR gene subsets which had particularly high expression levels. In testis, average expression was not particularly high, and very few highly expressed genes were found, none corresponding to ORs previously implicated in sperm chemotaxis. Higher expression levels were more common for genes with a non-OR genomic neighbor. Importantly, no correlation in expression levels was detected for human-mouse orthologous pairs. Also, no significant difference in expression levels was seen between intact and pseudogenized ORs, except for the pseudogenes of subfamily 7E which has undergone a human-specific expansion.
The OR superfamily as a whole, show widespread, locus-dependent and heterogeneous expression, in agreement with a neutral or near neutral evolutionary model for transcription control. These results cannot reject the possibility that small OR subsets might play functional roles in different tissues, however considerable care should be exerted when offering a functional interpretation for ectopic OR expression based only on transcription information.
Olfaction, the sense of smell, is mediated by a very large family of olfactory receptors (ORs), whose chemical interaction with a multitude of odorant ligands initiates a cascade of signal transduction events leading to smell perception [1–4]. OR proteins participate also in guiding olfactory sensory neurons axons to their glomerular targets , and have been proposed to be involved in the regulation of their own expression [6, 7].
The publication of the complete human genome sequence, as well as that of other mammals such as mouse, dog and chimpanzee, allowed the identification of entire OR repertoires in those species via computational data-mining. Currently, 853 human OR genes are known in the human genome , 1490 in the mouse genome [9, 10], 1493 in the rat genome , 971 in the dog genome [12, 13] and 1091 ORs in chimpanzee . These are represented in the Human Olfactory Receptor Exploratorium Database (HORDE) and in the Olfactory Receptor DataBase (ORDB) . In addition to this massive information that has been obtained by scrutinizing genome sequence repositories, several publications have provided information on transcription of OR genes in different tissues and species. In mouse, olfactory epithelial ESTs have been sequenced for more than 400 OR genes  and a custom mouse OR microarray was used to examine the expression levels of more than 800 genes in olfactory epithelium . In contrast, human OR expression has been investigated for only a small number of genes, as exemplified by the demonstration of transcripts for genes in an OR cluster on human chromosome 17 .
ORs are expected to be specifically expressed in the olfactory epithelium, where their expression is also highly regulated by mechanisms which allow each sensory neuron to express a single allele of a single OR gene [19–21]. OR genes are also expressed in the olfactory bulb, where they are specifically localized to the nerve and glomerular layers, potentially related to the targeting of the sensory axons [22, 23]. In the present paper we address the question to which degree OR genes are expressed in non-olfactory tissues. We use the term "ectopic" defined as "a biological event or process that occurs in an abnormal location or position within the body" . We note that the term has also been used to describe abnormal expression in malignant tumors, not studied here.
Most of the early reports about ectopic OR expression have been related to testis and germ cells, where several dozens of human and mouse ORs have been shown to be transcribed [25–32]. These results have led to the hypothesis that at least some ORs are involved in mammalian sperm chemotaxis. Evidence for the involvement of human hOR17-4 (OR1D2) [30, 33] and mouse MOR267-13 (orthologous to human OR10J5)  in sperm chemotaxis has been provided. Another hypothesis proposed that ORs linked to the major histocompatibility complex locus and expressed in testis are implicated in olfaction-driven mate choice .
Some human ORs were also shown to be expressed in tongue [36, 37], erythroid cells  and proState [39, 40]. Some murine ORs have been shown to be expressed ectopically in placenta, brain, peripheral nervous system, colon and fetal liver [41–45]. Because of their broad pattern of tissue expression during development and in adult life, ORs have been proposed to play a role in cell-cell recognition . Despite such reports regarding the potential functions of ectopically expressed ORs, this phenomenon is far from being fully understood and requires further systematic investigation.
Here we report the first systematic global analysis of spatial OR expression patterns in human and mouse. We aim to obtain transcriptional evidence for numerous ORs in a large number of tissues, so as to shed further light on the possible factors influencing OR ectopic expression. To this end, we have collected mouse and human transcriptome information from various sources, including public domain and Celera ESTs, as well as genome-wide microarray data . We found substantial expression in several dozens human and mouse tissues, but no indications for unusually high expression in testis. Our results support a neutral or near neutral evolution model for OR transcription control, whereby functionality is rendered less likely. All the results are fully available through HORDE .
Our first aim was to obtain evidence of transcription for human and mouse OR genes. We examined expression data from a multitude of sources, including cDNAs and mRNAs information as well as microarray data . The source of microarray data is the GeneAtlas2 project, which provides whole genome expression patterns in dozens of human and mouse tissues .
A search through the GeneAtlas2 data identified (after filtration on potentially cross-reacting probesets) a total of 293 probesets matching 273 human ORs, and 397 probesets matching 371 mouse ORs. Probesets were then tested for being positively expressed applying the Cross-Gene Error Model on the full microarray data. We identified 206 human and 216 mouse OR genes as expressed in at least one tissue. This analysis included 61 human and 48 mouse tissues, among them mouse olfactory epithelium, mouse vomeronasal organ, and human and mouse olfactory bulb and testis. Every one of these tissues showed positive expression of at least one OR gene. Further information regarding expression of 221 human ORs and 587 mouse ORs was obtained from ESTs and mRNAs. This information was supported by 867 human transcripts in 78 tissues and 1965 mouse transcripts in 50 tissues.
We asked whether the distribution of OR expression intensities was different in functionally implicated tissues as compared to other tissues. Generally, the distributions were rather similar for the two classes (Figure 2C,D). The functionally implicated tissues (olfactory epithelium and bulb, as well as testis) did not display an unusual distribution, nor did they show an unusually high number of highly expressed genes. The number of highly expressed ORs in testis was smaller than in olfactory epithelium and bulb (Figure 2C,D, red colors). Interestingly, these highly expressed ORs are not biased toward a genomic location or family affiliation . In addition, highly expressed human and mouse ORs in the different tissues were not orthologous. More functionally related expression patterns were observed for a set of non-receptor genes with an established olfactory function (Figure 2E,F). Note that Olfactory Marker Protein (OMP) was expressed in mouse but not in human olfactory bulb. This might be the result of a poor probeset design since human OMP is not expressed in any tissue.
Across-tissue patterns of expression for individual OR genes were eclectic, differing widely from one gene to another (Figure 2A,B). A hierarchical clustering analysis was applied to 108 human and 141 mouse differentially expressed ORs, as determined using an ANOVA test applied to the GeneAtlas2 microarray data (see Additional file 3). The human OR gene tree ramified into two principal branches, whose major expression divergence was observed in the highly expressing tissues. In mouse, no significant clustering was observed. Principal component analysis also could not separate the ORs expression profiles into distinct groups (data not shown).
The product of the entropy and the expression intensity at each position was calculated and the mean of all products defined as the ectopic expression index, theoretically ranging from 0 (no ectopic expression) to 10 (all the examined genes are ubiquitously expressed and all the tissues are represented in every ranked position). The ectopic expression index was found to be 1.6 for human ORs and 1.3 for mouse ORs. Interestingly, a similar level of ectopic expression was observed in groups of brain specific genes (see Additional file 4) and spermatogenesis-related genes (see Additional file 5). In these gene groups, the entropy in most ectopic ranked tissue positions was above 0.7 and the ectopic expression index ranged from 1.1 and 1.5 for human and mouse brain specific genes respectively, to respectively 1.5 and 1.4 for spermatogenesis-related genes in human and mouse.
Recently, the results of a custom-made microarray, which includes all mouse OR genes hybridized to olfactory epithelium and 6 additional tissues (vomeronasal organ, lung, heart, testis, muscle and cerebellum) were published in Zhang et al. . An examination of the 397 OR genes represented in both our analyzed experiment and that of Zhang et al., using the same cutoff criteria, revealed a significant concordance. Thus, all 48 ORs that showed olfactory epithelial expression in our analysis were also highly expressed in the results of Zhang et al. (see Additional file 6, p = 0.0001, chi-square test = 21).
Ectopic transcription coverage for intact and pseudogene OR genes in human and mouse. Transcripts include ESTs and mRNAs. Shown are the number and percentage (in parenthesis) of OR genes for which evidence of transcription was found, relative to the entire OR repertoire. Lowest row indicates the average EST coverage per OR, calculated using the count of adult normal tissues ESTs from non-normalized libraries in each subgroup. -7E indicates pseudogenes excluding 7E subfamily members. Percentage of OR genes for which evidence of transcription was found based only on spliced ESTs was 9% for intact, 8% for pseudogenized human ORs, and 5% for intact, 4% for pseudogenized mouse ORs. The EST coverage based only on spliced ESTs dropped to about 0.04 for intact genes and pseudogenes excluding 7E members, while coverage was 0.17 for the 7E subfamily, significantly different from that of the other groups.
One OR subfamily, 7E, shows an unusually high average level of transcription, significantly different from the one of other intact and pseudogene ORs according to the Tukey-Kramer test (p < 0.05), (Table 1). Members of this subfamily were the only OR group for which the EST coverage differed significantly from a Poisson distribution (Figure 7B). In this skewed distribution, most 7E genes were under-expressed, while four genes, all belonging to one of the two 7E phylogenetic clades  were highly overexpressed. Most additional genes represented by ESTs belonged to the same over-expressed phylogenetic clade.
We examined the possibility that OR gene expression is related to genomic location, and potentially influenced by the transcription of non-OR neighboring genes, namely genes not belonging to the OR superfamily and located at a maximum distance of 0.5 M from an OR. We found that 98 out of 165 human ORs (59%) represented by at least one EST whose source was non-olfactory tissues, have a non-OR neighbor. This is in contrast to 241 out of 684 genes (35%) for ORs not observed to be ectopically expressed (p < 0.0001, chi-square test = 36). In mouse, a similar but weaker trend was found: 51% (80 out of 156 genes) of the ectopically expressed ORs had a non-OR genomic neighbor while the fraction was 42% (317 out of 749 genes) for ORs not observed to be ectopically expressed (p= 0.04, chi-square test = 4.21).
Motivated by the need to provide evidence of OR transcription, we searched through numerous databases for transcription evidence and successfully provided such information for 371 human OR genes. Since the nominal expression site of OR genes, namely olfactory epithelium, is practically not represented in transcript databases, most of the expression evidence presented here originates from ectopic tissues. Our results generally indicate that hundreds of OR genes have active promoters capable of directing transcription in diverse cellular contexts. As the samples explored are basically unbiased, it is suggested that our results are representative of the entire olfactory receptor gene repertoire.
It could be argued that the ectopic OR expression is an artifact of microarray experiments. Indeed, microarray experiments are known to be "noisy" , requiring independent experimental validation. Such validation is presently underway for selected OR genes that show unique patterns of expression, and preliminary results show adequate concordance for two genes in terms of tissue-average expression levels, though not yet for individual tissue levels. However, the essence of the presently reported results, namely the widespread occurrence of OR expression in ectopic tissues is amply supported by previous reports showing ectopic OR transcription by PCR and sequencing methods [25, 29, 36, 37, 55]. The identification of numerous OR-related ESTs in diverse tissue origins provides further corroboration. Moreover, since we carefully selected the OR probesets to avoid possible cross-reaction between similar ORs, the observed expression levels could not be a result of summation over numerous genes, each expressed in negligible amounts. In fact, some of the OR genes were found to be highly expressed well above the background level.
Ectopic expression is a universal phenomenon that extends beyond the OR gene superfamily. Here, we corroborate this phenomenon for two additional gene groups, brain specific and spermatogenesis-related genes. Ectopic expression has been also observed previously at the protein level in enzymes and neurotransmitters and it has been suggested to be widespread in the cells of multicellular organisms .
Our findings do not indicate unusually high OR expression levels in olfactory epithelium relative to other tissues. This expression is specific for mature olfactory sensory neurons and not for other cell types in the olfactory epithelium, as previously reported [57, 58]. At face value, such result could be interpreted as contradicting a function for OR genes in olfactory epithelium. However, as each OR gene is expressed only in approximately 0.1% of the sensory neurons, OR transcripts are diluted in the total olfactory tissue mRNA.
We compared the results of our analyses to the ones of an expression custom-made microarray published by Zhang et al. . We identified in the GeneAtlas2 microarray 397 mouse unique OR-related probesets representing 371 genes, we could not detect expression in any tissue for 181 probesets (155 genes), from the remaining 216 OR genes, 117 were not found to be expressed in the olfactory epithelium and only 32 were expressed in the five upper quantiles. In contrast to our results, they found a higher fraction of OR genes that surpass the threshold and are called "enriched in the olfactory epithelium" as compared to fraction of expressed in the olfactory epithelium as emerging from the data analyzed here. Su et al.  used 5 μg of cRNA in the microarray hybridization experiments, while Zhang et al. used 70 μg. This difference is a possible explanation as to why we fail to identify olfactory epithelium-specific OR transcription.
The median correlation coefficient between the two experiments is in line with those in other reports, ranging from 0.2 to 0.4 in comparisons among datasets acquired for identical RNA samples but in different laboratories and using different experimental and analysis protocols . Other factors which may account for the differences in the two experiments include the use of mRNA from different mouse strains, differences in probeset designs and differences in data processing software. The influence of the probeset design is critical, as learned from comparisons of different results in the same report (Zhang et al.).
It is widely believed that the main function of OR proteins is in mediating olfaction within the chemosensory organ. In parallel, reports published ever since the discovery of OR genes have proposed an additional OR function, namely as mediators of sperm chemotaxis. In the past decade, considerable evidence has accumulated in support for this hypothesis [25–31], but the concept has also been disputed . The present study, which addresses the expression of a large number of OR genes in olfactory tissues as well as in other tissues, including testis, could help to shed light on this issue. One of the important questions is which of the hundreds of OR genes is actually involved in sperm chemotaxis, as such specific function is unlikely to be mediated by the entire OR repertoire. Previous functional studies have focused on two OR genes: the human OR1D2 (hOR17-4)  and a non-orthologous mouse gene, MOR267-13 (MOR23) . For MOR267-13 it was Stated that it is expressed at low level in testis , this is also supported by our results. We found that also OR1D2 is expressed at low level in testis. Our results further show that these two specific ORs have a higher expression in other tissues. Conversely, other ORs are highly expressed in testis but have not been invoked so far as chemotaxis mediators. It should be noted, however, that genes could function in an ectopic tissue (e.g. sperm) even if they are expressed at higher level in another tissue. The notion that a gene must be expressed at the highest level in the tissue in which it is functional is plausible but exceptions may exist.
In addition to a proposed role in sperm chemotaxis, ORs have been proposed specific roles in numerous other tissues, mostly based on the mere presence of OR transcript(s). This includes cell-cell recognition and organ construction during development [37, 41, 45, 46], taste perception , chemical detection of exogenous or endogenous ligands in the cerebral cortex  and additional functions [40, 43, 44]. Does OR expression in a given ectopic tissue indicate specific functionality? An alternative scenario would be that ectopic OR transcription is predominantly the result of neutral or nearly neutral mechanisms, e.g. small DNA sequence changes in regulatory regions, fixed in the population by random drift and not necessarily related to function or fitness [52, 60, 61].
A claim that ectopic OR expression may be in part governed by neutral changes is supported by several findings. First, the divergence in the expression patterns of human-mouse orthologous OR pairs in ectopic tissues is similar to that of randomly selected pairs. This observation generalizes a previously reported specific case for OR51E2, which was found to be over-expressed in human proState [39, 40], although minor expression could be detected also in brain. Its mouse ortholog was found predominantly in brain, and its rat ortholog in brain and liver . Second, OR transcription was found here to be unbiased with respect to any particular OR subgroup in ectopic tissues (except for the human 7E subfamily members, see below), as indicated by the approximate fit to a Poisson distribution. We note, though, that other studies have reported up to 300-fold difference in the levels of ORs transcription in mouse olfactory epithelium [5, 16]. Last, OR pseudogenes were found here to be expressed in ectopic tissues at similar levels to those of intact genes. In contrast, in olfactory epithelium, where chemosensory functionality is expected, OR pseudogenes were found to be significantly less expressed than intact genes . In spite of this, our results can neither exclude nor confirm the possibility that some ORs do have a function in ectopic tissues, as the functionally-related transcription might be concealed within the relatively noisy microarray data.
The relative weight of neutral evolution and purifying selection in shaping gene expression changes is controversial. Based on the expression data that have accumulated lately, a predominantly neutral model of evolution for gene expression has been proposed [52, 60, 61]. On the other hand, others  have recently indicated the parallel importance of selective constraints. The neutral model was previously supported by a lack of correlation between gene expression profiles and gene sequence divergence , a general observation that is in line with our own results. Also, similar rates of divergence in gene expression levels between humans and chimpanzees were found for intact and expressed pseudogenes , again echoing the results reported here.
Recently, ectopic expression and its evolutionary significance have been reviewed . The authors propose that transcription regulation is leaky due to the need to decondense chromatin and the limited number of transcription factors in the cell. They suggest that ectopic expression, which appears randomly, may have evolutionary potential and provide an opportunity to develop protein function diversification ("the marginal benefit hypothesis"). The same might be true for OR genes, a point that will have to be further explored.
The OR gene subfamily 7E has expanded extensively in the primate lineage, composing ~10% of the human and chimpanzee OR repertoires. In humans, all its members but one are pseudogenes [51, 63]. It was suggested that the subfamily expansion occurred through a complex mechanism of large segmental duplications, and that the duplication unit included one 7E subfamily member from each of the two phylogenetic clades of the subfamily . Our results suggest that one of these two genes had already a disrupted promoter when the expansion started. The unusually high expression level of the second clade may be the result of mutations enhancing expression. It is also possible that highly expressed 7E members have some potential function after all as previously suggested .
Previous reports have shown that adjacent and nearby genes show correlated expression patterns in yeast . Human genes were shown not to be randomly distributed in the genome; highly expressed genes and weakly expressed genes tend to populate different chromosomal domains [65, 66]. It has been suggested that co-expressed genes in eukaryotic genomes reflect the domain organization of chromatin . We have noticed that ORs located within the range of 0.5 M from non-OR genes have a higher tendency to be expressed than others, hinting that the genomic environment may partially influence OR gene expression. This is further supported by the expression of an OR gene in erythroid cells, attributed to its genomic location in the transcriptionally active chromatin domain of the extended β-globin gene cluster .
We reported on widespread ectopic OR expression. Related observations have formed the basis for proposing specific roles for ORs in non-olfactory tissues. We proposed an additional explanation for ectopic expression, a neutral or near neutral model of evolution for at least part of OR transcription regulation in non-olfactory tissues. This is supported by the uniform overall transcription level in most tissues, the heterogeneity in its patterns, the expression level proportion similarity in OR intact and disrupted genes and the total lack of correlation between human and mouse orthologous ORs.
Human OR probesets were identified based on genomic location overlap between GeneAtlas2 targets and HORDE annotation . Mouse OR probeset identity was established using the UCSC genome annotation database table knownToGnf1m.txt.gz , which connects between probesets and GenBank accession numbers. Gene symbols and definitions were assigned to the accession number using Ensembl annotations downloaded via EnsMart .
We validated the probesets annotation using BLAT . A probe was matched to an OR gene if it aligned in the correct orientation, with no more than one mismatch, and did not align to any additional OR gene or to any other gene from Ensemble all human/mouse transcripts library. Following the GeneAnnot algorithm described in , sensitivity and specifiCity scores were calculated for each probeset. Sensitivity score describes the percentage of probes in a probeset that match a gene and specifiCity score denotes how many other genes match the probeset and to how many probes within the probeset they match. Probesets having sensitivity or specifiCity lower than 0.7 were excluded and not used on further analyses. After filtering the probesets we were left with representation for 273 human and 371 mouse ORs (293 human and 397 mouse probesets). Most remaining probesets were of high quality, with both sensitivity and specifiCity scores of 1 (79% in human and 91 % in mouse), (see Additional file 7).
Expression data for human and mouse tissues were obtained from GNF Symatlas . The human and mouse arrays data were supplied after applying the MAS5 (Affymetrix, Santa Clara, CA) algorithm and normalization using global median scaling . Sixty one human normal adult tissues and 48 mouse normal adult tissues were selected for the analysis.
Log10 of the expression values were averaged for replicas and were divided into 11 bins according to . Intensities lower than log10200 for mouse and log10300 for human were considered as the zero bin. The remaining intensities were divided into 10 equal density quantiles.
Throughout this work quantile scaling was utilized, unless otherwise Stated.
A graded tissue specifiCity index, τ, ranging from 0 (ubiquitously expressed) to 1 (one-tissue specific) was calculated as described in Yanai et al. .
Expression data was analyzed also applying GeneSpring software. This was used for determining the number of positively expressed genes, differentially expressed genes, and clustering. Each expression value was divided by the median of the microarray and the median of the probeset across all tissues. Unexpressed probesets were filtered out using the Cross-Gene Error Model. Differentially expressed genes were determined by an ANOVA test, p-value of 0.001. Under these conditions, 206 human and 216 mouse OR genes were found expressed, and 108 human and 141 mouse ORs were differentially expressed. Hierarchical gene clustering (gene tree) and tissue clustering (condition tree) were performed for differentially expressed ORs.
To verify that expression data from all the tissues have the same distribution (normal), we calculated the distribution of the expression log values for each tissue and its skewness parameter. Tissues with a skewness parameter outside the range of average plus 2 standard deviations of all skewness values were removed from further analyses. The removed tissues were pancreas, brown fat and tongue in mouse and cervical ganglion, trigeminal ganglion, ovary and appendix in human.
To quantify ectopic expression, we ranked for each OR gene the ectopic tissues according to their expression level, entropy and mean expression were then calculated for each ranked position. The Shannon entropy (H) formula  was utilized to measure the entropy:
where Pi is the fraction of the tissue type i, and T is the total number of tissues. The entropy was normalized dividing by log2 T, to range from 0 to 1.
Brain specific genes were selected based on scientific literature. We search Pubmed  for papers including the exact phrases "brain specific" or "brain-specific" in the title or abstract. The list of genes was then manually extracted and carefully curated by inspection and reading of the title/abstract. Probesets/genes assignment was based on the GNF organization annotation . The final list included 57 human brain specific genes and 50 mouse brain specific genes. Human spermatogenesis related genes were identified searching with the keyword "spermatogenesis" in GeneAtlas web site. Corresponding mouse orthologs for the two groups were extracted from GeneCards .
Human-mouse OR orthologs identification was based on a previous work, which identified human-mouse-dog three way mutual best hits . GeneCards was further used to extract 9616 human-mouse orthologous pairs represented in the mouse and human microarrays. Random pairs were selected from this list using the PERL rand function. Pearson correlations were calculated using the log10 expression values supplied by Su et al., since these values are normally distributed.
Data mining of ESTs and mRNA was performed using UCSC genome browser annotation tables chr#_est.txt.gz and chr#_mrna.txt.gz, (# represents the chromosome number) . We selected ESTs and mRNAs which were aligned to ORs coding regions.
After curation (see below), ESTs and mRNAs that included part of the UTR, were used as probes for extracting additional ESTs (in two data mining rounds). In addition, spliced ESTs which were aligned upstream of the coding region (up to 10 Kb) were collected. ESTs and mRNAs annotations regarding tissue source and type (tumor or normal) were extracted from GenBank and MEROPS .
Data mining from Celera Genomics  was performed using all OR coding sequences as queries in a BLASTN search against Full Invitrogen sequences (FIS) and 5' Invitrogen clones libraries. Hits with an e-value of less than 10-10 were downloaded to our computer. The final assignment of a particular Celera's cDNAs to a particular OR was based on an alignment >97% over 500 bp.
ESTs were rechecked to represent OR genes by excluding those whose genomic locations overlapped exons of the non-OR subset from the UCSC known gene table . Spliced ESTs were required to be transcribed from the same strand as the OR. This criterion was not applied to non-spliced ESTs, because their transcription strand is less reliable since they lack splice junctions that verify the strand. We applied a cutoff of maximum 500 bp length for non coding exons and maximum distance of 100 Kb between a non coding exon and the coding region. The first criterion was based on previous knowledge about the typical OR UTR structure [18, 31] and was applied to avoid genomic contamination, the second was applied to avoid collection of chimeras (transcripts including very large introns are suspected to be chimeras). Redundant information from ESTs that belong to the same clone was removed. mRNAs which their GenBank definition matched non-OR genes were manually removed from the dataset.
The final dataset contained 867 human transcripts, including 721 ESTs (spliced and non-spliced) and 146 mRNAs, from 78 tissues. These provided evidence for transcription of 221 human ORs. In mouse we obtained 1860 ESTs and 105 mRNAs, from 50 tissues. Of these 1176 ESTs are the result of a single project . Together they provided transcription evidence for 587 mouse ORs.
Poisson distributions were calculated using ESTs from the first data mining round, as these constitute a true random set. Also, we filtered out ESTs whose source was not normal adult tissue as well as ESTs from normalized or subtracted libraries. Kolmogorov-Sminorv tests were performed at the web server  or using the Matlab6p5 application software.
We thank Marc Fellous and Ronny Aloni for critical reading of the manuscript. D. Lancet holds the Ralph and Lois Silver Chair in Human Genomics. Research was supported by the Crown Human Genome Center, by an Israel Ministry of Science and Technology grant to the National Knowledge Center in Genomics, and by the Abraham and Judith Goldwasser Foundation.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.