Skip to main content

The Retinome – Defining a reference transcriptome of the adult mammalian retina/retinal pigment epithelium



The mammalian retina is a valuable model system to study neuronal biology in health and disease. To obtain insight into intrinsic processes of the retina, great efforts are directed towards the identification and characterization of transcripts with functional relevance to this tissue.


With the goal to assemble a first genome-wide reference transcriptome of the adult mammalian retina, referred to as the retinome, we have extracted 13,037 non-redundant annotated genes from nearly 500,000 published datasets on redundant retina/retinal pigment epithelium (RPE) transcripts. The data were generated from 27 independent studies employing a wide range of molecular and biocomputational approaches. Comparison to known retina-/RPE-specific pathways and established retinal gene networks suggest that the reference retinome may represent up to 90% of the retinal transcripts. We show that the distribution of retinal genes along the chromosomes is not random but exhibits a higher order organization closely following the previously observed clustering of genes with increased expression.


The genome wide retinome map offers a rational basis for selecting suggestive candidate genes for hereditary as well as complex retinal diseases facilitating elaborate studies into normal and pathological pathways. To make this unique resource freely available we have built a database providing a query interface to the reference retinome [1].


The mammalian retina is a highly structured tissue developmentally originating from neuroectodermal evagination of the diencephalon and subsequent invagination processes resulting in the formation of two cellular layers which ultimately give rise to the inner neural retina and the outer retinal pigment epithelium (RPE) monolayer [2]. In the adult, the neural retina consists of approximately 55 distinct cell types histologically structured into three layers of cells (photoreceptors, intermediate neurons and ganglion cells) and two layers of neuronal interconnections (outer and inner plexiform layers) [3]. The RPE is differentiated into polarized cells with an apical and a basal orientation separating the neural retina from the underlying choroidal blood supply. With its apical microvilli-like processes, the RPE establishes an intimate contact with the photoreceptor outer segments to sustain their metabolic support and maintain photoreceptor integrity [4]. Together, the neural retina and the RPE provide the structural and functional basis for light perception by ensuring the capture of photons, the conversion of light stimuli into complex patterns of neuronal impulses and the transmission of the initially processed signals to the higher visual centers of the brain.

Recent progress in retinal research has greatly enhanced our current understanding of basic functional processes in the adult retina (e.g. [4, 5]). A great deal of effort has focused on the molecular dissection of the phototransduction pathway and the retinoid cycle (e.g. ref. [6]). Besides elucidating physiological mechanisms in normal tissue, the identification of genes involved in hereditary retinal disease has provided another valuable source of insight into functional pathways of the retina and the RPE (reviewed in [7, 8]).

Despite these advances, a remaining challenge is to obtain a reference genome-wide expression map of the retina/RPE transcriptome, further facilitating the identification of retinal susceptibility genes, but most importantly, offering an invaluable resource for functional genomics studies. Initial analyses of human [9, 10] and mouse [11] whole genome sequences and the use of more recent comparative gene prediction algorithms [12, 13] suggest an overall number of mammalian gene loci in the range of 35,000 to 45,000. These estimates have largely been validated by experimental data on gene transcription [14, 15] although alternative promoter usage, differential exon splicing during mRNA maturation, alternative usage of polyadenylation sites and other post-transcriptional modifications may further increase the genetic diversity required to encode the full complement of cellular transcripts [16, 17]. In addition, there may be a considerable number of non-coding genes unaccounted for by current annotations [18].

In recent years, a number of approaches and technologies were adopted to identify genes expressed in the retina/RPE of human, cow, dog and mouse including data-mining and assembly of publically available expressed sequence tag (EST) information [1923], sequencing of cDNA libraries generated via conventional methods [2429] or via normalization techniques [30, 31], hybridization to gene arrays of various formats [32, 33] and serial analysis of gene expression (SAGE) [34, 35]. Suppression subtractive hybridization (SSH) has been shown to be an efficient technique with which differentially expressed genes can be normalized and enriched over 1000-fold in a single round of hybridization [36]. Subsequently, applications of SSH to identify retina and RPE-enriched genes have been reported [3739].

Based on a comprehensive survey of data available from 27 independent studies applying a wide spectrum of gene identification approaches we have now assembled a first genome-wide reference transcriptome of the adult mammalian retina/RPE. This reference transcriptome comprises 13,037 non-redundant transcripts and likely reflects up to 90% of the mammalian retinome.


A total of 481,137 primary datasets on gene transcripts from the adult mammalian retina/RPE tissues have been generated in 27 independent studies (Table 1). Of these, 52,630 datasets (31,814 from retina, 11,632 from RPE and 9,184 from retina/RPE) were available and attributable to unique LocusLink identifiers (IDs). Correcting for gene redundancy within and between studies yielded a catalogue of 15,645 retinal/RPE genes. A survey of incidence and origin of each of these genes in the various studies analyzed demonstrated that 2,608 transcripts were found only once (see additional File 1) while the remaining 13,037 genes (see additional File 2) were confirmed in at least two and up to 16 independent gene identification approaches (Table 2). Thus, the latter compilation of genes may represent a more conservative description of the retinome minimizing a potential bias in data ascertainment. Of the 13K retinome, 1,411 genes were solely identified in retinal studies (see additional File 3) while 246 genes were exclusively found in the RPE datasets (see additional File 4).

Table 1 Studies identifying adult mammalian retina / RPE transcripts and details on gene data retrieval
Table 2 Frequency of unique genes in studies

To assess the degree of completeness of the adult mammalian retinome, we compared the LocusLink IDs of the 13,037 transcripts to partial lists of genes known i) to be specifically expressed in the retina/RPE (category I, n = 43) (see additional File 5), ii) to play a role in the phototransduction pathway/vitamin A cycle (category II, n = 57) (see additional File 6), iii) to encode retinal/RPE proteins verified by immunohistochemistry (category III, n = 260) (see additional File 7), and iv) to be associated with syndromic and non-syndromic retinal disease (category IV, n = 102) (see additional File 8). The data show that the compiled retinome covers all retina/RPE-specific transcripts (43/43) and 53/57 (93%) of the phototransduction pathway/vitamin A cycle genes. Known retinal/RPE proteins are represented by 204/260 (79%) transcripts while 87/102 (85%) genes known to be involved in retinal diseases are found in the 13K retinome collection (Table 3). To further evaluate the significance of these findings, partial transcriptomes of heart (n = 3,660; see additional File 9), liver (n = 5,780; see additional File 10) and prostate (n = 7,018; see additional File 11) were assembled and compared to the four selected categories. In category 1, none of the retina/RPE-specific genes were present in the heart, liver, or prostate partial transcriptomes while category 2 revealed four of 16 expected (25%, heart), 7 of 25 (28%, liver) and 9 of 31 (29%, prostate) genes. In category 3, 59/73 (81%, heart), 82/115 (71%, liver) and 92/140 (66%, prostate) genes were present. Retinal disease genes (category 4) were found at a rate of 18 of 29 (62%, heart), 26 of 45 (58%, liver) and 28 of 55 (51%, prostate) (Table 3). The expected values for the partial transcriptomes were calculated by adjusting the respective transcriptome sizes relative to the total number of transcripts of the retinome.

Table 3 Representation of retinome and partial assemblies of heart, liver, and prostate transcriptomes in defined retina/RPE gene groups

A comparison of the 13K retinome with partial transcriptomes of heart, liver, and prostate suggests a high degree of overlapping expression between retina/RPE and heart (3,496/3,660), liver (5,343/5,780) and prostate (6,471/7,018). A total of 2,330 genes are expressed in all tissues and represent putative "housekeeping" genes (see additional File 12). It should be noted that the low number of ubiquitously expressed genes is largely due to the fragmentary nature of the heart, liver, and prostate transcriptomes. With increasing transcriptome complexities this number is likely to increase. Analysis of the least complete transcriptome, the heart, reveals that 2,330/3,660 (64%) transcripts can be classified as ubiquitously expressed (see additional Files 9 and 12) while a maximum of 1,330/3,660 (36%) genes may display tissue-restricted or tissue-specific expression. A comparison of more complete transcriptomes may significantly reduce the latter estimate. So far 5,051 genes are only found in the retinome representing a collection of "retinome-enriched" transcripts, while 7,986 are also present in at least one of the partial transcriptomes of the heart, liver or prostate. Thirty-two genes were found to be expressed in heart, liver and prostate but not in the retinome (see additional File 13).

The distribution of the mammalian retinome across the human genome was assessed by a paired comparison of the number of reference retinome genes (13,037) versus the number of human non-redundant syntenic gene predictions (SGPs) (43,109) [40] along 618 five-Mb windows (Fig. 1a). Correction for the total number of SGPs relative to the number of retinome genes positioned on average 21.1 (median 19.9) SGPs per window compared to 21.3 (median 15.0) retinal genes per window. Based on the Wilcoxon two-sample paired signed rank test, the null hypothesis assuming similar distribution of the SGPs and the retinome genes across the five-Mb windows can be rejected at p < 0.01. While their overall distribution greatly parallels that of the SGPs, retinome genes tend to cluster in several chromosomal regions most prominently on chromosomes 6p22.1-p21.31, 11q12.2-q13.1, 16p13.3, 19p13.3 and 19q13.32-q13.33 (Fig. 1a,1b).

Figure 1

Chromosomal distribution of transcripts defining the reference retinome. (A) The distribution of 13,037 retinome genes over the human genome is shown as the difference between the number of observed and expected transcripts in window sizes of 5 Mb along the chromosomes (abscissa). The number of expected genes was based on 43,109 SGP-predicted transcripts. To correct for gene density per bin, the SGPs were adjusted by a factor of 0.30 (13,037/43,109). Positive/negative ordinate values indicate regions of enrichment/depletion of retinome-encoded transcripts. (B) Close-up of chromosomes 6 and 19 calculated for a window size of 1 Mb along the two chromosomes.

To provide positional candidates for syndromic and non-syndromic hereditary retinopathies, the 13K reference retinome as well as the "retinome-enriched" transcripts (5,051 transcripts) were superimposed onto the disease intervals of 42 thus far uncloned retinal disorders (Table 4). In many instances, this results in a significant reduction of genes in the respective intervals offering a manageable number of candidates for retinal diseases (e.g. the RP29 locus contains 28 SGPs of which 5 are present in the reference retinome including GPM6A, WDR17, FLJ22649, VEGFC, AGA). The number of possible candidates is further reduced in the "retinome-enriched" transcript category to GPM6A and VEGFC. To make the information on the reference retinome available, we have created the interactive RetinaCentral database, a research portal which collects and stores information on genes and proteins functionally relevant to the retinal tissues [1]. We have implemented an interactive data retrieval system that presently contains linked information on the 13,037 genes of the 13K reference retinome. Database scripts were programmed to synchronize the data with LocusLink index files [41] which are updated daily [42].

Table 4 Number of genes mapped to retinal disease loci


Compiling the transcriptome of a cell or tissue is arguably more demanding than establishing the number of gene loci encoded by a given genome sequence [43]. This may mainly be explained by the dynamic nature of mRNA itself which frequently produces alternative transcripts from a single gene locus by usage of tissue-specific promoters, cryptic splice sites or variable polyadenylation signals [44, 45]. In addition, variation in gene expression is known to occur within and between populations [46, 47] and allele-specific expression, even from non-imprinted genes, appears to be common [48]. Further complicating transcriptome definition are effects of gender and age on RNA expression [49] as well as agonal and postmortem factors which greatly affect RNA integrity and thus frequently influence subsequent analyses [50]. Finally, differences in experimental technologies and data post-processing add an additional level of variability. Taken together, the complexities in mRNA metabolism and experimental data handling strongly suggest that there is not a single transcriptome for a given cell or tissue but implies an arbitrary number of individual transcriptomes which need to be defined by a series of parameters such as age, gender, ethnicity, cause and time of death of the tissue donor besides many others. It is therefore advisable to initially aim for a reference transcriptome providing a blueprint of an expression profile within a broadly defined time-frame. Following this line of reasoning, we here present a framework of a first reference transcriptome of the retina/RPE consisting of 13,037 unique transcripts which broadly characterize the mature state of expression in this tissue.

The present meta-analysis has integrated information from 27 studies employing diverse technologies to identify retinal/RPE transcrips. Among these, SAGE represents a sensitive tool to detect low level transcription [51] while the PCR-based SSH method is well suited to enrich for differentially expressed genes [36]. The combined use of these approaches together with conventional cDNA library sequencing and microarray-based techniques provides a more solid assessment of gene expression than would each method alone. For example, SAGE is based on sequencing of hundreds of thousands of short (10, 14, or 21 bp) tags, ideally derived from a unique location of a single transcript. Rare tags could originate from infrequently expressed transcripts but could also reflect minor genomic contamination or minor sequencing errors. For the assembly of the reference retinome we have addressed these concerns by including only those transcripts that have independently been confirmed in a second unrelated study. This has led to a conservative assembly of the 13K retinome. It should be kept in mind however that this proceeding likely excludes a number of authentic transcripts. This is illustrated by the finding that the 15K retinome which comprises 15,645 transcripts including those which were solely found in a single study (Table 2), contains an additional five of the 102 known retinal disease genes (RHOK, MTATP6, CHM, LRAT, RIMS1) not included in the 13K retinome. Similarly, an additional three genes (RHOK, LRAT, GPRK7) involved in the vitamin A/phototransduction pathway are part of the 15K but not the 13K retinome. With additional transcription data on the retina/RPE becoming available, a second generation retinome map will need to address this issue.

The estimation of transcriptome size represents one of the fundamental questions in molecular biology. Early studies using reassociation kinetics have calculated the number of distinct mRNA transcripts present in various mouse tissues to be between 11,500 and 12,500 [52]. Initial SAGE analyses have led to the conclusion that the number of different transcripts observed in normal and tumorous tissue may lie between 14,247 and 20,471 [53]. Recent data from comprehensive EST sequencing of a number of tissues including brain, breast, colon, head/neck, kidney lung, ovary, prostate, and uterus suggest expression of between 7,500 and 13,500 distinct genes for each tissue [54]. Although the size of the reference retinome is consistent with these estimates, the question of adequate transcript representation by the current compilation remains open. We have addressed this by defining a number of gene groups with known expression in retina/RPE and comparing these to the reference retinome. Genes exclusively expressed in retina/RPE are highly represented in the retinome (100%), as are mainly tissue-specific genes known to play a role in the vitamin A/phototransduction pathway (93%) (Table 3). A partial list of 260 genes whose encoded proteins were shown by immunohistochemistry to be expressed in the retina/RPE (but may also be present in other tissues), were represented in the reference retinome at a rate of approximately 79%. Similar numbers were obtained for the retinome coverage of retinal disease genes (85%). From these data we conclude that the 13K reference retinome is highly representative of retina/RPE-expressed genes and may describe as much as 90% of the transcript complement in the adult state.

Another point of interest concerns the proportion of retinome transcripts which is uniquely expressed in this tissue. Brentani et al. [54] estimate that any two tissues may share between 73% and 84% of their transcriptomes. Comparing transcription in three tissues (breast, colon, head/neck) the authors found overlapping expression in 47% of transcripts. To investigate this in more detail, we have compiled three partial transcriptomes from heart (n = 3,660), liver (n = 5,780) and prostate (n = 7,018) by applying the same stringent criteria as defined for the retinome. Limited by the size of the partial heart transcriptome, we determined 2,330 transcripts (termed "housekeeping" genes) to be expressed in all four tissues (i.e. 64% of the heart transcriptome). Comparing the retinome to any of the partial transcriptomes revealed overlapping gene profiles between 92 % and 95 %. This would suggest that only a minor proportion of retinome transcripts is indeed unique to the retina/RPE. Thus far, we have identified a group of so called "retinome-enriched" genes comprising 5,051 transcripts which are not present in the partial transcriptomes of heart, liver and prostate. This group most likely contains additional "housekeeping" or tissue-restricted transcripts and needs further adjustment by more refined in-silico normalization to comprehensive reference transcriptomes of other tissues.

Highly expressed genes including those with a ubiquitous or a tissue-specific transcription profile, have been shown to cluster in chromosomal regions of increased gene expression (termed RIDGEs) [55, 56]. Functionally, this higher order structure has been related to transcriptional regulation [56, 57]. To search for a possible correlation, we have determined the chromosomal distribution of the reference retinome independent of gene density. Our data show good agreement with the previously established regional expression map defining approximately 30 RIDGEs within the human genome. Overlaps are most evident for chromosomes 6, 9, 11, 17, and 19. From this we conclude that the majority of transcripts assembled in the reference retinome share characteristics of the RIDGEs including moderate to high level expression. This finding may be ascribed to the stringent selection criteria we have applied to assemble the reference retinome by excluding all transcripts (n = 2,608) that were reported in only a single study. Conversely, the RIDGE-like pattern of the reference retinome could be an indication that missing transcripts may have features compatible with chromosomal domains defined as anti-RIDGEs [56]. As opposed to RIDGEs, clustering of genes in anti-RIDGEs seems associated with significant decreased expression [56]. In contrast to their fractional occurrence in transcriptomes, the identification of such low abundant transcripts are likely to require significant resources in order to compile more complete transcriptomes.

To provide positional candidates for retinal disease genes, we have mapped the transcripts representing the reference retinome to the minimal regions defined for 42 retinal disease loci with as yet undefined gene mutations. To further limit the number of candidate genes, in particular for loosely defined disease loci such as RP28 or VRNI, we have similarly integrated the "retinome-enriched" transcripts. This also accommodates for the fact that approximately 50% of retinal disease genes are retina/RPE-specific [58]. For 41 of 42 unknown disease genes we have now identified strong candidates although for some disease loci including AIED, COD4, CORD8, CRV, LCA5, RP28, and USH2B, the number of candidates may still exceed capacities of most laboratories for direct analysis. For other disease loci (e.g. BCD, BBS3, COD2, CYMD, MCDR4, OPA4, PRD, RNANC, RP24, RP29, RP6, WFS2 and WGN1), a restricted number of candidates are now available (see additional File 14).


We here present a first near-complete transcriptome of a defined tissue, the retinome, which may serve as a reference for further efforts to establish spatial, i.e. cell-specific, and developmental transcriptomes of the retina/RPE. A fundamental aspect of the current study was to integrate the available information on gene identification generated by a wide range of techniques. This ensures robustness and reliability of transcript data providing a stringent framework for further expression studies in systems biology. A similar approach for other tissues/cells would be advisable as this may greatly facilitate in-silico identification of tissue-specific genes to elucidate functional pathways vital for a defined cell population. In addition, the reference retinome may prove valuable for providing strong candidates for hereditary as well as genetically complex diseases and thus may help to further our understanding of retinal biology in health and disease.


Data retrieval and analysis

To assemble a list of genes expressed in the adult mammalian retina and RPE we reviewed 27 studies reporting raw or processed transcript data derived from several mammalian species including H. sapiens, B. taurus, C. familiaris and M. musculus (Table 1). The data were generated by cDNA library sequencing, microarray studies, and SAGE. Publically available data analysing transcripts from adult mammalian retina/RPE tissues published until December 2003 were included. Excluded were studies investigating transcription in retina/RPE by using RNA sources such as fetal tissues, cell lines or non-mammalian species. Gene identifiers such as GenBank accession number, gene nomenclature symbol, gene description, UniGene cluster ID, cDNA sequences or tags were available from sources as detailed in additional File 15 and were used to retrieve the unique human LocusLink ID for each gene (as of December, 2003). Only genes with established LocusLink ID were included in the present study. For SAGE data, tag-to-gene assignment was done by querying the SAGEmap_tag_ug-rel dataset [59, 60]. Tags assigned to multiple genes were excluded from further analysis. Human orthologous genes were established via the NCBI-curated homology database[61] or by BLAST sequence comparison [62].

To assemble partial transcriptomes of heart, liver and prostate, for each tissue data were mined from at least one SAGE library, in addition to expressed sequence tag (EST) sources (see additional File 16). Similar to the criteria for the assembly of the retinome, genes identified in only one study were disregarded. EST retrieval was facilitated by use of the Gene Library Summarizer [63] which retrieves the known genes represented by at least one EST and generated from a tissue sample with normal histology.

Partial lists of genes known to play a role in the retina and/or the RPE were assembled from the literature (see additional Files 5, 6, 7 and 8). Additional File 5 summarizes genes known to be exclusively expressed in retina and/or RPE, while additional File 6 includes genes involved in the phototransduction cascade and the vitamin A cycle. Additional File 7 is a partial compilation of genes/proteins verified by immunohistochemistry to be present in adult mammalian retina and/or RPE. A list of 102 genes involved in retinal diseases was retrieved from the RetNet database, January 2004 [58] (see additional File 8).

Assignment of genes and disease loci to the human genome

A total of 43,109 human non-redundant syntenic gene predictions (SGP) were retrieved (as of December 2003) and chromosomally mapped to the reference sequence of the human genome (July 2003) utilizing the USCS Genome Table Browser [64]. Based on the position of their putative transcription start sites, the SGPs were assigned to 5 Mb bins along the human chromosomes. In addition, one-megabase bins were defined for refined analysis of chromosome 6 and 19 (Fig. 1b). Similarly, the chromosomal map positions of the retinome transcripts were determined by querying the USCS Genome Table Browser with the respective LocusLink, UniGene or RefSeq IDs.

Mapped loci of retinal dystrophies with unknown genetic basis (n = 45) were taken from RetNet, January 2004 [58] and placed on the human genome sequence by querying the USCS Genome Table Browser with DNA marker sequences shown to flank the minimal candidate region. Three disease loci (CORD1, CORD4 and RCD1) are insufficiently mapped on the respective human chromosomes and were therefore not included in the analysis.

Statistical analysis of gene distribution

To determine if either of the two datasets, the 43,109 human non-redundant SGPs and the 13K retinome transcripts, is distributed in a non-parametric and distribution free manner over the genome, the Kolmogorov-Smirnov Goodness-of-Fit Test was used [65]. Statistical significance of the median difference in paired chromosomal distribution of retinome transcripts versus the SGPs was then evaluated by the non-parametric Wilcoxon two-sample paired signed rank test [66]. To carry out the test we calculated the difference between all genes versus retinal genes per 5-Mb bin. To correct for the total number of genes within the two groups, the SGPs per bin were adjusted by a factor of 13,037/43,109 = 0.30. Mean and median values per bin were 21.05 and 10.93 for all genes and 21.26 and 19.93 for retinal genes, respectively.


  1. 1.

    RetinaCentral - A portal to the Human Retina. []

  2. 2.

    Chow RL, Lang RA: Early eye development in vertebrates. Annu Rev Cell Dev Biol. 2001, 17: 255-296. 10.1146/annurev.cellbio.17.1.255.

    CAS  Article  PubMed  Google Scholar 

  3. 3.

    Masland RH: The fundamental plan of the retina. Nat Neurosci. 2001, 4: 877-886. 10.1038/nn0901-877.

    CAS  Article  PubMed  Google Scholar 

  4. 4.

    Marmor MF, Wolfensberger TJ: The Retinal pigment Epithelium. 1998, Oxford: Oxford University Press

    Google Scholar 

  5. 5.

    Palczewski K, Polans AS, Baehr W, Ames JB: Ca(2+)-binding proteins in the retina: structure, function, and the etiology of human visual diseases. Bioessays. 2000, 22: 337-350. 10.1002/(SICI)1521-1878(200004)22:4<337::AID-BIES4>3.0.CO;2-Z.

    CAS  Article  PubMed  Google Scholar 

  6. 6.

    Thompson DA, Gal A: Genetic defects in vitamin A metabolism of the retinal pigment epithelium. Dev Ophthalmol. 2003, 37: 141-154.

    CAS  Article  PubMed  Google Scholar 

  7. 7.

    Rattner A, Sun H, Nathans J: Molecular genetics of human retinal disease. Annu Rev Genet. 1999, 33: 89-131. 10.1146/annurev.genet.33.1.89.

    CAS  Article  PubMed  Google Scholar 

  8. 8.

    Pacione LR, Szego MJ, Ikeda S, Nishina PM, McInnes RR: Progress toward understanding the genetic and biochemical mechanisms of inherited photoreceptor degenerations. Annu Rev Neurosci. 2003, 26: 657-700. 10.1146/annurev.neuro.26.041002.131416.

    CAS  Article  PubMed  Google Scholar 

  9. 9.

    Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W: Initial sequencing and analysis of the human genome. Nature. 2001, 409: 860-921. 10.1038/35057062.

    CAS  Article  PubMed  Google Scholar 

  10. 10.

    Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA: The sequence of the human genome. Science. 2001, 291: 1304-1351. 10.1126/science.1058040.

    CAS  Article  PubMed  Google Scholar 

  11. 11.

    Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, Agarwal P, Agarwala R, Ainscough R, Alexandersson M, An P: Initial sequencing and comparative analysis of the mouse genome. Nature. 2002, 420: 520-562. 10.1038/nature01262.

    CAS  Article  PubMed  Google Scholar 

  12. 12.

    Parra G, Agarwal P, Abril JF, Wiehe T, Fickett JW, Guigo R: Comparative gene prediction in human and mouse. Genome Res. 2003, 13: 108-117. 10.1101/gr.871403.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  13. 13.

    Xuan Z, Wang J, Zhang MQ: Computational comparison of two mouse draft genomes and the human golden path. Genome Biol. 2003, 4: R1-10.1186/gb-2002-4-1-r1.

    PubMed Central  Article  PubMed  Google Scholar 

  14. 14.

    Das M, Burge CB, Park E, Colinas J, Pelletier J: Assessment of the total number of human transcription units. Genomics. 2001, 77: 71-78. 10.1006/geno.2001.6620.

    CAS  Article  PubMed  Google Scholar 

  15. 15.

    Carninci P, Waki K, Shiraki T, Konno H, Shibata K, Itoh M, Aizawa K, Arakawa T, Ishii Y, Sasaki D: Targeting a complex transcriptome: the construction of the mouse full-length cDNA encyclopedia. Genome Res. 2003, 13: 1273-1289. 10.1101/gr.1119703.

    PubMed Central  Article  PubMed  Google Scholar 

  16. 16.

    Boue S, Letunic I, Bork P: Alternative splicing and evolution. Bioessays. 2003, 25: 1031-1034. 10.1002/bies.10371.

    CAS  Article  PubMed  Google Scholar 

  17. 17.

    Zavolan M, Kondo S, Schonbach C, Adachi J, Hume DA, Hayashizaki Y, Gaasterland T: Impact of alternative initiation, splicing, and termination on the diversity of the mRNA transcripts encoded by the mouse transcriptome. Genome Res. 2003, 13: 1290-1300. 10.1101/gr.1017303.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  18. 18.

    Kampa D, Cheng J, Kapranov P, Yamanaka M, Brubaker S, Cawley S, Drenkow J, Piccolboni A, Bekiranov S, Helt G: Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22. Genome Res. 2004, 14: 331-342. 10.1101/gr.2094104.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  19. 19.

    Malone K, Sohocki MM, Sullivan LS, Daiger SP: Identifying and mapping novel retinal-expressed ESTs from humans. Mol Vis. 1999, 5: 5-

    PubMed Central  CAS  PubMed  Google Scholar 

  20. 20.

    Sohocki MM, Malone KA, Sullivan LS, Daiger SP: Localization of retina/pineal-expressed sequences: identification of novel candidate genes for inherited retinal disorders. Genomics. 1999, 58: 29-33. 10.1006/geno.1999.5810.

    CAS  Article  PubMed  Google Scholar 

  21. 21.

    Bortoluzzi S, d'Alessi F, Danieli GA: A novel resource for the study of genes expressed in the adult human retina. Invest Ophthalmol Vis Sci. 2000, 41: 3305-3308.

    CAS  PubMed  Google Scholar 

  22. 22.

    Stohr H, Mah N, Schulz HL, Gehrig A, Frohlich S, Weber BH: EST mining of the UniGene dataset to identify retina-specific genes. Cytogenet Cell Genet. 2000, 91: 267-277.

    CAS  Article  PubMed  Google Scholar 

  23. 23.

    Katsanis N, Worley KC, Gonzalez G, Ansley SJ, Lupski JR: A computational/functional genomics approach for the enrichment of the retinal transcriptome and the identification of positional candidate retinopathy genes. Proc Natl Acad Sci U S A. 2002, 99: 14326-14331. 10.1073/pnas.222409099.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  24. 24.

    Bernstein SL, Borst DE, Neuder ME, Wong P: Characterization of a human fovea cDNA library and regional differential gene expression in the human retina. Genomics. 1996, 32: 301-308. 10.1006/geno.1996.0123.

    CAS  Article  PubMed  Google Scholar 

  25. 25.

    Shimizu-Matsumoto A, Adachi W, Mizuno K, Inazawa J, Nishida K, Kinoshita S, Matsubara K, Okubo K: An expression profile of genes in human retina and isolation of a complementary DNA for a novel rod photoreceptor protein. Invest Ophthalmol Vis Sci. 1997, 38: 2576-2585.

    CAS  PubMed  Google Scholar 

  26. 26.

    Buraczynska M, Mears AJ, Zareparsi S, Farjo R, Filippova E, Yuan Y, MacNee SP, Hughes B, Swaroop A: Gene expression profile of native human retinal pigment epithelium. Invest Ophthalmol Vis Sci. 2002, 43: 603-607.

    PubMed  Google Scholar 

  27. 27.

    Wistow G, Bernstein SL, Wyatt MK, Ray S, Behal A, Touchman JW, Bouffard G, Smith D, Peterson K: Expressed sequence tag analysis of human retina for the NEIBank Project: retbindin, an abundant, novel retinal cDNA and alternative splicing of other retina-preferred gene transcripts. Mol Vis. 2002, 8: 196-204.

    CAS  PubMed  Google Scholar 

  28. 28.

    Wistow G, Bernstein SL, Wyatt MK, Fariss RN, Behal A, Touchman JW, Bouffard G, Smith D, Peterson K: Expressed sequence tag analysis of human RPE/choroid for the NEIBank Project: over 6000 non-redundant transcripts, novel genes and splice variants. Mol Vis. 2002, 8: 205-220.

    PubMed  Google Scholar 

  29. 29.

    Yu J, Farjo R, MacNee SP, Baehr W, Stambolian DE, Swaroop A: Annotation and analysis of 10,000 expressed sequence tags from developing mouse eye and adult retina. Genome Biol. 2003, 4: R65-10.1186/gb-2003-4-10-r65.

    PubMed Central  Article  PubMed  Google Scholar 

  30. 30.

    Sinha S, Sharma A, Agarwal N, Swaroop A, Yang-Feng TL: Expression profile and chromosomal location of cDNA clones, identified from an enriched adult retina library. Invest Ophthalmol Vis Sci. 2000, 41: 24-28.

    CAS  PubMed  Google Scholar 

  31. 31.

    Lin CT, Sargan DR: Generation and analysis of canine retinal ESTs: isolation and expression of retina-specific gene transcripts. Biochem Biophys Res Commun. 2001, 282: 394-403. 10.1006/bbrc.2001.4587.

    CAS  Article  PubMed  Google Scholar 

  32. 32.

    Kennan A, Aherne A, Palfi A, Humphries M, McKee A, Stitt A, Simpson DA, Demtroder K, Orntoft T, Ayuso C: Identification of an IMPDH1 mutation in autosomal dominant retinitis pigmentosa (RP10) revealed following comparative microarray analysis of transcripts derived from retinas of wild-type and Rho(-/-) mice. Hum Mol Genet. 2002, 11: 547-557. 10.1093/hmg/11.5.547.

    CAS  Article  PubMed  Google Scholar 

  33. 33.

    Chowers I, Gunatilaka TL, Farkas RH, Qian J, Hackam AS, Duh E, Kageyama M, Wang C, Vora A, Campochiaro PA: Identification of novel genes preferentially expressed in the retina using a custom human retina cDNA microarray. Invest Ophthalmol Vis Sci. 2003, 44: 3732-3741. 10.1167/iovs.02-1080.

    Article  PubMed  Google Scholar 

  34. 34.

    Blackshaw S, Fraioli RE, Furukawa T, Cepko CL: Comprehensive analysis of photoreceptor gene expression and the identification of candidate retinal disease genes. Cell. 2001, 107: 579-589. 10.1016/S0092-8674(01)00574-8.

    CAS  Article  PubMed  Google Scholar 

  35. 35.

    Sharon D, Blackshaw S, Cepko CL, Dryja TP: Profile of the genes expressed in the human peripheral retina, macula, and retinal pigment epithelium determined through serial analysis of gene expression (SAGE). Proc Natl Acad Sci U S A. 2002, 99: 315-320. 10.1073/pnas.012582799.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  36. 36.

    Diatchenko L, Lau YF, Campbell AP, Chenchik A, Moqadam F, Huang B, Lukyanov S, Lukyanov K, Gurskaya N, Sverdlov ED: Suppression subtractive hybridization: a method for generating differentially regulated or tissue-specific cDNA probes and libraries. Proc Natl Acad Sci U S A. 1996, 93: 6025-6030. 10.1073/pnas.93.12.6025.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  37. 37.

    den Hollander AI, van Driel MA, de Kok YJ, van de Pol DJ, Hoyng CB, Brunner HG, Deutman AF, Cremers FP: Isolation and mapping of novel candidate genes for retinal disorders using suppression subtractive hybridization. Genomics. 1999, 58: 240-249. 10.1006/geno.1999.5823.

    CAS  Article  PubMed  Google Scholar 

  38. 38.

    Sharma S, Chang JT, Della NG, Campochiaro PA, Zack DJ: Identification of novel bovine RPE and retinal genes by subtractive hybridization. Mol Vis. 2002, 8: 251-258.

    CAS  PubMed  Google Scholar 

  39. 39.

    Schulz HL, Rahman FA, Fadl El, Moule FM, Stojic J, Gehrig A, Weber BH: Identifying differentially expressed genes in the mammalian retina and the retinal pigment epithelium by suppression subtractive hybridization. Cytogenet Genome Research.

  40. 40.

    Wiehe T, Gebauer-Jung S, Mitchell-Olds T, Guigo R: SGP-1: prediction and validation of homologous genes based on sequence alignments. Genome Res. 2001, 11: 1574-1583. 10.1101/gr.177401.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  41. 41.

    Pruitt KD, Maglott DR: RefSeq and LocusLink: NCBI gene-centered resources. Nucleic Acids Res. 2001, 29: 137-140. 10.1093/nar/29.1.137.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  42. 42.

    National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health. []

  43. 43.

    Harrison PM, Kumar A, Lang N, Snyder M, Gerstein M: A question of size: the eukaryotic proteome and the problems in defining it. Nucleic Acids Res. 2002, 30: 1083-1090. 10.1093/nar/30.5.1083.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  44. 44.

    Landry JR, Mager DL, Wilhelm BT: Complex controls: the role of alternative promoters in mammalian genomes. Trends Genet. 2003, 19: 640-648. 10.1016/j.tig.2003.09.014.

    CAS  Article  PubMed  Google Scholar 

  45. 45.

    Sorek R, Shamir R, Ast G: How prevalent is functional alternative splicing in the human genome?. Trends Genet. 2004, 20: 68-71. 10.1016/j.tig.2003.12.004.

    CAS  Article  PubMed  Google Scholar 

  46. 46.

    Enard W, Khaitovich P, Klose J, Zollner S, Heissig F, Giavalisco P, Nieselt-Struwe K, Muchmore E, Varki A, Ravid R: Intra- and interspecific variation in primate gene expression patterns. Science. 2002, 296: 340-343. 10.1126/science.1068996.

    CAS  Article  PubMed  Google Scholar 

  47. 47.

    Blackshaw S, Kuo WP, Park PJ, Tsujikawa M, Gunnersen JM, Scott HS, Boon WM, Tan SS, Cepko CL: MicroSAGE is highly representative and reproducible but reveals major differences in gene expression among samples obtained from similar tissues. Genome Biol. 2003, 4: R17-10.1186/gb-2003-4-3-r17.

    PubMed Central  Article  PubMed  Google Scholar 

  48. 48.

    Knight JC: Allele-specific gene expression uncovered. Trends Genet. 2004, 20: 113-116. 10.1016/j.tig.2004.01.001.

    CAS  Article  PubMed  Google Scholar 

  49. 49.

    Chowers I, Liu D, Farkas RH, Gunatilaka TL, Hackam AS, Bernstein SL, Campochiaro PA, Parmigiani G, Zack DJ: Gene expression variation in the adult human retina. Hum Mol Genet. 2003, 12: 2881-2893. 10.1093/hmg/ddg326.

    CAS  Article  PubMed  Google Scholar 

  50. 50.

    Tomita H, Vawter MP, Walsh DM, Evans SJ, Choudary PV, Li J, Overman KM, Atz ME, Myers RM, Jones EG: Effect of agonal and postmortem factors on gene expression profile: quality control in microarray analyses of postmortem human brain. Biol Psychiatry. 2004, 55: 346-352. 10.1016/j.biopsych.2003.10.013.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  51. 51.

    Velculescu VE, Madden SL, Zhang L, Lash AE, Yu J, Rago C, Lal A, Wang CJ, Beaudry GA, Ciriello KM: Analysis of human transcriptomes. Nat Genet. 1999, 23: 387-388. 10.1038/70487.

    CAS  Article  PubMed  Google Scholar 

  52. 52.

    Hastie ND, Bishop JO: The expression of three abundance classes of messenger RNA in mouse tissues. Cell. 1976, 9: 761-774. 10.1016/0092-8674(76)90139-2.

    CAS  Article  PubMed  Google Scholar 

  53. 53.

    Zhang L, Zhou W, Velculescu VE, Kern SE, Hruban RH, Hamilton SR, Vogelstein B, Kinzler KW: Gene expression profiles in normal and cancer cells. Science. 1997, 276: 1268-1272. 10.1126/science.276.5316.1268.

    CAS  Article  PubMed  Google Scholar 

  54. 54.

    Brentani H, Caballero OL, Camargo AA, da Silva AM, da Silva WA, Dias Neto E, Grivet M, Gruber A, Guimaraes PE, Hide W: The generation and utilization of a cancer-oriented representation of the human transcriptome by using expressed sequence tags. Proc Natl Acad Sci U S A. 2003, 100: 13418-13423. 10.1073/pnas.1233632100.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  55. 55.

    Caron H, van Schaik B, van der Mee M, Baas F, Riggins G, van Sluis P, Hermus MC, van Asperen R, Boon K, Voute PA: The human transcriptome map: clustering of highly expressed genes in chromosomal domains. Science. 2001, 291: 1289-1292. 10.1126/science.1056794.

    CAS  Article  PubMed  Google Scholar 

  56. 56.

    Versteeg R, van Schaik BD, van Batenburg MF, Roos M, Monajemi R, Caron H, Bussemaker HJ, van Kampen AH: The human transcriptome map reveals extremes in gene density, intron length, GC content, and repeat pattern for domains of highly and weakly expressed genes. Genome Res. 2003, 13: 1998-2004. 10.1101/gr.1649303.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  57. 57.

    Barrans JD, Ip J, Lam CW, Hwang IL, Dzau VJ, Liew CC: Chromosomal distribution of the human cardiovascular transcriptome. Genomics. 2003, 81: 519-524. 10.1016/S0888-7543(03)00008-9.

    CAS  Article  PubMed  Google Scholar 

  58. 58.

    Retinal Information Network. []

  59. 59.

    NCBI, SAGE Maps. []

  60. 60.

    NCBI, SAGE Maps. []

  61. 61.

    NCBI, HomoloGene. []

  62. 62.

    Basic local alignment search tool (BLAST). []

  63. 63.

    Gene Library Summariser (GLS). []

  64. 64.

    UCSC Genomic Bioinformatics, Table Browser. []

  65. 65.

    Chakravarti L, Roy J: Handbook of Methods of Applied Statistics. 1967, Hoboken, NJ: John Wiley and Sons, I:

    Google Scholar 

  66. 66.

    Wilcoxon F: Individual comparisons by ranking methods. Biometrics. 1945, 1: 80-83.

    Article  Google Scholar 

  67. 67.

    NCBI, Unigene. []

  68. 68.

    National Eye Institute (NEI), Retina cDNA Unnormalised Library. []

  69. 69.

    National Eye Institute (NEI), RPE/choroid cDNA Unnormalised Library. []

Download references


This work was supported by grants from the Deutsche Forschungsgemeinschaft (DFG) (We1259/14-2 and 14-3) and the Bundesministerium für Bildung und Forschung (BMBF) (01KW9921/0).

Author information



Corresponding author

Correspondence to Bernhard HF Weber.

Additional information

Authors' contributions

HLS participated in the design of the study, collected and processed the information from the published reports related to the retina and RPE genes. TG conceptually developed the RetinaCentral database and is involved in the curation of the site. JK carried out the statistical analyses and helped with the computational handling of data. BHFW was involved in all aspects of data assembly and prepared the manuscript. All authors read and approved the final manuscript.

Electronic supplementary material

Additional File 1: Retina/RPE genes reported only in a single study List of genes expressed in the retina or RPE reported only in one study (XLS 191 KB)

Additional File 2: Gene list of the reference retina / RPE transcriptome List of 13,037 genes expressed in the retina/RPE (XLS 1 MB)

Additional File 3: List of genes identified exclusively in retina studies (XLS 146 KB)

Additional File 4: List of genes identified exclusively in RPE studies (XLS 118 KB)

Additional File 5: Partial list of genes known to be expressed specifically in the retina and/or RPE (XLS 28 KB)

Additional File 6: Partial list of genes known to be involved in the vitamin A cycle and phototranduction pathway (XLS 23 KB)

Additional File 7: Partial list of genes known to encode retina / RPE proteins (XLS 125 KB)

Additional File 8: List of known genes involved in retinal diseases (XLS 22 KB)

Additional File 9: Partial gene list of the heart transcriptome (XLS 870 KB)

Additional File 10: Partial gene list of the liver transcriptome (XLS 926 KB)

Additional File 11: Partial gene list of the prostate transcriptome (XLS 959 KB)

Additional File 12: List of genes present in the reference retinome and partial transcriptomes of heart, liver, prostate (XLS 151 KB)

Additional File 13: List of genes present in partial transcriptomes of heart, liver, and prostate but not in reference retinome (XLS 20 KB)

Additional File 14: List of retinome transcripts mapping to retinal disease intervals. (XLS 167 KB)

Additional File 15: Data sources used to compile the retina / RPE transcriptome (XLS 18 KB)

Additional File 16: Data sources used to compile partial transcriptomes of heart, liver, and prostate (XLS 14 KB)

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Schulz, H.L., Goetz, T., Kaschkoetoe, J. et al. The Retinome – Defining a reference transcriptome of the adult mammalian retina/retinal pigment epithelium. BMC Genomics 5, 50 (2004).

Download citation


  • Suppression Subtractive Hybridization
  • Retinal Disease
  • Neural Retina
  • Retinal Gene
  • cDNA Library Sequencing