Skip to main content
Fig. 5 | BMC Genomics

Fig. 5

From: Expert curation of the human and mouse olfactory receptor gene repertoires identifies conserved coding regions split across two exons

Fig. 5

Some OR genes have an intron within the coding sequence. (a) Example of three mouse OR loci (Olfr1174-ps, Olfr1175-ps and Olfr1177-ps) previously annotated as pseudogenes due to the lack of a conserved iATG. However, an iATG can be found in the upstream exon, producing an intact open reading frame of ~ 320 amino acids. Below the gene models, the coverage track for the combined RNAseq data from all mouse samples (n = 12), along with the number of reads supporting each exon junction. (b) At the top, previous annotation showed truncated ORFs of two OR pseudogenes. We identify an intact ORF spanning the two loci with strong support from the RNAseq data. Below is the nucleotide sequence around the splice junction, for mouse and the corresponding orthologues in other mammals. At the bottom is the multiple alignment of the protein sequences for the same species with conserved amino acids in blue and variable ones in red. The splice junction is conserved. (c) Boxplots of the average counts for all protein-coding and pseudogene OR genes, plus those that become protein coding after considering ORFs containing an intron (split), for human (top) and mouse (bottom). The split OR genes are expressed at similar levels to protein-coding loci, and significantly higher than pseudogenes. Wilcoxon rank sum test, one-tail. ns = not significant; *** p-value < 1 × 10–7. (d) Boxplots of the expression levels of the five most highly expressed OR genes in each of 33 single mature OSNs. Values for each cell are indicated, coloured following the scheme from C. Values for cells expressing split OR genes are shown as triangles. All 33 mature OSNs express one OR gene at very high levels (1st); expression then drops rapidly by several orders of magnitude. The split OR genes (orange) are expressed at the same levels as the other protein-coding genes, and can similarly induce monogenic expression. (e) Phylogenetic tree of all protein-coding human OR genes. The split OR genes are highlighted in orange and yellow; the latter correspond to genes with two protein-coding isoforms, one split ORF and the other ORF contained within a single exon. Asterisks indicate ORs within clades that previously contained only pseudogenes. (f) Same as E but for the mouse protein-coding OR genes

Back to article page