We identified 1227 feline cDNA sequences derived from tissues obtained from ten cats and performed extensive comparative genomics functional analysis to elucidate the computationally derived comparative gene expression analysis patterns, biochemical functions and phenotypes associated with these sequences. Our cDNA sequences and associated comparative and functional analysis provide an initial perspective on feline biology as viewed through our set of 1227 cDNA sequences. Although it is predicted that the number of feline protein coding genes encoded in the cat genome is in the order of 20,000 to 25,000, similar to most other mammalian genomes, the number of known published cat protein coding gene sequences is much lower at 2099 sequences (NCBI databse, 2011). These 1227 cDNA/gene sequences represent a rich set of potential targets for genetic association studies, biologically relevant diets and pharmacologically active compounds which can be developed to enhance the well-being of companion cats worldwide. Additionally, these sequences have value in similar applications for endangered felids.
Our strategy to identify a set of 1227 high quality and high confidence cDNA sequences from feline tissue samples expands the expressed sequence data for domestic cat. Although we initially obtained over 3000 cDNA sequences, we chose to filter our sequences so that the set we describe would be of the most value for the feline genomics community. Specifically, the conservative strategy outlined in Figure 1 resulted in a set of 913 known sequences and 314 novel sequences (1227 sequences in total) of which 914 orthologous clusters across feline, human, dog and mouse were identified (for which 844 were known cDNA sequences and 70 were novel cDNA sequences). The genes corresponding to these 914 orthologs were used as input sequences for a variety of bioinformatics and computational analyses aimed at providing an initial perspective on the physiological and pathological roles of these sequences in feline development, nutrition and health. Although we have identified a number of interesting results using computational and sequence comparison methods, our analysis only identifies the potential roles of these genes based on comparative analysis in other species. However, validating these results and proving the function of these genes will require molecular and biochemical experimental analysis. The results of our inferred expression analysis provide a set of gene expression patterns consistent with the source tissues used for cDNA production. Of the 21 source tissues used as starting material, inferred expression patterns from each anatomical region were detected with greater than 100 genes being associated in each case. It is interesting to note that each of these tissues exhibited relatively high gene expression numbers (i.e., numbers of genes associated with anatomical expression), which is what one would expect if the inferred expression patterns were an accurate representation of the true expression patterns of the source tissues. Tissues such as brain (725 genes), heart (629 genes), pancreas (568 genes) and testis (703 genes) exhibit inferred expression of more than 60% of the genes encoding our 1227 cDNA sequences. Inferred cellular expression patterns correlated with cell types expected in the source tissues including glial cells and neurons (432 genes and 124 genes respectively), retinal pigment epithelium cells (514 genes), and skeletal muscle cells (499 genes). Together, these results provide an expression framework for understanding the roles of these cDNA sequences in feline physiology and pathology. Because greater than 70% of our cDNA sequences were associated with embryological expression patterns we were not surprised to discover that a significant number of developmental phenotypes were associated with our set of cDNA sequences. Specifically, we identified genes associated with abnormal heart morphology and abnormal cardiac blood flow, abnormal mesoderm development, abnormal developmental patterning and abnormal retinal neuronal layer morphology. These phenotypes are consistent with the expression and role of genes identified in the source tissues selected for cDNA sequencing. The fact that the inferred expression patterns exhibit greater breadth of expression than the starting tissues is in line with the notion that genes tend to be expressed in complex spatial and temporal patterns. It may be the case that the inferred expression patterns include some anatomical, cellular and/or developmental expression patterns which may be false positives, however the overall picture of expression provided by this analysis greatly enhances the value of these cDNA sequences in genomic applications.
Interestingly, our analysis of gene ontology in the context of dN/dS values of individual orthologous cDNA sequences provides insight into how the domestic cat is both similar to and differs from other mammals. We detected evidence of negative selection acting on genes associated with microtubules and the actin cytoskeleton, suggesting that genes associated with these cellular structures are fairly well conserved among mammals [41, 42]. Additionally, we identified gene ontology annotation terms affiliated with the nucleus, the chromosomes and DNA replication exhibiting relatively low values of dN/dS along with orthologs associated with transcriptional regulation and translational elongation. Similar values were obtained for genes annotated as G-protein beta/gamma binding and trans-Golgi network trafficking, vesicle and endoplasmic reticulum compartment membrane and SNARE complex. This is not surprising given that the housekeeping functions of mammalian cells are relatively well conserved. All cells must transmit information from the genome into RNA and protein components in a manner that maintains the appropriate subcellular compartmentalization of molecular functions. Intracellular trafficking that diverges from cellular requirements is likely to exhibit relatively deleterious consequences leading to negative selection compared to cells that function appropriately. Microtubules are involved in cellular integrity, cell motility and cell division; all of these processes are critical for cell viability [41, 43].
In comparison to these highly conserved orthologs which mediate the core cellular processes, we detect evidence of considerably less negative selection acting on orthologs associated with transmembrane receptors, apoptotic signals, guanyl-nucleotide exchange factors and GPCR activity. Additionally, we identified evidence of less negative selection among orthologs associated with extracellular spaces, mitochondrial membrane affiliation and integral proteins of the plasma membrane. Unlike the highly conserved orthologs with intracellular functions, these orthologs form the basis of interactions across cells, through the extracellular space into the nucleus and organelles by a variety of signal transduction mechanisms for which multiple paralogous genes exist in each species. Such patterns of selection have been identified by others and represent evolutionary patterns of selection that may be associated with positive selection in different evolutionary lineages . Moreover, these cDNA sequences might encode proteins for which extracellular environment plays a selective role during evolution.
It is well documented that paralogs diverge at a greater rate than orthologs [45, 46]. Because our analysis did not include the entire set of genes from the cat, we cannot rule out the possibility that some of our orthologs are not true orthologs. It is worthwhile to point out that our analysis included only cat, dog, mouse and human genes which effectively limits the detection of evolutionary selection using the dN/dS ratio because some of these species diverged more than 100 million years ago. Nonetheless, it is interesting that others have observed similar patterns of divergence in protein networks operating at the cellular periphery and within the extracellular space [44, 47].
Our analysis identified orthologs associated with respiratory chain and mitochondria as exhibiting relatively lower levels of negative selection. It is possible that the predatory status of cats resulted in adaptive changes in energy production and oxidative phosphorylation that facilitate the high energy requirements of predation.
It is interesting that we detect evidence of divergence within apoptotic genes in the cat compared to other mammalian species. This may underlie species specific differences in adaptation, such as what might be expected to have happened as obligate carnivores diverged from a common ancestor of omnivores and herbivores. The high protein requirements coupled with enhanced predatory fitness may have co-evolved with differences in cellular response to stress and cellular apoptosis, both within and outside of the brain.
This hypothesis is supported by the metabolic network analysis in GeneGO where the top 25% dN/dS values were associated with metabolic pathways implicated in non-carbohydrate roles. The metabolic network analysis performed with GeneGO demonstrated that genes in the group with smaller dN/dS values are associated with metabolic networks most involved in carbohydrate metabolism, while the genes in the larger dN/dS value group are in metabolic networks most involved in amino acid metabolism. This suggests that depending on metabolic requirements, the evolution rate may not be the same across all metabolic networks, and obligate carnivores like cats, may exhibit relatively less negative selection acting on genes involved in amino acid metabolism and more neutral selection acting on carbohydrate associated genes. This result is in agreement with the observation that cats exhibit different dietary requirements for amino acids taurine , arginine , cysteine and, methionine . In contrast to dogs, cats are unable to synthesize taurine from cysteine , subsequently, taurine deficiency in cats is associated with a variety of clinically important conditions including cardiac  immune , neurological , platelet , reproductive  and retinal  dysfunctions. Additionally, cats exhibit rapid onset of ammonia toxicity resulting from arginine deficiency and, in severe cases, may die within 24 hours [23, 48].
Through the use of KEGG pathway annotation, we identified domestic cat genes involved in a variety of amino acid related pathways including the metabolism of alanine, aspartate, arginine, proline, glutamate, glycine, serine, threonine, histidine, lysine, methionine, phenylalanine, tyrosine and tryptophan. We identified specific pathways in amino acid metabolism, which tend to differ between obligate carnivores and omnivorous mammals . These include six genes involved in tryptophan metabolism which are of value for cats because they are unable to synthesize niacin from tryptophan, as compared to omnivores . Additionally we identified three genes involved in arginine metabolism, which is an essential amino acid in cats . We identified genes involved in glutamate metabolism, which may provide insight into the metabolic consequences of the low levels of ornithine produced from glutamate in cats .
We also identified genes associated with pathways underlying lipid metabolism, including genes participating in biochemical pathways of linoleic, alpha-linoleic acid and arachidonic acids, which is important and noteworthy because cats cannot use linoleic acid for the biosynthesis of arachidonic acid . Further analysis of these genes may provide clues about feline biochemistry associated with arachidonic acid which may be important in feline reproduction . Finally, we identified genes involved in the metabolism of retinol, which represent another very important gene set because cats are unable to synthesize retinol from beta-carotene .
The metabolism and biosynthesis of cofactors, vitamins and glycans is important in the nutrition and health of animals. Within these biochemical pathways, we identified three genes associated with folate metabolism, seven genes involved in glutathione metabolism and two genes associated with keratin sulfate biosynthesis, two genes associated with N-glycan biosynthesis and three genes associated with pathothenate and CoA biosynthesis. Some of these genes may provide value as important biological markers for monitoring oxidative stress, apoptosis and immune function in cats .
Collectively, many of these genes and their associated pathways are important for feline health and nutrition because they represent biochemical processes that cats have adapted to accommodate the narrow dietary range of an obligate carnivore in contrast to omnivorous mammals. The subsequent characterization of these genes and pathways may provide a genomic foundation for understanding how obligate carnivores differ from other animals in both health and disease.
Our functional and evolutionary analysis suggests that through divergent evolutionary trajectories, different species evolve slightly different biochemical processes of cells, tissues and organs that contribute to the manifestation of species specific adaptations and disorders. The domestic cat is known to suffer from a number of hereditary diseases, many of which have counterparts in other species like humans and dogs . As part of our investigation into the biological significance of our cDNA sequences, we employed a comparative genomics approach to discover the phenotypes associated with these sequences. Our approach leveraged the mammalian phenotype ontology that has been developed as part of the mouse genome database . We decided to select a relatively small number of genes for which a considerable number of important phenotypes may be associated.
Our phenotype data was obtained from previously published mouse phenotyping studies using transgenic or knockout mice. Subsequently, they should be considered as related to, rather than exactly, the true phenotypes that might arise in the cat. Because our method relies upon orthologous relationships between cat and mouse genes, it is worthwhile to point out that inaccurate mappings between orthologs may lead to inaccurate predictions of phenotypes. Furthermore, as we have described throughout this paper, the cat exhibits some strong similarities to general biological processes that are shared with mammals. The cat also has well documented differences when compared to omnivorous animals. Therefore, one must consider the phenotype analysis as a general thematic picture of the functional consequences of our cDNA sequences rather than as a one-to-one mapping of gene-phenotype associations within our cDNA sequences.
We identified seven phenotypic modules exhibiting 136 phenotypes arising from only 38 genes. Many of the genes we identified exhibit numerous phenotypes, both within and across modules. Such pleiotropic effects underlie the complexity of mammalian genomes and provide context for future genomic studies. We selected these gene-phenotype associations to provide a detailed, but yet tractable picture of how our cDNA sequences might map onto anatomical and physiological traits.
Within the cardiac module, we identified eight genes associated with phenotypes relating to cardiac disease in cats. Some of the genes within this module include tropomodulin 1, snail homolog 1 and an interleukin receptor antagonist. This module includes phenotypes of cardiac hypertrophy and mitral valve defects, both of which are known hereditary diseases in cats . These genes provide examples of the types of phenotypes that might arise from perturbations of cat genes underlying inherited feline cardiac diseases, such as aortic stenosis, atrial-septal defect, mitral valve displasia, tetralogy of Fallot and ventricular-septal defect [53, 54].
Our developmental module consists of seven genes and includes a TGFbeta induced homeobox transcription factor as well as the signaling molecule arginine-vasopressin. The phenotypes associated with this module include developmental patterning across both the proximal/distal axis and the rostral/caudal axis. The phenotypes also include cellular specification and patterning such as mesoderm development, trophoblast layer morphology and adipose tissue differentiation, to name a few. Domestic cats exhibit a variety of developmental defects, such as polydactyly, hip dysplasia, sacrococcygeal dysgenesis, portocaval shunt, open central fontanel, open lateral fontanel and thoracic hemivertebra [54–57]. The cDNA sequences we describe may include genes that are responsible for abnormal developmental conditions in domestic and endangered felids.
We identified a sensory module, which contains five genes such as NADH dehydrogenase (ubiuinone) Fe-S protein 4 and caspase 9 apoptosis-related cysteine pepidase. This module includes the phenotypes of cataracts, blindness and optic nerve atrophy. Examples of inherited sensory system disorders in the domestic cat include cataracts, corneal dystrophy (stromal and endothelial), progressive retinal atrophy and glaucoma . The overlap between retinal and ocular phenotypes and inherited feline diseases suggests that there are specific genomic regions, represented by our cDNA sequences, which may include aspects of the genetic mechanisms of these debilitating diseases in cats. It is interesting to note that our sensory module includes genes involved in energy production. This is not surprising as retinal tissue is known to exhibit relatively high energy requirements and depletion of energy in this tissue has been associated with blindness and other vision defects .
Within our energy and homeostasis module, we identified genes like glycerol kinase 2, NAD(P)H dehydrogenase quinone 1 and NADH dehydrogenase (ubiquinone) Fe-S protein 4. The phenotypes within this module are associated with traits of clinical and adaptive importance in the cat. For example, our comparative phenotype analysis identified phenotypes of insulin resistance, increased circulating insulin level and impaired glucose tolerance; traits associated with the feline hereditary disease of diabetes mellitus . This module also contains phenotypes such as abnormal gluconeogenesis, increased glucagon, abnormal glucose homeostasis and increased circulating ammonia level, which are important in felid nutrition as cats use gluconeogenesis as a predominant form of energy production and are susceptible to ammonia toxicity [17, 18]. The genes in this module are of value in exploring some of the fundamental metabolic and biochemical differences between obligate carnivores and omnivores. Moreover, these genes may provide a genomic basis for specific diets that can reduce the incidence of feline disorders associated with specific nutritional deficiencies.
Within other modules, we identified phenotypes associated with cancer, such as increased tumor incidence, malignant tumors and B-cell derived lymphoma which may provide clues to the genetic susceptibility cats have for hereditary lymphoma . Among the behavioral phenotypes within the nervous system module, we identified a number of traits that may represent predator specific adaptations of cats. For example, we identified cDNA sequences associated with spatial learning, balance, righting response, gate and motor coordination; traits that are almost synonymous with cats and of extreme adaptive value for an apex hyper predator.
The comparative genomics analysis of OMIM diseases within our cDNA sequence data set provides a final perspective on the importance of our reported sequences in the health of domestic cats. Many of the diseases identified in the OMIM mapping are also represented by phenotypes within the modules. This independent annotation demonstrates that our analysis converges even though OMIM analysis leverages human orthology relationships and the phenotype analysis leverages murine orthology relationships. It is worth noting the limitation of sequence based comparative genomics approaches. They can provide considerable insight into the functional role of our cDNA sequences, but must ultimately be proven through focused and carefully designed genomics studies in cats. Nonetheless, our cDNA sequences and associated analysis provide considerable value through the identification of many interesting clinically and nutritionally relevant feline genes.
The set of diseases and phenotypes provides a starting point for candidate gene approaches and for the selection of biomarkers for monitoring nutrition and health. By combining diverse types of annotation, we can better understand the function of a given gene in a breadth of tissues and organ systems and of the biological processes it is involved in the organismal level, as well as its role in disease. For example, we identified genes associated with expression in the heart, and with a number of cardiac phenotypes, including cardiac hypertrophy, abnormal outflow tract and abnormal mitral valve morphology, as well as the OMIM disease annotation of dilated cardiomyopathy. These are of direct relevance to feline disease, since hypertrophic cardiomyopathy is a common clinical concern in cats .
The recent development of a 70,000 SNP feline bead array by Hill's Pet Nutrition and the Morris Animal Foundation provides an important and powerful resource for conducting gene association studies in the domestic cat, and related endangered species. However, even in the absence of whole-genome genetic association approaches, our characterization of these 1227 cDNA sequences provides an extremely valuable resource for candidate gene approaches aimed at investigating the genetic basis of feline phenotypes. It will be interesting to see how our comparative and functional analysis of these 1227 cDNA sequences compares to the data produced from high throughput sequencing and future genetic studies within and across different breeds in the domestic cat. It is likely that some of our functional annotations may turn out not to hold, and it is equally likely that some of them will. Through collaborative efforts, it will be possible to begin unravelling the genetic mechanisms underlying feline health and disease.