Global regulatory architecture of human, mouse and rat tissue transcriptomes
- Ajay Prasad†1,
- Suchitra Suresh Kumar†1,
- Christophe Dessimoz3, 4, 5,
- Stefan Bleuler2,
- Oliver Laule2,
- Tomas Hruz2, 4,
- Wilhelm Gruissem1 and
- Philip Zimmermann1, 2Email author
© Prasad et al.; licensee BioMed Central Ltd. 2013
Received: 21 January 2013
Accepted: 25 September 2013
Published: 20 October 2013
Predicting molecular responses in human by extrapolating results from model organisms requires a precise understanding of the architecture and regulation of biological mechanisms across species.
Here, we present a large-scale comparative analysis of organ and tissue transcriptomes involving the three mammalian species human, mouse and rat. To this end, we created a unique, highly standardized compendium of tissue expression. Representative tissue specific datasets were aggregated from more than 33,900 Affymetrix expression microarrays. For each organism, we created two expression datasets covering over 55 distinct tissue types with curated data from two independent microarray platforms. Principal component analysis (PCA) revealed that the tissue-specific architecture of transcriptomes is highly conserved between human, mouse and rat. Moreover, tissues with related biological function clustered tightly together, even if the underlying data originated from different labs and experimental settings. Overall, the expression variance caused by tissue type was approximately 10 times higher than the variance caused by perturbations or diseases, except for a subset of cancers and chemicals. Pairs of gene orthologs exhibited higher expression correlation between mouse and rat than with human. Finally, we show evidence that tissue expression profiles, if combined with sequence similarity, can improve the correct assignment of functionally related homologs across species.
The results demonstrate that tissue-specific regulation is the main determinant of transcriptome composition and is highly conserved across mammalian species.
Mouse and rat are frequent mammalian models in biomedical research to learn more about a disease, its diagnosis and its treatment. Comparing results obtained from mammalian models with those from human samples is essential to estimate conservation of molecular mechanisms across species and to refine prediction models. Cross-species comparisons can be carried out at several levels such as genomic or protein sequences, molecular abundances, or phenotypes. On the sequence level, the conservation of protein sequence between species has been extensively studied. A more recent development has been the study of genomic alterations, some of which appear to be highly relevant in translational efforts from model organisms to human . For example, several genetic mutations in acute promyelocytic leukemia have been shown to be conserved between human and mouse and are expected to be relevant for the disease . On the level of molecular abundance, protein or transcript expression are usually assessed either by comparing absolute abundances between samples (e.g. time-course or tissue type comparisons) or abundance ratios (e.g. response to a perturbation or disease, given as a ratio of an experimental versus a control condition). Several studies exist comparing transcript abundance in various tissues. Most of these studies were restricted to comparing the human and mouse transcriptomes, thereby limiting the interpretation to a bilateral relationship without evidence from further organisms [3–7]. Comparing exclusively human and mouse lacks a measure for estimating inter-species distance in transcriptome composition. Including rat as a close relative to mouse is crucial to assess the quality of the differences measured between mouse and human.
In contrast to tissue expression studies, the comparison of responses to perturbations has been much more difficult because available data consists of experiments carried out under conditions that are not easily comparable. In fact, from all public data currently available from gene expression repositories, very few experiments can be considered equivalent between human and mouse, such as cell cultures from the same tissue of origin and treated with the same chemical at identical concentration ranges and in a similar experimental setup. Therefore, comparisons have remained primarily descriptive, for example on the level of correlation network structures between species , assuming that the composition of data has little effect on overall network structure.
Recent comparative genomics efforts have led to a better understanding of conservation of gene expression between human and rodents. Nevertheless, there is still much debate about which regulatory aspects are conserved between gene homologs of different species. Earlier studies comparing the expression patterns of orthologous gene pairs in different tissues showed conflicting results. For example, some studies suggested that orthologous genes have dissimilar expression patterns [3, 4, 9–11], while others reported congruent expression profiles [5–7, 12–17]. Reasons for this discrepancy could be local experimental effects due to low coverage of genes or conditions, technological and methodological choices, probe quality, data normalizaton issues, or different methods to identify orthologs between species.
In this work, we overcome these limitations by combining three mammalian species, a large set of tissue types, two independent datasets per organism, and a data preparation methodology that delivers highly representative and robust expression values from a very large set of original data. Specifically, we integrated high quality human, mouse and rat data from more than 33,900 Affymetrix expression arrays across six microarray platforms (3 species × 2 platforms per species) and across at least 55 overlapping tissue types between each set of platforms. The original data for this study was obtained from Genevestigator , a database of manually annotated, quality controlled and globally normalized public gene expression experiments.
The study was carried out on two distinct microarray platforms per species in order to assess the validity of the results across platforms and between independent experimental compositions. We grouped the platforms into two platform sets, SET 1 including arrays with lower transcript coverage (Affymetrix HG-U133A (20k), MG-U74Av2 (12k), and RG-U34 (8k)) and SET 2 including more recent full genome arrays (Affymetrix Human133 Plus 2.0 (47k), Mouse430 2.0 (40k) and Rat230 2.0 (31k)). To further minimize biases that can occur in our comparison between species, we carried out the following:
Overlap of tissue types: for all three species, we created data matrices with identical composition of tissues within each set.
Single vector per tissue type: for each tissue type and microarray platform, we calculated a single representative expression vector based on all samples annotated with this tissue type.
Selection of orthologs: we used OMA , a state-of-the-art orthology prediction algorithm, to obtain gene ortholog clusters between human, mouse and rat.
Selection of probe sets: we excluded, wherever possible, probe sets targeting multiple transcripts, keeping only highly specific probe sets.
Data normalization: a global normalization was performed across all data from a given microarray platform (see Methods section). Additionally, each probe set was normalized across all tissue types to yield a standardized representation of tissue specificity.
Hypothesis 1: The global architecture of tissue expression is conserved between human and rodent species.
Hypothesis 2: Orthologs of more closely related species have a higher tissue expression correlation.
Hypothesis 3: Tissue expression profiling can improve the mapping of functional orthologs.
Results and discussion
Architecture of tissue expression
We hypothesized that biologically related tissues would cluster together, while unrelated tissues would be located more distantly on the plot. This is assumed to be true if for each tissue type, a representative vector of expression for that tissue can be generated; here, we calculated an average vector from all samples annotated with a given tissue type, irrespective of the nature of the experiment. The PCA projections revealed that biologically related tissues indeed clustered close to each other. Furthermore, the clustering was highly similar between the two platform sets of a given organism, despite completely different experiment compositions. The global architecture was strikingly similar between the three species studied, both in SET 1 and SET 2. In contrast to a previous study , in which mouse had a similar overall structure as human but was scaled differently, our results revealed an almost identical architecture for human, mouse and rat, both in the scale and in the clustering of the projected individual tissue types. Considering human data alone, the results were similar to those obtained by other studies (e.g. [6, 20], although here each category in the plot represents an average vector aggregated from a population of samples rather than plotting individual samples in the PCA.
The first principal component separated distinctly all central nervous system (CNS) tissues from all other body parts (limb, muscles, cell cultures and all inner organs). This suggests that all tissues of neural origin possess a major subset of genes that are strongly differentially expressed relative to tissues of other origin. For each species, the other principal components segregated all other tissue types (i.e. not of neuronal origin) into clusters of anatomical parts that have a common origin or physiology. For example, a variety of muscle tissues formed a distinct cluster which was located close to another cluster involving heart tissues (here labelled musculoskeletal and cardiovascular systems, respectively). On the opposite side of this component, related tissue clusters from the digestive, respiratory and renal systems were located. Interestingly, in SET 2, liver appears to be more closely located to the remaining tissues for mouse and rat in comparison to human. For SET 1, this is observed when principal component 1 is plotted against component 2. This result confirms previous findings in a comparison of human and mouse [6, 21]. Finally, SET 1 also comprised tissues belonging to haemolymphoid or immune system, such as thymus, spleen, bone marrow, blood and lymph node. These tissues distinctly clustered together on the opposite side of the cardiovascular and musculoskeletal system (Figure 1A).
A particular case is the 'cell culture’ category. It appears distinct from the tissues from which these cells originated. The results suggest that bringing cells into culture causes a major shift in the transcriptome population. To assess the nature of this transformation more in detail, we identified a set of 177 genes exhibiting high expression in cell culture but minimal expression in all other normal tissues from SET 2. Gene Ontology enrichment analysis of these genes revealed that most of these genes belonged to apoptosis-related processes (see Additional file 3). The individual cell culture samples that were aggregated to obtain the average vector for the category 'cell culture’ contained primary cells from various organs, cell lines and also blood cells. We hypothesized that the apoptosis-related signature was mainly due to the presence of immune cells and immortalized cell lines. Therefore, to further refine our analysis and to compare uniquely primary cells to their organs of origin, we carried out an independent analysis using Genevestigator, in which we excluded cell lines and blood cells. This extensive search comparing 54 different cell culture types to over 200 normal tissue types revealed a set of 217 probe sets having strong expression in most cell culture categories, but minimal expression in all normal tissues (see Additional file 4). A Gene Ontology enrichment analysis of this set revealed that most of these genes were involved in biological processes related to extracellular structure organization, vasculature development, cell motility, biological adhesion and wound healing (see Additional file 5). Obviously, cells artificially extracted and isolated from their tissue context trigger the activation of processes to reestablish this context. These processes involve several hundred genes and therefore strongly influence the global transcriptome population of these cultures, as shown here and in previous work [20, 22, 23]. Interestingly, the activation of genes related to extracellular structure organization, vasculature development and cell motility was observed across a wide variety of cell types arising from functionally completely different organs. The shift between primary cells and their tissue of origin was in the same order of magnitude as between different organs and tissue types, raising questions about how cell culture can be used to model biological processes in vitro. Nevertheless, recent progress in our understanding of cell adhesion and interaction with neighboring cells are enabling cell culture models to better mimic in vivo processes [24, 25].
Conservation of expression regulation
Enriched GO biological processes of genes with Pearson correlation coefficient above 0.7 in SET 1 and SET 2
hemostasis, blood coagulation
synaptic vesicle transport
transmission of nerve impulse
Enriched GO biological processes of genes with Pearson correlation coefficient smaller than -0.2 in SET 1 and SET 2
anterograde axon cargo transport
complement activation, lectinpathway
establishment of localization
positive regulation of adenylatecyclase activity by GPCR
pyrimidine ribonucleotide metabolic process
We extended our correlation analysis to study the global topology of tissue expression correlation data by comparing the degree distributions for each species. For this, we modeled our expression network as an undirected graph, where a node represents a gene and an edge is drawn between two genes if their expression profiles are correlated beyond a Pearson correlation coefficient threshold. For each percentage level, a degree distribution was calculated. We then calculated an average of five degree distributions for three different ranges of correlation coefficient values, namely from 0.85 to 0.89, 0.90 to 0.94 and 0.95 to 0.99. Figure 3A shows that the expression correlation networks of the tissue transcriptomes follow a power law connectivity distribution, that is, n(k) ∼ k-γ. In this study, γ ranged from 2.9 to 3.1 which is in the typical range for a scale free network topology . Tissue correlation networks therefore are composed of hubs, where different sizes of gene sets are highly correlated in comparison to a random network. It is interesting to observe that the scale free properties prevail at higher degrees for the lower correlation ranges than for the highest correlation range. This is most likely an artifact due to the smaller number of genes remaining in the network after filtering for higher correlations. In fact, the scale free properties for SET 2, which comprises four times more orthologs than SET 1, were present at higher degrees for all three organisms.
Integrating expression and sequence data
The typical way of inferring genes with conserved function across species is to identify orthologous clusters-sets of genes that evolved from a single common gene in the last common ancestor of the species in question . While sequence conservation has proven its advantage in determining orthologous relationships, this type of analysis does not include-let alone model-the associated regulatory machinery. And indeed, orthology alone is no guaranty of function conservation: many cases are known where orthologs have diverged functionally . By constrast, gene expression analysis measures the dynamic, condition-specific response of complex biological systems. Furthermore, even when the ancestral function has been retained among orthologs, because of lineage-specific duplications, such clusters can contain more than one gene per organism. Indeed, approximately 2% of the orthologous clusters from human, mouse and rat are composed of such m:n:p orthologous clusters (with m, n, p all > 1). While functional redundancy of homologs within a species exists (e.g. due to gene dosage requirements), it is generally believed that most duplicated genes carry out different functions [29–32], though the difference has recently been shown to be relatively modest . We hypothesized that a combination of sequence similarity and gene expression correlation might yield the most likely correct mappings of homologous genes that carry out the same function.
Due to the importance of both mouse and rat in biomedical research, it is imperative to study the similarities and differences relative to human, both at the level of biological processes and responses to perturbations. In this study, we focus on the first aspect. We compared a highly robust set of tissue expression profiles between orthologous genes of human, mouse and rat, obtained from the well annotated and quality controlled database Genevestigator. We used two independent data sets based on 2127 and 8954 gene clusters to substantiate our results and found that there is a strong conservation of tissue expression profiles across species for both datasets. In general, earlier studies overestimated variances between species , while more recent studies using larger and more standardized datasets showed a much higher level of conservation . Although several previous studies compared tissue-specific gene expression profiles, none of them compared human, mouse and rat tissues systematically. In fact, most of the data used in earlier comparative studies were limited either to a relatively small number of tissue types (e.g. [5, 7, 16]), or to a larger but only partly overlapping set of tissues between human and mouse . Nevertheless, our results are in conjunction with some of the earlier studies [6, 16] and can be attributed to the conservation of functionally fundamental biological processes. In fact, during organism development, cells differentiate to reach highly complex but robust and distinct biological and functional end-states. The functional stability and homogeneity of these states is crucial for the function of larger organs and for the overall survival of the organism. Therefore, it is not surprising that these mechanisms are highly conserved between mammalian species. It was interesting to observe a stronger similarity between biologically related tissues across species than between unrelated tissues within species.
Several studies have tried to minimize platform and sample variation in order to perform cross-species comparisons (e.g. ). Although our analysis was performed on a dataset compiled from experiments collected from various laboratories and performed under different conditions, we observe a high conservation of tissue-specific expression. We show that variances originating from the nature of the tissue type are significantly higher than the variances arising from perturbations applied to these tissues. This has wide reaching implications on the design and sampling of biological experiments. In fact, small variations in the composition of different cell types within samples may massively bias the population of genes that appear to be responding to an external perturbation such as the challenging of cells with a chemical compound. This means that microarray or RNA-seq perturbation experiments carried out on biological samples composed of various cell types require a higher number of biological replicates than experiments carried out on single cell types such as cell culture. It also implicates that the sampling of tissue material for transcriptomic analysis should minimize the number of functionally different cell types per sample. Recent developments in single cell analytics have raised hopes of being able to eliminate such biases. However, the level of variation between neighbor cells of the same type is still a matter of debate. It seems most likely that for the measurement of cellular response to perturbations, working with a population of hundreds of identical cell types under strictly controlled conditions will yield more representative results than individual cells and require less biological replicates.
Of particular relevance are the findings about the transcriptome shift occasioned by bringing cells into culture. The use of cell cultures and cell lines to describe biological systems using RNA-seq or microarrays must be evaluated very carefully. Practically, lists of genes that are strongly biased between individual cell cultures and their tissue of origin would need to be identified, and the analysis of cell culture transcriptome data could then be significantly improved by excluding them from such analyses. It is likely that each tissue type will have a significant proportion of genes biased specifically in the corresponding cell culture, while a common set of genes, as found here, will be present for most cell types.
On the level of gene network architecture, our study showed that the tissue expression correlation networks from human, mouse and rat share common topological properties, such as scale-free connectivity distribution. Global properties like these reflect the principles underlying robustness and dynamics of these networks.
Our analysis on orthologous gene clusters having multiple orthologs revealed that gene expression profiles can improve the mapping of genes, which are functionally related. By virtue of studying tissue-specific gene expression, core sets of genes can be identified for the purpose of developing animal models of human diseases by producing transgenic rodents with tissue-specific inducible gene expression or tissue-specific gene deletions.
Finally, we report here about the spatial dimension of transcriptomes, i.e. the dimension of fundamental biological processes resulting from cell differentiation and organ development. An essential and still missing aspect in the comparative study of human, mouse and rat transcriptomes is the comparison of their response to perturbations. The generation and availability of consistent datasets from human and rodents representing a variety of perturbations carried out on the same tissue or cell types under identical conditions would be highly valuable to further our understanding and use of mouse and rat models as predictors for applications in biomedical research.
Orthologous gene cluster
Orthologous gene clusters of human, mouse and rat were generated from the OMA database . Only clusters that have OMA IDs for all three species were considered and Ensembl gene IDs for each of the corresponding OMA IDs were retrieved from the OMA server. The mapping of Ensembl IDs to Affymetrix probe set IDs was retrieved from mapping files provided by Affymetrix (http://www.affymetrix.com/estore/).
Preparation of expression data
SET 1 comprised the Affymetrix platforms Human Genome U133A, Murine Genome U74 Version 2 and Rat Genome U34. These platforms contain 22283, 12654 and 3227 probe sets for human, mouse and rat, respectively. SET 2 comprised the Affymetrix platforms Human Genome U133 Plus 2.0 arrays, Mouse Genome 430 2.0 and Rat Genome 230 2.0. These platforms contain 50855, 21391 and 6870 probe sets for human, mouse and rat, respectively.
All experiments from Genevestigator are manually annotated using ontologies to ensure the controlled use of sample descriptors. The data was quality controlled as well as normalized using Robust Multi-array Average (RMA, ) and inter-experiment scaling (see Genevestigator User Manual; http://www.genevestigator.com). Probe sets targeting multiple transcripts represented by suffix (_x_ at, _g_at, _f_at,_r_at,_b_at,_l_at and _i_at) were filtered out and only probe sets with higher specificity represented by suffix (_at and _s_at) were taken into consideration for further analysis. In cases where multiple probesets were available for a given gene, the probe set having the maximum present call percentage across the complete database was considered. Only complete gene clusters, i.e having triplets of human, mouse and rat probe sets, were considered for further analysis.
Principal Component Analysis (PCA) was carried out on a matrix, which contained the orthologous genes from the gene clusters containing 1:1:1 relationships. For each organism, expression vectors across tissue types were normalized separately using the R function norm. Principal Component Analysis was performed on these matrices using the prcomp function.
Hierarchical clustering was performed with the hclust function provided in R statistical package based on 1-[pearson correlation] as a distance measure and complete linkage clustering.
For analyzing the homologous gene clusters having m:n:p relationships between species, the gene clusters having a maximum of four related genes in each species were considered. For all possible combinations of the homologous genes in Human-Mouse, Human-Rat and Mouse-Rat, expression matrices were created, which were normalized and the pairwise Pearson correlation coefficient was calculated. The correlation values for Human-Mouse, Human-Rat and Mouse-Rat genes present in homologous gene clusters were then ranked from highest to lowest and the results were visualized using a box plot.
Correlation networks were constructed based on pairwise correlations between all genes of a given dataset. An edge between two genes represented by nodes was defined for correlation coefficients beyond a chosen threshold, resulting in a discretized matrix with 1, if the correlation coefficient was greater than that threshold and 0 otherwise. The degree for each gene was calculated from the matrix as a sum of number of 1s present in each row. The degree distribution was calculated by using the formula P(k) = n k /n, where n is the number of nodes in a network and n k is the number of nodes having degree k. A degree distribution was calculated for each percent of correlation threshold, and an average of these values was calculated for each of the ranges 0.85-0.89, 0.90-0.94 and 0.95-0.99 for each species. The degree distribution per range is represented as a log-log graph of average degree distribution versus degree.
Gene Ontology enrichment analysis was performed on orthologous gene clusters with a Pearson correlation coefficient above 0.7 for both SET 1 and SET 2 and below -0.2 for both datasets. GO enrichment analysis was performed using the GOstats package in Bioconductor, with hypergeometric distribution testing. The complete list of genes served as the universal set. A p-value of 0.001 was considered significant for genes having positive correlation, while a p-value of 0.05 was considered significant for genes having negative correlation.
The second analysis of cell cultures versus normal tissues was carried out using the GENE SEARCH Anatomy tool from Genevestigator . To allow a comparison of primary cells versus normal tissues, we excluded immortalized cell lines and blood cells from the 'target’ categories of the gene search and compared against all other normal tissues. Gene ontology enrichment was performed with GOEAST  and the long list of results obtained was trimmed using GO Trimming .
We acknowledge support of our research from the Swiss Commission for Technology and Innovation (CTI, grants 9428.1 PFLS-LS and 12396.1 PFLS-LS) and from ETH Zurich. We thank also Vincent Jaquet for performing preliminary analyzes on similar sets of data, the results of which led to designing this project.
- Bennett CN, Green JE: Unlocking the power of cross-species genomic analyses: identification of evolutionarily conserved breast cancer networks and validation of preclinical models. Breast Cancer Res. 2008, 10: 213-10.1186/bcr2125.PubMed CentralView ArticlePubMedGoogle Scholar
- Wartman LD, Larson DE, Xiang Z, Ding L, Chen K, Lin L, Cahan P, Klco JM, Welch JS, Li C, Payton JE, Uy GL, Varghese N, Ries RE, Hoock M, Koboldt DC, McLellan MD, Schmidt H, Fulton RS, Abbott RM, Cook L, McGrath SD, Fan X, Dukes AF, Vickery T, Kalicki J, Lamprecht TL, Graubert TA, Tomasson MH, Mardis ER, et al: Sequencing a mouse acute promyelocytic leukemia genome reveals genetic events relevant for disease progression. J Clin Invest. 2011, 121 (4): 1445-1455. 10.1172/JCI45284. 10.1172/JCI45284PubMed CentralView ArticlePubMedGoogle Scholar
- Yanai I, Graur D, Ophir R: Incongruent expression profiles between human and mouse orthologous genes suggest widespread neutral evolution of transcription control. Omics : J Integr Biol. 2004, 8: 15-24. 10.1089/153623104773547462.View ArticleGoogle Scholar
- Su AI, Wiltshire T, Batalov S, Lapp H, Ching KA, Block D, Zhang J, Soden R, Hayakawa M, Kreiman G, Cooke MP, Walker JR, Hogenesch JB: A gene atlas of the mouse and human protein-encoding transcriptomes. Proc Natl Acad Sci U S A. 2004, 101 (16): 6062-6067. 10.1073/pnas.0400782101.PubMed CentralView ArticlePubMedGoogle Scholar
- Su AI, Cooke MP, Ching KA, Hakak Y, Walker JR, Wiltshire T, Orth AP, Vega RG, Sapinoso LM, Moqrich A, Patapoutian A, Hampton GM, Schultz PG, Hogenesch JB: Large-scale analysis of the human and mouse transcriptomes. Proc Natl Acad Sci U S A. 2002, 99 (7): 4465-4470. 10.1073/pnas.012025199.PubMed CentralView ArticlePubMedGoogle Scholar
- Zheng-Bradley X, Rung J, Parkinson H, Brazma A: Large scale comparison of global gene expression patterns in human and mouse. Genome Biol. 2010, 11: R124-10.1186/gb-2010-11-12-r124.PubMed CentralView ArticlePubMedGoogle Scholar
- Liao BY, Zhang J: Evolutionary conservation of expression profiles between human and mouse orthologous genes. Mol Biol Evol. 2006, 23 (3): 530-540.View ArticlePubMedGoogle Scholar
- Bergmann S, Ihmels J, Barkai N: Similarities and differences in genome-wide expression data of six organisms. PLoS Biol. 2003, 2: e9-PubMed CentralView ArticlePubMedGoogle Scholar
- Rustici G, Mata J, Kivinen K, Lio P, Penkett C, Burns G, Hayles J, Brazma A, Nurse P, Bahler J: Periodic gene expression program of the fission yeast cell cycle. Nat Genet. 2004, 36 (5): 809-17.View ArticlePubMedGoogle Scholar
- Han ES, Hickey M: Microarray evaluation of dietary restriction. J Nutr. 2005, 135 (6): 1343-1346.PubMedGoogle Scholar
- Yanai I, Korbel JO, Boue S, McWeeney SK, Bork P, Lercher MJ: Similar gene expression profiles do not imply similar tissue functions. Trends Genet. 2006, 22 (3): 132-138. 10.1016/j.tig.2006.01.006.View ArticlePubMedGoogle Scholar
- Lelandais G, Crom S: Cross-Species Comparison Using Expression Data. Introduction to Systems Biology. Edited by: Choi S. 2007, : Humana Press, 147-159. doi:10.1007/978-1-59745-531-2-8.View ArticleGoogle Scholar
- Fang H, Tong W, Perkins R, Shi L, Hong H, Cao X, Xie Q, Yim S, Ward J, Pitot H, Dragan Y: Bioinformatics approaches for cross-species liver cancer analysis based on microarray gene expression profiling. BMC bioinformatics. 2005, 6 (Suppl 2): S6-10.1186/1471-2105-6-S2-S6.PubMed CentralView ArticlePubMedGoogle Scholar
- Liao BY, Zhang J: Low rates of expression profile divergence in highly expressed genes and tissue-specific genes during mammalian evolution. Mol Biol Evol. 2006, 23 (6): 1119-1128. 10.1093/molbev/msj119.View ArticlePubMedGoogle Scholar
- Chen J, Blackwell TW, Fermin D, Menon R, Chen Y, Gao J, Lee AW, State DJ: Evolutionary-conserved gene expression response profiles across mammalian tissues. OMICS: J Integr Biol. 2007, 11: 96-122. 10.1089/omi.2006.0007.View ArticleGoogle Scholar
- Chan E, Quon G, Chua G, Babak T, Trochesset M, Zirngibl R, Aubin J, Ratcliffe M, Wilde A, Brudno M, Morris Q, Hughes T: Conservation of core gene expression in vertebrate tissues. J Biol. 2009, 8 (3): 33-10.1186/jbiol130.PubMed CentralView ArticlePubMedGoogle Scholar
- Brawand D, Soumillon M, Necsulea A, Julien P, Csárdi G, Harrigan P, Weier M, Liechti A, Aximu-Petri A, Kircher M, Albert FW, Zeller U, Khaitovich P, Grützner F, Bergmann S, Nielsen R, Paabo S, Kaessmann H: The evolution of gene expression levels in mammalian organs. Nature. 2011, 478 (7369): 343-348. 10.1038/nature10532.View ArticlePubMedGoogle Scholar
- Hruz T, Laule O, Szabo G, Wessendorp F, Bleuler S, Oertle L, Widmayer P, Gruissem W, Zimmermann P: Genevestigator v3: a reference expression database for the meta-analysis of transcriptomes. Adv Bioinformatics. 2008, 2008: 420747-PubMed CentralView ArticlePubMedGoogle Scholar
- Schneider A, Dessimoz C, Gonnet GH: OMA Browser–exploring orthologous relations across 352 complete genomes. Bioinformatics. 2007, 23: 2180-2182. 10.1093/bioinformatics/btm295.View ArticlePubMedGoogle Scholar
- Lukk M, Kapushesky M, Nikkila J, Parkinson H, Goncalves A, Huber W, Ukkonen E, Brazma A: A global map of human gene expression. Nat Biotechnol. 2010, 28 (4): 322-324. 10.1038/nbt0410-322.PubMed CentralView ArticlePubMedGoogle Scholar
- Somel M, Creely H, Franz H, Mueller U, Lachmann M, Khaitovich P, Paabo S: Human and chimpanzee gene expression differences replicated in mice fed different diets. PLoS ONE. 2008, 3: e1504-10.1371/journal.pone.0001504.PubMed CentralView ArticlePubMedGoogle Scholar
- Birgersdotter A, Sandberg R, Ernberg I: Gene expression perturbation in vitro–a growing case for three-dimensional (3D) culture systems. Semin Cancer Biol. 2005, 15 (5): 405-412. 10.1016/j.semcancer.2005.06.009.View ArticlePubMedGoogle Scholar
- Sandberg R, Ernberg I: The molecular portrait of in vitro growth by meta-analysis of gene-expression profiles. Genome Biol. 2005, 6 (8): R65-10.1186/gb-2005-6-8-r65.PubMed CentralView ArticlePubMedGoogle Scholar
- Underhill GH, Galie P, Chen CS, Bhatia SN: Bioengineering methods for analysis of cells in vitro. Annu Rev Cell Dev Biol. 2012, 28: 385-410. 10.1146/annurev-cellbio-101011-155709.View ArticlePubMedGoogle Scholar
- Liu JS, Gartner ZJ: Directing the assembly of spatially organized multicomponent tissues from the bottom up. Trends Cell Biol. 2012, 22 (12): 683-691. 10.1016/j.tcb.2012.09.004.PubMed CentralView ArticlePubMedGoogle Scholar
- Barabasi AL, Oltvai ZN: Network biology: understanding the cell’s functional organization. Nat Rev Genet. 2004, 5: 101-113. 10.1038/nrg1272.View ArticlePubMedGoogle Scholar
- Fitch W: Distinguishing homologous from analogous proteins. Syst Zool. 1970, 19 (2): 99-113. 10.2307/2412448.View ArticlePubMedGoogle Scholar
- Studer RA, Robinson-Rechavi M: How confident can we be that orthologs are similar, but paralogs differ?. Trends Genet : TIG. 2009, 25 (5): 210-216. 10.1016/j.tig.2009.03.004.View ArticlePubMedGoogle Scholar
- Koonin EV: Orthologs, paralogs, and evolutionary genomics. Annu Rev Genet. 2005, 39: 309-338. 10.1146/annurev.genet.39.073003.114725.View ArticlePubMedGoogle Scholar
- Huminiecki L, Wolfe KH: Divergence of spatial gene expression profiles following species-specific gene duplications in human and mouse. Genome Res. 2004, 14: 1870-1879. 10.1101/gr.2705204.PubMed CentralView ArticlePubMedGoogle Scholar
- Li WH, Yang J, Gu X: Expression divergence between duplicate genes. Trends Genet. 2005, 21: 602-607. 10.1016/j.tig.2005.08.006.View ArticlePubMedGoogle Scholar
- Makova KD, Li WH: Divergence in the spatial pattern of gene expression between human duplicate genes. Genome Res. 2003, 13: 1638-1645. 10.1101/gr.1133803.PubMed CentralView ArticlePubMedGoogle Scholar
- Altenhoff AM, Studer RA, Robinson-Rechavi M, Dessimoz C: Resolving the ortholog conjecture: orthologs tend to be weakly, but significantly, more similar in function than paralogs. PLOS Comput Biol. 2012, 8 (5): e1002514-10.1371/journal.pcbi.1002514.PubMed CentralView ArticlePubMedGoogle Scholar
- Blanc G, Wolfe KH: Functional divergence of duplicated genes formed by polyploidy during Arabidopsis evolution. Plant Cell. 2004, 16: 1679-1691. 10.1105/tpc.021410.PubMed CentralView ArticlePubMedGoogle Scholar
- Dowell R: The similarity of gene expression between human and mouse tissues. Genome Biol. 2011, 12: 101-10.1186/gb-2011-12-1-101.PubMed CentralView ArticlePubMedGoogle Scholar
- Altenhoff AM, Schneider A, Gonnet GH, Dessimoz C: OMA 2011: orthology inference among 1000 complete genomes. Nucleic Acids Res. 2011, 39 (Database issue): D289-D294.PubMed CentralView ArticlePubMedGoogle Scholar
- Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP: Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res. 2003, 31 (4): e15-10.1093/nar/gng015.PubMed CentralView ArticlePubMedGoogle Scholar
- Zheng Q, Wang XJ: GOEAST: a web-based software toolkit for Gene Ontology enrichment analysis. Nucleic Acids Res. 2008, 36 (Web Server issue): W358-W363.PubMed CentralView ArticlePubMedGoogle Scholar
- Jantzen SG, Sutherland BJ, Minkley DR, Koop BF: GO Trimming: Systematically reducing redundancy in large Gene Ontology datasets. BMC Res Notes. 2011, 4: 267-10.1186/1756-0500-4-267.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.