Development and progression of diseases are to a large amount linked to evolutionary conserved genes  as well as to reactions on environmental factors such as infection or nutrition. Many experimental model organisms including mouse, worm, fruit fly or yeast were established to study such factors when human material could not sufficiently be exploited. Such developments necessitate tracing the descendents of common ancestor genes in these organisms where gene conservation is typically judged by sequence similarity. Genes similar in sequence can be found not only in different organisms but also in a single organism after the appearance of evolutionary events such as genome duplications or the invention of alternative splicing isoforms (ASIs).
Today's attempts to systems biology integrate experimental insights from wet labs combining the fields of functional genomics, biomedicine and bioinformatics. The function of genes is either determined experimentally or predicted by similarity of respective sequences. In the same way, sequence similarity is commonly regarded as a valid measure for gene evolution. Hence, the conservation of gene sequences as well as of gene function refers to a common origin but this conservation does not exclude that gene function probably changes [2–5]. On the other hand, gene function can be indirectly derived from gene expression analyses regarding expression profiles as reactions patterns. Several aspects such as the essentiality of homologous genes  in normal and disease networks  are studied quantitatively using genome-wide gene, transcript or protein expression data. Hence, it would be beneficial to infer gene function with an integrated approach combining sequences and quantitative measurements.
The basis for such cross-species comparisons is the precise determination of orthology relationships between full-length protein sequences. Current family inferring approaches are aiming at the computation of the closest proteins of two organisms, the orthologs, or of a single organism, the paralogs, but the approaches are generally not capable to discriminate molecular individuals of similar functionality among a set of similar genes, e.g. within a single organism [8, 9]. Such approaches have been analysed and compared ; essentially, these approaches consist of a sequence similarity comparison with tools like BLAST  along with specific categorization algorithms and processing pipelines that exploit the results of the sequence similarity. Sets resulting from such categorizations are named orthologs clusters or protein families; the latter term is used for all such results in this paper. Depending on the specific algorithm for determining protein families, either a reduced [12–14] or the completely available organism range  is used. Strategies of the implementations aim at performing a strictly pairwise organism comparison [16, 17], large-scale comparisons for multiple species [15, 18], robust clustering algorithms in uncurated pipelines [15, 19], or the algorithm pipelines include reconciliations or further phylogenetic data . During the last two decades, a large amount of such approaches were created [10, 20] and it is often difficult to retrace the results of such procedures. Respective repositories were established as searchable databases and are, currently, the ends of the information processing queue. In the context of quantitative analytics or systems biology it is desired that such successful and excellent information resources are used, for example as components of integrative data pipelines. As well, comparisons across particular established family sets by a single-standing or integrative application tool are sparse.
Because of the complexity of evolution it becomes necessary to broaden gene expression analyses to the full set of paralogous genes. Current gene expression studies are able to quantify expressed genes in a genome-wide manner using microarray chip technology and, more recently, quantitative high-throughput sequencing. One problem of microarray analyses is that expression data are often compressed to gene level annotation, which does not reflect alternatively spliced isoforms. It is important to note that large repositories for microarray experiments, the Gene Expression Omnibus GEO, ArrayExpress, or the Gene Expression ATLAS [21–23], store, beside the raw data, already normalized data. Such resources are used to retrieve or derive differential expression data of published research on immune response, ageing, nutrition, compound or drug treatment, or experimental gene manipulations such as knock-out or gene silencing. There already exist helper tools  that facilitate cross-species queries in such repositories for finding ortholog relationships.
In this work we describe the web server Ortho2ExpressMatrix (O2EM) that integrates evolutionary information on gene (protein or microRNA) families and whole-genome expression data in order to i) compare the experimentally comparable settings for two biological objects; ii) find functional orthologs in a pair of species; iii) help to infer a meaningful sub-clustering of large gene families by gene expression data; and, iv) disclose a row of similarly differentially expressed genes in the light of sequence similarity. The tool represents a systematic approach for genome and proteome research and aims to contribute to functional orthology detection.