Flowchart of the data selection. The diagram shows the three parallel analyses performed for the study. (a) The human subset of the muscle-specific training set curated by Wasserman and Fickett was used for validating the methodology  (46 sequences, group a). (b) Bovine cardiac tissue specific expression (group b). Bovine contigs in all libraries were subjected to the following criteria: contigs should consist of a minimum number of 6 ESTs and greater than 90% of those ESTs should be present in the bovine cardiac library. The EST contigs were later passed through the orthology selection where their human RefSeqs were identified. Promoter regions of the obtained RefSeqs were then extracted and examined for common regulatory motifs. Group b is composed of the resulting 23 human sequences. (c) Bovine and human cardiac tissue specific expression (group c). Bovine contigs from four muscle and cardiac tissue libraries were compared to the human genome as described above. The obtained human RefSeqs were subsequently scanned for high expression in human cardiac tissue (log ratio ≤ 2). Group c consists of these 25 human sequences. There are eight common sequences predicted by both methods (b and c) and thus the combined set has 40 sequences. MEME motifs in the combined group were compared (red arrow). The additional analyses done using the bovine orthologues of the human genes in the two data sets are highlighted in the grey box. Common steps in (b) and (c) are coloured yellow. Selection steps are represented by blue boxes.