Transcription factors are key elements in orchestrating gene expression programs with respect to development and differentiation and in response to the environment. A recent census of human transcription factors stated that out of approximately 1700-1900 transcription factor genes roughly 700 encode C2H2 zinc finger domains including, with a number of about 400, the largest subgroup of all, the KRAB-ZNF proteins . There is evidence to suggest that KRAB zinc finger genes and thus KRAB-mediated transcriptional repression initially accrued 360 million years ago at a time when the first tetrapod/amphibian genomes were established . The ZNF family continuously grew during phylogenesis with particular emphasis on the mammalian and therein the primate lineage, with often lineage-specific expansions [2, 58, 59, 51, 5, 6, 60]. The expansions are considered the result of repeated tandem gene duplication followed by diversification. Gene duplication is a key mechanism in driving evolution by providing opportunities for the selection of new phenotypes. Upon duplication the gene copies might diversify, thus developing new functionalities (neo- or subfunctionalization), they might contribute to genetic robustness or one of the copies might be lost (for review see ). Neo- and subfunctionalization can be brought about on different levels: Mutations in protein coding sequences may confer novel properties e.g. for transcription factors altered DNA binding sites and thus modified target gene lists. Changes in regulatory regions may lead to quantitative and qualitative expression differences like altered tissue expression profiles. A recent study on a large set of human transcription factors supports these notions . Both, positive selection between paralogs for altered C2H2 zinc finger DNA binding domains [59, 6] as well as diversification of the expression patterns between paralogs  has been shown for the (KRAB) ZNF family. In a study of coding sequence polymorphisms identified in humans compared to chimpanzee, the KRAB zinc finger gene family was classified as having an excess of rapidly evolving genes, with an enrichment for positively selected genes .
Here we focused on the human 8q24.3 zinc finger cluster comprising seven members near the telomere. Our phylogenetic analysis in mammals revealed both, considerable evolutionary pressure to keep effector domain structure as well as ongoing evolution: Purifying selection on the KRAB and the zinc finger domains, including the major residues influencing DNA binding specificity were indications of the conservation in mammals and thus a likely functional importance of the 8q24.3 ZNF genes in biological processes. Evolutionary constraints on the zinc finger region and purifying selection between orthologs has been generally noted for the C2H2 family . In contrast, the lack of functional ZNF16, ZNF34, ZNF252 and ZNF517 ortholog genes in the rodent lineages pointed to persisting evolutionary dynamics. ZNF252 is of particular interest. It appears that even within the primate lineage it can encode a fully functional KRAB-ZNF protein (in rhesus monkey), a protein with a disrupted KRAB domain and truncated C2H2 zinc fingers (in human) or only a remnant without KRAB and ZNF sequences (in chimpanzee). Sequences homologous to LINE/SINE repeats in the human and chimpanzee orthologs might be responsible for the disruption of the open reading frame in the 5' part. ZNF16 remained the only 8q24.3 locus member without evidence of containing a KRAB domain in a mammalian species.
Since human C2H2 ZNF genes probably originated from common ancestors as products of gene duplication, they most likely retained common structural and transcriptional regulatory features that should be apparent in the family members of established ZNF clusters. Our alignments/tree building, reciprocal database searches and ZNF domain characteristics revealed that the ZNF genes from 8q24.3 generally share higher similarities to members of their own locus than to ZNF genes in other genomic loci. Thus, the seven ZNF genes comprise the closest paralogs for each other. They appear to form a rather remote genomic locus without close ties to other ZNF clusters. Furthermore, in contrast to other clusters, e.g. the one at the 19q13.2 locus described before  and used as an outlier group in the present study, the 8q24.3 ZNF genes show considerably less phylogenetic relatedness within the cluster. One explanation for this relative distance within 8q24.3 could be that duplication events leading to the paralogs happened quite early in mammalian evolution, most likely more than 130 million year ago before the split of Theria and Eutheria. The high degree of conservation in dog, cow, mouse/rat and human in combination with the location in syntenic regions argues that the 8q24.3 ZNF locus is as old as the Eutheria. The robustness of at least three members of this locus (ZNF7, ZNF250, ZNF251) is probably due to essential functions that are conserved during mammalian evolution. The ZNF252 ortholog found in the marsupial opossum raises the likelihood that the locus existed even before in the Theria. The fact that we were currently unable to define other orthologs in opossum is probably due to the preliminary state of the genome assembly of this species. Without data from other phylogenetically older species than mammals it is difficult to assess which gene might have been descended from a more ancestral gene. Furthermore, it remains unclear from which ancestral locus the 8q24.3 locus originally derived.
Expression profiles are indirect means for the comparison of regulatory regions of different genes. The recorded tissue expression profiles of the 8q24.3 ZNF genes showed overall relatively similar patterns in that mostly the same tissues displayed the highest or lowest relative expressions, respectively. This implies common regulatory principles, e.g. similar cis-acting elements or transacting factors. Yet, subgroups could be distinguished, too. A possible subspecialization after duplication, while still showing overlap or even redundancy, was conceivable. In order to gain insights into the gene control regions of the seven 8q24.3 ZNF genes we performed a computational comparison of their proximal promoter regions with focus on common properties and elements. There was evidence that the promoters can be classified as TATA-less and CpG island-associated as well as (with the exception of ZNF34) displaying multiple start points for transcriptional initiation. Thus, they would mostly fit a class of core promoters dubbed "dispersed" that are evolutionary younger and more common in vertebrates and whose exact mechanisms of regulation by transcription factors are less understood than those of the class termed "focused" . The most prominent individual TFBS module discovered in six of the seven ZNF promoter regions (not in that of ZNF16), was made up of EGR1 and SP1, both C2H2 zinc finger proteins. The two factors have been shown to be able to act reciprocally on promoters in this module configuration, i.e. SP1 as transactivator and EGR1 as repressor . The interrelationship between these two factors was also shown by in vivo occupancy changes in genome-wide studies during monocytic differentiation . SPI1, an ETS-domain transcription factor also known as PU.1 and with essential functions during hematopoiesis , was found as part of modules in five ZNF promoter regions. Interestingly, genome-wide experimental investigation of SP1, EGR1 and SPI1 binding sites in the above mentioned cell model of monocytic differentiation [65, 67] corroborated the potential functionality of most of our bioinformatically determined sites: We accessed the data through the online genome explorer of the Genome Network Platform http://genomenetwork.nig.ac.jp/. The transcription factor occupancies represented there indicate e.g. binding of SP1, EGR1 and SPI1 to the ZNF251, ZNF250 and ZNF252 promoter regions, but not to that of ZNF16. This coincides with our predictions. SP1 is a ubiquitously expressed transcription factor involved in many processes through transcriptional regulation of numerous genes . The immediate-early response gene EGR1/Zif268 also has been implicated in many processes. In light of the strong expression of 8q24.3 ZNF genes in fetal brain and also cerebellum it is noteworthy that EGR1 was described to also play a role in spatial memory . Since transcription factors have a high potential for functional pleiotropy based on broadly defined DNA binding specificity, crosstalk with each other and dependence on cell and tissue specific influences, it will be pivotal to determine the roles of the TFBS detected in the 8q24.3 ZNF genes experimentally. The closest pairs with respect to their expression patterns were ZNF34/ZNF250 and ZNF16/ZNF7 (see cluster analysis in Figure 8 and expression similarity calculations in Additional file 10). Our analyses of the proximal promoter regions revealed that ZNF34 shares one TFBS module with ZNF250 and ZNF16 shares two TFBS modules with ZNF7 (see Table 4). This could partly explain expression similarity, but regulatory elements from outside the proximal promoter regions might also contribute. Furthermore, other regulatory principles like chromatin organization or regulation by small RNAs might also play a role. A next step for further elucidation would be an analysis of the expression profiles of the corresponding transcription factors.
Surprisingly, KRAB-ZNF genes from other genomic loci exhibited similar tissue signatures as well, raising the possibility that there are underlying causes for these patterns of ZNF genes. It is generally assumed that higher expression of a gene in one compared to another tissue points to a more important function in the tissue with the more prominent transcript levels and may be connected to tissue function [70, 71]. Thus, our recorded profiles especially emphasize fetal brain and, for particular genes of the familiy, also testis, cerebellum, prostate and thyroid as tissues in which the examined ZNF genes might serve important roles (see Figures 7, 8). Interestingly, tissues like heart and liver display consistently low levels of expression of the seventeen tested ZNF genes. Therefore, one might infer that (KRAB) ZNF proteins are predominantly influencing morpho-/organogenic processes of organs such as the brain that have been more strongly modified during tetrapode to primate evolution than liver and heart.
Fetal brain is a tissue undergoing complex developmental differentiation processes, notably neurogenesis. Testis as well is characterized by a major differentiation program, spermatogenesis. Both tissues display, compared to other tissues, above average features of transcriptional regulation: Fetal brain and testis belong to the tissues with the highest number of alternative splicing events , testis tissue expresses a large number of tissue-specific exons  and has the highest frequency of tissue-specific putative alternative promoters . A recent detailed global transcriptome analysis of human mid-fetal brain regions revealed a high percentage of expressed genes with a large number of specific gene expression and alternative splicing patterns . The transcript patterns corresponded to anatomical and functional subdivisions of brain. C2H2 ZNF genes were sometimes also found to be enriched in some fetal brain regions. With respect to the 8q24.3 ZNF locus, most members were not scored as over/underrepresented in a particular brain region. Only ZNF250, along with at least 15 other C2H2 ZNF genes, was underexpressed in thalamus tissue compared to average expression in other regions. Prominent cell types of testis comprise the epithelial Sertoli cells, the androgen producing Leydig cells and the developing germ cells, the spermatocytes. The latter cells display transcriptional properties that are conceptually different from somatic cells: Distinctive features include use of alternative promoters, alternate starts sites, use of alternate transcription factors, altered genome packaging and arrest of transcription during spermatogenesis . Involvement of KRAB-ZNF proteins in spermatogenesis can be inferred from the fact that their co-repressor protein TRIM28 was shown to be required for the maintenance of this process in mouse . In addition to reports on prominent expression of KRAB-ZNF genes in testis in embryos and adults [77–81], there is accumulating evidence that KRAB-ZNF genes play a role in sex determination, spermatogenesis and imprinting [35, 81]. We therefore assume that KRAB zinc finger genes/proteins play especially important roles in differentiation processes in fetal brain and testis, e.g. by switching off distinctive target genes in distinct temporal and spatial patterns. In this respect it is noteworthy that KRAB-ZNF genes can be found in expression signatures of stem cells and change in response to reprogramming and Oct4 knock-down [82–84]. A recent publication provided even evidence for the mouse KRAB-ZNF protein ZFP809 as a stem-cell-specific retroviral restriction factor .
Comparison of sequence similarities with expression pattern similarities uncovered higher positive correlations for the 8q24.3 locus KRAB-ZNF group than for the more heterogeneous non-8q24.3 group (see Table 2). They probably reflect the closer phylogenetic relationships in the 8q24.3 group. Notably, cDNA and KRAB domain similarity correlations deviate a lot between the two groups. These findings imply that sequences within the cDNA and in particular those encoding the KRAB domain of the 8q24.3 genes are somehow involved in specifying the expression profiles observed. It is probably more likely that cis-regulatory sequences near the KRAB-encoding exons rather that the KRAB exon sequences themselves contribute to this phenomenon. Such surrounding sequences might form an evolutionary linked unit with the KRAB exons.
Expression profiling is constrained by the samples and methodologies that are being used. As in many published studies on human tissue-specific expression analyses (e.g. [70, 51, 72]) we depended on commercial RNA samples from human materials that represented pools from different individuals. With respect to the methods, the amplicons our quantitative PCR relied on do not likely interrogate all possible transcripts of a gene nor do they necessarily measure the same isoforms that are measured by microarray applications. A polyA-based priming step in many labeling protocols for microarray hybridization is an example: It leads to a considerable 3' bias . Sensitivity of quantitative PCR exceeds that of the classical microarray platforms . High sensitivity is of particular interest for the robust detection of (KRAB) ZNF transcripts which are often low abundant. Therefore, despite the wealth of published gene expression data, problems of bias and sensitivity affect the expression profiling of (KRAB) ZNF genes and result in an incomplete picture of their transcript patterns in cells and tissues. Detailed, high sensitive profiling of all transcripts of a (KRAB) ZNF gene will be instrumental in understanding their spatial and temporal patterns of expression. As for most other members of the ZNF superfamily, information on the proteins and function of the 8q24.3 ZNF genes is scarce. One study discovered a complex of human ZNF7 protein with autoantigen L7 and ribosomal protein S7 . ZNF16 was proposed to have a role in erythroid and megakaryocytic differentiation and to harbor a transactivation domain N-terminally to the zinc finger domain [88, 89]. Another report associated ZNF250 with cell proliferation . Recently, a localization study on endogenous mouse ZNF250/Zfp647 uncovered a novel type of nucleoplasmic body containing KRAB-ZNF proteins and TRIM28 in differentiated cells .