The Epc-N domain: a predicted protein-protein interaction domain found in select chromatin associated proteins
© Perry; licensee BioMed Central Ltd. 2006
Received: 20 July 2005
Accepted: 16 January 2006
Published: 16 January 2006
An underlying tenet of the epigenetic code hypothesis is the existence of protein domains that can recognize various chromatin structures. To date, two major candidates have emerged: (i) the bromodomain, which can recognize certain acetylation marks and (ii) the chromodomain, which can recognize certain methylation marks.
The Epc-N (E nhancer of P olyc omb-N-terminus) domain is formally defined herein. This domain is conserved across eukaryotes and is predicted to form a right-handed orthogonal four-helix bundle with extended strands at both termini. The types of amino acid residues that define the Epc-N domain suggest a role in mediating protein-protein interactions, possibly specifically in the context of chromatin binding, and the types of proteins in which it is found (known components of histone acetyltransferase complexes) strongly suggest a role in epigenetic structure formation and/or recognition. There appear to be two major Epc-N protein families that can be divided into four unique protein subfamilies. Two of these subfamilies (I and II) may be related to one another in that subfamily I can be viewed as a plant-specific expansion of subfamily II. The other two subfamilies (III and IV) appear to be related to one another by duplication events in a primordial fungal-metazoan-mycetozoan ancestor. Subfamilies III and IV are further defined by the presence of an evolutionarily conserved five-center-zinc-binding motif in the loop connecting the second and third helices of the four-helix bundle. This m otif appears to consist of a P HD followed by a mononuclear Z n knuckle, followed by a P HD-like derivative, and will thus be referred to as the PZPM. All non-Epc-N proteins studied thus far that contain the PZPM have been implicated in histone methylation and/or gene silencing. In addition, an unusual phyletic distribution of Epc-N-containing proteins is observed.
The data suggest that the Epc-N domain is a protein-protein interaction module found in chromatin associated proteins. It is possible that the Epc-N domain serves as a direct link between histone acetylation and methylation statuses. The unusual phyletic distribution of Epc-N-containing proteins may provide a conduit for future insight into how different organisms form, perceive and respond to epigenetic information.
Cellular DNA is packaged as chromatin, a condensed fiber composed of nucleosome core particles. Each core particle comprises 147 base pairs of DNA wrapped nearly twice around an octomer of histone proteins, which is canonically defined by an H3/H4 tetramer flanked by two H2A/H2B heterodimers .
The amino- and carboxy-terminal tails of histone proteins protrude from the core particle into the solvent, and are therefore amenable to post-chromatin packaging modification. Well known histone modifications include Lys acetylation, Ser/Thr phosphorylation and Lys/Arg methylation at N-terminal tails; however, examples of histone tail ubiquitinylation, SUMOylation, ADP ribosylation, glycosylation, carbonylation and biotinylation have also been described. The dynamic composite of these modifications (the epigenetic state) predicates chromatin structure, and therefore gene activity, in a manner that is not yet fully understood. It is generally thought, however, that acetylation is positively correlated with transcriptional activation and that methylation is positively correlated with gene silencing, though reciprocal examples of each paradigm have been shown (for recent reviews) [2–4].
There are two competing, but not necessarily mutually exclusive, models for how a cell interprets epigenetic information. The first is the "histone code" model which states that histone tail modification, occurring in sequential, interdependent layers, specifically alters the affinity for various chromatin associated proteins in a way that influences downstream function. For example, the histone code is thought to underlie the determination of bulk chromatin properties such as the formation and maintenance of heterochromatic and euchromatic domains [5–7]. A second hypothesis likens histone modification to commonly known receptor mediated signal transduction networks . This "signaling network" model more easily accounts for the apparent degeneracy amongst certain histone modifications, and suggests that such modifications serve to confer bistability, robustness and adaptability to a presumed chromatin based network.
Regardless of operative model, certain proteins must be able to recognize particular chromatin structures or outputs, and capacitating motifs have evolved apparently for this purpose. Two well-characterized examples are the bromodomain and the chromodomain. The bromodomain, first identified in the Drosophila chromatin remodeling protein Brahma, is a left-handed four-helix bundle that binds selectively to acetyl-lysine [9, 10]. It is conserved amongst eukaryotes, and has been found distributed into three major protein families: (i) ATP-dependent chromatin remodeling factors, (ii) histone acetyltransferases (HATs, e.g., GCN5, PCAF, TAFII250) and (iii) BET (b romodomain + ET domain) transcriptional regulators. Bromodomains can occur as a single copy or in duplicate, and when they occur in tandem, as in TAFII250, they can bind selectively to diacetylated histone tails with appropriately spaced acetyl-lysine moieties . By contrast, chromodomains comprise histone methylation mark recognizing motifs defined by three antiparallel β-strands reinforced with a single cross-strand helix . Chromodomains are also conserved across eukaryotes, and have even been found in two Phycodnaviridae viruses. Like the bromodomain, the chromodomain has been distributed into three major protein families, (i) proteins with an amino-terminal chromodomain followed by a chromo-shadow domain [Su(var)205], (ii) proteins with a single chromodomain in conjunction with other non-related domains (e.g., Polycomb and the S. cerevisiae histone acetyltransferase Esa1) and (iii) proteins with tandem chromodomains (CHD1), all of which participate in epigenetic events.
The Epc-N (E nhancer of the P olyc omb-N-terminus) domain is defined and described below. The core of this domain is predicted to be a right-handed orthogonal four-helix bundle. It is further defined by the presence of β-strand extensions at both termini, and its conserved, and therefore PSI-BLAST defining, residues suggest a role in protein-protein interaction surface formation, possibly in the context of chromatin binding. The Epc-N domain occurs across eukaryotes, and two distinct Epc-N-containing protein families have been identified. Each of these families is composed of two subfamilies. Members of three of the four identified subfamilies have already been evidenced to participate in epigenetic events, most notably as components of histone acetyltransferase (HAT) complexes. Two of the four subfamilies (III and IV) are further defined by the presence of a five-center-zinc-binding-motif in the loop that connects the second and third helices of the four-helix bundle. This motif is composed of a PHD followed by a mononuclear Zn-knuckle, followed by a PHD-like derivative, which will be referred to as the PZPM (P HD/Z n-knuckle/P HD m otif). The PZPM is also an evolutionarily conserved translocatable module, and all proteins studied to date that contain this motif have been implicated in histone methylation and/or gene silencing. Therefore, the Epc-N domain emerges as a candidate to be another building block in the limited repertoire of domains that could have affinity for specific epigenetic signatures, and a peculiar phyletic distribution of the four protein families in which it is found seems to reflect significant discrepancies in how different organisms have evolved to form and interpret epigenetic information.
Epc-N domain discovery and annotation
The hint of a conserved sequence common to some of the proteins described below was first noted by Stankunas et al. when they cloned the Enhancer of Polycomb E(Pc) gene from Drosophila . Later bioinformatic analyses by Koonin and colleagues also suggested the possibility of conserved sequences common to E(Pc) and Lin-49 during their seminal work on domain accretion [14, 15]. At that time, fewer sequences were available, and it appeared that there could be two potentially independent modules, coined EP1 and EP2 [14, 15]. However, because of the less complete repertoire of available sequences at that time, these domains could not be precisely defined. With many more sequences in hand, it is now clear that EP1 and EP2 always co-occur to form a single domain, which will be referred to simply as Epc-N (below).
The five-center-zinc-binding motif of subfamily III and IV Epc-N domains
A PSI-BLAST search (E = 0.001) with the PZPM of human BRD1 [gb | AAH47508] was restricted to the first iteration of human sequences and returned 333 candidates that were analyzed manually for the presence of the entire domain. An alignment with a representative from each protein family is shown in Figure 2A. The alignment suggests that the entire PZPM comprises the verified PHD, followed by a C2HC knuckle/ribbon not described previously, which is followed by a PHD-like derivative. It should be noted that the PHD-like derivative is not identified by algorithm analysis of any sequence as a statistically significant PHD, and that the eighth ligand is an evolutionarily conserved and motif-defining histidine residue. Five groups of proteins in humans were found to contain the PZPM: (i) members of Epc-N subfamily III, (ii) members of Epc-N subfamily IV, (iii) mixed lineage leukemia (MLL) proteins (trithorax homologs) including AF10, AF17 and MLLT6, (iv) Jumonji transcription factors including GASC-1 and (v) NSD1, the n uclear receptor binding S ET [Su(var), Enhancer of zeste, Trithorax] d omain protein.
It should be noted that the entire PZPM is conserved across all Epc-N subfamily III and IV proteins, as well as in the other described proteins. Therefore, for functional considerations, it should not be viewed as simply a PHD, but as a single, large, evolutionarily translocatable unit that happens to exist as a subdomain in Epc-N subfamily III and IV proteins.
Domain architectures of Epc-N and PZPM containing proteins
As alluded to above, Epc-N containing proteins can be divided into two major families (based on the presence or absence of the PZPM), each with two subfamilies. Domain architecture diagrams of the four identified Epc-N domain containing protein subfamilies are shown in Figure 3A. All Epc-N domain containing proteins are predicted by PSORT  to be nuclear, most contain known chromatin-associated domains, and they range in length from ~400 ([emb | CAB96695] from P. vivax) to over 3200 ([gb | AAS64921], RHINOCEROS, from D. melanogaster) amino acid residues . Epc-N subfamily I proteins are characterized by C-terminal Epc-N domain followed by a coiled-coil domain at the extreme C-terminus of the polypeptide. Subfamily I can be further subdivided into two groups based on the presence (a) or absence (b) of the aforementioned Agenet domain derivative closer to the N-terminus. Subfamily II proteins are characterized by an N-terminal Epc-N domain and a C-terminal Epc-C (Enhancer of the polycomb C-terminal) domain. This subfamily can also be subdivided into two groups (a) and (b); however, to date a (b) protein has only been found in Drosophila (below). Though they have distinct domain architectures, subfamily I can be viewed as a plant lineage-specific expansion of subfamily II.
As described above, Epc-N subfamilies III and IV are defined by the presence of a PZPM between the second and third helices of the canonical Epc-N domain. Subfamilies III and IV are related in as much as the N-terminal positioning of the PZPM-containing Epc-N domain, and the periodic occurrence of AT-hook motifs. These subfamilies likely arose from duplication events in a primordial fungal-metazoan-mycetozoan ancestor. The difference between these subfamilies is that subfamily III members are further defined by the presence of a bromodomain (an acetyl-lysine binding four helix bundle) adjacent and C-terminal to the Epc-N domain followed by a low complexity region leading up to a PWWP domain at the C-terminus. In addition, there is a C2H2 zinc finger at the N-terminus of select subfamily III proteins (below). By contrast, subfamily IV proteins contain only the PZPM-Epc-N domain followed by a variable length stretch of low complexity sequence.
Domain architecture diagrams of PZPM-containing proteins are shown in Figure 3B. As defined above, there are five groups of PZPM proteins in humans: (i) members of Epc-N subfamily III, (ii) members of Epc-N subfamily IV, (iii) various MLL proteins, (iv) various Jumonji transcription factors, and (v) NSD1. Groups (i) and (ii) are as described above. Group (iii) members consist of an N-terminal PZPM and a C-terminal coiled-coil, while group (iv) members are defined by an N-terminal Jumonji domain with C-terminal PZPM followed immediately by two copies of a Tudor domain. Finally, NSD1 (v) is a large protein with an N-terminal PWWP domain followed by a large undefined stretch of sequence leading up to a PZPM, a lone PHD, a second PWWP domain, a SET domain and then a second lone PHD that all occur in rapid succession.
C. elegans PZPM proteins are very similar to those found in humans, and the respective genes are likely orthologs. There are Epc-N subfamily III and IV representatives, an apparent MLL-AF10/AF17 homolog, and an N-terminal Jumonji domain transcription factor. The major differences between C. elegans and humans at this level are that: (i) humans have multiple copies, potentially paralogs, of each PZPM-encoding gene except NSD1 whereas C. elegans has retained only one copy of each gene and (ii) that there is no apparent NSD1 in C. elegans but rather a unique protein (Y59A8A.2) consisting of an N-terminal PZPM and a C-terminal PHD. Since NSD1 contains lone PHD motifs in addition to its PZPM, it is possible that Y59A8A.2 and NSD1 modulate similar functions and that they are in fact encoded by orthologous genes.
As noted above and discussed below, the PZPM is conserved in plants, but not as part of an Epc-N domain. The Arabidopsis genome is completely sequenced, and seven genes were found to encode for proteins with the PZPM. Five were trithorax homologs, and two types of domain architectures were apparent in this group. ATX1 and ATX2 are defined by an Agenet variant (interestingly, the same variant as that found in Epc-N subfamily 1a proteins), followed by a PWWP domain, a phenylalanine-tyrosine rich domain, the PZPM and finally a C-terminal SET domain. ATX3, ATX4 and ATX5 are somewhat different and are defined by a PWWP domain followed by a lone PHD, the PZPM and the C-terminal SET domain. The other two PZPM proteins in Arabidopsis feature the PZPM as a stand-alone motif, one with a single copy and one where it has been duplicated.
Phyletic distribution of Epc-N domain containing protein subfamilies
Data mining for the biological roles of Epc-N proteins
Epc-N subfamily III proteins are immediately connected to some level of epigenetic regulation by the presence of their family-defining bromodomains. With respect to subfamilies II and IV, somewhat less is currently known about subfamily III proteins, but it is known that in humans one subfamily III protein, BR140, is involved in somatic cell development while a presumed paralog, BRL, is most highly expressed in germline tissues . The subfamily III protein in C. elegans is LIN-49. Like BR140, LIN-49 has been shown to be involved in somatic cell development, specifically through the regulation of homeotic gene expression . The RNAi knockdown line of LIN-49 shows post-embryonic growth defects, sterile progeny and uncoordinated movement defects, phenotypes which are also observed in a naturally occurring mutant in which the sixteenth ligand (20 ligands total) of the PZPM has been changed from Cys to Ser, presumably disrupting the motif . Finally, the subfamily III protein of Drosophila, CG1845, was shown to have a 2-hybrid interaction with CG16838, an AAA-ATPase thought to function as a chaperone in protein complex assembly-dissolution. Interestingly, the subfamily IV protein in S. cerevisiae, NTO1, also has a two-hybrid interaction with a protein complex assembly chaperone, suggesting that this is a bona-fide property of HAT complex regulation.
As suggested, there is a bit more information available regarding Epc-N subfamily IV proteins. Subfamily IV proteins do not appear to be essential, and are thus inferred to take on more specialized roles than subfamily II proteins, whose roles seem to be more basic. The NTO1 knockout is viable, and the RNAi of the subfamily IV transcript in C. elegans has a WT phenotype. Subfamily IV proteins have been found associated with two different HATs. The human protein, JADE (of which there is three isoforms), interacts physically with the von Hippel-Lindau tumor suppressor and the H4/H2A HAT TIP60 in kidney tissue [39, 40]. By contrast, NTO1 has a two-hybrid interaction with SAS3, the catalytic subunit of the NuA3 H3 HAT complex, which is involved in gene silencing. NTO1 also has two-hybrid interactions with the protein complex assembly chaperone UMP1 and SLM6, a protein of unknown function. Finally, the subfamily IV protein of Drosophila, RHINOCEROS, regulates Ras pathway genes to restrict epidermal growth factor signaling in the eye, but its molecular mechanism is currently unknown .
Data mining for the biological roles of non-Epc-N PZPM proteins
From the sum of the analyses described above, the Epc-N domain defined here emerges as a crucial component of epigenetic regulation. The Epc-N domain appears to be a protein-protein interaction module, and can be included with the bromodomain and the chromodomain in the small cadre of potential histone code interpreters. Current evidence suggests that the core of this domain is a right-handed orthogonal four-helix bundle. Further, it can occur with a PZPM between the second and third helices, and the nature and apparent positioning of its conserved amino acid residues suggests that it may have intrinsic affinity for chromatin. Several Epc-N-containing proteins have been directly implicated as components of HAT complexes (e.g., NuA4 and TIP60 H4/H2A HATs and NuA3 H3 HAT), but intriguingly all proteins studied with PZPMs are associated with histone methylation and gene silencing. Therefore, PZPM-containing Epc-N proteins may be direct links between histone acetylation and methylation statuses.
The unusual phyletic distribution of Epc-N containing proteins likely reflects significant discrepancies in the way different organisms form, perceive and respond to epigenetic information. Most eukaryotes for which sequence information is available appear to retain at least one subfamily II gene (the notable exceptions are the kinetoplastid and Diplomonad parasites), and the available knockout lines are all homozygous lethal. Both of these observations indicate a rather fundamental role for Epc-N subfamily II proteins in epigenetic structure formation and/or recognition. However, the evolutionary peculiarities of the other three subfamilies suggest that the basic properties of the Epc-N domain can be harnessed for more specialized functions. For example, plants are strikingly different from all other eukaryotes in that they have a unique Epc-N subfamily (I, which appears to be a lineage specific expansion of subfamily II) but lack representatives of two other prominent subfamilies (III and IV). Plant cells have retained totipotency and the ability to dedifferentiate, hinting at the presence of epigenetic regulatory mechanisms different from those found in other complex multicellular organisms. Previously identified differences expected to contribute to these phenomena include a unique class of HD2-type histone deacetylases and an acetylation mark on Lys 20 of H4, which is the site of a methylation mark in animals and fungi [59, 60]. The irregular distribution of the Epc-N domain proteins in plants documented here suggests that it may also have a role in plant-specific epigenetic regulation.
Epc-N subfamily III and IV members apparently participate in specialized epigenetic processes in Fungi, Metazoa and Mycetozoa. Some effects in higher organisms including those mediated by JADE, BRL and RHINOCEROS appear to be tissue specific and have localized developmental consequences [36, 39–41]. Most organisms that contain one or more subfamily III proteins also have retained one or more subfamily IV proteins, so while they are similar in some sense (and likely arise from the duplication of a single primordial gene), it is doubtful that their functions are completely overlapping. This is further evidenced by the fact that RNAi of the subfamily III transcript in C. elegans causes severe phenotypes, while RNAi of the subfamily IV transcript does not. It is therefore interesting that Mycetozoa and Fungi Basidiomycota have retained a subfamily III gene but lack a subfamily IV gene and Fungi Ascomycota have retained a subfamily IV gene but lack a subfamily III gene. In budding yeast (an ascomycote), the subfamily IV protein NTO1 is associated with a non-essential HAT involved in gene silencing, and therefore its is likely to play a role in mating-type switching and/or telomere and rDNA maintenance. Thus, the evolutionary event described here suggests that there exists some fundamental and perhaps phylum defining difference between ascomycotes and basidiomycotes at the level of epigenetic regulation, possibly related to reproduction or telomere maintenance.
In summary, exhaustive experimental analyses of the Epc-N domain and the proteins in which it is found are anticipated to provide significant insight into both organism and tissue specific differences in epigenetic regulation and basic universal chromosomal processes.
The Epc-N domain is a functionally uncharacterized globular domain found in proteins with known roles in epigenetic processes. This domain appears to be a right-handed orthogonal four-helix bundle that can accommodate a predicted five-center-zinc-binding motif (the PZPM) between its second and third helices. It is possible that this domain has intrinsic affinity for chromatin and that it links histone acetylation and methylation statuses.
I would like to thank Job Dekker, Marian Walhout and anonymous reviewers for very insightful comments that have made this work stronger. I would also like to thank members of the various genome-sequencing consortia whose efforts make this type of work possible.
- Kornberg RD, Thomas JO: Chromatin structure; oligomers of histones. Science. 1974, 184: 865-868.PubMedView ArticleGoogle Scholar
- Loidl P: A plant dialect of the histone language. Trends Plant Sci. 2004, 9: 84-90. 10.1016/j.tplants.2003.12.007.PubMedView ArticleGoogle Scholar
- Peterson CL, Laniel MA: Histones and histone modifications. Curr Biol. 2004, 14: R546-R551. 10.1016/j.cub.2004.07.007.PubMedView ArticleGoogle Scholar
- Margueron R, Trojer P, Reinberg D: The key to development: interpreting the histone code?. Curr Opin Genet Dev. 2005, 15: 163-176. 10.1016/j.gde.2005.01.005.PubMedView ArticleGoogle Scholar
- Strahl BD, Allis CD: The language of covalent histone modifications. Nature. 2000, 403: 41-45. 10.1038/47412.PubMedView ArticleGoogle Scholar
- Turner BM: Histone acetylation and an epigenetic code. Bioessays. 2000, 22: 836-845. 10.1002/1521-1878(200009)22:9<836::AID-BIES9>3.0.CO;2-X.PubMedView ArticleGoogle Scholar
- Jenuwein T, Allis CD: Translating the histone code. Science. 2001, 293: 1074-1080. 10.1126/science.1063127.PubMedView ArticleGoogle Scholar
- Schreiber SL, Bernstein BE: Signaling network model of chromatin. Cell. 2002, 111: 771-778. 10.1016/S0092-8674(02)01196-0.PubMedView ArticleGoogle Scholar
- Haynes SR, Dollard C, Winston F, Beck S, Trowsdale J, Dawid IB: The bromodomain: a conserved sequence found in human, Drosophila and yeast proteins. Nucleic Acids Res. 1992, 20: 2603-PubMedPubMed CentralView ArticleGoogle Scholar
- Dhalluin C, Carlson JE, Zeng L, He C, Aggarwal AK, Zhou M: Structure and ligand of a histone acetyltransferase bromodomain. Nature. 1999, 399: 491-496. 10.1038/20974.PubMedView ArticleGoogle Scholar
- Jacobson RH, Ladurner AG, King DS, Tjian R: Structure and function of a human TAFII250 double bromodomain module. Science. 2000, 288: 1422-1425. 10.1126/science.288.5470.1422.PubMedView ArticleGoogle Scholar
- Jacobs SA, Khorasanizadeh S: Structure of HP1 chromodomain bound to a lysine 9-methylated histone H3 tail. Science. 2002, 295: 2080-2083. 10.1126/science.1069473.PubMedView ArticleGoogle Scholar
- Stankunas K, Berger J, Ruse C, Sinclair DAR, Randazzo F, Brock HW: The enhancer of polycomb gene of Drosophila encodes a chromatin protein conserved in yeast and mammals. Development. 1998, 125: 4055-4066.PubMedGoogle Scholar
- Koonin EV, Aravind L, Kondrashov AS: The Impact of Comparative Genomics on Our Understanding of Evolution. Cell. 2000, 101: 573-576. 10.1016/S0092-8674(00)80867-3.PubMedView ArticleGoogle Scholar
- International Human Genome Sequencing Consortium: Initial Sequencing and Analysis of the Human Genome. Nature. 2001, 409: 860-921. 10.1038/35057062.View ArticleGoogle Scholar
- Sonnhammer EL, Eddy SR, Birney E, Bateman A, Durbin R: Pfam: multiple sequence alignments and HMM-profiles of protein domains. Nucleic Acids Res. 1998, 26: 320-322. 10.1093/nar/26.1.320.PubMedPubMed CentralView ArticleGoogle Scholar
- Schultz J, Milpetz F, Bork P, Ponting CP: SMART, a simple modular architecture research tool: identification of signaling domains. Proc Natl Acad Sci (USA). 1998, 95: 5857-5864. 10.1073/pnas.95.11.5857.View ArticleGoogle Scholar
- Altschul SF, Madden TL, Scaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25: 3389-3402. 10.1093/nar/25.17.3389.PubMedPubMed CentralView ArticleGoogle Scholar
- Cuff JA, Clamp ME, Siddiqui AS, Finlay M, Barton GJ: Jpred: a consensus secondary structure prediction server. Bioinformatics. 1998, 14: 892-893. 10.1093/bioinformatics/14.10.892.PubMedView ArticleGoogle Scholar
- Notredame C, Higgins DG, Heringa J: T-coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol. 2000, 302: 205-217. 10.1006/jmbi.2000.4042.PubMedView ArticleGoogle Scholar
- Nakai K, Horton P: PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. Trends Biochem Sci. 1999, 24: 34-36. 10.1016/S0968-0004(98)01336-X.PubMedView ArticleGoogle Scholar
- Gavin AC, Bosche M, Krause R, Grandi P, Marzioch M, Bauer A, Schultz J, Rick JM, Michon AM, Cruciat CM, Remor M, Hofert C, Schelder M, Brajenovic M, Ruffner H, Merino A, Klein K, Hudak M, Dickson D, Rudi T, Gnau V, Bauch A, Bastuck S, Huhse B, Leutwein C, Heurtier MA, Copley RR, Edelmann A, Querfurth E, Rybin V, Drewes G, Raida M, Bouwmeester T, Bork P, Seraphin B, Kuster B, Neubauer G, Superti-Furga G: Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature. 2002, 415: 141-147. 10.1038/415141a.PubMedView ArticleGoogle Scholar
- Fuchs M, Gerber J, Drapkin R, Sif S, Ikura T, Ogryzko V, Lane WS, Nakatani Y, Livingston DM: The p400 complex is an essential E1A transformation target. Cell. 2001, 106: 297-307. 10.1016/S0092-8674(01)00450-0.PubMedView ArticleGoogle Scholar
- Galarneau L, Nourani A, Boudreault AA, Zhang Y, Heliot L, Allard S, Savard J, Lane WS, Stillman DJ, Cote J: Multiple links between the NuA4 histone acetyltransferase complex and epigenetic control of transcription. Mol Cell. 2000, 5: 927-937. 10.1016/S1097-2765(00)80258-0.PubMedView ArticleGoogle Scholar
- Brock HW, Fisher CL: Maintenance of gene expression patterns. Dev Dyn. 2005, 232: 633-655. 10.1002/dvdy.20298.PubMedView ArticleGoogle Scholar
- Attwooll C, Oddi S, Cartwright P, Prosperini E, Agger K, Steensgaard P, Wagener C, Sardet C, Moroni MC, Helin K: A novel repressive E2F6 complex containing the polycomb group protein, EPC1, that interacts with EZH2 in a proliferation-specific manner. J Biol Chem. 2005, 280: 1199-1208. 10.1074/jbc.M412509200.PubMedView ArticleGoogle Scholar
- McCullagh P, Chaplin T, Meerabux J, Grenzelias D, Lillington D, Poulsom R, Gregorini A, Saha V, Young BD: The cloning, mapping and expression of a novel gene, BRL, related to the AF10 leukaemia gene. Oncogene. 1999, 18: 7442-7452. 10.1038/sj.onc.1203117.PubMedView ArticleGoogle Scholar
- Chamberlin HM, Thomas JH: The bromodomain protein LIN-49 and trithorax-related protein LIN-59 affect development and gene expression in Caenorhabditis elegans. Development. 2000, 127: 713-723.PubMedGoogle Scholar
- Chamberlin HM, Brown KB, Sternberg PW, Thomas JH: Characterization of seven genes affecting Caenorhabditis elegans hindgut development. Genetics. 1999, 153: 731-742.PubMedPubMed CentralGoogle Scholar
- Zhou MI, Wang H, Ross JJ, Kuzmin I, Xu C, Cohen HT: The von Hippel-Lindau tumor suppressor stabilizes the novel plant homeodomain protein jade-1. J Biol Chem. 2002, 277: 39887-39898. 10.1074/jbc.M205040200.PubMedView ArticleGoogle Scholar
- Panchenko MV, Zhou MI, Cohen HT: von Hippel-Lindau partner jade-1 is a transcriptional co-activator associated with histone acetyltransferase activity. J Biol Chem. 2004, 279: 56032-56041. 10.1074/jbc.M410487200.PubMedView ArticleGoogle Scholar
- Voas MG, Rebay I: The novel plant homeodomain protein rhinoceros antagonizes Ras signaling in the Drosophila eye. Genetics. 2003, 165: 1993-2006.PubMedPubMed CentralGoogle Scholar
- Lin Y, Ono K, Satoh S, Ishiguro H, Fujita M, Miwa N, Tanaka T, Tsunoda T, Yang K, Nakamura Y, Furukawa Y: Identification of AF17 as a downstream gene of the β-catenin/T-cell factor pathway and its involvement in colorectal carcinogenesis. Cancer Res. 2001, 61: 6345-6349.PubMedGoogle Scholar
- Yang Z, Imoto I, Fukuda Y, Pimkhaokham A, Shimada Y, Imamura M, Sugano S, Nakamura Y, Inazawa J: Identification of a novel gene, GASC-1, within an amplicon at 9p23–24 frequently detected in esophageal cancer cell lines. Cancer Res. 2000, 60: 4735-4739.PubMedGoogle Scholar
- Jaju RJ, Fidler C, Haas OA, Strickson AJ, Watkins F, Clark K, Cross NCP, Cheng J, Aplan PD, Kearney L, Boultwood J, Wainscoat JS: A novel gene, NSD1, is fused to NUP98 in the t(5;11)(q35;p15.5) in de novo childhood acute myeloid leukemia. Blood. 2001, 98: 1264-1267. 10.1182/blood.V98.4.1264.PubMedView ArticleGoogle Scholar
- Kurotaki N, Imaizumi K, Harada N, Masuno M, Kondoh T, Nagai T, Ohashi H, Naritomi K, Tsukahara M, Makita Y, Sugimoto T, Sonoda T, Hasegawa T, Chinen Y, Tomita Ha HA, Kinoshita A, Mizuguchi T, Yoshiura Ki K, Ohta T, Kishino T, Fukushima Y, Niikawa N, Matsumoto N: Haploinsufficiency of NSD1 causes Sotos syndrome. Nature Genet. 2002, 30: 365-366. 10.1038/ng863.PubMedView ArticleGoogle Scholar
- Rio M, Clech L, Amiel J, Faivre L, Lyonnet S, Le Merrer M, Odent S, Lacombe D, Edery P, Brauner R, Raoul O, Gosset P, Prieur M, Vekemans M, Munnich A, Colleaux L, Cormier-Daire V: Spectrum of NSD1 mutations in Sotos and Weaver syndromes. J Med Genet. 2003, 40: 436-440. 10.1136/jmg.40.6.436.PubMedPubMed CentralView ArticleGoogle Scholar
- Okada Y, Feng Q, Lin Y, Jiang Q, Li Y, Coffield VM, Su L, Xu G, Zhang Y: hDOT1L links histone methylation to leukemogenesis. Cell. 2005, 121: 167-178. 10.1016/j.cell.2005.02.020.PubMedView ArticleGoogle Scholar
- Kim TG, Kraus JC, Chen J, Lee Y: JUMONJI, a critical factor for cardiac development, functions as a transcriptional repressor. J Biol Chem. 2003, 278: 42247-42255. 10.1074/jbc.M307386200.PubMedView ArticleGoogle Scholar
- Kim TG, Chen J, Sadoshima J, Lee Y: Jumonji represses atrial natriuretic factor gene expression by inhibiting transcriptional activities of cardiac transcription factors. Mol Cell Biol. 2004, 24: 10151-10160. 10.1128/MCB.24.23.10151-10160.2004.PubMedPubMed CentralView ArticleGoogle Scholar
- Rayasam GV, Wendling O, Angrand PO, Mark M, Niederreither K, Song L, Lerouge T, Hager GL, Chambon P, Losson R: NSD1 is essential for early post-implantation development and has a catalytically active SET domain. EMBO J. 2003, 16: 3153-3163. 10.1093/emboj/cdg288.View ArticleGoogle Scholar
- Alvarez-Venegas R, Pien S, Sadder M, Witmer X, Grossniklaus U, Avramova Z: ATX-1, an Arabidopsis homolog of trithorax, activates flower homeotic genes. Curr Biol. 2003, 13: 627-637. 10.1016/S0960-9822(03)00243-4.PubMedView ArticleGoogle Scholar
- Kelley LA, MacCallum RM, Sternberg MJ: Enhanced genome annotation using structural profiles in the program 3D-PSSM. J Mol Biol. 2000, 299: 499-520. 10.1006/jmbi.2000.3741.PubMedView ArticleGoogle Scholar
- McGuffin LJ, Jones DT: Improvement of the GenTHREADER method for genomic fold recognition. Bioinformatics. 2003, 19: 874-881. 10.1093/bioinformatics/btg097.PubMedView ArticleGoogle Scholar
- Shi J, Blundell TL, Mizuguchi K: FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. J Mol Biol. 2001, 310: 243-257. 10.1006/jmbi.2001.4762.PubMedView ArticleGoogle Scholar
- Gough J, Karplus K, Hughey R, Chothia C: Assignment of Homology to Genome Sequences using a Library of Hidden Markov Models that Represent all Proteins of Known Structure. J Mol Biol. 2001, 313: 903-919. 10.1006/jmbi.2001.5080.PubMedView ArticleGoogle Scholar
- Ginalski K, Elofsson A, Fischer D, Rychlewski L: 3D-Jury: a simple approach to improve protein structure predictions. Bioinformatics. 2003, 19: 1015-1018. 10.1093/bioinformatics/btg124.PubMedView ArticleGoogle Scholar
- Dennis CA, Videler H, Pauptit RA, Wallis R, James R, Moore GR, Kleanthous C: A structural comparison of the colicin immunity proteins Im7 and Im9 gives new insights into the molecular determinants of immunity-protein specificity. Biochem J. 1998, 333: 183-191.PubMedPubMed CentralView ArticleGoogle Scholar
- Pandey R, Muller A, Naploi CA, Selinger DA, Pikaard CS, Richards EJ, Bender J, Mount DW, Jorgensen RA: Analysis of histone acetyltransferase and histone deacetylase families of Arabidopsis thaliana suggests functional diversification of chromatin modification among multicellular eukaryotes. Nucleic Acids Res. 2002, 30: 5036-5055. 10.1093/nar/gkf660.PubMedPubMed CentralView ArticleGoogle Scholar
- Waterborg JH: Identification of five sites of acetylation in alfalfa histone H4. Biochemistry. 1992, 31: 6211-6219. 10.1021/bi00142a006.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.