Characterization of constitutive CTCF/cohesin loci: a possible role in establishing topological domains in mammalian genomes
© Li et al.; licensee BioMed Central Ltd. 2013
Received: 14 March 2013
Accepted: 26 July 2013
Published: 14 August 2013
Recent studies suggested that human/mammalian genomes are divided into large, discrete domains that are units of chromosome organization. CTCF, a CCCTC binding factor, has a diverse role in genome regulation including transcriptional regulation, chromosome-boundary insulation, DNA replication, and chromatin packaging. It remains unclear whether a subset of CTCF binding sites plays a functional role in establishing/maintaining chromatin topological domains.
We systematically analysed the genomic, transcriptomic and epigenetic profiles of the CTCF binding sites in 56 human cell lines from ENCODE. We identified ~24,000 CTCF sites (referred to as constitutive sites) that were bound in more than 90% of the cell lines. Our analysis revealed: 1) constitutive CTCF loci were located in constitutive open chromatin and often co-localized with constitutive cohesin loci; 2) most constitutive CTCF loci were distant from transcription start sites and lacked CpG islands but were enriched with the full-spectrum CTCF motifs: a recently reported 33/34-mer and two other potentially novel (22/26-mer); 3) more importantly, most constitutive CTCF loci were present in CTCF-mediated chromatin interactions detected by ChIA-PET and these pair-wise interactions occurred predominantly within, but not between, topological domains identified by Hi-C.
Our results suggest that the constitutive CTCF sites may play a role in organizing/maintaining the recently identified topological domains that are common across most human cells.
KeywordsCTCF Cohesin Constitutive binding site Chromatin interaction Topological domain
The CCCTC-binding factor (CTCF) is a C2H2-zinc finger protein with eleven zinc fingers that display close to 100% similarity between mouse, chicken, and human . CTCF has a versatile role in genome regulation including transcriptional regulation, e.g., c-Myc[2, 3], X chromosome inactivation , allele-specific silencing at imprinted loci such as Igf2/H19[5–8], and regulation of expression of lineage-specific gene clusters such as the β-globin locus  and the MHC class II locus . Recently, CTCF has been implicated in splicing through its action on local RNA polymerase II pausing , trinucleotide repeat instability [12, 13], DNA replication [14, 15], and nucleosome positioning [16, 17]. Because of these diverse functional roles in genome regulation, CTCF has been dubbed the “Master Weaver” of the genome .
CTCF sometimes co-localizes with cohesin [19, 20]. Cohesin, a multi-subunit complex, consists of a heterodimer of SMC (structure maintenance of chromosomes) proteins, SMC1 (structural maintenance of chromosomes 1) and SMC3 (structural maintenance of chromosomes 3), with Rad21 [RAD21 homolog (S. pombe) also known as Scc1] and STAG (also known as Scc3). Cohesin was initially identified for its role in sister chromatid cohesion [14, 21, 22] but has been implicated in regulation of gene expression [19, 20, 23–27] and DNA replication [28, 29]. Schmidt et al. have shown that cohesin can also bind to thousands of sites independent of CTCF .
The first genome-wide study of the CTCF binding in human cell lines identified ~14,000 CTCF binding sites . Most of these sites were located far from the annotated transcriptional start sites (TSS). A subsequent analysis of CTCF binding in three human cell lines showed that they had around 40-60% of the CTCF binding sites in common . Recently, Chen et al.  identified ~28,000 constitutive CTCF sites in 19 human cell types and showed that a large proportion of the variable CTCF binding between different cell types is linked to differential DNA methylation. A study of the evolution of CTCF binding in six representative mammals identified thousands of highly conserved, robust and tissue-independent CTCF binding sites . Those studies suggest that, unlike many other transcription factors/proteins, a substantial portion of CTCF sites may be bound across multiple cell lines, i.e., bound constitutively.
The canonical CTCF binding motif is 16 to 20 base pairs (bp) long . Earlier studies suggested CTCF may be capable of binding to sequences of as long as 40–60 bp [2, 3]. Footprinting of CTCF binding to the amyloid precursor protein (APP) promoter confirmed that CTCF can bind to a 40-bp fragment . Recently, Schmidt et al.  identified a 33/34-mer full-spectrum CTCF motif in a subset of evolutionarily conserved CTCF binding loci.
Given its diverse functional roles in genome regulation, different “classes” of CTCF binding sites might exist where each class has a unique co-factor (or a combination of co-factors) and/or has different binding specificity (e.g., canonical vs. full-spectrum). In this computational study, we examine the functional relevance of a unique “class” of CTCF binding sites – those that were constitutively bound across multiple human cell lines and co-localized with constitutively bound cohesin loci. We operationally refer to a binding site that was bound by a protein in 90% or more of available cell lines as a “constitutive” site.
Genome-wide CTCF-mediated chromatin interactions have been mapped using Chromatin Interaction Analysis Paired-End Tags (ChIA-PET) in mouse ES cells  and in K562 and MCF7 cell lines (data available at the ENCODE portal on UCSC genome browser). ChIA-PET identifies specific long-range chromatin interactions where widely separated genomic regions are brought to spatial proximity mediated by a protein detected by ChIP .
Recently, Guelen et al.  identified ~1300 sharply defined large domains defined by interactions with nuclear lamina components. Similarly, Dixon et al.  identified two to three thousand topological domains in multiple cell and tissue types by Hi-C . Both studies suggested that human/mammalian genomes are partitioned into large, discrete domains that are units of chromosome organization. Dixon et al.  and Meuleman et al.  further proposed that these topological domains are common across different cell types and highly conserved across species. Both studies found that boundaries are enriched with CTCF sites. Those studies reinforce the notion that CTCF plays an important role in genome organization.
Using ChIP-seq (chromatin immunoprecipitation with sequencing) datasets from ENCODE  (http://genome.ucsc.edu/ENCODE/), we identified ~24,000 constitutive CTCF binding sites, ~12,000 of which further co-localized with constitutive cohesin loci. Our computational analysis further revealed that the CTCF-mediated chromatin interaction regions detected by ChIA-PET in both K562 and MCF7 cell lines were enriched with sites where constitutive CTCF and constitutive cohesin co-localized . Furthermore, we found that these CTCF-mediated chromatin interactions were predominantly within topological domains rather than between them. Those results suggest that sites with constitutive CTCF plus constitutive cohesin may be involved in establishing/maintaining global chromatin structure that is common across cell lines [23, 24].
Because all analyses involved CTCF binding sites in terms of constitutive sites, we simply refer to constitutive CTCF sites in CTCF peaks as constitutive CTCF sites, with CTCF here being the protein, not the motif. Throughout the manuscript, we used ‘c’ in front a protein to denote constitutive and ‘/’ between proteins to denote co-localization or overlap. For instance, a cCTCF/cRad21 site is a constitutive CTCF site within a CTCF peak that also lies within a constitutive Rad21 peak. We also use ‘site’ to indicate a motif site, as distinct from a ‘peak’ from ChIP-seq. We also refer to a broader region near a binding site as a locus.
Constitutive CTCF sites
Constitutive CTCF/cohesin sites
Total CTCF binding sites in various classes of ChIP-seq loci
Number of cell lines
Total binding sites
CTCF in one cell line
Among the 23,709 cCTCF sites, 12,014 overlapped with cCohesin (cRad21/cSmc3) sites. Only 925 of the remaining 11,065 cCTCF sites did not overlap with any Rad21/Smc3 (cohesin) sites in the four cell lines with Smc3 ChIP-seq data available. We refer to these 925 sites as the cCTCF without any cohesin sites or cCTCF-non-cohesin sites and contrast their properties with those of cCTCF/cCohesin sites. Those cCTCF sites that overlap with cohesin loci in one, two or three of the four cell lines are not classified as either cCTCF/cCohesin or cCTCF-non-cohesin. (Additional file 4 list all sites that were constitutive by our criterion).
cCTCF sites are more conserved than non-constitutive CTCF sites
Most cCTCF/cCohesin sites are distant from promoters and lack CpG islands
cCTCF/cCohesin loci are associated with constitutive open chromatin
We examined other proteins for possible association with cCTCF loci. In each cell line separately, we counted how often a protein had the center of its ChIP-seq peak within ± 200 bp from CTCF sites in three classes: the 12,014 cCTCF/cCohesin sites, the 925 cCTCF-non-cohesin sites, and all CTCF sites bound in the given cell line but excluding the cCTCF sites. We used all ENCODE TFBS datasets (encodeHaib, encodeSydh, encodeUTA, and encodeUW), histone modification datasets (encodeHistoneBroad, encodeHistoneUW, encodeHistoneUTA), and open chromatin datasets (encodeDukeDNase, encodeUWDNase, and encodeUNCFaire). The total number of histone marks analysed was 11 (Additional file 1). CTCF, Rad21, and Smc3were not included in the analysis. In total, we included 1011 ChIP-seq datasets representing 68 factors in this analysis and combined results from all cell lines for the same factor/feature.
cCTCF/cCohesin loci are enriched with the 33/34-mer and 20/26-mer motifs
The motif logo of the extended cCTCF/cCohesin sites (30 bp at each side) showed information flanking the 16-bp core CTCF motif (data not shown), indicative of the existence of additional motifs. To discover those motifs, we used the 12,014 cCTCF/cCohesin sequences. At each of the 30+30+16-5+1=72 positions, we counted the number of sequences in which each of the 1,024 possible k-mers (k=5) occurred as in . The counts were then ranked and the top 50 k- mers at each side of the core CTCF motif were combined separately to create the composite motif for the side using the position-specific k-mer frequency as the weight.
Proportions of the 33/34-mer and the 20/26-mer motifs in various classes of CTCF loci
All CTCF sites
Recently, additional CTCF motifs such as (C6D, C7D, C8D, and U5C7D) have been reported . However, the proportions of those motifs in the context of the core motif are low ranging from 0.3% to 4% in CD43 cells. We estimated that the proportions of those motifs in both cCTCF and non-constitutive CTCF loci were ~0.5%.
cCTCF is enriched in CTCF-mediated chromatin interactions
Proportions of various CTCF binding sites that contained within the CTCF-mediated interaction regions from ChIA-PET
Sites in interaction
All CTCF excluding cCTCF
cCTCF excluding cCohesin
All CTCF excluding cCTCF
cCTCF excluding cCohesin
Proportions of CTCF-mediated interactions involving cCTCF and non-constitutive CTCF sites
Category of interaction
CTCF type in region 1/2
CTCF type in region 2/1
Any CTCF excluding cCTCF
Any CTCF excluding cCTCF
Any CTCF excluding cCTCF
Any CTCF excluding cCTCF
(B) MCF7 replicate 1
Any CTCF excluding cCTCF
Any CTCF excluding cCTCF
Any CTCF excluding cCTCF
Any CTCF excluding cCTCF
(C) MCF7 replicate 2
Any CTCF excluding cCTCF
Any CTCF excluding cCTCF
Any CTCF excluding cCTCF
Any CTCF excluding cCTCF
Because of the strong association between cCTCF and cCohesin in K562 and MCF7 cell lines, we found that the odds of ChIA-PET detected interactions were approximately 3 times greater among the cCTCF/cCohesin sites than among the cCTCF sites without cCohesin (p-value≈0) (Table 3).
Interplay between topological domains and cCTCF sites
Two recent studies [23, 36] suggested that human/mammalian genomes are divided into large, discrete domains that are units of chromosome organization. Dixon et al.  further proposed that the topological domains are common across different cell types and highly conserved across species. Those results, together with our ChIA-PET results, suggest that chromatin topological domains and CTCF-mediated chromatin interactions may be intrinsically linked.
CTCF is a multi-functional protein that has been implicated in transcriptional regulation, insulation, DNA replication, X-chromosome inactivation, splicing chromatin packaging and many others . CTCF binding sites are widespread in genomes from fly to humans [1, 18]. Earlier, several genome-wide studies identified ~14,000 to ~27,000 CTCF binding sites in several human cell lines. Those studies also showed that 40-60% of the CTCF sites in the cell lines studied were invariant to cell types [17, 31]. Many CTCF binding sites were also computationally identified  and found to be conserved [17, 31, 33, 50, 51]. However, it remained unclear how many CTCF binding sites are present in the human genome and what proportion of them is constitutively bound across most cell lines/tissues. A comprehensive CTCF binding site database containing more than 15 million sequences in 10 species has been recently updated to include long-range chromatin interaction data mediated by CTCF , thereby facilitating analyses like ours in non-human species.
Our analysis of 112 ENCODE CTCF ChIP-seq datasets representing 56 human cell lines suggests that there might be as many as 450,000 CTCF binding sites in the human genome. Nearly half were found in CTCF peaks in only one of the 56 cell lines. About a quarter million of the CTCF sites were found in CTCF peaks in more than one of the 56 cell lines. Moreover, ~24,000 CTCF binding sites were found in CTCF peaks in more than 90% (at least 51 of 56) cell lines, suggesting that those constitutive CTCF sites may be implicated in some fundamental biological process/function for most or all cell lines.
Of course, the exact numbers of cCTCF sites identified by our methods depend on thresholds used for making decisions. In our analysis, we trimmed/extended all peaks to 200 bp in length from the center. Using 300 bp instead increased by 1,640 the number of CTCF sites declared constitutive. Including these additional sites in our analysis of ChIA-PET interactions yielded results substantially the same as those in Table 4. In our analysis, a p-value cut-off of 0.0005 on the PWM score identified a CTCF binding site in 80-95% of the peaks. Adjusting the cut-off would certainly affect the number of CTCF sites identified and declared constitutive; but, like changing the peak length, changing this cut-off seems unlikely to influence our results about enrichment and our overall conclusions about the role of cCTCF sites.
Because many datasets used in our analysis were from cancer cell lines which often carry genetic and chromatin aberrations, we looked for evidence that cCTCF sites might diverge between cancer and normal cell lines. We identified 27,735, 28,662, and 27,774 cCTCF sites in recently deposited CTCF ChIP-seq from 23 cancer cell lines, 20 normal cell lines, and 19 cell lines with unknown karyotypes, respectively . Not only did these three groups have similar numbers of cCTCF sites, they had 19,279 (80.5% to 83.2%) cCTCF sites in common, indicating that cell origins have little effect on the number or locations of cCTCF sites.
The nature of ChIP-seq experiments is to capture a snapshot of protein binding in time. Thus, the sites that we define as constitutive because they are bound in over 90% of cell lines are likely sites where a protein spends most time in the bound state -- perhaps an individual binding event of long duration or perhaps frequent bouts of binding/unbinding with the bound state predominating. Long-duration binding might be attributed to strong binding whereas frequent binding/unbinding would not be. Thus, the constitutive sites that we detect should not correspond exactly to sites with strong binding, though different binding motifs (canonical vs. full-spectrum) might be correlated with binding strength. On the other hand, one can imagine sites in the genome where a protein bound relatively briefly but the site is bound at some time in every tissue or cell line. Such a site would theoretically meet our definition of ‘constitutive’ but would go undetected by our analysis as ChIP-seq snapshots would be virtually impossible to capture short-term binding at the same site in multiple cell lines.
Strong binding may occur at constitutive sites, but it may not be the only explanation for their existence. We recently developed an alternative method for identifying constitutive sites using peak data only (without motif search) (manuscript in preparation). We identified constitutive sites for 22 factors with ChIP-seq data in more than six cell lines. We found that the proportions of constitutive sites vary between different factors from a few to many thousands. It is unlikely that factors that bind to the highest number of constitutive sites (e.g., CTCF and Rad21) are strong binders whereas those that bind to the fewest constitutive sites (e.g., JunD) are weaker binders. We also found that gene ontology analysis of the target genes of the constitutive Pol II sites are highly enriched with biological processes such as metabolism and cell cycle (data not shown). Together, those results strongly suggest that the constitutive sites are biologically meaningful.
Because of CTCF’s diverse roles in genome regulation, different “classes” of CTCF binding sites might exist to carry out different functional roles. Such classes might differ in their co-factors and/or binding strength and specificity (e.g., canonical vs. full-spectrum motifs). In this study, we focused on the class of CTCF binding sites that are constitutively bound and co-localized with the constitutive cohesin loci and compared it to a class of constitutive CTCF binding sites without cohesin. We examined the genomic features, transcriptional landscape and epigenetic environments of those sites to gain insights into their functional relevance. Our analysis not only included many more datasets but also was more comprehensive than the earlier analyses of CTCF binding sites [16, 31, 32, 51, 53].
We identified ~12,000 constitutive CTCF binding sites co-localized with constitutive cohesin loci. The majority of these cCTCF/cCohesin sites were located ≥ 5 kb from the TSS in introns or in intergenic regions that lacked CpG islands. Furthermore, the cCTCF/cCohesin loci were enriched in H3k4me1 mark with well-positioned nucleosomes (Additional file 1). A substantial number of the cCTCF sites overlapped with cohesin in one or more cell lines without meeting the criterion that the corresponding Rad21 and Smc3 peaks were in ≥ 90% of available cell lines. In contrast, few cCTCF sites did not co-localize with cohesin loci in any cell line.
Our analysis of the constitutive sites is limited by the number of cell lines studied; some factors have data from only a limited number of cell lines. As data from additional cell lines become available, some of the cCTCF/cCohesin sites will no longer be designated as constitutive. Although the cCTCF sites were found in at least 51 of the 56 cell lines, constitutive cohesin was defined via Rad21 and Smc3 peaks, which were identified in only 6 and 4 cell lines, respectively.
Numerous studies have shown that CTCF cooperates with cohesin to contribute to DNA loop formation to thereby regulate gene expression and chromatin interactions [18–20, 23–26, 48, 49, 54], DNA replication , RNA pol II pausing . Our computational analysis revealed that the strength of association between CTCF and cohesin increases when both sites/loci were constitutive, similarly for CTCF and Znf143 (Additional file 1 and Additional file 2: Table S2), and for CTCF, cohesin, and Znf143 (Additional file 1 and Additional file 2: Table S3).
A footprinting study of CTCF binding to the promoter of the APP gene showed that the binding of the full-length CTCF protein generated a DNase I protected region covering 40 bp . Subsequent motif analysis  in a set of evolutionarily conserved CTCF sites identified ~5,000 33/34-mer full-spectrum CTCF binding sites. We independently identified the same 33/34-mer motifs in the set of cCTCF/cCohesin loci. Furthermore, we also identified two potentially novel 20/26-mer CTCF motifs (Figure 6). Whether those full-spectrum motifs function in transcriptional regulation or in mediating chromatin-chromatin interactions, or both, remains unclear.
Using ENCODE ChIP-seq data we identified ~450,000 CTCF binding sites in CTCF peaks from 56 cell lines. We also identified ~24,000 cCTCF and ~12,000 cCTCF/cCohesin binding sites. The cCTCF sites were located in unique genomic environments and were over-represented in CTCF-mediated global chromatin interactions that are predominantly within, but not between, the proposed topological domains. We suggest that CTCF and cohesin cooperate in those loci to establish/maintain the “common” chromatin structure in most human cell lines.
We downloaded all ChIP-seq data defined as either narrow or broad peaks from the ENCODE portal at the UCSC genome browser (http://genome.ucsc.edu/ENCODE/downloads.html). We extended/trimmed all TFBS ChIP-seq peaks to 200 bp in length from the center of the peak. All genomic sequence data, such as CpG islands, were downloaded from the UCSC genome browser. All data were in GRCh37 (hg19) assembly.
Predicting CTCF binding sites in ChIP-seq data
For each ChIP-seq dataset, we predicted the location of a CTCF binding site in each peak using the GADEM software . The position weight matrix (PWM) model for CTCF was obtained from earlier de novo analysis  of the CTCF ChIP-seq datasets . The p-value for the PWM score cutoff was set to 0.0005 to identify well-defined CTCF sites. For each ChIP-seq peak, we selected the single highest scoring site that passed the p-value cutoff for the PWM score as the binding site. We found a CTCF binding site in 80-95% of the CTCF ChIP-seq peaks in each dataset. We combined binding sites from replicate experiments for any cell line and retained only the unique ones. Similarly, we predicted the CTCF binding sites in all other ChIP-seq datasets from ENCODE. The coordinates of all unique CTCF binding sites identified in the 112 CTCF ChIP-seq datasets in 56 cell lines are provided in Additional file 5.
Motif conservation analysis
We extracted the 46-way multiZ alignments (hg19) for the 23,709 cCTCF binding sites (16 bp) plus the 10 bp flanking regions using the Galaxy “Extract MAF blocks given a set of genomic intervals” tool (http://main.g2.bx.psu.edu/). Multiple blocks in the Galaxy output were merged using a custom python code. For each multiZ alignment, we scanned each sequence in the alignment for a CTCF binding site using the same PWM and p-value (0.0005) cutoff as before. We then counted the number of sequences (equivalently, species) in each alignment containing a CTCF binding site and used the number as a surrogate for conservation. Similarly, we randomly selected 23,709 non-constitutive CTCF binding sites that were identified in 2–10 cell lines and repeated the above analysis.
Overlap with ChIP-PET interaction data
We downloaded genome-wide CTCF-mediated chromatin interactions identified by ChIA-PET in K562 and MCF7 cell lines from ENCODE/GIS-Ruan (http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeGisChiaPet/). The CTCF binding sites used were those we predicted from ENCODE CTCF ChIP-seq peaks in K562 and MCF7 as described above. There were 25,304 unique CTCF-mediated interactions from cell line K562. For cell line MCF7, two replicate experiments yielded 50,498 and 20,140 unique CTCF-mediated interactions, respectively. Each interaction is defined by a pair of genomic coordinates, referred to herein as region1 and region2, respectively. Since is an interaction has no directionality, the order of the two regions is irrelevant. It is worth pointing out that a region in one interaction pair may overlap with region(s) in another.
Proportions of CTCF sites in ChIA-PET detected interactions
We found 23,577 cCTCF and 81,464 CTCF excluding cCTCF (non-constitutive CTCF) binding sites in cell line K562 based on the ChIP-seq data. We then counted the number of cCTCF and non-constitutive CTCF sites contained within any regions in the ChIA-PET interaction data.
Proportion of ChIA-PET detected interactions involving CTCF sites
A ChIA-PET interaction region may contain a cCTCF site, a non-constitutive CTCF site, or neither. Thus, a ChIA-PET interaction pair may be one of six possible types: cCTCF and cCTCF, cCTCF and non-constitutive CTCF, cCTCF and neither, non-constitutive CTCF and non-constitutive CTCF, non-constitutive CTCF and neither, neither and neither (Table 4). Since a region in the interaction pair may contain multiple cCTCF and/or non-constitutive CTCF sites, we assigned all possible types of interactions possible for the pair of regions and counted them proportionally. For example, if region 1 contained one cCTCF and one non-constitutive CTCF site and region 2 contained one non-constitutive CTCF site, we assigned one-half count for cCTCF and non-constitutive CTCF interaction and one-half count for non-constitutive CTCF and non-constitutive cCTCF interaction. This way, each interaction pair in the original ChIA-PET data contributes equally and the sum of the counts equals to the total number of interaction pairs in the original ChIA-PET data.
CCCTC binding factor
RAD21 homolog (S. pombe)
Structure maintenance of chromosomes 1
Structure maintenance of chromosomes 3
Constitutive CTCF sites
Constitutive Rad21 site
Constitutive Smc3 site
Constitutive CTCF sites that overlap with constitutive cohesin loci
Constitutive CTCF sites that did not overlap with either Rad21 or Smc3 loci.
We thank Karen Adelman, Kasia Bebenek, Tom Kunkel, Daniel Menendez, Mike Resnick, and Paul Wade for critical comments and suggestions in the earlier version of the manuscript. We thank the Computational Biology Facility at NIEHS for computing time and support.
This research was supported by Intramural Research Program of the NIH, National Institute of Environmental Health Sciences (ES101765).
- Ohlsson R, Renkawitz R, Lobanenkov V: CTCF is a uniquely versatile transcription regulator linked to epigenetics and disease. Trends Genet. 2001, 17 (9): 520-527. 10.1016/S0168-9525(01)02366-6.View ArticlePubMedGoogle Scholar
- Klenova EM, Nicolas RH, Paterson HF, Carne AF, Heath CM, Goodwin GH, Neiman PE, Lobanenkov VV: CTCF, a conserved nuclear factor required for optimal transcriptional activity of the chicken c-myc gene, is an 11-Zn-finger protein differentially expressed in multiple forms. Mol Cell Biol. 1993, 13 (12): 7612-7624.PubMed CentralView ArticlePubMedGoogle Scholar
- Lobanenkov VV, Nicolas RH, Adler VV, Paterson H, Klenova EM, Polotskaja AV, Goodwin GH: A novel sequence-specific DNA binding protein which interacts with three regularly spaced direct repeats of the CCCTC-motif in the 5′-flanking sequence of the chicken c-myc gene. Oncogene. 1990, 5 (12): 1743-1753.PubMedGoogle Scholar
- Xu N, Donohoe ME, Silva SS, Lee JT: Evidence that homologous X-chromosome pairing requires transcription and Ctcf protein. Nat Genet. 2007, 39 (11): 1390-1396. 10.1038/ng.2007.5.View ArticlePubMedGoogle Scholar
- Bell AC, Felsenfeld G: Methylation of a CTCF-dependent boundary controls imprinted expression of the Igf2 gene. Nature. 2000, 405 (6785): 482-485. 10.1038/35013100.View ArticlePubMedGoogle Scholar
- Kurukuti S, Tiwari VK, Tavoosidana G, Pugacheva E, Murrell A, Zhao Z, Lobanenkov V, Reik W, Ohlsson R: CTCF binding at the H19 imprinting control region mediates maternally inherited higher-order chromatin conformation to restrict enhancer access to Igf2. Proc Natl Acad Sci USA. 2006, 103 (28): 10684-10689. 10.1073/pnas.0600326103.PubMed CentralView ArticlePubMedGoogle Scholar
- Murrell A, Heeson S, Reik W: Interaction between differentially methylated regions partitions the imprinted genes Igf2 and H19 into parent-specific chromatin loops. Nat Genet. 2004, 36 (8): 889-893. 10.1038/ng1402.View ArticlePubMedGoogle Scholar
- Yoon YS, Jeong S, Rong Q, Park KY, Chung JH, Pfeifer K: Analysis of the H19ICR insulator. Mol Cell Biol. 2007, 27 (9): 3499-3510. 10.1128/MCB.02170-06.PubMed CentralView ArticlePubMedGoogle Scholar
- Splinter E, Heath H, Kooren J, Palstra RJ, Klous P, Grosveld F, Galjart N, de Laat W: CTCF mediates long-range chromatin looping and local histone modification in the beta-globin locus. Genes Dev. 2006, 20 (17): 2349-2354. 10.1101/gad.399506.PubMed CentralView ArticlePubMedGoogle Scholar
- Majumder P, Gomez JA, Chadwick BP, Boss JM: The insulator factor CTCF controls MHC class II gene expression and is required for the formation of long-distance chromatin interactions. J Exp Med. 2008, 205 (4): 785-798. 10.1084/jem.20071843.PubMed CentralView ArticlePubMedGoogle Scholar
- Shukla S, Kavak E, Gregory M, Imashimizu M, Shutinoski B, Kashlev M, Oberdoerffer P, Sandberg R, Oberdoerffer S: CTCF-promoted RNA polymerase II pausing links DNA methylation to splicing. Nature. 2011, 479 (7371): 74-79. 10.1038/nature10442.View ArticlePubMedGoogle Scholar
- Cleary JD, Tome S, Lopez Castel A, Panigrahi GB, Foiry L, Hagerman KA, Sroka H, Chitayat D, Gourdon G, Pearson CE: Tissue- and age-specific DNA replication patterns at the CTG/CAG-expanded human myotonic dystrophy type 1 locus. Nat Struct Mol Biol. 2010, 17 (9): 1079-1087.View ArticlePubMedGoogle Scholar
- Libby RT, Hagerman KA, Pineda VV, Lau R, Cho DH, Baccam SL, Axford MM, Cleary JD, Moore JM, Sopher BL: CTCF cis-regulates trinucleotide repeat instability in an epigenetic manner: a novel basis for mutational hot spot determination. PLoS Genet. 2008, 4 (11): e1000257-10.1371/journal.pgen.1000257.PubMed CentralView ArticlePubMedGoogle Scholar
- Guillou E, Ibarra A, Coulon V, Casado-Vela J, Rico D, Casal I, Schwob E, Losada A, Mendez J: Cohesin organizes chromatin loops at DNA replication factories. Genes Dev. 2010, 24 (24): 2812-2822. 10.1101/gad.608210.PubMed CentralView ArticlePubMedGoogle Scholar
- Sherwood R, Takahashi TS, Jallepalli PV: Sister acts: coordinating DNA replication and cohesion establishment. Genes Dev. 2010, 24 (24): 2723-2731. 10.1101/gad.1976710.PubMed CentralView ArticlePubMedGoogle Scholar
- Fu Y, Sinha M, Peterson CL, Weng Z: The insulator binding protein CTCF positions 20 nucleosomes around its binding sites across the human genome. PLoS Genet. 2008, 4 (7): e1000138-10.1371/journal.pgen.1000138.PubMed CentralView ArticlePubMedGoogle Scholar
- Kim TH, Abdullaev ZK, Smith AD, Ching KA, Loukinov DI, Green RD, Zhang MQ, Lobanenkov VV, Ren B: Analysis of the vertebrate insulator protein CTCF-binding sites in the human genome. Cell. 2007, 128 (6): 1231-1245. 10.1016/j.cell.2006.12.048.PubMed CentralView ArticlePubMedGoogle Scholar
- Phillips JE, Corces VG: CTCF: master weaver of the genome. Cell. 2009, 137 (7): 1194-1211. 10.1016/j.cell.2009.06.001.PubMed CentralView ArticlePubMedGoogle Scholar
- Parelho V, Hadjur S, Spivakov M, Leleu M, Sauer S, Gregson HC, Jarmuz A, Canzonetta C, Webster Z, Nesterova T: Cohesins functionally associate with CTCF on mammalian chromosome arms. Cell. 2008, 132 (3): 422-433. 10.1016/j.cell.2008.01.011.View ArticlePubMedGoogle Scholar
- Rubio ED, Reiss DJ, Welcsh PL, Disteche CM, Filippova GN, Baliga NS, Aebersold R, Ranish JA, Krumm A: CTCF physically links cohesin to chromatin. Proc Natl Acad Sci USA. 2008, 105 (24): 8309-8314. 10.1073/pnas.0801273105.PubMed CentralView ArticlePubMedGoogle Scholar
- Dorsett D: Cohesin: genomic insights into controlling gene transcription and development. Curr Opin Genet Dev. 2011, 21 (2): 199-206. 10.1016/j.gde.2011.01.018.PubMed CentralView ArticlePubMedGoogle Scholar
- Michaelis C, Ciosk R, Nasmyth K: Cohesins: chromosomal proteins that prevent premature separation of sister chromatids. Cell. 1997, 91 (1): 35-45. 10.1016/S0092-8674(01)80007-6.View ArticlePubMedGoogle Scholar
- Dixon JR, Selvaraj S, Yue F, Kim A, Li Y, Shen Y, Hu M, Liu JS, Ren B: Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature. 2012, 485 (7398): 376-380. 10.1038/nature11082.PubMed CentralView ArticlePubMedGoogle Scholar
- Handoko L, Xu H, Li G, Ngan CY, Chew E, Schnapp M, Lee CW, Ye C, Ping JL, Mulawadi F: CTCF-mediated functional chromatin interactome in pluripotent cells. Nat Genet. 2011, 43 (7): 630-638. 10.1038/ng.857.PubMed CentralView ArticlePubMedGoogle Scholar
- Hou C, Dale R, Dean A: Cell type specificity of chromatin organization mediated by CTCF and cohesin. Proc Natl Acad Sci USA. 2010, 107 (8): 3651-3656. 10.1073/pnas.0912087107.PubMed CentralView ArticlePubMedGoogle Scholar
- Wendt KS, Yoshida K, Itoh T, Bando M, Koch B, Schirghuber E, Tsutsumi S, Nagae G, Ishihara K, Mishiro T: Cohesin mediates transcriptional insulation by CCCTC-binding factor. Nature. 2008, 451 (7180): 796-801. 10.1038/nature06634.View ArticlePubMedGoogle Scholar
- Faure AJ, Schmidt D, Watt S, Schwalie PC, Wilson MD, Xu H, Ramsay RG, Odom DT, Flicek P: Cohesin regulates tissue-specific expression by stabilising highly occupied cis-regulatory modules. Genome Res. 2012, 22 (11): 2163-2175. 10.1101/gr.136507.111.PubMed CentralView ArticlePubMedGoogle Scholar
- Nasmyth K: Cohesin: a catenase with separate entry and exit gates?. Nat Cell Biol. 2011, 13 (10): 1170-1177. 10.1038/ncb2349.View ArticlePubMedGoogle Scholar
- Seitan VC, Merkenschlager M: Cohesin and chromatin organisation. Curr Opin Genet Dev. 2011, 22 (2): 93-100.View ArticlePubMedGoogle Scholar
- Schmidt D, Schwalie PC, Ross-Innes CS, Hurtado A, Brown GD, Carroll JS, Flicek P, Odom DT: A CTCF-independent role for cohesin in tissue-specific transcription. Genome Res. 2010, 20 (5): 578-588. 10.1101/gr.100479.109.PubMed CentralView ArticlePubMedGoogle Scholar
- Cuddapah S, Jothi R, Schones DE, Roh TY, Cui K, Zhao K: Global analysis of the insulator binding protein CTCF in chromatin barrier regions reveals demarcation of active and repressive domains. Genome Res. 2009, 19 (1): 24-32.PubMed CentralView ArticlePubMedGoogle Scholar
- Chen H, Tian Y, Shu W, Bo X, Wang S: Comprehensive identification and annotation of cell type-specific and ubiquitous CTCF-binding sites in the human genome. PLoS One. 2012, 7 (7): e41374-10.1371/journal.pone.0041374.PubMed CentralView ArticlePubMedGoogle Scholar
- Schmidt D, Schwalie PC, Wilson MD, Ballester B, Goncalves A, Kutter C, Brown GD, Marshall A, Flicek P, Odom DT: Waves of retrotransposon expansion remodel genome organization and CTCF binding in multiple mammalian lineages. Cell. 2012, 148 (1–2): 335-348.PubMed CentralView ArticlePubMedGoogle Scholar
- Quitschke WW, Taheny MJ, Fochtmann LJ, Vostrov AA: Differential effect of zinc finger deletions on the binding of CTCF to the promoter of the amyloid precursor protein gene. Nucleic Acids Res. 2000, 28 (17): 3370-3378. 10.1093/nar/28.17.3370.PubMed CentralView ArticlePubMedGoogle Scholar
- Fullwood MJ, Liu MH, Pan YF, Liu J, Xu H, Mohamed YB, Orlov YL, Velkov S, Ho A, Mei PH: An oestrogen-receptor-alpha-bound human chromatin interactome. Nature. 2009, 462 (7269): 58-64. 10.1038/nature08497.PubMed CentralView ArticlePubMedGoogle Scholar
- Guelen L, Pagie L, Brasset E, Meuleman W, Faza MB, Talhout W, Eussen BH, de Klein A, Wessels L, de Laat W: Domain organization of human chromosomes revealed by mapping of nuclear lamina interactions. Nature. 2008, 453 (7197): 948-951. 10.1038/nature06947.View ArticlePubMedGoogle Scholar
- Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, Amit I, Lajoie BR, Sabo PJ, Dorschner MO: Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science (New York, NY. 2009, 326 (5950): 289-293. 10.1126/science.1181369.View ArticleGoogle Scholar
- Meuleman W, Peric-Hupkes D, Kind J, Beaudry JB, Pagie L, Kellis M, Reinders M, Wessels L, van Steensel B: Constitutive nuclear lamina-genome interactions are highly conserved and associated with A/T-rich sequence. Genome Res. 2013, 23 (2): 270-280. 10.1101/gr.141028.112.PubMed CentralView ArticlePubMedGoogle Scholar
- Consortium TEP: The ENCODE (ENCyclopedia Of DNA Elements) project. Science (New York, NY. 2004, 306 (5696): 636-640.View ArticleGoogle Scholar
- Wang H, Maurano MT, Qu H, Varley KE, Gertz J, Pauli F, Lee K, Canfield T, Weaver M, Sandstrom R: Widespread plasticity in CTCF occupancy linked to DNA methylation. Genome Res. 2012, 22 (9): 1680-1688. 10.1101/gr.136101.111.PubMed CentralView ArticlePubMedGoogle Scholar
- Lai AY, Fatemi M, Dhasarathy A, Malone C, Sobol SE, Geigerman C, Jaye DL, Mav D, Shah R, Li L: DNA methylation prevents CTCF-mediated silencing of the oncogene BCL6 in B cell lymphomas. J Exp Med. 2010, 207 (9): 1939-1950. 10.1084/jem.20100204.PubMed CentralView ArticlePubMedGoogle Scholar
- Crawford GE, Holt IE, Whittle J, Webb BD, Tai D, Davis S, Margulies EH, Chen Y, Bernat JA, Ginsburg D: Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS). Genome Res. 2006, 16 (1): 123-131.PubMed CentralView ArticlePubMedGoogle Scholar
- Follows GA, Dhami P, Gottgens B, Bruce AW, Campbell PJ, Dillon SC, Smith AM, Koch C, Donaldson IJ, Scott MA: Identifying gene regulatory elements by genomic microarray mapping of DNaseI hypersensitive sites. Genome Res. 2006, 16 (10): 1310-1319. 10.1101/gr.5373606.PubMed CentralView ArticlePubMedGoogle Scholar
- Giresi PG, Kim J, McDaniell RM, Iyer VR, Lieb JD: FAIRE (Formaldehyde-Assisted Isolation of Regulatory Elements) isolates active regulatory elements from human chromatin. Genome Res. 2007, 17 (6): 877-885. 10.1101/gr.5533506.PubMed CentralView ArticlePubMedGoogle Scholar
- Simon JM, Giresi PG, Davis IJ, Lieb JD: Using formaldehyde-assisted isolation of regulatory elements (FAIRE) to isolate active regulatory DNA. Nat Protoc. 2012, 7 (2): 256-267. 10.1038/nprot.2011.444.PubMed CentralView ArticlePubMedGoogle Scholar
- FitzGerald PC, Sturgill D, Shyakhtenko A, Oliver B, Vinson C: Comparative genomics of Drosophila and human core promoters. Genome Biol. 2006, 7 (7): R53-10.1186/gb-2006-7-7-r53.PubMed CentralView ArticlePubMedGoogle Scholar
- Nakahashi H, Kwon KR, Resch W, Vian L, Dose M, Stavreva D, Hakim O, Pruett N, Nelson S, Yamane A: A genome-wide map of CTCF multivalency redefines the CTCF code. Cell reports. 2013, 3 (5): 1678-1689. 10.1016/j.celrep.2013.04.024.PubMed CentralView ArticlePubMedGoogle Scholar
- Bourque G, Leong B, Vega VB, Chen X, Lee YL, Srinivasan KG, Chew JL, Ruan Y, Wei CL, Ng HH: Evolution of the mammalian transcription factor binding repertoire via transposable elements. Genome Res. 2008, 18 (11): 1752-1762. 10.1101/gr.080663.108.PubMed CentralView ArticlePubMedGoogle Scholar
- Monahan K, Rudnick ND, Kehayova PD, Pauli F, Newberry KM, Myers RM, Maniatis T: Role of CCCTC binding factor (CTCF) and cohesin in the generation of single-cell diversity of protocadherin-alpha gene expression. Proc Natl Acad Sci USA. 2012, 109 (23): 9125-9130. 10.1073/pnas.1205074109.PubMed CentralView ArticlePubMedGoogle Scholar
- Xie X, Mikkelsen TS, Gnirke A, Lindblad-Toh K, Kellis M, Lander ES: Systematic discovery of regulatory motifs in conserved regions of the human genome, including thousands of CTCF insulator sites. Proc Natl Acad Sci USA. 2007, 104 (17): 7145-7150. 10.1073/pnas.0701811104.PubMed CentralView ArticlePubMedGoogle Scholar
- Essien K, Vigneau S, Apreleva S, Singh LN, Bartolomei MS, Hannenhalli S: CTCF binding site classes exhibit distinct evolutionary, genomic, epigenomic and transcriptomic features. Genome Biol. 2009, 10 (11): R131-10.1186/gb-2009-10-11-r131.PubMed CentralView ArticlePubMedGoogle Scholar
- Ziebarth JD, Bhattacharya A, Cui Y: CTCFBSDB 2.0: a database for CTCF-binding sites and genome organization. Nucleic Acids Res. 2013, 41 (Database issue): D188-D194.PubMed CentralView ArticlePubMedGoogle Scholar
- Rach EA, Winter DR, Benjamin AM, Corcoran DL, Ni T, Zhu J, Ohler U: Transcription initiation patterns indicate divergent strategies for gene regulation at the chromatin level. PLoS Genet. 2011, 7 (1): e1001274-10.1371/journal.pgen.1001274.PubMed CentralView ArticlePubMedGoogle Scholar
- Sanyal A, Lajoie BR, Jain G, Dekker J: The long-range interaction landscape of gene promoters. Nature. 2012, 489 (7414): 109-113. 10.1038/nature11279.PubMed CentralView ArticlePubMedGoogle Scholar
- Hu M, Deng K, Qin Z, Dixon J, Selvaraj S, Fang J, Ren B, Liu JS: Bayesian inference of spatial organizations of chromosomes. PLoS Comput Biol. 2013, 9 (1): e1002893-10.1371/journal.pcbi.1002893.PubMed CentralView ArticlePubMedGoogle Scholar
- Li L: GADEM: a genetic algorithm guided formation of spaced dyads coupled with an EM algorithm for motif discovery. J Comput Biol. 2009, 16 (2): 317-329. 10.1089/cmb.2008.16TT.PubMed CentralView ArticlePubMedGoogle Scholar
- Kannan MB, Solovieva V, Blank V: The small MAF transcription factors MAFF. Biochimica et biophysica acta. 2012Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.