Skip to main content

Exploring the changing landscape of cell-to-cell variation after CTCF knockdown via single cell RNA-seq

Abstract

Background

CCCTC-Binding Factor (CTCF), also known as 11-zinc finger protein, participates in many cellular processes, including insulator activity, transcriptional regulation and organization of chromatin architecture. Based on single cell flow cytometry and single cell RNA-FISH analyses, our previous study showed that deletion of CTCF binding site led to a significantly increase of cellular variation of its target gene. However, the effect of CTCF on genome-wide landscape of cell-to-cell variation remains unclear.

Results

We knocked down CTCF in EL4 cells using shRNA, and conducted single cell RNA-seq on both wild type (WT) cells and CTCF-Knockdown (CTCF-KD) cells using Fluidigm C1 system. Principal component analysis of single cell RNA-seq data showed that WT and CTCF-KD cells concentrated in two different clusters on PC1, indicating that gene expression profiles of WT and CTCF-KD cells were systematically different. Interestingly, GO terms including regulation of transcription, DNA binding, zinc finger and transcription factor binding were significantly enriched in CTCF-KD-specific highly variable genes, implying tissue-specific genes such as transcription factors were highly sensitive to CTCF level. The dysregulation of transcription factors potentially explains why knockdown of CTCF leads to systematic change of gene expression. In contrast, housekeeping genes such as rRNA processing, DNA repair and tRNA processing were significantly enriched in WT-specific highly variable genes, potentially due to a higher cellular variation of cell activity in WT cells compared to CTCF-KD cells. We further found that cellular variation-increased genes were significantly enriched in down-regulated genes, indicating CTCF knockdown simultaneously reduced the expression levels and increased the expression noise of its regulated genes.

Conclusions

To our knowledge, this is the first attempt to explore genome-wide landscape of cellular variation after CTCF knockdown. Our study not only advances our understanding of CTCF function in maintaining gene expression and reducing expression noise, but also provides a framework for examining gene function.

Background

CCCTC-binding factor (CTCF) is an 11-zinc finger protein that directionally binds to a well-defined DNA motif [1,2,3]. Although CTCF was initially reported as a transcription factor [4, 5], subsequent studies found it served as an insulator [6,7,8]. Nowadays, CTCF has been reported being involved in multiple cellular processes, such as transcriptional regulation, insulator activity, epigenetic regulation, organization of chromatin architecture and X chromosome inactivation [1, 2, 9,10,11,12,13,14,15]. CTCF activates and silences gene expression by preventing the spread of heterochromatin and blocking of unrelated enhancer-promoter interactions [1, 16]. Interestingly, mammalian genome is organized into thousands of highly self-interacting topologically associated domains (TADs) with CTCF demarcating individual TAD boundary [17]. Analyses based on high-resolution interaction matrix further identified ~ 10,000 chromatin loops ranging ~185Kb in human genome, anchoring by convergent CTCF-binding motif-pair at TAD boundaries [14]. In addition, chromatin loops mediated by CTCF and cohesin can tether distal enhancers to gene promoters and regulate its target gene expression [14, 15, 18, 19]. Further studies showed CTCF-mediated DNA loop could determine the chromatin architecture, with anchor-genes almost exclusively being housekeeping genes, while loop-genes being tissue-specific genes [15, 20]. For instance, inversion of a CTCF-binding site reconfigured the topology of chromatin loops and activated gene expression by creating a new chromatin loop [21, 22].

Recent studies showed that disruption of CTCF binding in mammalian cells resulted in loss of TADs [19], genomic instability [23], developmental failure [24, 25] and other malfunctions. Our recent study demonstrated that CTCF played an important role in stabilizing enhancer-promoter interaction and reducing the gene expression noises in mammalian cells [18]. In particular, we found that CTCF-KD or deletion of CTCF binding sites led to increased variation of cellular expression of GATA3, CD28, CD90 and CD5 [18]. However, the genome-wide change of cell-to-cell variation after CTCF-KD remains unknown. In this study, we conducted single cell RNA-seq on both WT and CTCF-KD cells to investigate the changing landscape of cell-to-cell variation at a genome-wide scale. Interestingly, GO terms including regulation of transcription, DNA binding and Zinc finger were significantly enriched in CTCF-KD specific highly variable genes. We also found that cellular variation-increased genes were significantly enriched in down-regulated genes, indicating knockdown of CTCF simultaneously reduced the expression level and increased the expression noises of its regulated genes.

Results

Efficient CTCF knockdown and single cell RNA-seq

We knocked down CTCF in EL4 cells by short hairpin RNA (shRNA). Western blotting showed a dramatic decrease of the CTCF protein level in shCTCF #1 and shCTCF #2 compared to shRNA luciferase controls (shLuc) (Fig. 1a). Quantitative polymerase chain reaction (qPCR) revealed the mRNA levels in shCTCF #1 and shCTCF #2 had been reduced to 38 and 40% of that in shLuc, respectively (Fig. 1b). These results confirmed the efficient knockdown of CTCF expression in shCTCF#1 and shCTCF#2, with a significant reduction consistently at both RNA and protein level.

Fig. 1
figure1

Knockdown of CTCF and schema of single cell sequencing. a. Western blot analysis of CTCF in luciferase control (shLuc) and CTCF-KD cells (shCTCF#1 and shCTCF#2). EL4 cells were infected with retroviral particles encoding GFP and an shRNA targeting CTCF or a control sequence for 5 days. b. Real-time quantitative PCR (RT-qPCR) analysis of CTCF expression in luciferase control (shLuc) and knockdown (shCTCF#1 and shCTCF#2) cells. The expression level of CTCF was normalized to GAPDH. c. Schema of single cell RNA sequencing using Fluidigm C1 system

In order to investigate the changing landscape of cell-to-cell variation after CTCF knockdown, we successfully conducted single cell RNA-seq for shLuc#1, shLuc #2, shCTCF #1 and shCTCF#2 using 4 integrated fluidics circuits (IFCs) (Fig. 1c). We noticed the gene expression levels of pooled shLuc single cells were highly correlated with that of bulk data from our previous study [18] (r2 = 0.86; Additional file 1: Figure S1A). The gene expression of pooled single cells from shLuc #1 was also highly correlated with that of shLuc #2 (r2 = 0.87; Additional file 1: Figure S1B). In addition, the gene expression of pooled single cell repeats and bulk cell repeats in CTCF-KD cells were highly correlated (Additional file 1: Figure S1C-S1D).

Systematic differences between WT cells and CTCF-KD cells

A total of 95 cells, including 24 cells from shLuc #1, 24 cells from shCTCF #1, 23 cells from shLuc #2 and 24 cells from shCTCF #2, were kept for further analyses after quality control. We conducted principal component analysis (PCA) on 11,361 genes shared by those 95 cells (Fig. 2a). The coordination of cells from experiment1 and experiment2 on PCA projection is not significantly different (P = 0.8; student’s t-test), indicating no obvious batch effect between experiment1 and experiment2. Further analysis showed that WT cells and CTCF-KD cells were distinguishable on PCA projection and concentrated in two different clusters on PC1 (Fig. 2b,c), implying a systematic difference of gene expression profiles between CTCF-KD and WT cells. We also noticed a correlation between CTCF expression level and its coordination on the PCA projection (using the first 10 PCs), among which CTCF expression level and PC2 exhibited the highest correlation (Additional file 2: Figure S2A) (r2 = 0.18, P = 0.22 × 10− 6).

Fig. 2
figure2

Systematic difference between CTCF-KD and WT cells. a. No significant batch effect among the experimental repeats based on PCA analysis. b. CTCF-KD cells were largely distinguishable from WT cells on PCA projection. c. Distribution of individual WT cells and individual CTCF-KD cells on PC1. d. Heatmap of differentially expressed genes (TOP 20) between WT cells and CTCF-KD cells

We further calculated the differential gene expression between WT and CTCF-KD cells using edgeR [26]. In total, we identified 195 up-regulated and 107 down-regulated genes in CTCF-KD cells compared to WT cells (Additional file 2: Figure S2B). Heatmap of the most differentially expressed genes between WT cells and CTCF-KD cells exhibited a cellular heterogeneity within the same cell population (Fig. 2d). The most enriched gene categories in down-regulated genes include glycolytic processing, Prolyl 4-hydroxylase α subunit, iron-dependent dioxygenase and carbon metabolism (Additional file 2: Figure S2D). Compared to the most enriched gene categories in up-regulated genes include RNA binding, ribosome biogenesis, WD40 repeat domain and RNA processing (Additional file 2: Figure S2C), consistent to our recent study based on bulk data in some way [18].

CTCF knockdown changed the landscape of cell-to-cell variation

In order to distinguish true signals of cellular variation from technical noise, we calculated the expression noise of each gene (σ22) [27, 28]. The expression noises exhibited two distinct scaling properties: negative association with expression at low expression levels and no association at high expression levels (log2TPM > 1) (Fig. 3a). We filtered out low expressed genes (log2TPM ≤ 1) to reduce the impact of technological noise, resulting in 7843 genes for further analysis (Fig. 3a). Coefficient of variation (CV) was calculated to measure cell-to-cell variation of each gene across the cell populations. The distribution of alterations of cell-to-cell variation pre-and post-CTCF knockdown followed a normal distribution (Additional file 3: Figure S3). We identified 602 cellular variation increased genes and 890 cellular variation decreased genes after CTCF knockdown by mean ± SD (Fig. 3b).

Fig. 3
figure3

Identification and analyses of genes showing cellular variation changes after CTCF KD. a. The relationship between expression level and noise level of reference genes. Genes with low cellular variation was used for further analyses. b. Scatter plot of cellular variation-changed genes after CTCF KD. Blue and red indicate the variation decrease and variation increase, respectively. c. The top 15 gene categories enriched in variation increased genes. d. The top 15 gene categories enriched in variation decreased genes

GO analyses showed that variation-increased genes were significantly enriched in GO terms such as regulation of transcription, DNA binding, zinc finger proteins, covalent chromatin modification and transcription factor binding (Fig. 3c). In fact, almost all genes involved in DNA binding, zinc finger proteins and transcription factor binding are transcription factors. The significant enrichment of transcription factors in CTCF-KD-specific highly variable genes potentially indicates a high sensitivity of transcription factors to CTCF level. Dysregulation of certain transcription factors possibly explains why knockdown of CTCF leads to a systematic change of gene expression. In contrast, variation-decreased genes were significantly enriched in housekeeping genes related GO terms such as rRNA processing, DNA repair, tRNA processing and RNA modification (Fig. 3d). The enrichment of housekeeping genes in WT-specific highly variable genes potentially indicates a higher cellular variation of cell activity in WT cells compared to CTCF-KD cells.

CTCF knockdown simultaneously altered expression level and cellular variation of its regulated genes

We identified 302 expression-changed genes and 1490 cellular variation-changed genes pre-and-post CTCF knockdown. Next, we were interested to examine whether those cellular variation-changed genes were enriched in expression-changed genes. Venn diagram showed that 47 genes out of total 107 down-regulated genes exhibited increased cellular variation (Fig. 4), which were significantly over-represented (P = 0.29 × 10− 23, χ2 test). These results indicate CTCF knockdown simultaneously reduced the expression level and increased the gene expression noise. Among those genes with decreased expression and increased cellular variation, EGR1 and JUNB played an important role in maintaining the cell type-specific gene regulation. For instance, EGR1 belongs to the EGR family of C2H2-type zinc-finger proteins, and encodes a nuclear protein that participates in transcriptional regulation.

Fig. 4
figure4

Genes showing cellular variation change tend to be differentially expressed genes. Genes with expression decrease and cellular variation increase were significantly over-represented (P = 0.29 × 10− 23, χ2 test). Genes with decreased cellular variation and increased expression level were significantly over-represented (P = 0.48 × 10− 23, χ2 test)

Meanwhile, there were 96 genes out of the total 195 up-regulated genes exhibiting decreased cellular variation (Fig. 4), which were also significantly over-represented (P = 0.48 × 10− 23, χ2 test). The 96 genes with decreased cellular variation and increased expression level were significantly enriched in poly(A) RNA binding, rRNA processing, WD40, purine nucleobase biosynthetic processing, rRNA methylation and RNA methyltransferase activity. It is obvious that those enriched GO terms were associated with basic cellular functions belonging to housekeeping genes. Taken together, our results clearly indicate that distortion of CTCF expression could simultaneously change the gene expression level and cell-to-cell variation of its regulated genes.

Furthermore, we identified CTCF binding sites using CTCF ChIP-seq data in WT EL4 cells from our previous study [18]. We identified each gene-associated CTCF by counting the CTCF binding sites within 20Kb of the transcriptional start site (TSS) for each gene. The numbers of gene-associated CTCF of variation-increased genes are significantly higher than that of variation-decreased genes (P = 0.0033; Wilcoxon test), and are significantly higher than that of variation-unchanged genes (P = 0.16 × 10− 6; Wilcoxon test). The numbers of gene-associated CTCF of variation-decreased genes do not show significant differences from that of variation-unchanged genes (P = 0.5; Wilcoxon test). These results suggest that genes regulated by multiple CTCF binding sites tend to possess a higher cellular variation after CTCF knockdown.

Discussion

CTCF plays an important role in chromatin structure organization and regulation of gene expression [14,15,16,17]. In this study, we used single cell RNA-seq to analyze genome-wide gene expression profiles of WT and CTCF-KD cells at single cell resolution. Indeed, WT cell population and CTCF-KD cell population showed distinct concentration on PC1, indicating that knockdown of CTCF resulted in a systematic impact on the genome-wide gene expression profile. These results further implied that CTCF contributed to key functions in controlling the genome-wide gene regulation. We generated the genome-wide landscape of cell-to-cell variation in both WT and CTCF-KD cells. After comparing cell-to-cell variations between WT and CTCF-KD cells, we identified those genes showing a significant change of cellular variation after CTCF knockdown. Interestingly, the cellular variation-increased genes are significantly enriched in expression-decreased genes, suggesting that CTCF-medicated promoter-enhancer interaction did not only play an important role in maintaining the expression of its regulated genes, but also reduced their expression noise.

In this study, we identified numerous genes with an obvious change of cellular variation after CTCF knockdown, although the knockdown efficiency is moderate. We expected more genes showing alterations of cellular variation with a more efficient CTCF knockdown, which would enhance the conclusion in this study. Interestingly, the variation-increased genes were significantly enriched in GO terms such as chromatin DNA binding, zinc finger proteins and zinc ion binding, implying the expression noise of those zinc finger proteins were strongly increased after CTCF knockdown. The increased cellular variation of zinc finger proteins potentially indicates a high sensitivity of zinc finger proteins to CTCF expression level or cellular environmental change within the cell. In fact, the majority of those zinc finger proteins were transcription factors that played an important role in the regulation of cell type-specific gene expression. Our observation that CTCF knockdown fluctuated the expression of many transcriptional factors further explained why disruption of CTCF expression led to pronounced biological effects such as developmental failure [24, 25]. Taken together, our findings provide convincing evidence that CTCF serves as a key player in stabilizing the gene expression noise of zinc finger related genes.

Conclusion

We conducted single cell RNA-seq on both wild type (WT) cells and CTCF-Knockdown (CTCF-KD) cells using Fluidigm C1 system. Principal component analysis of single cell RNA-seq data showed that WT and CTCF-KD cells concentrated in two different clusters on PC1, indicating a systematic difference of gene expression profiles between WT and CTCF-KD cells. Interestingly, GO terms including regulation of transcription, DNA binding, zinc finger and transcription factor binding were significantly enriched in CTCF-KD-specific highly variable genes, implying tissue-specific genes such as transcription factors were highly sensitive to CTCF level. Dysregulation of transcription factors possibly explains why knockdown of CTCF leads to a systematic change of gene expression. In contrast, housekeeping genes such as rRNA processing, DNA repair and tRNA processing were significantly enriched in WT-specific highly variable genes, potentially due to a higher cellular variation of cell activity in WT cells compared to CTCF-KD cells. We further noticed that cellular variation-increased genes were significantly enriched in down-regulated genes, indicating CTCF knockdown simultaneously reduced the expression levels and increased the expression noise of its regulated genes. To our knowledge, this is the first attempt to explore the genome-wide landscape of cellular variation after CTCF knockdown. Our study not only advances our understanding of CTCF function in maintaining gene expression and reducing expression noise, but also provides a framework for examining gene function.

Methods

Cell culture

EL4 (ATCC® TIB-39™), derived from lymphoma in C57BL/6 N mouse, was purchased from ATCC. EL4 cells were cultured in DMEM (GIBCO Invitrogen) supplemented with 50 IU/mL penicillin, 50 mg/mL streptomycin (GIBCO Invitrogen) and 10% heat-inactivated calf serum (Sigma, USA). Cultures were maintained by replacement of fresh medium every 3 days, and cell density was kept between 1 X 105 and 1 X 106 cells/mL.

Knockdown of CTCF by shRNA

Knockdown of CTCF was performed using Lentiviral-mediated short hairpin RNA (shRNA) in EL4 cells as described previously [18]. Briefly, 293T cells were co-transfected with an envelope plasmid (pLP/VSVG) to generate lentiviral particles. The medium containing lentiviral particles was harvested after 48 h transfection. EL4 cells were infected with the harvested shLuc and shCTCF retroviral particles packaged in GP2–293. GFP+ cells were sorted out to check the knockdown efficiency using RT-qPCR and Western blotting after 5 days of infection. The cell populations displaying efficient knockdown of CTCF were used for single cell RNA-seq.

The following shRNA sequences were used for CTCF knockdown: mouse CTCF-shRNA 1: 5′-GGTGCAATTGAGAACATTATA; mouse CTCF-shRNA 2: 5′-TGGACGATACCCAGATCATAA.

Western blot analyses

After thorough washing, the knockdown (shCTCF) cells and control (shLuc) were harvested for Western blotting analyses. Protein concentration of the cell lysates was measured using BCA kit (Pierce, Rockford, IL). Protein samples (40 μg/lane) were applied to SDS-PAGE followed by Western blotting against anti-CTCF antibody (07–729, Millipore), and anti-GAPDH antibody (sc-1616, Santa Cruz Biotechnology).

Quantitative real-time PCR

Total RNAs from the knockdown (shCTCF#1 and shCTCF#2) and control (shLuc) cells were extracted using miRNeasy Micro Kit (QIAGEN). cDNA was synthesized by using oligo (dT)20 and SuperScript III Reverse Transcriptase (Invitrogen) according to manufacturers’ instructions. RT-qPCR samples were mixed with the following Taqman probe mixture (Applied Biosystems) and run on a LightCycler 96 (Roche): Gapdh: Mm03302249_g1; CTCF: Mm00484027_m1. Results were normalized to the mRNA level of Gapdh.

Single cell RNA sequencing

Fluidigm C1™ Single-Cell Autoprep System (Fluidigm, South San Francisco, CA, USA) was used for single cell RNA-seq. In the initial experiment, shLuc#1 or shCTCF#1 were uploaded to a C1 integrated fluidics circuit (IFC) for cell capture, respectively. We checked the IFC to count the number of captured cells, and to distinguish between live and dead cells for later data processing. After successful completion of the second knockdown, cells from shLuc#2 and shCTCF#2 were treated similarly to the initial experiment. Single-cell RNA-seq with SMARTer protocol (Clontech, Mountain View, CA, USA) was prepared following Fluidigm manual ‘Using the C1 Single-Cell Auto Prep System to generate mRNA from Single Cells and Libraries for Sequencing’. The wells containing either zero or double cells were filtered out. We selected 24 cells with the highest quality from each IFC. The DNA materials obtained from the 96 single cells were sequenced on Illumina HiSeq 3000, as illustrated in Fig. 1c.

Reads mapping and quality control

Quality of the reads was assessed using FASTQC. All reads were aligned to the mouse genome (Ensemble version GRCm38.89) utilizing STAR v.2.5.2 [29, 30]. Unique mapping reads were allowed (using parameter --outFilterMultimapNmax). The alignments were used as input in HTSEQ v.0.9.1 [31] to count the number of reads mapping to each of the 24,057 ref-seq genes in each cell. We filtered out those low-quality cells from our dataset based on a threshold for a minimum of 3000 unique genes per cell. In total, the final dataset contained 95 cells (including 24 cells from shLuc #1, 24 cells from shCTCF #1, 23 cells from shLuc #2 and 24 cells from shCTCF #2) with a mean of 216,157 sequenced reads per cell. Transcripts per million (TPM) was used to normalize the gene expression level and log2 transformed. Furthermore, genes with log2(TPM + 1) > 1 in less than two individual cells were filtered out, leaving a total of 95 samples and 11,361 genes for further analyses.

Statistical analyses and gene enrichment analyses

For identification of genes with biologically significant cell-to-cell variation, we used η2 = σ22 (σ denotes standard deviation; μ denotes mean) to measure the noise of gene expression [27, 28, 32]. We filtered out those genes exhibiting a low expression (log2TPM ≤ 1), since the expression noise is inversely proportional to the expression when the gene expression level is low (log2TPM ≤ 1) (Fig. 3a), leading to 7843 genes remaining for further analyses. To examine any possible enrichment of particular gene categories and pathways in certain gene lists, GO enrichment analysis was performed using DAVID [33, 34]. Multiple comparison corrections for GO enrichment analyses were performed using Benjamini adjustment.

In this study, we used coefficient of variation (CV) to calculate the cell-to-cell variation. The variation increased genes are calculated by

$$ C{V}_{KD}-C{V}_{WT}> mea{n}_{C{V}_{KD}-C{V}_{WT}}+s{d}_{C{V}_{KD}-C{V}_{WT}} $$

while the variation decreased genes are calculated by

$$ C{V}_{KD}-C{V}_{WT}< mea{n}_{C{V}_{KD}-C{V}_{WT}}-s{d}_{C{V}_{KD}-C{V}_{WT}} $$

Availability of data and materials

The single cell RNA-seq data of EL4 cells have been deposited in the Gene Expression Omnibus database with the accession number GSE135769.

Abbreviations

CTCF:

CCCTC-Binding Factor

CTCF-KD:

CTCF-Knockdown

IFCs:

Integrated fluidics circuits

PCA:

Principal component analysis

qPCR:

Quantitative polymerase chain reaction

shLuc:

shRNA luciferase controls

shRNA:

Short hairpin RNA

TADs:

Topologically associated domains

TPM:

Transcripts per million

WT:

Wild type

References

  1. 1.

    Kim TH, Abdullaev ZK, Smith AD, Ching KA, Loukinov DI, Green RD, Zhang MQ, Lobanenkov VV, Ren B. Analysis of the vertebrate insulator protein CTCF-binding sites in the human genome. Cell. 2007;128(6):1231–45.

    CAS  Article  Google Scholar 

  2. 2.

    Ong CT, Corces VG. CTCF: an architectural protein bridging genome topology and function. Nat Rev Genet. 2014;15(4):234–46.

    CAS  Article  Google Scholar 

  3. 3.

    Ren G, Zhao K. CTCF and cellular heterogeneity. Cell Biosci. 2019;9:83.

    Article  Google Scholar 

  4. 4.

    Lobanenkov VV, Nicolas RH, Adler VV, Paterson H, Klenova EM, Polotskaja AV, Goodwin GH. A novel sequence-specific DNA binding protein which interacts with three regularly spaced direct repeats of the CCCTC-motif in the 5′-flanking sequence of the chicken c-myc gene. Oncogene. 1990;5(12):1743–53.

    CAS  PubMed  Google Scholar 

  5. 5.

    Klenova EM, Nicolas RH, Paterson HF, Carne AF, Heath CM, Goodwin GH, Neiman PE, Lobanenkov VV. CTCF, a conserved nuclear factor required for optimal transcriptional activity of the chicken c-myc gene, is an 11-Zn-finger protein differentially expressed in multiple forms. Mol Cell Biol. 1993;13(12):7612–24.

    CAS  Article  Google Scholar 

  6. 6.

    Chung JH, Bell AC, Felsenfeld G. Characterization of the chicken beta-globin insulator. Proc Natl Acad Sci U S A. 1997;94(2):575–80.

    CAS  Article  Google Scholar 

  7. 7.

    Bell AC, Felsenfeld G. Methylation of a CTCF-dependent boundary controls imprinted expression of the Igf2 gene. Nature. 2000;405(6785):482–5.

    CAS  Article  Google Scholar 

  8. 8.

    Yusufzai TM, Tagami H, Nakatani Y, Felsenfeld G. CTCF tethers an insulator to subnuclear sites, suggesting shared insulator mechanisms across species. Mol Cell. 2004;13(2):291–8.

    CAS  Article  Google Scholar 

  9. 9.

    Cuddapah S, Jothi R, Schones DE, Roh TY, Cui K, Zhao K. Global analysis of the insulator binding protein CTCF in chromatin barrier regions reveals demarcation of active and repressive domains. Genome Res. 2009;19(1):24–32.

    CAS  Article  Google Scholar 

  10. 10.

    Van Bortle K, Ramos E, Takenaka N, Yang J, Wahi JE, Corces VG. Drosophila CTCF tandemly aligns with other insulator proteins at the borders of H3K27me3 domains. Genome Res. 2012;22(11):2176–87.

    Article  Google Scholar 

  11. 11.

    Taslim C, Chen Z, Huang K, Huang TH, Wang Q, Lin S. Integrated analysis identifies a class of androgen-responsive genes regulated by short combinatorial long-range mechanism facilitated by CTCF. Nucleic Acids Res. 2012;40(11):4754–64.

    CAS  Article  Google Scholar 

  12. 12.

    Bonora G, Plath K, Denholtz M. A mechanistic link between gene regulation and genome architecture in mammalian development. Curr Opin Genet Dev. 2014;27:92–101.

    CAS  Article  Google Scholar 

  13. 13.

    Rao SSP, Huang SC, Glenn SHB, Engreitz JM, Perez EM, Kieffer-Kwon KR, Sanborn AL, Johnstone SE, Bascom GD, Bochkov ID, et al. Cohesin loss eliminates all loop domains. Cell. 2017;171(2):305–20.e324.

    Article  Google Scholar 

  14. 14.

    Rao SS, Huntley MH, Durand NC, Stamenova EK, Bochkov ID, Robinson JT, Sanborn AL, Machol I, Omer AD, Lander ES, et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell. 2014;159(7):1665–80.

    CAS  Article  Google Scholar 

  15. 15.

    Tang Z, Luo OJ, Li X, Zheng M, Zhu JJ, Szalaj P, Trzaskoma P, Magalska A, Wlodarczyk J, Ruszczycki B, et al. CTCF-mediated human 3D genome architecture reveals chromatin topology for transcription. Cell. 2015;163(7):1611–27.

    CAS  Article  Google Scholar 

  16. 16.

    Handoko L, Xu H, Li G, Ngan CY, Chew E, Schnapp M, Lee CW, Ye C, Ping JL, Mulawadi F, et al. CTCF-mediated functional chromatin interactome in pluripotent cells. Nat Genet. 2011;43(7):630–8.

    CAS  Article  Google Scholar 

  17. 17.

    Dixon JR, Selvaraj S, Yue F, Kim A, Li Y, Shen Y, Hu M, Liu JS, Ren B. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature. 2012;485(7398):376–80.

    CAS  Article  Google Scholar 

  18. 18.

    Ren G, Jin W, Cui K, Rodrigez J, Hu G, Zhang Z, Larson DR, Zhao K. CTCF-mediated enhancer-promoter interaction is a critical regulator of cell-to-cell variation of gene expression. Mol Cell. 2017;67(6):1049–1058.e1046.

    CAS  Article  Google Scholar 

  19. 19.

    Nora EP, Goloborodko A, Valton AL, Gibcus JH, Uebersohn A, Abdennur N, Dekker J, Mirny LA, Bruneau BG. Targeted degradation of CTCF decouples local insulation of chromosome domains from genomic compartmentalization. Cell. 2017;169(5):930–44 e922.

    CAS  Article  Google Scholar 

  20. 20.

    Zuin J, Dixon JR, van der Reijden MI, Ye Z, Kolovos P, Brouwer RW, van de Corput MP, van de Werken HJ, Knoch TA, van IJken WF, et al. Cohesin and CTCF differentially affect chromatin architecture and gene expression in human cells. Proc Natl Acad Sci U S A. 2014;111(3):996–1001.

    CAS  Article  Google Scholar 

  21. 21.

    Guo Y, Xu Q, Canzio D, Shou J, Li J, Gorkin DU, Jung I, Wu H, Zhai Y, Tang Y, et al. CRISPR inversion of CTCF sites alters genome topology and enhancer/promoter function. Cell. 2015;162(4):900–10.

    CAS  Article  Google Scholar 

  22. 22.

    Kang JY, Song SH, Yun J, Jeon MS, Kim HP, Han SW, Kim TY. Disruption of CTCF/cohesin-mediated high-order chromatin structures by DNA methylation downregulates PTGS2 expression. Oncogene. 2015;34(45):5677–84.

    CAS  Article  Google Scholar 

  23. 23.

    Lang F, Li X, Zheng W, Li Z, Lu D, Chen G, Gong D, Yang L, Fu J, Shi P, et al. CTCF prevents genomic instability by promoting homologous recombination-directed DNA double-strand break repair. Proc Natl Acad Sci U S A. 2017;114(41):10912–7.

    CAS  Article  Google Scholar 

  24. 24.

    Shin JO, Lee JJ, Kim M, Chung YW, Min H, Kim JY, Kim HP, Bok J. CTCF regulates Otic neurogenesis via histone modification in the Neurog1 locus. Mol Cell. 2018;41(7):695–702.

    CAS  Google Scholar 

  25. 25.

    Hirayama T, Tarusawa E, Yoshimura Y, Galjart N, Yagi T. CTCF is required for neural development and stochastic expression of clustered Pcdh genes in neurons. Cell Rep. 2012;2(2):345–57.

    CAS  Article  Google Scholar 

  26. 26.

    Robinson MD, McCarthy DJ, Smyth GK. edgeR: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26(1):139–40.

    CAS  Article  Google Scholar 

  27. 27.

    Newman JR, Ghaemmaghami S, Ihmels J, Breslow DK, Noble M, DeRisi JL, Weissman JS. Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise. Nature. 2006;441(7095):840–6.

    CAS  Article  Google Scholar 

  28. 28.

    Taniguchi Y, Choi PJ, Li GW, Chen H, Babu M, Hearn J, Emili A, Xie XS. Quantifying E. coli proteome and transcriptome with single-molecule sensitivity in single cells. Science. 2010;329(5991):533–8.

    CAS  Article  Google Scholar 

  29. 29.

    Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29(1):15–21.

    CAS  Article  Google Scholar 

  30. 30.

    Dobin A, Gingeras TR. Mapping RNA-seq reads with STAR. Curr Protoc Bioinformatics. 2015;51:11.14.1–11.14.19.

    Article  Google Scholar 

  31. 31.

    Anders S, Pyl PT, Huber W. HTSeq--a Python framework to work with high-throughput sequencing data. Bioinformatics. 2015;31(2):166–9.

    CAS  Article  Google Scholar 

  32. 32.

    Brennecke P, Anders S, Kim JK, Kolodziejczyk AA, Zhang X, Proserpio V, Baying B, Benes V, Teichmann SA, Marioni JC, et al. Accounting for technical noise in single-cell RNA-seq experiments. Nat Methods. 2013;10(11):1093–5.

    CAS  Article  Google Scholar 

  33. 33.

    Jiao X, Sherman BT, Huang da W, Stephens R, Baseler MW, Lane HC, Lempicki RA. DAVID-WS: a stateful web service to facilitate gene/protein list analysis. Bioinformatics. 2012;28(13):1805–6.

    CAS  Article  Google Scholar 

  34. 34.

    Huang da W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009;4(1):44–57.

    Article  Google Scholar 

Download references

Acknowledgements

We thank the members of our laboratory for their suggestions and encouragement. The computation was supported by Center for Computational Science and Engineering of Southern University of Science and Technology.

Funding

This study was supported by National Key R&D Program of China (2018YFC1004500), National Science Foundation of China (81872330, 31741077), Shenzhen Innovation and Technology Commission (JCYJ20170817111841427, KQTD20180411143432337) to W.J. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author information

Affiliations

Authors

Contributions

W.J. conceived the project. R.G. performed the experiments and W.W. analyzed the data. N.H. and W.J. supervised the study. W.W., N.H. and W.J. wrote the manuscript. All authors have read and approved the manuscript.

Corresponding authors

Correspondence to Ni Hong or Wenfei Jin.

Ethics declarations

Ethics approval and consent to participate

Not Applicable

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1:

Figure S1. High reproducibility of single cell RNA-seq data. (A) Scatter plot showing the correlation of gene expressions between pooled single cells and bulk data in shLuc. (B) Scatter plot showing the correlation of gene expression between shLuc #1 and shLuc#2. (C) Scatter plot showing the correlation of gene expressions between pooled single cells and bulk data in CTCF-KD cells. (D) Scatter plot showing the correlation of gene expressions between shCTCF#1 and shCTCF#2.

Additional file 2:

Figure S2. Analysis of differentially expressed genes between WT cells and CTCF KD cells. (A) The expression of CTCF is correlated with the cell coordination on PC2. (B) Differentially expressed genes were plotted in MAplot. Significantly up-regulated genes and down-regulated genes were indicated red and blue, respectively. (C) The top 15 enriched GO terms in the 195 up-regulated genes. (D) The top 15 enriched GO terms in the 107 down-regulated genes.

Additional file 3:

Figure S3. Histogram showing that CV difference of gene expression between CTCF-KD cells and WT cells followed the normal distribution.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, W., Ren, G., Hong, N. et al. Exploring the changing landscape of cell-to-cell variation after CTCF knockdown via single cell RNA-seq. BMC Genomics 20, 1015 (2019). https://doi.org/10.1186/s12864-019-6379-5

Download citation

Keywords

  • Single cell RNA-seq
  • Cell-to-cell variation
  • CTCF
  • Change of cellular variation
  • CTCF knockdown