Cells
Cultured MEC1 cells were stained with LIVE/DEAD fixable Aqua stain (Invitrogen) to allow for excluding cells dead already prior to fixation (during subsequent FACS sorting) and fixed using 1% PFA (Pierce). Aliquots of 10 k cells were FACS sorted directly into 100 μl SDS lysis buffer (50 mM Tris/HCl, 0.5% SDS, and 10 mM EDTA) supplemented with 1X cOmplete EDTA-free protease inhibitor (Roche) and stored at − 80 °C until use. For aliquots of cells (50 and 150 k), where the sheath fluid volume is non-negligible, cells were sorted into PBS, spun down (2000 g 5 min) and resuspended in 100 μl SDS lysis buffer prior to freezing. Sorting was performed using a BD FACSAriaIIu cell sorter (BD Biosciences) with an 85 μm nozzle.
Chromatin immunoprecipitation and tagmentation
For ChIP, polyclonal anti-H3K27Ac (Diagenode, cat# C15410196, lot# A1723-0041D) antibody or anti-CTCF (Diagenode, cat# C15410210, lot# A2359-00234P) antibody was added to Protein G-coupled Dynabeads (ThermoFisher) in PBS with 0.5% BSA and incubated with rotation for 4 h at 4 °C (0.5 h at RT for HT-ChIPmentation samples processed in a single day). For 50–150 k cells, 10 μl beads incubated with 3 μg H3K27Ac or 1.5 μg CTCF antibody were used per ChIP. For 0.1–10 k cells, 2 μl beads incubated with 0.6 μg H3K27Ac or 0.3 μg CTCF antibody were used per ChIP. Fixed cells (FACS sorted) frozen in SDS lysis buffer were thawed at room temperature. To perform ChIP on < 10 k cells, aliquots were diluted with SDS lysis buffer and 100 μl containing the appropriate number of cells were processed. Cells were sonicated for 12 cycles of 30 s on/30 s off on high power using a Bioruptor Plus (Diagenode). To neutralize the SDS, Triton X100 was added to a final concentration of 1% along with 2 μl 50x cOmplete protease inhibitor (final 1x). Samples were incubated at room temperature for 10 min and when applicable 5% aliquots were saved for preparation of input controls. Antibody-coated Dynabeads were washed with PBS with 0.5% FCS and mixed with cell lysate in PCR tubes. Tubes were incubated rotating overnight (or 4 h for HT-ChIPmentation samples processed in a single day) at 4 °C.
Immunoprecipitated chromatin was washed with 150 μl of low-salt buffer (50 mM Tris/HCl, 150 mM NaCl, 0.1% SDS, 0.1% NaDOC, 1% Triton X-100, and 1 mM EDTA), high-salt buffer (50 mM Tris/HCl, 500 mM NaCl, 0.1% SDS, 0.1% NaDoc, 1% Triton X-100, and 1 mM EDTA) and LiCl buffer (10 mM Tris/HCl, 250 mM LiCl, 0.5% IGEPAL CA-630, 0.5% NaDOC, and 1 mM EDTA), followed by two washes with TE buffer (10 mM Tris/HCl and 1 mM EDTA) and two washes with ice cold Tris/HCl pH 8. For tagmentation, bead bound chromatin was resuspended in 30 μl of tagmentation buffer, 1 μl of transposase (Nextera, Illumina) was added and samples were incubated at 37 °C for 10 min followed by two washes with low-salt buffer.
High-throughput ChIPmentation library preparation
For High-throughput ChIPmentation (HT-CM) samples, bead bound tagmented chromatin was diluted in 20 μl of water. PCR master mix (Nextera, Illumina) and indexed amplification primers [17] (0.125uM final concentration) was added and libraries prepared using the following program: 72 °C 5 min (adapter extension); 95 °C 5 min (reverse cross-linking); followed by 11 cycles of 98 °C 10s, 63 °C 30s and 72 °C 3 min.
For preparation of HT-CM compatible input controls, 1 μl of 50 mM MgCl2 was added to 5 μl sonicated lysate (5% aliquot of 10 k samples) to neutralize the EDTA in the SDS lysis buffer. Thirty microliters of tagmentation buffer and 1 μl transposase (Nextera, Illumina) was added, and samples were incubated at 37 °C for 10 min. 22.5 μl of the transposition reaction were combined with 15 μl of PCR master mix and 2.5 μl of primer mix (Nextera, Illumina). Libraries were subsequently amplified as described for HT-ChIPmentation samples.
ChIPmentation library preparation
For standard reverse crosslinking, chromatin complexes were diluted with 200 μl ChIP elution buffer (10 mM Tris/HCl, 0.5% SDS, 300 mM NaCl, and 5 mM EDTA) and 2 μl of 20 μg/ml proteinase K (Thermo Scientific). Samples were vortexed and incubated with shaking overnight at 65 °C. After reverse crosslinking, 1 μl 20 μg/ml RNase (Sigma) was added and incubated at 37 °C for 30 min. After another 2 h of incubation with 2 μl of proteinase K (20 mg/ml) at 55 °C, samples were placed in a magnet to trap magnetic beads and supernatants were collected. DNA purification was carried out using Qiagen MinElute PCR Purification Kit. Fifteen microliters of PCR master mix and 5 μl of primer mix (Nextera, Illumina) was added to 20 μl of eluted DNA, and libraries were amplified as described for HT-ChIPmentation libraries.
Preparation of conventional input control
Sonicated material from 50 k cells was reverse crosslinked as described for ChIPmentation. Two nanograms of DNA was used for library preparation using the ThruPLEX DNA-seq kit (Rubicon Genomics) with 11 cycles of PCR amplification.
Post-PCR library cleanup and sequencing
After PCR amplification, library cleanup was done using Agencourt AmPureXP beads (Beckman Coulter) at a ratio of 1:1. DNA concentrations in purified samples were measured using the Qubit dsDNA HS Kit (Invitrogen). Libraries were pooled and single-end sequenced (50 cycles) using the Nextseq500 platform (Illumina).
Basic processing of ChIP-seq and input control sequencing data
Quality of the sequenced samples was assessed using FastQC v0.11.5 [18]. Samples were mapped to the human reference genome (hg19) using Bowtie2 v2.2.3 [19] with default settings. Further basic processing was performed using HOMER v4.8.3 [20]. Specifically, mapped reads were converted into tagdirectories by the makeTagDirectory command using settings for the human genome (-genome hg19) and removing duplicate reads by allowing only one tag to start per base pair (-tbp 1).
Genome browser visualizations
Bedgraphs were created for each sample using HOMER’s makeUCSCfile. Tracks were uploaded and visualized using the UCSC genome browser [21].
Peak finding and plotting peak metrics
Peak finding was performed using the findPeaks command in HOMER. Peaks were called using default settings for histone modifications (-style histone) and transcription factors (-style factor) for H3K27Ac and CTCF respectively with input (-i) as a control. Visualization was done in R v3.1.0 [22], using the built in barplot and boxplot R-functions to plot peak numbers and peak quality scores, respectively.
Making and annotating peak catalogs
Peak catalogs were created by merging all peak files of samples analyzed using HOMER’s mergePeaks command. Setting used (-size given) ensured that peaks with literal overlap were merged to one peak while peaks unique to one sample were directly added to the peak catalog. Subsequently, peak catalogs were annotated with unnormalized (-raw) read counts within peaks in the catalog for each individual sample using HOMER’s annotatePeaks.pl script.
Plotting peak read distributions and correlation between samples
Raw counts were log normalized in R as follows: log(df[,countsCols]+1,2). Log2 counts were subsequently plotted using the build in boxplot R-function. These same Log2 counts were used to calculate sample correlations, using the build-in cor R-function with spearman correlation. Correlation matrices were visualized with the pheatmap function from the pheatmap R-package using color scales generated with the build-in colorRampPalette R-function.
Plotting reads within 1 kb bins for input control samples
A file containing 1 kb bins covering the whole genome was created using the makewindows command from bedtools v2.26.0 [23] using a window size of 1 kb (-w 1000). Chromosome sizes were retrieved as follows: mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -e "select chrom, size from hg19.chromInfo" > hg19.genome. Raw reads in each 1 kb bin for each input control were counted using HOMER’s annotatePeaks.pl script, as described above. Raw read distributions were converted to RPKM in R based on the standard RPKM formula. Resulting RPKM distributions were plotted with the build-in boxplot R-function.
Determining top peak overlap
Peaks identified in individual samples were overlapped with in-house code using the IRanges [24] R-package. Top peaks overlap was considered to be the percentage of high quality peaks (50% of peaks with highest quality scores) in the reference sample that overlap (≥1 bp) with a peak in the second sample. For purposes of determining peak overlap, CTCF peaks were extended with 50 bp up and downstream, considering findPeaks with -style factor only calls a small region around the peak maximum. Peak overlaps were visualized using the pheatmap function from the pheatmap R-package using color scales generated with the build-in colorRampPalette R-function.
Comparing library complexity
To compare duplication rates between HT-ChIPmentation and ChIPmentation samples, fastq files were randomly down-sampled to the total number of reads in the smallest file for each cell number. Down sampling was performed using the fastq-sample script from fastq-tools v0.8 [25]. Fraction of unique reads was subsequently determined for each file using FastQC v0.11.5.
Motif enrichment analysis
Enrichments of known transcription factor binding motifs in peaks were identified using HOMER’s findMotifsGenome.pl script with default settings.