ChIP on SNP-chip for genome-wide analysis of human histone H4 hyperacetylation
© McCann et al; licensee BioMed Central Ltd. 2007
Received: 25 April 2007
Accepted: 14 September 2007
Published: 14 September 2007
SNP microarrays are designed to genotype Single Nucleotide Polymorphisms (SNPs). These microarrays report hybridization of DNA fragments and therefore can be used for the purpose of detecting genomic fragments.
Here, we demonstrate that a SNP microarray can be effectively used in this way to perform chromatin immunoprecipitation (ChIP) on chip as an alternative to tiling microarrays. We illustrate this novel application by mapping whole genome histone H4 hyperacetylation in human myoblasts and myotubes. We detect clusters of hyperacetylated histone H4, often spanning across up to 300 kilobases of genomic sequence. Using complementary genome-wide analyses of gene expression by DNA microarray we demonstrate that these clusters of hyperacetylated histone H4 tend to be associated with expressed genes.
The use of a SNP array for a ChIP-on-chip application (ChIP on SNP-chip) will be of great value to laboratories whose interest is the determination of general rules regarding the relationship of specific chromatin modifications to transcriptional status throughout the genome and to examine the asymmetric modification of chromatin at heterozygous loci.
Chromatin immunoprecipitation (ChIP) is a technique widely used to study interactions of proteins with specific genomic regions . Several methodologies have been devised for the detection of the genomic fragments generated by a ChIP experiment (reviewed in ). In particular, the use of DNA microarray methodology (ChIP-on-chip) allows for high-throughput analysis of thousands of genomic sequences simultaneously . Genome tiling arrays covering entire genomes  can be used to map the sites of DNA-protein interaction on a genomic scale, although at a high cost.
Here, we propose the use of SNP microarrays (SNP-chip) to evaluate ChIP products (ChIP on SNP-chip) as an alternative to genome tiling arrays. SNP microarrays are designed to genotype thousands of Single Nucleotide Polymorphisms (SNPs) by hybridization of genomic fragments to an array of short nucleotide sequences . The evaluation of SNP-chip hybridization in binary terms (a probe in the SNP-chip either does or does not hybridize) might be appropriate for ChIP-on-chip experiments where a large number of DNA-protein interaction sites have to be detected and the evaluation of the strength of each interaction is not an issue. The mapping of histone modifications throughout the genome fulfills these requirements. To illustrate the ChIP on SNP-chip methodology, we present here a genome-wide analysis of histone H4 hyperacetylation in human myoblasts and myotubes.
Histone H4 hyperacetylation has been associated with increased gene expression , and, although the molecular basis of this effect is under investigation , the precise pattern of histone acetylation and its effect on gene expression is not completely understood . The assembly of an active transcriptional complex at the promoter is an essential feature of eukaryotic gene expression [9, 10]. Histone acetylation of the promoter precedes the activation of many genes and is thought to establish a chromatin environment suitable for the assembly of the transcriptional complex [11, 12]. The genomic distribution of H4 acetylation has been studied for individual genes (reviewed in ), across human chromosomes 21 and 22 , and at whole genome level in yeast , but never for a complete mammalian genome. Quantification of the pattern of histone H4 hyperacetylation in the human genome will add valuable information about the mechanisms by which this histone modification affects chromatin structure and controls gene expression.
The analysis of our results indicates significant clusters of H4 hyperacetylation at a range of 300 kilobases in human samples from both myoblasts and myotubes. Complementary analysis of gene expression of the same samples indicates that histone H4 hyperacetylation is positively associated with gene expression at a range of [-300 Kb, +300 Kb] from the start of gene transcription; a much greater range than previously reported. These results show that ChIP on SNP-chip can be used to provide biological insight into how histone H4 hyperacetylation affects both eukaryotic transcription and chromatin's structural integrity.
To study the relationship between histone H4 hyperacetylation and gene expression, we obtained gene expression data from equivalent human myoblast and myotube samples. Analysis of mRNA cellular transcripts using the Affymetrix HGU133A/B chip set produced reproducible results for 38,865 probesets (10,872 hybridized and 25,993 not hybridized) in human myoblasts, and for 36,905 probesets (10,814 hybridized and 26,091 not hybridized) in human myotubes. An overview of the genomic distribution of these results is given in Figure 1.
In contrast to our observation for the SNP-array probes, we did not observe correlation between probesets corresponding to transcripts in neighbouring regions of the genome (Figure 2B, Additional file 2). This is consistent with multiple evidence showing the general lack of gene expression correlation between neighbouring genes in eukaryotic genomes with few exceptions such as the histone genes or Hox genes (see e.g. ). Therefore, the clusters of hyperacetylated H4 that were identified cannot be explained by the correlated expression of clusters of genes.
Imprinted loci should display allele specific distribution of specific chromatin modifications. Therefore, we examined all heterozygous loci following determination of genotypes by SNP analysis of total DNA isolated from myoblasts to identify novel regions displaying reciprocally modified chromatin. Of the SNPs assessed, 2,024 SNPs were found to be heterozygous. 408 and 563 of the heterozygous SNPs had a relative signal above background in myoblasts and in myotubes, respectively. The contribution of each allele to the overall relative signal was equivalent in the majority of these heterozygote SNPs. However, 137 of the 408 and 67 of the 496 heterozygous SNPs had signal from only one allele in myoblasts and myotubes, respectively. In most of the 248 heterozygous SNPs with signal both in myoblasts and in myotubes, the relative contribution of each allele was maintained between the myotube and myoblast samples (20 and 188 with signal from one or two alleles, respectively). However, the other 40 SNPs suggested a change (39 of them from two alleles to one, and just 1 from one allele to the other). Together, these data indicate an unanticipated level of complexity in chromatin structure acetylation present at numerous heterozygous loci.
Discussion and conclusion
We have presented the novel use of SNP-arrays for identification of DNA fragments in a chromatin immunoprecipitation experiment (ChIP on SNP-chip). Our application of this methodology to the analysis of histone H4 hyperacetylation in human myoblasts and myotubes suggests that this histone modification happens in clusters across large genomic distances (possibly up to 300 kilobases).
The study of gene expression in equivalent samples indicates no correlation of gene expression in nearby genes, but that the clusters of hyperacetylated histone H4 are associated with expressed genes at ranges of up to 300 kilobases. These results together suggest that histone H4 hyperacetylation is a relatively imprecise mechanism that acts over genomic regions sometimes spanning multiple genes, that are then expressed or not according to the more precise activity of the transcription factor machinery. It has been suggested that such a mechanism could facilitate rapid gene transcription in response to external stimuli .
We have demonstrated that ChIP on SNP-chip can be used to characterize genomic sites of DNA-protein interaction. ChIP on SNP-chip makes it feasible to establish general rules of association between proteins and DNA but does not provide precise information regarding the strength of individual interaction sites because of the intrinsic noise level of the SNP-array used in this way. We recommend this methodology for experiments like the one presented in this manuscript, where the aim is to identify a large number of genomic sites of protein-DNA interaction. In particular, this methodology is ideally suited to the study of histone modifications. In addition, the use of SNP chips provides information about the allelic distribution of the chromatin modifications under study. This could be of value if the researcher is studying imprinting or alternatively the effect genetic variants on the binding of a protein to DNA. Our analysis of heterozygous loci suggest an unanticipated level of complexity in chromatin acetylation between alleles. Since we performed the analyses presented in this manuscript, newer SNP arrays with higher resolution have been produced. For example, the new Genome-Wide Human SNP Array 6.0 (Affymetrix) features more than 906,600 single nucleotide polymorphisms (SNPs) and more than 946,000 probes for the detection of copy number variation. Without question, the new generation SNP arrays allow thorough coverage of the genome for efficient analysis of ChIP products.
Human fetal skeletal myoblasts (a generous gift from Dr. Eric Shoubridge) were cultured in SKGM-2 media (Cat. # cc-3245, Cambrex, East Rutherford, NJ) containing 10% fetal bovine serum (Hyclone, Logan, Utah) 10 ng/ml human epidermal growth factor, 0.1 mg/mL insulin, 0.5 mg/mL BSA, 0.5 mg/mL fetuin, 0.39 μg/mL dexamethasone and 50 μg/mL gentamicin and maintained at 37°C and 5% CO2. Cells were grown to approximately 70% confluence and were induced to differentiate in to multinucleated myotubes by mitogen withdrawal. Cells were maintained in differentiation media (DMEM supplemented with 2.5% horse serum) for five days prior to harvesting.
RNA extraction and hybridization
Total RNA was extracted from 1 × 106 myoblasts and differentiated myotubes using the RNeasy Mini Kit (Qiagen, Valencia, CA (5 μg) according to the manufacturer's instructions. RNA was quantified, quality-checked by the Bioanalyzer (Agilent) and reverse-transcribed with a cDNA synthesis kit in the presence of SuperScript II RT (Invitrogen-Life Technologies, Inc.) and an oligo dT-T7 primer (Affymetrix Inc., Santa Clara, CA). Ten microliters of purified cDNA were used for the in vitro transcription (IVT) amplification reaction, in the presence of biotinylated nucleotides (Enzo Biochem Inc.). Labeled cRNA (15 μg) was fragmented by incubation at 94°C for 35 min in fragmentation buffer (GeneChip Sample Cleanup, Qiagen) and hybridized competitively against the Affymetrix HG-U133 microarray set. Arrays were scanned using a GeneArray 2500 scanner (Affymetrix) and analyzed using MicroArray Suite 5.0 (Affymetrix).
DNA extraction and chromatin immunoprecipitation
Total genomic DNA was extracted from myoblasts using the Qiagen DNeasy Tissue Kit (Valencia CA) and was used as a control for the SNP-Array studies. Antibody used in Chromatin immunoprecipitation was anti-acetyl-Histone H4 peptide corresponding to amino acids 2–19 of Tetrahymena histone H4 acetylated on Lys5, Lys8, Lys12 and Lys16 (Upstate Cat #06–866). Chromatin immunoprecipitation was performed using the ChIP Assay kit from Upstate Biotech (Charlottesville, VA) (Cat# 17–295) as per manufacturer's instructions. Briefly, 1 × 108 cells were cross-linked for 15 minutes at room temperature with 1% formaldehyde. Cells were washed twice with ice-cold PBS, scraped from tissue culture plates and subsequently resuspended in SDS lysis buffer (1% SDS, 10 mM EDTA, 50 mM Tris-HCl, pH8.1) supplemented with protease inhibitors (Mini Complete, Roche, Cat. # 1836153) and lysed for 10 minutes on ice. DNA was sheared to between 1 and 2 kb by sonication for three 15-second pulses with a 1.5 mm step probe equipped sonicator set to a magnitude of 30%. Sonicated lysates were cleared by centrifugation for 10 minutes at 13,000 rpm at 4°C. Cleared lysates were diluted 10 fold in 0.01% SDS, 1.1% Triton ×-100, 1.2 mM EDTA, 16.7 mM Tris HCl, pH8.1, 167 mM NaCl and incubated with 70 μl of Salmon Sperm DNA/Protein A Agarose-50% slurry (Upstate Biotech, Cat# 16–157C) to reduce non-specific background for 1 hour at 4°C with rotation. Five μls of Anti-acetyl-H4 (Upstate Biotech, Cat. # 06–866) was added to the lysates (per 106 cell equivalents). Immune complexes were recovered with Salmon Sperm DNA/Protein A Agarose and washed twice with low salt buffer (0.1% SDS, 1% Triton ×-100, 2 mM EDTA, 20 mM Tris HCl, pH8.1, 150 mM NaCl); twice with high salt buffer (0.1% SDS, 1% Triton ×-100, 2 mM EDTA, 20 mM Tris HCl, pH 8.1, 500 mM NaCl); twice with LiCl buffer (0.25 M LiCl, 1% NP40, 1% deoxycholate, 1 mM EDTA, 10 mM Tris HCl, pH 8.1), and twice with TE. Immune complexes were eluted with an elution buffer containing 1% SDS and 0.1 M NaHCO3 for 30 min at room temperature with rotation. DNA-protein crosslinks were reversed for 4 hrs at 65°C with 0.2 mM NaCl. Samples were deproteinated and DNA isolated using columns as described by the manufacturer (Qiagen). DNA was eluted from columns with 30 μls of ddH2O.
Immunoprecipitated DNA and control (whole-cell extract) DNA were assayed according to the protocol (GeneChip Mapping Assay manual) supplied by Affymetrix. Briefly, a total of 250 ng DNA was digested with XbaI then ligated to XbaI adaptor before PCR amplification. Cycling conditions were: 95°C for 3 minutes followed by 35 cycles of 95°C for 20 seconds, 59°C for 15 seconds, and 72°C for 15 seconds. Final extension was done at 72°C for 7 minutes (DNA Engine Tetrad PTC-225, MJ Research, Waltham, MA). To evaluate PCR products, 3 μL of each PCR product was mixed with 3 μL of 2× gel loading dye on 2% Tris-borate EDTA gel and run at 120 V for 1 hour to check for the expected product between 250 and 1,000 bp. Twenty μg of PCR product was fragmented with DNase I and biotinylated overnight at 37°C using biotin-N6-ddATP (Perkin Elmer) and terminal transferase (Promega, Madison WI). Target hybridization, washing, scanning and staining were performed as recommended by the manufacturer.
The 10 K SNP arrays were scanned with the Affymetrix GeneChip Scanner 3000 using GeneChip Operating System 1.0 (Affymetrix). Data files were generated automatically and genotype calls were made automatically by GeneChip DNA Analysis Software 2.0 (Affymetrix). Genomic location of SNP probesets was derived using the Affymetrix GeneChip Mapping 10 K library file (Mapping10K_Xba131) and the Ensembl human genome map (v27.35a.1), based on the release 35 of the human genome sequence (May 2004). Each probeset is identified in the Affymetrix annotation by its tscID (The SNP Consortium). These IDs are mapped to rsIDs (NCBI RefSNP ID), which are used to identify SNP positions in the Ensembl database.
SNP-Array probeset selection and analysis
This study used a pre-commercial release version of the Affymetrix GeneChip Mapping 10 K SNP Array for the human genome (ax13339). This microarray contains probesets for 10,043 SNPs distributed across the 22 autosomes and the X chromosome. An initial selection process identified probesets that map to an unambiguous location in the human genome; tscIDs were mapped to rsIDs, for which genomic locations can be found in Ensembl. Screening removed 448 tscIDs, which could not be mapped to any rsID; 2,362 rsIDs not localized in Ensembl; and 33 with multiple locations. The remaining 7,200 probesets, which have a unique location, were used in our analysis.
As a further selection for high quality SNP data, we performed a control genotyping experiment in triplicate using the total DNA from the same myoblast cell line. Of the 7,200 uniquely mapped probesets, only the 6,464 for which the same genotyping was obtained in at least two of the three replicates were used for our analysis.
The analysis of SNP-array data was performed in biological triplicates. We analyzed the results from SNP arrays using the Genotyping Tools V 1.0 (Affymetrix). This tool assesses whether measurable hybridization to a probeset is present, and provides heterozygosity data. Probes were classified as hybridizing or not based on calls of Present (P) (present in at least two of three replicates) or Absent (A) (absent in at least two of three replicates). All other probesets were not included in further analysis.
DNA microarray probeset selection and analysis
The Affymetrix HGU133A/B gene expression array set contains 44,760 probesets from genes on the 22 autosomes and on chromosomes X and Y. We selected the probesets that were unambiguously mapped to a genomic position in the NetAffx probeset annotations  (April 12, 2005, http://www.affymetrix.com/, based on NCBI release 35 of the human genome sequence). Screening removed 858 probesets without a genomic location in NetAffx, 3,114 with multiple locations, and 52 corresponding to genes in chromosome Y, which is not covered by the SNP Array. The remaining 40,736 probesets with a unique genomic location were used in our analysis.
We analyzed the results from the expression arrays using the MicroArray Suite 5.0, (Affymetrix). This tool assesses whether measurable hybridization to a probeset is present. Analysis of gene expression was performed in biological triplicates. Probesets were classified as hybridizing or not based on calls of Present (P) (present in at least two of three replicates) or Absent (A) (absent in at least two of three replicates), respectively. All other probesets were not included in further analysis.
Availability of microarray data in GEO
We have deposited all microarray data used in this work at the Gene Expression Omnibus database (National Center for Biotechnology Information) where it is available under the super-series identifier GSE4133. This super series is composed of the following subset series: GSE4131 (triplicate Affymetrix HGU133A/B expression chips hybridized to RNA extracted from myoblasts and myotubes) and GSE4132 (triplicate Affymetrix 10 K ax 13339 SNP-chips were hybridized to DNA from myoblasts (control), myoblast ChIP, and myotube ChIP). The gene expression data is also available at the StemBase database of stem cell gene expression data  under experiment identifier E204 (containing sample S267 for the myoblasts, and sample S273 for the myotubes).
The authors would like to thank Kathy Sheikheleslamy, Neal Sanche, and Dr. Eric Shoubridge for various technical aspects of the paper. MAR holds the Canada Research Chair in Molecular Genetics and is a Howard Hughes Medical Institute International Research Scholar. MAA holds a Canada Research Chair in Bioinformatics. This work was supported by funding from Genome Canada, the Canadian Institutes of Health Research, the National Institutes for Health Research, the Ontario Research Development Challenge Fund, and the Canada Research Chair Program.
- Solomon M, Larsen P, Varshavsky A: Mapping protein-DNA interactions in vivo with formaldehyde: evidence that histone H4 is retained on a highly transcribed gene. Cell. 1988, 53 (6): 937-947. 10.1016/S0092-8674(88)90469-2.PubMedView ArticleGoogle Scholar
- Rodriguez B, Huang T: Tilling the chromatin landscape: emerging methods for the discovery and profiling of protein-DNA interactions. Biochem Cell Biol. 2005, 83 (4): 525-534. 10.1139/o05-055.PubMedView ArticleGoogle Scholar
- Ren B, Robert F, Wyrick J, Aparicio O, Jennings E, Simon I, Zeitlinger J, Schreiber J, Hannett N, Kanin E: Genome-Wide Location and Function of DNA Binding Proteins. Science. 2000, 290 (5500): 2306-2309. 10.1126/science.290.5500.2306.PubMedView ArticleGoogle Scholar
- Bertone P, Stolc V, Royce T, Rozowsky J, Urban A, Zhu X, Rinn J, Tongprasit W, Samanta M, Weissman S: Global identification of human transcribed sequences with genome tiling arrays. Science. 2004, 306 (5705): 2242-2246. 10.1126/science.1103388.PubMedView ArticleGoogle Scholar
- Wang DG, Fan JB, Siao CJ, Berno A, Young P, Sapolsky R, Ghandour G, Perkins N, Winchester E, Spencer J: Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome. Science. 1998, 280 (5366): 1077-1082. 10.1126/science.280.5366.1077.PubMedView ArticleGoogle Scholar
- Allfrey VG, Faulkner R, Mirsky AE: Acetylation and Methylation of Histones and Their Possible Role in the Regulation of Rna Synthesis. Proc Natl Acad Sci USA. 1964, 51: 786-794. 10.1073/pnas.51.5.786.PubMed CentralPubMedView ArticleGoogle Scholar
- Shogren-Knaak M, Ishii H, Sun JM, Pazin MJ, Davie JR, Peterson CL: Histone H4-K16 acetylation controls chromatin structure and protein interactions. Science. 2006, 311 (5762): 844-847. 10.1126/science.1124000.PubMedView ArticleGoogle Scholar
- Shia WJ, Pattenden SG, Workman JL: Histone H4 lysine 16 acetylation breaks the genome's silence. Genome Biol. 2006, 7 (5): 217-10.1186/gb-2006-7-5-217.PubMed CentralPubMedView ArticleGoogle Scholar
- Felsenfeld G, Boyes J, Chung J, Clark D, Studitsky V: Chromatin structure and gene expression. Proc Natl Acad Sci USA. 1996, 93 (18): 9384-9388. 10.1073/pnas.93.18.9384.PubMed CentralPubMedView ArticleGoogle Scholar
- Blackwood EM, Kadonaga JT: Going the distance: a current view of enhancer action. Science. 1998, 281 (5373): 60-63. 10.1126/science.281.5373.60.PubMedView ArticleGoogle Scholar
- Wade PA, Wolffe AP: Transcriptional regulation: SWItching circuitry. Curr Biol. 1999, 9 (6): R221-224. 10.1016/S0960-9822(99)80134-1.PubMedView ArticleGoogle Scholar
- Struhl K: Histone acetylation and transcriptional regulatory mechanisms. Genes Dev. 1998, 12 (5): 599-606.PubMedView ArticleGoogle Scholar
- Bernstein BE, Kamal M, Lindblad-Toh K, Bekiranov S, Bailey DK, Huebert DJ, McMahon S, Karlsson EK, Kulbokas EJ, Gingeras TR: Genomic maps and comparative analysis of histone modifications in human and mouse. Cell. 2005, 120 (2): 169-181. 10.1016/j.cell.2005.01.001.PubMedView ArticleGoogle Scholar
- Kurdistani SK, Tavazoie S, Grunstein M: Mapping global histone acetylation patterns to gene expression. Cell. 2004, 117 (6): 721-733. 10.1016/j.cell.2004.05.023.PubMedView ArticleGoogle Scholar
- Huynen MA, Snel B, Bork P: Inversions and the dynamics of eukaryotic gene order. Trends Genet. 2001, 17 (6): 304-306. 10.1016/S0168-9525(01)02302-2.PubMedView ArticleGoogle Scholar
- Chang S, Aune TM: Histone hyperacetylated domains across the Ifng gene region in natural killer cells and T cells. Proc Natl Acad Sci USA. 2005, 102 (47): 17095-17100. 10.1073/pnas.0502129102.PubMed CentralPubMedView ArticleGoogle Scholar
- Liu G, Loraine A, Shigeta R, Cline M, Cheng J, Valmeekam V, Sun S, Kulp D, Siani-Rose M: NetAffx: Affymetrix probesets and annotations. Nucleic Acids Res. 2003, 31 (1): 82-86. 10.1093/nar/gkg121.PubMed CentralPubMedView ArticleGoogle Scholar
- Perez-Iratxeta C, Palidwor G, Porter CJ, Sanche NA, Huska MR, Suomela BP, Muro EM, Krzyzanowski PM, Hughes E, Campbell PA: Study of stem cell function using microarray experiments. FEBS Lett. 2005, 579 (8): 1795-1801. 10.1016/j.febslet.2005.02.020.PubMedView ArticleGoogle Scholar