Distributed probing of chromatin structure in vivo reveals pervasive chromatin accessibility for expressed and non-expressed genes during tissue differentiation in C. elegans
© Sha et al; licensee BioMed Central Ltd. 2010
Received: 21 July 2010
Accepted: 6 August 2010
Published: 6 August 2010
Tissue differentiation is accompanied by genome-wide changes in the underlying chromatin structure and dynamics, or epigenome. By controlling when, where, and what regulatory factors have access to the underlying genomic DNA, the epigenome influences the cell's transcriptome and ultimately its function. Existing genomic methods for analyzing cell-type-specific changes in chromatin generally involve two elements: (i) a source for purified cells (or nuclei) of distinct types, and (ii) a specific treatment that partitions or degrades chromatin by activity or structural features. For many cell types of great interest, such assays are limited by our inability to isolate the relevant cell populations in an organism or complex tissue containing an intertwined mixture of other cells. This limitation has confined available knowledge of chromatin dynamics to a narrow range of biological systems (cell types that can be sorted/separated/dissected in large numbers and tissue culture models) or to amalgamations of diverse cell types (tissue chunks, whole organisms).
Transgene-driven expression of DNA/chromatin modifying enzymes provides one opportunity to query chromatin structures in expression-defined cell subsets. In this work we combine in vivo expression of a bacterial DNA adenine methyltransferase (DAM) with high throughput sequencing to sample tissue-specific chromatin accessibility on a genome-wide scale. We have applied the method (DALEC: Direct Asymmetric Ligation End Capture) towards mapping a cell-type-specific view of genome accessibility as a function of differentiated state. Taking advantage of C. elegans strains expressing the DAM enzyme in diverse tissues (body wall muscle, gut, and hypodermis), our efforts yield a genome-wide dataset measuring chromatin accessibility at each of 538,000 DAM target sites in the C. elegans (diploid) genome.
Validating the DALEC mapping results, we observe a strong association between observed coverage by nucleosomes and low DAM accessibility. Strikingly, we observed no extended regions of inaccessible chromatin for any of the tissues examined. These results are consistent with "local choreography" models in which differential gene expression is driven by intricate local rearrangements of chromatin structure rather than gross impenetrability of large chromosomal regions.
Recent advances in sequencing technology have allowed experimentalists a global view of the relationship between chromatin structure and genomic activity during development. By combining chromatin immunoprecipitation (ChIP) with high throughput sequencing or DNA microarrays (ChIP-Seq or ChIP-chip), it is possible to query the genomic localizations of specific transcription factors, histone modifications, and chromatin remodelling factors. Chromatin state maps from ChIP-Seq and ChIP-chip experiments along with data from genome-wide nuclease accessibility studies can be used to define molecular landscapes (including transcription start sites, regions of active transcription, enhancers, euchromatin, heterochromatin, etc.) on a genome-wide scale. Implicit in this analysis is the assumption that a cell's chromatin signature, or its epigenome, will be highly diagnostic of function.
One metric of chromatin structure is accessibility of the DNA in bulk chromatin. DNA-modifying enzymes such as nucleases [1, 2] and methyltransferases [3–5] have proven to be useful tools for defining susceptible regions. Accessible DNA may in some cases define regions of "open chromatin" that allow access to DNA binding factors such as transcription factors. In contrast, less accessible DNA may define regions of relatively compact chromatin and is often characterized by transcriptional inactivity.
Cleavage of chromatin by micrococcal nuclease (MNase) has been a standard method for examining nucleosome positioning and regional accessibility, both in vitro and in situ[6, 7]. Despite the substantial information that can come from MNase studies, this enzyme is known to have specific sequence and structural preferences [8–10], generating a well-recognized need for additional reagents and methods to independently survey genome accessibility. Several alternative nuclease or other approaches have been used for localized studies of accessibility [2, 11–15], each has its own potential advantages and potential biases.
One limitation of available methods for genomewide analysis of chromatin structures in specific cell types has been the need to isolate the individual cell type of interest in considerable bulk. Epigenome characterization methods using nucleases and other destructive probes have thus been applied only in the narrow range of biological systems where individual cell types can be isolated (or in whole organisms or mixed-cell-type tissues, where the results represent an amalgamation of the numerous constituent cell types). For many of the most interesting biological questions, the cell groups of interest are surrounded by (and embedded in) other very different cell types, making uniform-cell-type preparations impossible on the scale currently needed for genome-wide analysis. As one means of addressing this challenge, transgene-driven expression can be used to produce a specific probe in a defined cell type. This type of approach requires a probe that can detectably modify chromatin or DNA without killing or substantially disturbing the relevant cells.
DNA adenine methyltransferase (DAM) catalyzes the addition of a methyl group to the adenine base in the sequence GATC . DAM activities are used by a number of bacterial phages and prokaryotes. In E. coli, DAM is involved in a multitude of cellular processes, including DNA replication, mismatch repair, control of gene expression, and restriction-modification immunity [17, 18].
DAM from E. coli has been used to analyze chromatin structure in eukaryotic cells [19–21] and there is good localized agreement between chromatin structure inferred using DAM and other techniques, such as nuclease hypersensitivity mapping [4, 22]. Importantly, sensitivity to DAM methylation correlates well with transcriptional activity [23–25]. These results support the use of susceptibility to DAM methylation as a measure of chromatin structure. Key advantages of using DAM to probe eukaryotic chromatin structure are (i) the ability to assay accessibility in a living cell, (ii) the ability to assay designated sets of cells or tissues in complex biological samples using transgenes with regulated promoters, and (iii) the lack of any known background of DAM activity in eukaryotes. The addition of 6-methyl adenosines at sites in eukaryotic genome is apparently well tolerated in a variety of tissues, as functional expression of DAM methyltransferase in yeast, Drosophila, and mammalian cells has not resulted in any apparent phenotypes [5, 19, 22, 26, 27].
Previous studies of chromatin structure using DAM were limited to the investigation of only a few loci due to the low throughput nature of Southern blotting experiments. In this article, we describe a method that couples DAM methylation to high throughput sequencing. We termed our method DALEC and applied it to investigate in vivo chromatin structure in three transgenic C. elegans strains, each expressing DAM from a tissue-specific promoter. We provide evidence that genic regions remain in an accessible state that can be probed with DAM activity even when not expressed, with features of chromatin structure inferred from DAM accessibility concordant with nucleosome positioning and expression data derived from independent sources.
Engineering E. coli dam methyltransferase for expression in C. elegans
We first adapted the E. coli dam gene coding region for expression in C. elegans. We've previously found that introns incorporated into the coding regions can improve expression of transgenes in C. elegans. The dam gene was cloned by PCR from E. coli strain OP50. Two introns were incorporated at blunt cutting restriction sites (Additional File 1, Figure S1). Two myo-3 promoter constructs driving dam were produced. In pPD177.01, DAM was designed to express as a fusion to GFP; while in pPD176.59, a nuclear localization signal (NLS) from SV40 was included. Each construct was then incorporated into transgenic strains using pha-1(+) as a selectable genetic marker in a pha-1(e2123ts) genetic background . The two resulting transgenic lines were designated PD3994 (harboring pPD176.59) and PD5122 (harboring pPD177.01).
DAM methyltransferase is active in C. elegans
The difference between DAM exposed and non-exposed DNA was clear in the corresponding Southern blots. Distinct Dpn I cleavage bands that were present in the PD3994 (Figure 2b) and PD5122 (Figure 2f) lanes were absent from the N2 lanes. An important feature of these gels is the appearance of bands indicative of incomplete digestion of individual fragments by Dpn I. Such bands were reproducibly observed in digestions of DNA from DAM-expressing lines (Figure 2b,f and data not shown), consistent with DAM modification of a subset (and not all) GATC sites in any given chromosomal molecule.
Restriction by Mbo I likewise revealed differences between DAM-exposed and wildtype DNAs. On bulk ethidium staining, genomic DNAs from PD3994 (Figure 2c, arrow) and PD5122 (Figure 2g, arrow) were left uncut by Mbo I compared to N2 DNA. The presence of considerable levels of digested DNA (smears in the agarose gels for PD3994 and PD5122) can be explained by the fact that the transgene arrays (ccEx3994 and ccEx5122) are driven by a muscle promoter and expressed only in muscle cells; non-muscle tissues (comprising about 90% of the body mass) would have been substrates for Mbo I restriction. In the Southern blots for Mbo I digests, (Figure 2d,h), wildtype DNA was essentially completely digested while DNA from transgenic animals were only partially digested. The unrestricted and partially restricted DNA (Figure 2d,h, arrows) would be expected to represent methylated DNA from muscle tissue. The ability of Sau3A I to cut GATC irrespective of DAM methylation provided a control for DNA quality and restrictability. As expected, identical Sau3A I patterns were observed in wildtype and DAM-expressing strains (compare Sau3A I lanes in Figs. 2c versus 2d, and 2i versus 2j). In combination, the Southern blot analysis demonstrates the ability of expressed DAM to modify the C. elegans genome in an extensive but limited manner.
We next carried out a pilot mapping of DAM methylation by conventional cloning and sequencing of Dpn I products. Each cloned fragment would have been expected to carry full methylation at both ends, with no methylation or hemi-methylation of intervening GATC sites which were not cleaved (Additional File 2, Figure S2). As a control, the cloning protocol was performed in parallel on wild type C. elegans DNA. Only small numbers of clones were recovered in this case, all derived from bacteria sequences or lacking the characteristic Dpn I-cleaved ends (data not shown). By contrast, large numbers of clones could be obtained from PD3994 and PD5122: 335 non-redundant Dpn I fragments were characterized from these (168 from PD3994 and 167 from PD5122), of which 314 had termini derived from Dpn I cleavage at methylated GATC sites (Additional File 2, Table S1). These sequences were distributed throughout the genome (Additional File 2, Figure S3) and spanned exons, introns, exon-intron junctions, and non-annotated (intergenic) regions (Additional File 2, Tables S2 and S3). In addition to the genomic distribution, it was of interest to observe fragments from muscle-expressed and non-muscle expressed genes (Additional File 2, Tables S4 and S5).
Profiling genome-wide chromatin accessibility using Direct Asymmetric Ligation End Capture (DALEC)
We constructed and sequenced libraries from animals expressing myo-3::dam (PD3994, muscle), rol-6::dam-GFP (PD3995, hypodermis) and vit-2::dam-GFP (PD3997, gut), with a control library from N2 DNA treated in vitro with DAM methyltransferase. Animals were staged to maximize the mass of the tissue expressing the DAM protein (L1 for muscle; L4 and young adults for hypodermis; adults for gut). A combined total of 28.1 million raw reads were obtained from two separate sequencing runs. Linker sequences could be successfully parsed out from 25,319,003 reads. We considered a parsing event successful if the resulting product had the structure 5'-N16-19GA-3'. Parsed readouts were in turn converted to a 16 nucleotide format containing only the 16 nucleotides immediately upstream of the 3' GA. Parsed tags were then mapped to a database containing all filtered DAM tags in the C. elegans WS170 genome. A "hit" was considered only if there was a 16 nucleotide perfect match between the Solexa readout and the in silico-generated tag. Since each 16 nt tag in the database actually represented a molecule of the structure 5-(N)16GATC-3', our criterion in actuality required a 20 nt perfect match. In total, 11,229,470 reads could be aligned to the genome. After exclusion of repetitive and "proximal" tags (see Additional File 2, Figure S4 for definition and filtering of "proximal" tags), we obtained 9,651,128 tags that aligned uniquely to the genome, representing an average of 9-fold coverage (per GATC site in the diploid genome). These were used for all analyses described below.
Relationship between expression and DAM accessibility
A periodic DAM accessibility profile that correlates with nucleosome positioning at promoters
To characterize chromatin structure and accessibility on a subgenic level, we compared DAM methylation profile with previously published nucleosome position datasets for C. elegans. Of particular interest are datasets of total nucleosome positioning  and those in which a nucleosomal population was enriched for active chromatin by immunoprecipitation using antibodies to modified histones, in particular methylation on lysine 4 of histone H3 . The H3K4me2/3 nucleosome occurs at the 5' end of actively expressed genes and displays a high degree of constraint and phasing characteristics of its positioning, with bulk nucleosomes showing a much lower degree of reproducibility in positioning . A prominent "peak" H3K4me2/3 nucleosome can be readily detected at the 5' end of 3,904 genes, the majority of which are house-keeping genes (e.g. ribosomal proteins). To represent the relationship between nucleosome position and DAM accessibility, we calculated n numbers of DAM tags each position (relative to the peak dyad, normalizing to the number of DAM sites at the same position). A signal in this analysis depends on significant local positioning of nucleosomes, and not surprisingly we obtained little signal with the less-positioned bulk nucleosomes (data not shown).
Regions that are upstream to the H3K4me2/3 peak nucleosomes have been found to display a low level of observed nucleosome coverage. Correspondingly, the overall DAM methylation level is higher in regions upstream than in regions downstream to the H3K4me2/3 peak nucleosome.
Nucleosome coverage at ≈150bp upstream ("NDR" in Figure 6) to the dyad of the H3K4me2/3 peak nucleosome has a lowest level. If DNA in this region tends to be in an unprotected state, one would expect to observe a high level of DAM accessibility. Instead, there is a valley at this position for DAM methylation profiles from all three tissues. This result suggests that DNA in this region is protected by additional factors.
In this article, we presented our study of chromatin structure in differentiated C. elegans tissues by measuring DNA accessibility in living animals. To examine a series of in vivo accessibility profiles, we expressed dam driven by three tissue-specific promoters (myo-3, rol-6, vit-2) and analyzed the methylation profile of synchronized animal populations. Such an analysis would be expected to identify both dramatic and subtle differences in in vivo accessibility. Somewhat surprisingly given a number of models for developmental regulation that involve higher order inaccessibility of large regions of chromatin , we observed no genes or regions which were completely inaccessible to DAM modification in vivo. In particular, no region or gene showed a deviation in accessibility that was greater than from the genome-wide average. Analysis of quantitative differences in DAM accessibility does reveal correlations, particularly showing a relationship between accessibility and expression levels: with increasing SAGE representation numbers, we observe a corresponding increase in accessibility over many genes and in all samples and tissues analyzed. Because SAGE measures expression by capturing the pool of mature mRNAs, our results suggest that DAM accessibility at least partially reflects average transcriptional activity.
Although the data argue against "open/shut" accessibility versus inaccessibility of non-expressed chromosomal domains during development, there is certainly evidence for stable structures that protect specific sequences for extended time periods. These are evidenced by the capture of Dpn I fragments with internal (non-methylated) GATC sites (Additional File 2, Figure S2 and Table S1). These internal sites would have been protected from DAM methylation for an extended period while DAM was expressed in the relevant cells.
We observed very strong correlation between DAM accessibility and nucleosome positioning, in agreement with previous work . As shown in Figure 6, DAM accessibility peaks at inter-nucleosomal regions, indicative of the accessibility of linker DNA. As the nucleosomes become less uniform in position (beyond 500 bp upstream and downstream of the TSS), the periodicity in DAM methylation profile is decreased.
Our results provide support for DAM-DALEC and nucleosome ChIP-Seq [36, 38] as complementary technologies in establishing and validating detailed chromatin maps. Certainly ChIP-Seq provides the highest resolution maps for chromatin; at the same time this technique is limited by the need for extensive processing of samples after chromatin extraction. DAM-DALEC provides an in vivo picture of chromatin structure that is unaffected by concerns of specificity and rearrangement on extraction but is lower resolution in terms of the numbers of sites analyzed. Certainly the combination of the two methods will be of value in defining both static architecture and developmental shifts.
We have developed an assay with the capacity to infer chromatin structure on a genome level in living organisms. We have shown its concordance to gene expression and positioned nucleosome data obtained from independent sources. Thus, DAM-DALEC can provide independent high-confidence in vivo data, which even for a fraction of the genome can be used to refine, validate, or evaluate less sparse but potentially more 'invasive' nuclease-based assays. DALEC could readily be adapted to any context in which expression of foreign coding regions (dam methyltransferase) can be engineered. Because eukaryotes do not possess an adenine methylation system, the enzyme would have a possibility of being "neutral" to the host cell and not subject to regulation. Thus, DAM-DALEC could offer an advantage of capturing snapshots of chromatin structure in living animals at defined developmental stages and can be a powerful tool that complements existing genomics methods for investigating chromatin structure and dynamics on a whole genome level.
C. elegans strains and growth conditions
Animals were reared on E. coli grown on NGM (nematode growth medium) nutrient plates . Bacterial strains used in this work as food for C. elegans are noted for each experiment.
OP50: A uracil-auxotrophic E. coli strain with wild-type dam and dcm methylation systems. This strain has been a standard laboratory food source for C. elegans.
SCS110: An E. coli strain defective in both dcm and dam methylation: rpsL (Strr) thr leu endA thi-1 lacY galK galT ara tonA tsx dam dcm supE44D (lac-proAB) [F' traD36 proAB lacI q ZDM15]. This strain provides a suitable food source for C. elegans while avoiding the presence of bacterial sequences with adenine methylation in eventual sequencing libraries.
SCS110(Amp R ): SCS110 made ampicillin resistant by transformation with pUC18. This strain adds the ability to "switch" bacterial food sources in a culture by cultivation of previously OP50-fed populations with SCS110(AmpR)+ and ampicillin.
All worm strains were reared at 23°C unless otherwise stated. C. elegans strains used in the experiments were as follows:
N2: wildtype strain of C. elegans (Bristol isolate)
PD5122 [ pha-1 ( e2123ts ) III; ccEx5122 ]: transgenic line expressing E. coli dam-GFP translational fusion and genomic C. elegans pha-1(+) gene from the extra-chromosomal array ccEX5122. Line PD5122 was established by microinjection of a mixture of plasmids pPD177.01 (Lig6682) and pC1 into pha-1(e2123ts) animals. pPD177.01 contains the myo-3 (body wall muscle) promoter driving E. coli dam fused to GFP. Two introns with C. elegans consensus sequences have been inserted into the dam gene to optimize expression in nematodes [28, 40]. A detailed description of pPD177.01 structure is shown in Additional File 1, Figure S1. pC1 carries the wildtype pha-1 coding region; non-transformed pha-1(e2123ts) animals are inviable at 23°C, while transformed animals carrying pC1 are viable, providing a strong selection .
PD3994 [ pha-1 ( e2123ts ) III; ccEx3994 ]: transgenic line expressing E. coli dam and genomic C. elegans pha-1(+) gene from the extra-chromosomal array ccEx3994. Line PD3994 was established by microinjection of plasmids pPD176.59 (Lig6649) and pC1(FD142), which contains the C. elegans genomic pha-1 gene  and is a selection marker for ccEx3994. pPD176.59 contains the E. coli dam gene driven by the myo-3 promoter and a single SV40 nuclear localization signal.
PD3995 [ pha-1 ( e2123ts ) III; ccEx3995 ]: transgenic line expressing E. coli dam and genomic C. elegans pha-1(+) gene from the extra-chromosomal array ccEx3995. Line PD3995 was generated by microinjection of a mixture of plasmids L7710, pRF4 (carrying the C. elegans rol-6[su1006] ), and pC1 into pha-1(e2123ts) animals. Plasmid L7710 contains the rol-6 promoter  driving the expression of a GFP-DAM translational fusion (rol-6::gfp-dam-unc-54 3' UTR). Attached to the 3' end of L7710 is the unc-54 3' UTR .
PD3997[ pha-1 ( e2123ts ) III; ccEx3997 ]: transgenic line expressing E. coli dam and genomic C. elegans pha-1(+) gene from the extra-chromosomal array ccEx3997. Line PD3997 was established by microinjection of a mixture of plasmids consisting of pC1, pRF4, and L7715 (vit-2::gfp-dam-unc-54 3' UTR).
Southern hybridizations were performed according to standard protocols. Briefly, RNase A-treated genomic DNA was subjected to one hour restriction digest by Dpn I, Mbo I, or Sau3A I, followed by phenol:chloroform extraction and ethanol precipitation. Restricted fragments were resolved on 1.4% agarose gels followed by transfer to Hybond-N+ membranes (Amersham Biosciences, Cat #RPN303B as recommended for capillary blotting under alkali conditions). An 808 bp radiolabeled probe containing the C. elegans 5S rDNA/SL1 (Spliced Leader sequence 1) was synthesized from a Bam H1 fragment of plasmid pPD98.38 using the RadPrime DNA Labeling System (Invitrogen, Cat #18428-11) with labeled α-32P dATP (MP Biomedicals, Cat #33002HD.5). Pre-hybridization and hybridization were in roller bottles using phosphate-SDS buffer (0.5 M phosphate buffer pH7.2, 1 mM EDTA pH8.0, 7% (w/v) SDS, 1% (w/v) BSA).
DNA extraction from synchronized animal populations
To generate synchronized worm populations, embryos were collected by treating gravid animals with a solution containing 1 M NaOH in 10% bleach for approximately 5-7 minutes or until adult cuticles were completely disintegrated . Eggs were washed several times with M9 medium (22 mM KH2PO4, 42 mM Na2HPO4, 86 mM NaCl, 1 mM MgSO4) and distributed onto NGM plates containing a thin layer of SCS110 E. coli seeded the previous night. Synchronized populations were collected at the stage where the desired tissue mass would be greatest per animal (L1/L2 larvae for the myo-3 promoter in line PD3994; L4 larvae for the rol-6 promoter in line PD3995; and young/gravid adults for the vit-2 promoter in line PD3997). At no time were synchronized animals starved.
Synchronized animals were washed off NGM plates with chilled M9 medium, layered on a 5% sucrose solution, and pelleted by centrifugation at low speed. Pelleted animals were washed several times with chilled M9 and frozen as ~50 μL pellets at -80°C. It is important to note that throughout the harvesting procedure, animals were alive up to the time before freezing.
Genomic DNA was extracted using the following procedure. To each thawed ≈50 μl pellet was added 450 μl worm lysis buffer (0.1 M Tris pH 8.5, 0.1 M NaCl, 50 mM EDTA, 1% SDS), 1 μl 10 mg/ml glycogen, and 20 μL 20 mg/ml proteinase K in TE pH 7.4. The mixture was incubated at 62°C for 45 minutes, with intermittent vortexing. The mixture was extracted with 500 μl phenol, followed by 500 μl phenol:chloroform (1:1), and 500 μl chloroform. DNA was precipitated with 20 μl saturated ammonium acetate and 1 ml 100% ethanol, washed once with 500 μl ethanol, and resuspended in 50 μl TE pH 7.4. Each 50 μl sample was treated with 1 μl 10 mg/ml RNase A for one hour at 37°C. The reaction was terminated with 1× STOP buffer (1 M NH4Ac, 10 mM EDTA, 0.2% SDS) followed by phenol:chloroform/chloroform extraction and 100% ethanol precipitation. The final product was resuspended in 40-50 μl TE and used for DALEC library preparation.
in vitro methylation of N2 genomic DNA
N2 genomic DNA was methylated using the following 200 μl reaction mixture: 30.0 μl (≈20-30 μg) N2 genomic DNA, 0.5 μl 32 mM S-adenosyl methionine, 1.0 μl E. coli DAM (8 U/μl, NEB M0222S), 20.0 μl 10× DAM buffer, 148.5 μl dH2O. Following one hour incubation at 37°C, the reaction was terminated with 1× STOP buffer. To the terminated reaction mixture was added 1 μl 10 mg/ml glycogen followed by 500 μl phenol:chloroform extraction, 500 μl chloroform extraction, 100% ethanol precipitation, 0.5 ml 100% ethanol wash.
Dpn I digestion
Dpn I digestion was carried out in a 200 μl volume consisting of the following mix: 30 μl (≈20-30 μg) genomic DNA, 20 μl 10× buffer (NEB4), 10 μl Dpn I (20 U/μl, NEB #R0176), and 140 μl dH2O. The reaction mix was incubated for 1.5 hours at 37°C and terminated with 350 μl 1× STOP buffer. 1.0 μl 10 mg/ml glycogen was added to the mix followed by 500 μl phenol:chloroform extraction, 500 μl chloroform extraction, precipitation with 100% ethanol, wash with 0.5 ml 100% ethanol, and resuspension in 10 μl TE. (NOTE: Unless otherwise indicated, all enzymatic reactions described below used the same termination, extraction, and precipitation steps).
Ligation to Linker A
Linker A was purchased as two separate oligonucleotides (5'OH-CAAGCAGAAGACGGCATACGATCCTGAGTACACTATGTTCCGAC-OH3', 5'P-GTCGGAACATAGTGTAGCA-OH3') and hybridized by boiling in a flask of water for five minutes and allowing the water to cool to room temperature. Ligation to Linker A was carried out in a 50 μl reaction using the following mix: 10 μl Dpn I product, 1.5 μl of 0.05 mM Linker A, 11.0 μl dH2O, 25.0 μl 2× Quick Ligase Buffer, 2.5 μl Quick Ligase (NEB #M2200). The reaction was incubated for five minutes at room temperature followed by termination, extraction, precipitation, and resuspension of ligated products in 10 μl TE.
To increase the number of ligated molecules, we added a second ligation step using the following mix: 10 μl Quick Ligase product, 7 μl dH2O, 2 μl 10× ligase buffer, 1 μl T4 DNA ligase (2,000 U/μl; NEB #M0202). The reaction was allowed to proceed for 30 minutes at room temperature followed by termination, extraction, precipitation, and resuspension in 20 μl TE.
Mme I digestion
Linker A-ligated molecules were subjected to Mme I digestion using the following 200 μl reaction mix: 20 μl Linker A-ligated product, 20.0 μl 10× NEB 4, 0.3 μl 32 mM S-adenosyl methionine, 2.0 μl Mme I (2 U/μl; NEB R0637), 157.7 μl dH2O. The reaction was allowed to proceed for 1 hour at 37°C followed by termination, extraction, precipitation, and resuspension in 10 μl TE.
Ligation to Linker B
Linker B was purchased as two separate oligonucleotides (5'P-AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTCGGTGGTCGCCGTATCATT-OH3', 5'OH-TCATCTTTCCCTACACGACGCTCTTCCGATCTNN-OH3') and hybridized using the same procedure as described for Linker A. Mme I products were ligated to Linker B using the following 50 μl reaction mix: 10.0 μl Mme I product, 1.0 μl of 0.05 mM Linker B, 5.0 μl 10× ligase buffer, 3.0 μl T4 DNA ligase (2,000 U/μl; NEB #M0202), 31.0 μl dH2O. Ligations were performed overnight using a PCR machine. Reactions were initiated at 8°C and stepped up to 16°C, with each degree increase in temperature held for two hours. Ligated products were extracted, precipitated, and resuspended in 10 μl TE.
Linker A and Linker B ligated molecules were size fractionated on a 6% polyacrylamide:formamide denaturing gel (15% v/v 19:1 acrylamide:bis (40%), 1× TBE, 25% (v/v) 100% formamide, 42% w/v urea). Electrophoresis was performed in 0.5× TBE at 700 V for approximately 2.5-3 hours. The 116 nt single-stranded DNA product was cut out from the denaturing gel, using single-stranded oligonucleotides of sizes 95, 105, 114, 116, and 125 as size guides. Products were passively eluted from excised bands overnight in 0.3 M NaCl at 4°C, precipitated in 100% ethanol, and resuspended in 20 μl TE.
PCR reactions were performed in a 50 μl reaction consisting of the following mix: 5 μl PAGE-purified template, 1.0 μl each of Solexa bridge amplification primers (1 μg/μl), 5 μl dNTP mix (2 mM each), 5 μl 10× NEB ThermoPol PCR buffer, 1 μl Taq polymerase (5 U/μl; NEB M0267), 32 μl dH2O. Reaction cycles were titrated to determine the linear range, typically 11-17 cycles of 45 s at 94°C, 30 s at 55°C, and 30 s at 72°C, with an initial denaturation step of 60 s at 94°C and a final extension step of 60 s at 72°C. PCR products were separated on 3% low melting point agarose gel (NuSieve Cat #50084). For a gel of approximately 10 inches long, electrophoresis at 103 V for approximately six hours gave superior resolution. The desired 116 bp dsDNA product was excised and recovered from the gel using the following steps. To each cut band was added 400 μl 1× STOP buffer, 1 μl glycogen and incubated in a 68°C water bath until the agarose was completely melted. To each tube of melted agarose was added 350 μl of 68°C phenol, quickly vortexed, spun 5-7 minutes, and the aqueous phase extracted (typically, a second 1-2 minute spin was required to completely remove residual agarose). Following extraction with 250 μl 1:1 phenol:chloroform and 250 μl chloroform, DNA was precipitated in 1 ml 100% ethanol, washed with 0.5 ml 100% ethanol, and resuspended in 15 μl TE for each 5 μl PCR template used.
Sanger sequencing was performed by Elim Biopharmaceuticals Inc. (Hayward, CA). High throughput sequencing of captured DAM tags was performed on the Solexa Genome Analyzer I.
in silico identification of DAM tags
We generated a database of all potential DAM tags with the structure 5'-(N)16GATC-3' from C. elegans genome version WS170. There are 269,049 DAM (GATC) sites per haploid genome in C. elegans. Because DALEC captures two tags (in principle) per GATC site, there are a total of 538,098 potential tags (or half sites) per haploid genome. To reduce computation time during alignment of Solexa reads to the genome, each tag was represented by a 16 nucleotide sequence that did not include the 3' GATC.
We excluded DAM tags that occurred more than once in the genome or that mapped to vector or ribosomal sequences. We also excluded tags belonging to two adjacent GATC sites that lie within 20 bp from each other. Under situations where two fully methylated adjacent GATC sites mapped within 20 bp of each other, one site will always be captured at the expense of the other, resulting in undercount of DAM accessibility at such regions. When the distances are slightly above 20 bp, it is conceivable that there may be inherent bias in Mme I sequence preference that leads to the preferential capture of one site over the other, again resulting in undercount. To avoid both situations from skewing our analysis, we excluded such "proximal tags" using the criteria described in Additional File 2, Figure S4). After filtering out proximal, repetitive, and vector/ribosome-derived sequences, we were left with 370,152 in silico tags (per haploid genome) that we could use to align Solexa reads to the genome.
SAGE data were obtained from the Genome BC C. elegans Gene Expression Consortium at http://elegans.bcgsc.bc.ca/. We downloaded the March 2006 C. elegans SAGE database using the following relatively standard parameters: Quality filter: 0.99, Hide ambiguous tags: ON, Tag mapping resources: CODING, show only mapped tags: ON, Tags/page: 10, Lowest count cutoff: 1, Hide antisense tags: ON, Remove duplicate ditags: ON, Highest count cutoff: NONE, Sort order: DOWN, Resolve lowest match: ON. We used only long SAGE tags (17 nucleotides long) in our analysis. To determine a total SAGE score for each gene, we collapsed redundant annotations for each gene to a single copy and summed the SAGE score for each annotation. After our filtering criteria, we were left with 13,916 unique genes in our SAGE data set.
All data sets, including raw and aligned Solexa reads, SAGE data sets, in silico generated Dam tags, and gene sets used in our analyses have been deposited into GEO with accession number GSE23042.
Direct Asymmetric Ligation End Capture
DNA Adenine Methyltransferase
Green Fluorescent Protein
di- or tri-methylation of lysine 4 on histone H3
Nucleosome Depleted Region
New England Biolabs
Serial Analysis of Gene Expression
We would like to thank the following people for their help, suggestions, and discussions: Rob Tibshirani and Daniela Witten (Stanford University Dept. of Statistics), Arend Sidow, Anton Valouev, Cheryl Smith, Phil Lacroute, and Ziming Weng (Stanford University Dept. of Pathology), Norma Neff and Rick Myers (Stanford University Dept. of Genetics), Bas van Steensel (Netherlands Cancer Institute), Laurie Boyer (MIT Dept. of Biology), and Tom Tullius (Boston University Dept. of Chemistry). This work was supported by grant R01-GM37706 (AZF).
- Boyle AP, Davis S, Shulha HP, Meltzer P, Margulies EH, Weng Z, Furey TS, Crawford GE: High-resolution mapping and characterization of open chromatin across the genome. Cell. 2008, 132 (2): 311-322. 10.1016/j.cell.2007.12.014.PubMed CentralPubMedView ArticleGoogle Scholar
- Crawford GE, Holt IE, Whittle J, Webb BD, Tai D, Davis S, Margulies EH, Chen Y, Bernat JA, Ginsburg D: Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS). Genome Res. 2006, 16 (1): 123-131. 10.1101/gr.4074106.PubMed CentralPubMedView ArticleGoogle Scholar
- Jessen WJ, Dhasarathy A, Hoose SA, Carvin CD, Risinger AL, Kladde MP: Mapping chromatin structure in vivo using DNA methyltransferases. Methods. 2004, 33 (1): 68-80. 10.1016/j.ymeth.2003.10.025.PubMedView ArticleGoogle Scholar
- Kladde MP, Simpson RT: Positioned nucleosomes inhibit Dam methylation in vivo. Proc Natl Acad Sci USA. 1994, 91 (4): 1361-1365. 10.1073/pnas.91.4.1361.PubMed CentralPubMedView ArticleGoogle Scholar
- Wines DR, Talbert PB, Clark DV, Henikoff S: Introduction of a DNA methyltransferase into Drosophila to probe chromatin structure in vivo. Chromosoma. 1996, 104 (5): 332-340. 10.1007/BF00337221.PubMedView ArticleGoogle Scholar
- Flick JT, Eissenberg JC, Elgin SCR: Micrococcal nuclease as a DNA structural probe: Its recognition sequences, their genomic distribution and correlation with DNA structure determinants. Journal of Molecular Biology. 1986, 190 (4): 619-633. 10.1016/0022-2836(86)90247-0.PubMedView ArticleGoogle Scholar
- Yuan G, Liu Y, Dion MF, Slack MD, Wu LF, Altschuler SJ, Rando OJ: Genome-scale identification of nucleosome positions in S. cerevisiae. Science. 2005, 309: 626-630. 10.1126/science.1112178.PubMedView ArticleGoogle Scholar
- Wingert L, Von Hippel PH: The conformation dependent hydrolysis of DNA by micrococcal nuclease. Biochim Biophys Acta. 1968, 157 (1): 114-126.PubMedView ArticleGoogle Scholar
- Horz W, Altenburger W: Sequence specific cleavage of DNA by micrococcal nuclease. Nucleic Acids Res. 1981, 9 (12): 2643-2658. 10.1093/nar/9.12.2643.PubMed CentralPubMedView ArticleGoogle Scholar
- McGhee JD, Felsenfeld G: Another potential artifact in the study of nucleosome phasing by chromatin digestion with micrococcal nuclease. Cell. 1983, 32 (4): 1205-1215. 10.1016/0092-8674(83)90303-3.PubMedView ArticleGoogle Scholar
- Shi B, Guo X, Wu T, Sheng S, Wang J, Skogerbo G, Zhu X, Chen R: Genome-scale identification of Caenorhabditis elegans regulatory elements by tiling-array mapping of DNase I hypersensitive sites. BMC Genomics. 2009, 10: 92-10.1186/1471-2164-10-92.PubMed CentralPubMedView ArticleGoogle Scholar
- Parker SC, Hansen L, Abaan HO, Tullius TD, Margulies EH: Local DNA topography correlates with functional noncoding regions of the human genome. Science. 2009, 324 (5925): 389-392. 10.1126/science.1169050.PubMed CentralPubMedView ArticleGoogle Scholar
- Widlak P, Garrard WT: Unique features of the apoptotic endonuclease DFF40/CAD relative to micrococcal nuclease as a structural probe for chromatin. Biochem Cell Biol. 2006, 84 (4): 405-410. 10.1139/O06-063.PubMedView ArticleGoogle Scholar
- Cartwright IL, Hertzberg RP, Dervan PB, Elgin SC: Cleavage of chromatin with methidiumpropyl-EDTA. iron(II). Proc Natl Acad Sci USA. 1983, 80 (11): 3213-3217. 10.1073/pnas.80.11.3213.PubMed CentralPubMedView ArticleGoogle Scholar
- Gargiulo G, Levy S, Bucci G, Romanenghi M, Fornasari L, Beeson KY, Goldberg SM, Cesaroni M, Ballarini M, Santoro F: NA-Seq: a discovery tool for the analysis of chromatin structure and dynamics during differentiation. Dev Cell. 2009, 16 (3): 466-481. 10.1016/j.devcel.2009.02.002.PubMedView ArticleGoogle Scholar
- Barras F, Marinus MG: The great GATC: DNA methylation in E. coli. Trends Genet. 1989, 5 (5): 139-143. 10.1016/0168-9525(89)90054-1.PubMedView ArticleGoogle Scholar
- Lobner-Olesen A, Skovgaard O, Marinus MG: Dam methylation: coordinating cellular processes. Curr Opin Microbiol. 2005, 8 (2): 154-160. 10.1016/j.mib.2005.02.009.PubMedView ArticleGoogle Scholar
- Palmer BR, Marinus MG: The dam and dcm strains of Escherichia coli--a review. Gene. 1994, 143 (1): 1-12. 10.1016/0378-1119(94)90597-5.PubMedView ArticleGoogle Scholar
- Buryanov Y, Shevchuk T: The use of prokaryotic DNA methyltransferases as experimental and analytical tools in modern biology. Anal Biochem. 2005, 338 (1): 1-11. 10.1016/j.ab.2004.02.048.PubMedView ArticleGoogle Scholar
- van Steensel B, Delrow J, Henikoff S: Chromatin profiling using targeted DNA adenine methyltransferase. Nat Genet. 2001, 27 (3): 304-308. 10.1038/85871.PubMedView ArticleGoogle Scholar
- van Steensel B, Henikoff S: Identification of in vivo DNA targets of chromatin proteins using tethered dam methyltransferase. Nat Biotechnol. 2000, 18 (4): 424-428. 10.1038/74487.PubMedView ArticleGoogle Scholar
- Bulanenkova S, Snezhkov E, Nikolaev L, Sverdlov E: Identification and mapping of open chromatin regions within a 140 kb polygenic locus of human chromosome 19 using E. coli Dam methylase. Genetica. 2007, 130 (1): 83-92. 10.1007/s10709-006-0026-1.PubMedView ArticleGoogle Scholar
- Boivin A, Dura JM: In vivo chromatin accessibility correlates with gene silencing in Drosophila. Genetics. 1998, 150 (4): 1539-1549.PubMed CentralPubMedGoogle Scholar
- Kyrion G, Liu K, Liu C, Lustig AJ: RAP1 and telomere structure regulate telomere position effects in Saccharomyces cerevisiae. Genes Dev. 1993, 7 (7A): 1146-1159. 10.1101/gad.7.7a.1146.PubMedView ArticleGoogle Scholar
- Naumova NM, Olenkina OM, Gvozdev VA: [Inactivation of reporter genes by cloned heterochromatic repeats of Drosophila melanogaster is accompanied by chromatin compaction]. Genetika. 2003, 39 (5): 682-686.PubMedGoogle Scholar
- Hoekstra MF, Malone RE: Expression of the Escherichia coli dam methylase in Saccharomyces cerevisiae: effect of in vivo adenine methylation on genetic recombination and mutation. Mol Cell Biol. 1985, 5 (4): 610-618.PubMed CentralPubMedView ArticleGoogle Scholar
- Nakai H, Storm TA, Kay MA: Recruitment of single-stranded recombinant adeno-associated virus vector genomes and intermolecular recombination are responsible for stable transduction of liver in vivo. J Virol. 2000, 74 (20): 9451-9463. 10.1128/JVI.74.20.9451-9463.2000.PubMed CentralPubMedView ArticleGoogle Scholar
- Okkema PG, Harrison SW, Plunger V, Aryana A, Fire A: Sequence requirements for myosin gene expression and regulation in Caenorhabditis elegans. Genetics. 1993, 135 (2): 385-404.PubMed CentralPubMedGoogle Scholar
- Granato M, Schnabel H, Schnabel R: pha-1, a selectable marker for gene transfer in C. elegans. Nucleic Acids Res. 1994, 22 (9): 1762-1763. 10.1093/nar/22.9.1762.PubMed CentralPubMedView ArticleGoogle Scholar
- Ardizzi JP, Epstein HF: Immunochemical localization of myosin heavy chain isoforms and paramyosin in developmentally and structurally diverse muscle cell types of the nematode Caenorhabditis elegans. J Cell Biol. 1987, 105 (6 Pt 1): 2763-2770. 10.1083/jcb.105.6.2763.PubMedView ArticleGoogle Scholar
- Fire A, Xu S, Montgomery MK, Kostas SA, Driver SE, Mello CC: Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans. Nature. 1998, 391 (6669): 806-811. 10.1038/35888.PubMedView ArticleGoogle Scholar
- Krause M, Hirsh D: A trans-spliced leader sequence on actin mRNA in C. elegans. Cell. 1987, 49 (6): 753-761. 10.1016/0092-8674(87)90613-1.PubMedView ArticleGoogle Scholar
- Nelson DW, Honda BM: Two highly conserved transcribed regions in the 5S DNA repeats of the nematodes Caenorhabditis elegans and Caenorhabditis briggsae. Nucleic Acids Res. 1989, 17 (21): 8657-8667. 10.1093/nar/17.21.8657.PubMed CentralPubMedView ArticleGoogle Scholar
- Hunt-Newbury R, Viveiros R, Johnsen R, Mah A, Anastas D, Fang L, Halfnight E, Lee D, Lin J, Lorch A: High-throughput in vivo analysis of gene expression in Caenorhabditis elegans. PLoS Biol. 2007, 5 (9): e237-10.1371/journal.pbio.0050237.PubMed CentralPubMedView ArticleGoogle Scholar
- Valouev A, Ichikawa J, Tonthat T, Stuart J, Ranade S, Peckham H, Zeng K, Malek JA, Costa G, McKernan K: A high-resolution, nucleosome position map of C. elegans reveals a lack of universal sequence-dictated positioning. Genome Res. 2008, 18 (7): 1051-1063. 10.1101/gr.076463.108.PubMed CentralPubMedView ArticleGoogle Scholar
- Gu SG, Fire A: Partitioning the C. elegans genome by nucleosome modification, occupancy, and positioning. Chromosoma. 2009, 119 (1): 73-87. 10.1007/s00412-009-0235-3.PubMed CentralPubMedView ArticleGoogle Scholar
- Grewal SI, Moazed D: Heterochromatin and epigenetic control of gene expression. Science. 2003, 301 (5634): 798-802. 10.1126/science.1086887.PubMedView ArticleGoogle Scholar
- Barski A, Cuddapah S, Cui K, Roh TY, Schones DE, Wang Z, Wei G, Chepelev I, Zhao K: High-resolution profiling of histone methylations in the human genome. Cell. 2007, 129 (4): 823-837. 10.1016/j.cell.2007.05.009.PubMedView ArticleGoogle Scholar
- Brenner S: The genetics of Caenorhabditis elegans. Genetics. 1974, 77 (1): 71-94.PubMed CentralPubMedGoogle Scholar
- Fire A, Ahnn J, Kelly W, Harfe B, Kostas S, Hsieh J, Hsu M, Xu S: GFP applications in C. elegans. GFP Strategies and Applications. Edited by: M Chalfie SK. 1998, NY: John Wiley and Sons, 153-168.Google Scholar
- Mello CC, Kramer JM, Stinchcomb D, Ambros V: Efficient gene transfer in C.elegans: extrachromosomal maintenance and integration of transforming sequences. EMBO J. 1991, 10 (12): 3959-3970.PubMed CentralPubMedGoogle Scholar
- Hoppe PE, Waterston RH: A region of the myosin rod important for interaction with paramyosin in Caenorhabditis elegans striated muscle. Genetics. 2000, 156 (2): 631-643.PubMed CentralPubMedGoogle Scholar
- McKay SJ, Johnsen R, Khattra J, Asano J, Baillie DL, Chan S, Dube N, Fang L, Goszczynski B, Ha E: Gene expression profiling of cells, tissues, and developmental stages of the nematode C. elegans. Cold Spring Harb Symp Quant Biol. 2003, 68: 159-169. 10.1101/sqb.2003.68.159.PubMedView ArticleGoogle Scholar