Human nucleosomes: special role of CG dinucleotides and Alu-nucleosomes
© Bettecken et al; licensee BioMed Central Ltd. 2011
Received: 28 May 2010
Accepted: 31 May 2011
Published: 31 May 2011
The periodical occurrence of dinucleotides with a period of 10.4 bases now is undeniably a hallmark of nucleosome positioning. Whereas many eukaryotic genomes contain visible and even strong signals for periodic distribution of dinucleotides, the human genome is rather featureless in this respect. The exact sequence features in the human genome that govern the nucleosome positioning remain largely unknown.
When analyzing the human genome sequence with the positional autocorrelation method, we found that only the dinucleotide CG shows the 10.4 base periodicity, which is indicative of the presence of nucleosomes. There is a high occurrence of CG dinucleotides that are either 31 (10.4 × 3) or 62 (10.4 × 6) base pairs apart from one another - a sequence bias known to be characteristic of Alu-sequences. In a similar analysis with repetitive sequences removed, peaks of repeating CG motifs can be seen at positions 10, 21 and 31, the nearest integers of multiples of 10.4.
Although the CG dinucleotides are dominant, other elements of the standard nucleosome positioning pattern are present in the human genome as well.
The positional autocorrelation analysis of the human genome demonstrates that the CG dinucleotide is, indeed, one visible element of the human nucleosome positioning pattern, which appears both in Alu sequences and in sequences without repeats. The dominant role that CG dinucleotides play in organizing human chromatin is to indicate the involvement of human nucleosomes in tuning the regulation of gene expression and chromatin structure, which is very likely due to cytosine-methylation/-demethylation in CG dinucleotides contained in the human nucleosomes. This is further confirmed by the positions of CG-periodical nucleosomes on Alu sequences. Alu repeats appear as monomers, dimers and trimers, harboring two to six nucleosomes in a run. Considering the exceptional role CG dinucleotides play in the nucleosome positioning, we hypothesize that Alu-nucleosomes, especially, those that form tightly positioned runs, could serve as "anchors" in organizing the chromatin in human cells.
The periodical distribution of various dinucleotides along eukaryotic DNA sequences with a period of 10-11 bases is commonly considered as the manifestation of a nucleosome positioning signal present in the sequences [1–8]. The period, the more accurate value of which is 10.4 bases [9–12], corresponds to the helical repeat of DNA in the nucleosome. The positioning signal in human nucleosomes is rather weak and lacks the periodical AA and TT dinucleotides , while in yeast and nematodes the periodical nucleosome signals are dominated by AA and TT dinucleotides [5, 6]. However, RR and YY dinucleotides, GG and CC in particular, have been shown to contribute to the human nucleosome positioning signal [13, 14]. Whole-genome calculations for 13 diverse eukaryotes  confirmed the exceptional lack of visible dinucleotide periodicities in the human genome, where only CG showed a signal. Nucleosomes on Alu sequences, which are known to contain strongly periodical CG dinucleotides, are apparently representatives of a special class.
The full human genome sequence (build hg18) was copied from the UCSC genome server http://www.genome.ucsc.edu. The sequence had been assembled by the International Human Genome Project sequencing centers (March 2006). For filtering out repeats, the sequence data available under the label "masked" (hg18, file ChromFaMasked.zip, genome.ucsc.edu) was used.
All programs employed to calculate the DNA composition and derivation of the distance diagrams (autocorrelations) are either Perl scripts or C++ programs, both original. The auto-correlation was calculated as follows. For a dinucleotide MN at a given position, all distances to other MN dinucleotides downstream were counted and restricted to the size of the window. This was applied to all dinucleotide occurrences in the sequence. Essentially, the procedure scores all distances and reveals those which are preferred. The routine was disrupted when filtered repeat sequences or the end of a chromosome occurred within the window size limit. In order to avoid the end effect of the short-range distances in the positional correlation analysis, the last dinucleotides within the window size region at the sequence ends were excluded.
For the mapping of Alu sequences, the human Alu-Sx subfamily consensus sequence  was matched with the full human genome, using the software BLAST (release 2.2.21, taken from ftp://ftp.ncbi.nlm.nih.gov/blast/executables/) with standard (default) searching parameters. When starting positions of the matches were less than 250 bases apart, only the upstream copies were selected.
Results and Discussion
CG-periodicities in the human whole-genome sequence
A possible chromatin organizing role of Alu sequences
A model chromatin built from weak nucleosomes would very likely be unstable, having a loose structure and allowing for nucleosome sliding to alternative positions. One possible arrangement to avoid such instabilities would be the introduction of a certain number of strong uniquely positioned "anchoring" nucleosomes. These would serve as chromatin organizers, thus limiting the freedom of sliding of the other nucleosomes in between. Such a hypothetical arrangement has been previously described as the "parking lot model" .
The dual role of CG dinucleotides
It was not until recently that evidence emerged on the role CG dinucleotides may have in nucleosome positioning. Their appearance in Alu-sequences at distances of multiples of 10.4 bases (31 or 32 bases) was the first indication of their phasing function . Next, the analysis of the nucleosome DNA sequences of C. elegans showed that CG dinucleotides do have an unusually high positional preference within the 10-matrix of DNA bendability . Finally, a spectacular 10.4 base periodicity of CG in the genome of A. mellifera, the honey bee, was discovered . It turned out that the CG dinucleotide is, actually, among the strongest periodical elements (after AA and TT) in eukaryotic genomes.
The second obvious role of the CG dinucleotide is its potential to undergo C-methylation/-demethylation in many eukaryotic organisms. This modification is known to crucially impact gene expression and is leading to epigenetic phenomena [26, 27]. It is known also, that the DNA methyltransferases preferentially target nucleosomes [28, 29], so that the methylated CpGs are distributed with the period ~10 bases along the nucleosome DNA . Nucleosomes containing CG dinucleotides in key positions for the nucleosome stability - in the minor grooves at the interface DNA/histones [7, 30] - could be called epigenetic nucleosomes . The C-methylation in CG dinucleotides may tune the stability of the nucleosomes in promoter regions , and modulate the stability of the proposed anchor nucleosomes, e. g., Alu-nucleosomes, containing many CG dinucleotides. In  it is demonstrated for the first time that CpG methylation renders compactness to nucleosomes, with DNA bound more tightly to the histone octamers.
Are the weak nucleosomes phased?
The poor manifestation of dinucleotide periodicities in the human genome, namely CG only, suggests that the majority of human nucleosomes are rather weak. This means that there is only weak pressure by the nucleosome positioning signal. As experiment  and calculations (Gabdank et al., unpublished data) show, even in the highly periodical genome of C. elegans, the majority of the nucleosomes are as weak as the "nucleosomes" mapped on random sequences. The mouse genome in which no dinucleotide periodicity is emerging with autocorrelation calculations , is especially interesting in this respect. This is a nightmare case for signal hunters, although there definitely must be a certain sequence specificity for chromatin organization. After all, the DNA in the mouse genome is packed into nucleosomes as well, and the mouse chromatin is not known to be any different from typical mammalian chromatin. It would be incorrect though, to conclude that weak nucleosomes are randomly distributed along the sequences. Let us consider a hypothetical natural sequence in which the positioning signal is not introduced. In that sense, the sequence would be "random". But the histone octamers would still bind to those segments of the sequence that do have some resemblance to the standard positioning pattern. They will form weak nucleosomes at specific positions along the sequence. The non-randomness of nucleosome positioning in natural genomes is evidenced by the existence of the "nucleosome repeat lengths" , from 160 to 240 bases, depending on the species.
For detection of the periodical repetition of the DNA bendability pattern, whole-genome sequences with very weak or invisible periodicities are not suitable. The periodical signal extraction will probably succeed when it is applied instead to the comprehensive nucleosome DNA database sequences (work in progress). Due to the affinity of histone octamers to the segments with highest bendability, the sequences of the databases will contain the signal. For its extraction, the signal regeneration procedure can be used as described in . This study and others [18, 34] show that, no matter how weak the nucleosome positioning signal is, it can be traced and even characterized by one or another signal processing technique. It also shows that due to apparently species-specific sequence preferences, various different components of the general nucleosome positioning pattern can be used by different organisms. The preferential use of CG dinucleotides in human chromatin is the illustration. At the same time, since the physics of nucleosome positioning should be the same everywhere, the same universal pattern  should be used by all species. This does not exclude though, that there can be species-specific biases towards this or other selections of dinucleotides predominantly used for positioning of nucleosomes . Finally, with the identification of nucleosome positioning CG and other dinucleotides, it seems very natural to extend these considerations to the variable sites in the eukaryotic and (especially) the human genome. Single nucleotide polymorphisms (SNPs), SNP haplotypes, repetitive sequences, whether stable or subject to expansion or contraction, appear in a new light, as respective nucleosomes involved may vary in strength and/or position.
We appreciate the help of H. Goldbrunner and F. Hochmann with setting up the Linux genome server. We greatly appreciate discussions with and advice of G. Bernardi. We are also grateful to the anonymous reviewers for their thoughtful comments and suggestions towards improvement of the manuscript. The work has been supported by grant 222/09 of Israel Science Foundation and by a fellowship of SoMoPro (Southern Moravian Program, Czech Republic) with financial contribution of the European Union within the 7th framework program (FP/20007-2013, grant agreement No. 229603).
- Trifonov EN, Sussman JL: The pitch of chromatin DNA is reflected in its nucleotide sequence. Proceedings of the National Academy of Sciences USA. 1980, 77: 3816-3820. 10.1073/pnas.77.7.3816.View ArticleGoogle Scholar
- Mengeritsky G, Trifonov EN: Nucleotide sequence-directed mapping of the nucleosomes. Nucleic Acids Research. 1983, 11: 3833-3851. 10.1093/nar/11.11.3833.PubMed CentralPubMedView ArticleGoogle Scholar
- Satchwell SC, Drew HR, Travers AA: Sequence periodicities in chicken nucleosome core DNA. Journal of Molecular Biology. 1986, 191: 659-675. 10.1016/0022-2836(86)90452-3.PubMedView ArticleGoogle Scholar
- Ioshikhes I, Bolshoy A, Derenshteyn K, Borodovsky M, Trifonov EN: Nucleosome DNA sequence pattern revealed by multiple alignment of experimentally mapped sequences. Journal of Molecular Biology. 1996, 262: 129-139. 10.1006/jmbi.1996.0503.PubMedView ArticleGoogle Scholar
- Cohanim AB, Kashi Y, Trifonov EN: Yeast nucleosome DNA pattern: deconvolution from genome sequences of S. cerevisiae. Journal of Biomolecular Structure and Dynamics. 2005, 22: 687-694.PubMedView ArticleGoogle Scholar
- Johnson SM, Tan FJ, McCullough HL, Riordan DP, Fire AZ: Flexibility and constraint in the nucleosome core landscape of Caenorhabditis elegans chromatin. Genome Research. 2006, 16: 1505-1516. 10.1101/gr.5560806.PubMed CentralPubMedView ArticleGoogle Scholar
- Gabdank I, Barash D, Trifonov EN: Nucleosome DNA bendability matrix (C. elegans). Journal of Biomolecular Structure and Dynamics. 2009, 26: 403-412.PubMedView ArticleGoogle Scholar
- Bettecken T, Trifonov EN: Repertoires of the nucleosome-positioning dinucleotides. PLoS ONE. 2009, 4: e7654-10.1371/journal.pone.0007654.PubMed CentralPubMedView ArticleGoogle Scholar
- Trifonov EN, Bettecken T: Noninteger pitch and nuclease sensitivity of chromatin DNA. Biochemistry. 1979, 18: 454-456. 10.1021/bi00570a011.PubMedView ArticleGoogle Scholar
- Prunell A, Kornberg R, Lutter L, Klug A, Levitt M, Crick F: Periodicity of deoxyribonuclease-I digestion of chromatin. Science. 1979, 204: 855-858. 10.1126/science.441739.PubMedView ArticleGoogle Scholar
- Ulanovsky LE, Trifonov EN: Superhelicity of nucleosomal DNA changes its double-helical repeat. Cell Biophysics. 1983, 5: 281-283.PubMedView ArticleGoogle Scholar
- Cohanim AB, Kashi Y, Trifonov EN: Three sequence rules for chromatin. Journal of Biomolecular Structure and Dynamics. 2006, 23: 559-66.PubMedView ArticleGoogle Scholar
- Kato M, Onishi Y, Wada-Kiyama Y, Abe T, Ikemura T, Kogan S, Bolshoy A, Trifonov EN, Kiyama R: Dinucleosome DNA of human K562 cells: experimental and computational characterizations. Journal of Molecular Biology. 2003, 332: 111-125. 10.1016/S0022-2836(03)00838-6.PubMedView ArticleGoogle Scholar
- Kogan SB, Kato M, Kiyama R, Trifonov EN: Sequence structure of human nucleosome DNA. Journal of Biomolecular Structure and Dynamics. 2006, 24: 43-48.PubMedView ArticleGoogle Scholar
- Jurka J, Milosavljevic A: Reconstruction and analysis of human Alu genes. Journal of Molecular Evolution. 1991, 32: 105-21. 10.1007/BF02515383.PubMedView ArticleGoogle Scholar
- Salih F, Salih B, Kogan S, Trifonov EN: Epigenetic nucleosomes: Alu sequences and CG as nucleosome positioning element. Journal of Biomolecular Structure and Dynamics. 2008, 26: 9-16.PubMedView ArticleGoogle Scholar
- Shannon CE: A mathematical theory of communication. The Bell System Technical Journal. 1948, 27: 379-423.View ArticleGoogle Scholar
- Rapoport A, Frenkel ZM, Trifonov EN: Nucleosome positioning pattern derived from oligonucleotide compositions of genomic sequences. Journal of Biomolecular Structure and Dynamics. 2011, 28: 567-574.PubMedView ArticleGoogle Scholar
- Kiyama R, Trifonov EN: What positiones nucleosomes? - A model. FEBS Letters. 2002, 523: 7-11. 10.1016/S0014-5793(02)02937-X.PubMedView ArticleGoogle Scholar
- Englander EW, Wolffe AP, Howard BH: Nucleosome interactions with a human Alu element. Transcriptional repression and effects of template methylation. Journal of Biological Chemistry. 1993, 268: 19565-19573.PubMedGoogle Scholar
- Englander EW, Howard BH: Nucleosome positioning by human Alu elements in chromatin. Journal of Biological Chemistry. 1995, 270: 10091-10096. 10.1074/jbc.270.17.10091.PubMedView ArticleGoogle Scholar
- Tanaka Y, Yamashita R, Suzuki Y, Nakai K: Effects of Alu elements on global nucleosome positioning in the human genome. BMC Genomics. 2010, 11: 309-318. 10.1186/1471-2164-11-309.PubMed CentralPubMedView ArticleGoogle Scholar
- Benson G: Tandem cyclic alignment. Lecture Notes in Computational Sciences. 2001, 2089: 118-130.View ArticleGoogle Scholar
- Zhang XY, Fittler F, Hörz W: Eight different highly specific nucleosome phases on alpha-satellite DNA in the African green monkey. Nucleic Acids Research. 1983, 11: 4287-306. 10.1093/nar/11.13.4287.PubMed CentralPubMedView ArticleGoogle Scholar
- Linxweiler W, Hörz W: Reconstitution of mononucleosomes: characterization of distinct particles that differ in the position of the histone core. Nucleic Acids Research. 1984, 12: 9395-413. 10.1093/nar/12.24.9395.PubMed CentralPubMedView ArticleGoogle Scholar
- Bernstein EB, Meissner A, Lander ES: The mammalian epigenome. Cell. 2007, 128: 669-681. 10.1016/j.cell.2007.01.033.PubMedView ArticleGoogle Scholar
- Pennings S, Allan J, Davey CS: DNA methylation, nucleosome formation and positioning. Briefings in Functional Genomics and Proteomics. 2005, 3: 351-361. 10.1093/bfgp/3.4.351.PubMedView ArticleGoogle Scholar
- Jeong S, Liang G, Sharma S, Lin JC, Choi SH, Han H, Yoo CB, Egger G, Yang AS, Jones PA: Selective Anchoring of DNA Methyltransferases 3A and 3B to Nucleosomes Containing Methylated DNA. Molecular and Cellular Biology. 2009, 29: 5366-5376. 10.1128/MCB.00484-09.PubMed CentralPubMedView ArticleGoogle Scholar
- Chodavarapu RK, Feng S, Bernatavichute YV, Chen P-Y, Stroud H, Yu Y, Hetzel JA, Kuo F, Kim J, Cokus SJ, Casero D, Bernal M, Huijser P, Clark AT, Krämer U, Merchant SS, Zhang X, Jacobsen SE, Pellegrini M: Relationship between nucleosome positioning and DNA methylation. Nature (London). 2010, 466: 388-92. 10.1038/nature09147.View ArticleGoogle Scholar
- Gabdank I, Barash D, Trifonov EN: Single-base resolution nucleosome mapping on DNA sequences. Journal of Biomolecular Structure and Dynamics. 2010, 28: 107-121.PubMedView ArticleGoogle Scholar
- Choy JS, Wei S, Lee JY, Tan S, Chu S, Lee TH: DNA Methylation Increases Nucleosome Compaction and Rigidity. Journal of the American Chemical Society. 2010, 132: 1782-1783. 10.1021/ja910264z.PubMed CentralPubMedView ArticleGoogle Scholar
- Valouev A, Ichikawa J, Tonthat T, Stuart J, Ranade S, Peckham H, Zeng K, Malek JA, Costa G, McKernan K, Sidow A, Fire A, Johnson SM: A high-resolution, nucleosome position map of C. elegans reveals a lack of universal sequence-dictated positioning. Genome Research. 2008, 18: 1051-63. 10.1101/gr.076463.108.PubMed CentralPubMedView ArticleGoogle Scholar
- Kornberg RD: Structure of chromatin. Annual Reviews of Biochemistry. 1977, 46: 931-954. 10.1146/annurev.bi.46.070177.004435.View ArticleGoogle Scholar
- Frenkel ZM, Bettecken T, Trifonov EN: Nucleosome DNA sequence structure of isochores. BMC Genomics. 2011, 12: 203-PubMed CentralPubMedView ArticleGoogle Scholar
- Trifonov EN: Base pair stacking in nucleosome DNA and bendability sequence pattern. Journal of Theoretical Biology. 2010, 263: 337-339. 10.1016/j.jtbi.2009.11.020.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.