Definition, conservation and epigenetics of housekeeping and tissue-enriched genes
© She et al; licensee BioMed Central Ltd. 2009
Received: 08 November 2008
Accepted: 17 June 2009
Published: 17 June 2009
Housekeeping genes (HKG) are constitutively expressed in all tissues while tissue-enriched genes (TEG) are expressed at a much higher level in a single tissue type than in others. HKGs serve as valuable experimental controls in gene and protein expression experiments, while TEGs tend to represent distinct physiological processes and are frequently candidates for biomarkers or drug targets. The genomic features of these two groups of genes expressed in opposing patterns may shed light on the mechanisms by which cells maintain basic and tissue-specific functions.
Here, we generate gene expression profiles of 42 normal human tissues on custom high-density microarrays to systematically identify 1,522 HKGs and 975 TEGs and compile a small subset of 20 housekeeping genes which are highly expressed in all tissues with lower variance than many commonly used HKGs. Cross-species comparison shows that both the functions and expression patterns of HKGs are conserved. TEGs are enriched with respect to both segmental duplication and copy number variation, while no such enrichment is observed for HKGs, suggesting the high expression of HKGs are not due to high copy numbers. Analysis of genomic and epigenetic features of HKGs and TEGs reveals that the high expression of HKGs across different tissues is associated with decreased nucleosome occupancy at the transcription start site as indicated by enhanced DNase hypersensitivity. Additionally, we systematically and quantitatively demonstrated that the CpG islands' enrichment in HKGs transcription start sites (TSS) and their depletion in TEGs TSS. Histone methylation patterns differ significantly between HKGs and TEGs, suggesting that methylation contributes to the differential expression patterns as well.
We have compiled a set of high quality HKGs that should provide higher and more consistent expression when used as references in laboratory experiments than currently used HKGs. The comparison of genomic features between HKGs and TEGs shows that HKGs are more conserved than TEGs in terms of functions, expression pattern and polymorphisms. In addition, our results identify chromatin structure and epigenetic features of HKGs and TEGs that are likely to play an important role in regulating their strikingly different expression patterns.
The expression of most genes varies between different cell and tissue types and between different development and physiological states. Some genes, however, are constitutively expressed in all tissues and their expression levels are comparatively constant across different cell types. These genes have been referred to as housekeeping genes (HKGs) and are hypothesized to constitute a small set of genes required to maintain minimum basic cellular function . In contrast to the expression pattern of HKGs, tissue enriched genes (TEG) are highly expressed in one particular tissue type and are either not expressed or are expressed at much lower levels in other tissues. TEGs are generally responsible for the specialized functions of the particular tissues or cell types in which they are expressed and can therefore serve as biomarkers of specific biological processes or tissues. Since many diseases involve tissue- or organ-specific processes, TEGs may also be good candidate drug targets. HKGs, in contrast, have been widely used as experimental controls and normalization references for gene transcription and expression experiments, including RT-PCR, qPCR, Western blotting and microarray studies. The expression of many of the genes currently used for such purposes, however, varies across different cell types and conditions, and consequently there is a need for a better set of HKGs that have stable, high expression levels across a large number of tissues.
The genomic organization of HKGs is comparatively compact: intronic regions, coding regions and the intergenic spaces are shorter for HKGs than for other genes [2, 3], and HKGs are strongly clustered in the human genome , suggesting selection for economy in transcription and translation  and genomic co-regulation of broadly expressed genes. HKGs, as a result of their critical role in basic cell maintenance, are subject to stronger purifying selection and therefore evolve more slowly than TEGs in terms of sequence mutation . It is less well understood to what extent the functions and expression patterns of HKGs are conserved across species, whether HKGs are conserved at the genomic structure level and how polymorphic HKGs and TEGs are among different individuals within a species. To address these questions, we sought to define a high quality set of HKGs and then analyze the conservation of HKGs in terms of functions and expression patterns. We also analyzed the distribution of genomic component, such as segmental duplication, copy number variation regions and ultra conserved elements, which are closely related to conservation.
The regulatory mechanisms underlying the differential expression patterns of HKGs compared to TEGs are also poorly characterized. Chromatin structure and epigenetic modifications of genomic structure have been documented to regulate gene expression and affect replication, recombination and DNA repair [6, 7] through various mechanisms including nucleosome positioning and occupancy, histone modification (mainly acetylation and methylation) and DNA cytosine methylation [8, 9]. Abnormal changes in chromatin structure have been linked to disease, particularly cancer . Investigation of the differences in chromatin structure and epigenetic modification between HKGs and TEGs, consequently, may provide insight into epigenetic contributions to transcriptional patterns and the mechanisms of gene regulation and disease.
Here we use microarray gene expression profiling and analysis to compile a set of 1,522 high quality HKGs that are highly expressed in 42 normal tissues and show minimal fluctuations in expression level across these tissues. Similarly, we describe the identification of 975 TEGs. These genes from both categories are potentially useful laboratory experimental controls. The distinct expression patterns of HKGs and TEGs and the high quality of these sets also provide an opportunity to enhance our understanding of transcriptional and epigenetic regulatory mechanisms. We compare and contrast the genomic and epigenetic properties of HKGs and TEGs, and identify epigenetic factors that may contribute to the underlying mechanisms of expression regulation differences between HKGs and TEGs.
Results and Discussion
HKGs and TEGs
We identified 1,522 HKGs from a total of 18,149 genes in 42 normal human tissues monitored on the microarray (see Methods). This list of HKGs was used for analysis of genomic and epigenetic features (see Additional file 1). We also identified 975 TEGs from a subset of 29 representative tissues. These TEGs are expressed at much higher level in one single tissue than any other tissues (see Methods. Additional file 2). TEGs were found in 26 tissues, while no TEGs meeting our criteria were identified in spleen, colon and CD4+ T-cells.
There have been other recent efforts to identify HKGs in human tissues to serve as experimental references or internal controls [12–16], but these studies have significant shortcomings. One study surveyed a much smaller set of 7012 potential candidate genes , while another one used a much smaller set of 15 tissues . Others were based on microarray data from heterogeneous sources lacking systematic experimental controls  or were based on in silico predictions . The HKGs identified here are based on high quality microarray expression data systematically gathered from a large and diverse set of tissues. This is a systematically and experimentally defined set of HKG which have both high expression and low fluctuation across all major organ/tissues.
The human and mouse transcriptome in multiple tissues have been surveyed in microarray studies [17–19] that built foundations for studies on housekeeping and tissue-specific genes [13, 19, 20]. Large collections of EST and SAGE data have also been used to identify HKG  and tissue-specific genes . Comparing HKGs of this study and other studies based on microarray  or EST datasets , significant portions of genes in the three HKG sets overlap, while the HKG list described here has the fewest genes unique in a single study, suggesting our fluctuation-controlled microarray approach is more conservative than the other methods that either depend on sampling or representation in an EST dataset  or lack control of variation across different tissues  (see Additional file 3). We also compared our TEGs in testis, prostate, liver and skin with tissue-specific genes from another study  (see Additional file 4). A significant portion of our TEGs overlap with this human tissue-specific gene study, particularly for liver, in which 70% of the defined TEGs are identical. Other tissues are more discrepant, as a result of different tissue selection for the surveys and different criteria used to identify these genes.
Conservation of functions and expression patterns in HKGs across species
Cross-species conservation of housekeeping genes and tissue enriched genes
Orthologs of all genes on array
Expression of human housekeeping genes in mammals
CV of Intensity
All Genes in array
All Genes in Array
Distribution of segmental duplication, copy number variation sites and ultraconserved elements in HKGs and TEGs
Genes with segmental duplication, copy number variation and ultraconserved elements
Copy Number Variation
Ultra Conserved Elements
Total number of genes
Number of genes
Percent of all genes
Number of genes
Percent of all genes
Number of genes
Percent of all genes
We also calculated the distribution of ultra conserved elements (UCEs) which are sequences that are absolutely conserved (100% identical) between orthologous regions of the human, rat, and mouse genomes . Consistent with the slower evolution of HKGs, UCEs are significantly enriched in HKGs and not changed in TEG (Table 3). HKGs have been found to evolve more slowly than TEGs at the sequence level point mutation . The distribution of SD, CNV and UCE demonstrated that HKGs are also more conserved than TEGs with respect to genomic structural changes.
Enriched CpG islands at HKG transcription start sites
CpG islands at transcription start and end sites
Transcription start sites
Transcription end sites
Number of genes with CpG islands (ratio)
CpG density (/bp)
Number of genes with CpG islands (ratio)
CpG density (/bp)
Total number of genes in class
Chromatin structure and epigenetic modifications in HKGs
We next examined differences in chromatin structure and epigenetic modifications, including nucleosome occupancy, histone modifications, and DNA methylation between HKG and TEGs as possible mechanisms contributing to the differential expression patterns of these two groups of genes.
The HKGs identified in this study are highly expressed in CD4+ T cells (and all other tissues), leading to the possibility that the differences observed in HS site density seen for HKGs and TEGs may reflect only the overall expression level of these genes in CD4+ T cells, rather than the difference in expression patterns across tissues. To address this question, we partitioned both the HKGs and RefSeq genes into subgroups based on their expression level in CD4+ T cells: low, intermediate and high (Figure 3B). While a correlation between HS site density and expression level is still observed across the subgroups for either HKGs or RefSeq genes, the HKG-low expression subgroup (average expression intensity: 2.55) has a higher HS site density than the RefSeq-high expression subgroup (average intensity: 3.48), clearly demonstrating that the HS site density is not simply a function of gene expression level in CD4+ T cells, but also correlates with the high levels of expression across different tissues. A recent study showed the positive association of CpG density with the distribution of HS sites across different tissues , suggesting that the increase in HS sites in HKGs may be related to high CpG density. Another possible explanation for this observation is that HKGs may contain sequence elements at their TSS that inhibit formation of nucleosomes, leading to high promoter accessibility and higher expression levels of these genes across different tissues. Further investigation of TSS sequences and more HS site mapping in other tissues would be necessary to test this hypothesis.
DNA methylation in mammals occurs on cytosine residues of CpG dinucleotides, which may lead to formation of heterochromatin, imprinting and transcriptional repression . The distribution of genome-wide DNA methylation  in HKGs, RefSeq genes and TEGs (see Additional file 6) shows that DNA methylation peaks at TSS for all gene groups and that there is no significant difference of DNA methylation levels between HKGs, TEGs and RefSeq genes in either sperm cells or fibroblast cells. Additionally, comparison of the list of methylated genes from another recent study  with our HKGs and TEGs did not yield any significant overlapping genes (data not shown). Based on these two pieces of evidence, HKGs do not appear to be enriched for DNA methylation, despite enrichment for CpG islands. This observation is consistent with previous reports that CpG islands in normal tissues are protected from methylation and that methylation of CpG islands is one of the mechanisms of tumorigenesis [50–52].
Using high quality microarray gene expression profiling data, we identified a small subset of housekeeping genes that are highly expressed in 42 diverse normal tissues with small variation in expression level across these tissues. Cross species studies indicate that the functions and expression patterns of these HKGs are conserved between different species. These features make these genes better candidates for experimental references of transcription and expression levels than currently commonly used housekeeping genes: they can be easily detected, are stable across different tissues and are likely to be HKGs in other species. To investigate the mechanisms behind transcriptional regulation of HKGs and TEGs, we compared genomic features, chromatin structure, and epigenetic modifications between a larger set of HKGs and TEGs. We find that CpG islands are enriched near the TSSs of HKGs, in line with previous studies. HKGs have lower nucleosome occupancy, as indicated by strong enrichment of DNase I hypersensitive sites in HKGs that cannot be fully explained by the high expression level of HKGs in a single tissue type (CD4+T-cells). HKGs are enriched for DNase I hypersensitive sites relative to RefSeq genes of comparable or higher expression levels. HKGs and TEGs show significant differences in various histone methylation patterns, suggesting that histone methylation likely plays a role in the differential expression patterns but the relationships between histone methylation patterns and expression patterns is complex. DNA methylation patterns, in contrast, are similar for both HKGs and TEGs, suggesting that DNA methylation does not play a significant role in the differential expression patterns of these different types of genes. Elevated histone acetylation is not seen for HKGs after the correlation with expression is accounted for. Interestingly, however, histone acetylation appears to be elevated in all genes with moderate to high expression levels, suggesting that histone acetylation may serve as a general transcriptional switch to open chromatin and provide access to other transcription factors, which then regulate the extent of expression.
mRNA from human tissues was purchased from commercial vendors, including Clontech, Ambion, and Biochain. Most samples were pooled from multiple donors, typically twelve.42 normal tissues were tested, in cluding adipose, adrenal gland, bladder, activated CD4-positive T-lymphocyte, activated CD8-positive T-lymphocyte, bone marrow, brain, fetal brain, cerebellum, cerebral cortex, hippocampus, thalamus, pituitary gland, cervix uteri, colon, epididymis, heart, kidney, fetal kidney, liver, fetal liver, lung, fetal lung, trachea, lymph node, mammary gland, skeletal muscle, ovary, placenta, prostate, retina, salivary gland, skin, duodenum, ileum, jejunum, spinal cord, spleen, stomach, testis, thymus, and thyroid gland. These selected tissues cover most major organs and normal tissue types. Four fetal tissues of brain, kidney, liver and lung were included.
Microarray expression profiling
Human tissue microarray expression profiling was performed as described previously . In brief, purchased mRNA pooled from multiple normal individuals was amplified and labeled using a full-length amplification protocol and hybridized in duplicate against a common reference pool in a two-color dye swap experiment . Each gene is represented by 3 microarray probes placed at exon-exon junctions or in exons. Gene expression was calculated as the median probe intensity, after normalization by the pool of all data. The dataset is available at National Center for Biotechnology Information's Gene Expression Omnibus database [GEO accession: GSE16546].
Selection of HKGs and TEGs
We used fairly conservative criteria to identify HKGs: the intensity of the gene must be greater than the median intensity of all genes in the microarray in at least 41 out of 42 tissues and the coefficient of variance (CV, standard deviation/average) of the gene intensity across tissues must be less than 1. The intensity and CV of the 18,149 genes monitored in the microarray are distributed over a wide range, with average intensity of all genes 1.04 ± 1.94 (SD) and average CV of all genes 0.83 ± 0.77 (SD). A recent study shows that genes' breadth of expression in tissues is positively correlated with the expression level of the genes . Therefore it is reasonable to select HKGs from among those genes with higher intensity. While the CVs of most genes (76% of all genes) are below 1, some genes' expression is very volatile across tissues, with CV as high as 6. Our criteria guarantee the HKGs are highly expressed in vast majority of tissues with limited fluctuation in intensity level across tissues.
More stringent criteria were used to identify a reference HKG list for laboratory experimental controls. We required that the intensity of each HKG be greater than the median of all genes in each of the 42 tissues and CV of intensity less than 0.35. A total of 362 HKGs meet these criteria. The top 20 genes ranked by their average intensity across all 42 tissues were selected as the experimental housekeeping genes reference.
To identify TEGs, we selected 29 representative tissue types, removing fetal and redundant tissues from the set of 42 tissues described above. The resulting set was as follows: adipose, adrenal gland, bladder, bone marrow, brain, cervix uteri, colon, heart, kidney, liver, lung, trachea, mammary gland, ovary, skeletal muscle, lymph node, placenta, prostate, retina, salivary gland, skin, spinal cord, spleen, stomach, testis, thymus, thyroid gland, jejunum, and CD4-positive T-lymphocyte. To be identified as a TEG, the intensity of the gene in the relevant tissue was required to meet three criteria: 1) among the top 25% percentile of all genes in that particular tissue; 2) greater than 50% of the sum of intensities for that gene in all other tissues in the set of 29; and 3) greater than three times of intensity of the gene of interest in any other tissue.
Conservation of functions
We used the number of orthologs of human genes in other eukaryotic species as identified by NCBI HomoloGene  as an indication of functional conservation across species. We mapped human HKGs, TEGs and all genes represented in the microarray to orthologs in mouse, rat, dog, fly (D. melanogaster), worm (C. elegans) and budding yeast (S. cerevisiae). The numbers of human genes that map to genes of other species through HomoloGene are counted. Student's T-tests were applied between orthologs of HKG and all genes and between orthologs of TEG and all genes.
Distribution of SD and CNV in genes
We required at least a quarter of the total genomic length of a gene to overlap the SD or CNV region (Table 3). The p-values, indicating the statistical significance of the overlap for HKGs and TEGs relative to all RefSeq genes, were calculated according to the hypergeometric distribution with a Bonferroni correction.
CpG islands coordinates were obtained from UCSC genome browser http://genome.ucsc.edu human CpG island track. The number and length of CpG islands located within 500 bp upstream and downstream of transcription start sites and end sites are calculated for HKGs, TEGs and RefSeq genes. CpG density is indicated by the fraction of base pairs occupied by CpG islands. The hypergeometric distribution with Bonferroni correction is applied to determine the significance of the enrichment or depletion of CpG islands relative to the density seen for RefSeq genes.
Chromatin structure and epigenetics modifications
Data of DNase I hypersensitive (HS) sites, histone acetylation, methylation, transcription binding sites and DNA methylation were obtained from recent publications [35, 39, 43, 48]. The density of each feature is calculated in a 500 bp sliding window advancing 100 bp each time near transcription start sites for HKGs, RefSeq genes, TEGs. The average intensity of all genes in each group is plotted as a function of the distance to transcription start site.
We thank Rosetta's Gene Expression Laboratory for microarray experiment.
- Butte AJ, Dzau VJ, Glueck SB: Further defining housekeeping, or "maintenance," genes Focus on "A compendium of gene expression in normal human tissues". Physiol Genomics. 2001, 7 (2): 95-96.PubMedGoogle Scholar
- Eisenberg E, Levanon EY: Human housekeeping genes are compact. Trends Genet. 2003, 19 (7): 362-365. 10.1016/S0168-9525(03)00140-9.View ArticlePubMedGoogle Scholar
- Vinogradov AE: Compactness of human housekeeping genes: selection for economy or genomic design?. Trends Genet. 2004, 20 (5): 248-253. 10.1016/j.tig.2004.03.006.View ArticlePubMedGoogle Scholar
- Lercher MJ, Urrutia AO, Hurst LD: Clustering of housekeeping genes provides a unified model of gene order in the human genome. Nat Genet. 2002, 31 (2): 180-183. 10.1038/ng887.View ArticlePubMedGoogle Scholar
- Zhang L, Li WH: Mammalian housekeeping genes evolve more slowly than tissue-specific genes. Mol Biol Evol. 2004, 21 (2): 236-239. 10.1093/molbev/msh010.View ArticlePubMedGoogle Scholar
- Quina AS, Buschbeck M, Di Croce L: Chromatin structure and epigenetics. Biochem Pharmacol. 2006, 72 (11): 1563-1569. 10.1016/j.bcp.2006.06.016.View ArticlePubMedGoogle Scholar
- Herman JG, Baylin SB: Gene silencing in cancer in association with promoter hypermethylation. N Engl J Med. 2003, 349 (21): 2042-2054. 10.1056/NEJMra023075.View ArticlePubMedGoogle Scholar
- Li B, Carey M, Workman JL: The role of chromatin during transcription. Cell. 2007, 128 (4): 707-719. 10.1016/j.cell.2007.01.015.View ArticlePubMedGoogle Scholar
- Rando OJ: Chromatin structure in the genomics era. Trends Genet. 2007, 23 (2): 67-73. 10.1016/j.tig.2006.12.002.View ArticlePubMedGoogle Scholar
- Gal-Yam EN, Saito Y, Egger G, Jones PA: Cancer Epigenetics: Modifications, Screening, and Therapy. Annual Review of Medicine. 2008, 59: 267-280. 10.1146/annurev.med.59.061606.095816.View ArticlePubMedGoogle Scholar
- Vandesompele J, De Preter K, Pattyn F, Poppe B, Van Roy N, De Paepe A, Speleman F: Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome Biol. 2002, 3 (7): RESEARCH0034-10.1186/gb-2002-3-7-research0034.PubMed CentralView ArticlePubMedGoogle Scholar
- Hsiao LL, Dangond F, Yoshida T, Hong R, Jensen RV, Misra J, Dillon W, Lee KF, Clark KE, Haverty P, et al: A compendium of gene expression in normal human tissues. Physiological Genomics. 2001, 7 (2): 97-104.View ArticlePubMedGoogle Scholar
- Tu Z, Wang L, Xu M, Zhou X, Chen T, Sun F: Further understanding human disease genes by comparing with housekeeping genes and other genes. BMC Genomics. 2006, 7: 31-10.1186/1471-2164-7-31.PubMed CentralView ArticlePubMedGoogle Scholar
- De Ferrari L, Aitken S: Mining housekeeping genes with a Naive Bayes classifier. BMC Genomics. 2006, 7: 277-10.1186/1471-2164-7-277.PubMed CentralView ArticlePubMedGoogle Scholar
- de Jonge HJ, Fehrmann RS, de Bont ES, Hofstra RM, Gerbens F, Kamps WA, de Vries EG, Zee van der AG, te Meerman GJ, ter Elst A: Evidence based selection of housekeeping genes. PLoS ONE. 2007, 2 (9): e898-10.1371/journal.pone.0000898.PubMed CentralView ArticlePubMedGoogle Scholar
- Kouadjo KE, Nishida Y, Cadrin-Girard JF, Yoshioka M, St-Amand J: Housekeeping and tissue-specific genes in mouse tissues. BMC Genomics. 2007, 8: 127-10.1186/1471-2164-8-127.PubMed CentralView ArticlePubMedGoogle Scholar
- Su AI, Wiltshire T, Batalov S, Lapp H, Ching KA, Block D, Zhang J, Soden R, Hayakawa M, Kreiman G, et al: A gene atlas of the mouse and human protein-encoding transcriptomes. Proc Natl Acad Sci USA. 2004, 101 (16): 6062-6067. 10.1073/pnas.0400782101.PubMed CentralView ArticlePubMedGoogle Scholar
- Zhang W, Morris QD, Chang R, Shai O, Bakowski MA, Mitsakakis N, Mohammad N, Robinson MD, Zirngibl R, Somogyi E, et al: The functional landscape of mouse gene expression. J Biol. 2004, 3 (5): 21-10.1186/jbiol16.PubMed CentralView ArticlePubMedGoogle Scholar
- Liang S, Li Y, Be X, Howes S, Liu W: Detecting and profiling tissue-selective genes. Physiol Genomics. 2006, 26 (2): 158-162. 10.1152/physiolgenomics.00313.2005.View ArticlePubMedGoogle Scholar
- Farre D, Bellora N, Mularoni L, Messeguer X, Alba MM: Housekeeping genes tend to show reduced upstream sequence conservation. Genome Biol. 2007, 8 (7): R140-10.1186/gb-2007-8-7-r140.PubMed CentralView ArticlePubMedGoogle Scholar
- Zhu J, He F, Song S, Wang J, Yu J: How many human genes can be defined as housekeeping with current expression data?. BMC Genomics. 2008, 9: 172-10.1186/1471-2164-9-172.PubMed CentralView ArticlePubMedGoogle Scholar
- Thomas PD, Campbell MJ, Kejariwal A, Mi H, Karlak B, Daverman R, Diemer K, Muruganujan A, Narechania A: PANTHER: a library of protein families and subfamilies indexed by function. Genome Research. 2003, 13 (9): 2129-2141. 10.1101/gr.772403.PubMed CentralView ArticlePubMedGoogle Scholar
- Wheeler DL, Barrett T, Benson DA, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Edgar R, Federhen S: Database resources of the National Center for Biotechnology Information. Nucleic Acids Research. 2007, D5-12. 10.1093/nar/gkl1031. 35 Database
- Zhu J, He F, Hu S, Yu J: On the nature of human housekeeping genes. Trends Genet. 2008, 24 (10): 481-484. 10.1016/j.tig.2008.08.004.View ArticlePubMedGoogle Scholar
- Bailey JA, Yavor AM, Massa HF, Trask BJ, Eichler EE: Segmental duplications: organization and impact within the current human genome project assembly. Genome Research. 2001, 11 (6): 1005-1017. 10.1101/gr.GR-1871R.PubMed CentralView ArticlePubMedGoogle Scholar
- She X, Liu G, Ventura M, Zhao S, Misceo D, Roberto R, Cardone MF, Rocchi M, Green ED, Archidiacano N, et al: A preliminary comparative analysis of primate segmental duplications shows elevated substitution rates and a great-ape expansion of intrachromosomal duplications. Genome Research. 2006, 16 (5): 576-583. 10.1101/gr.4949406.PubMed CentralView ArticlePubMedGoogle Scholar
- Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, Fiegler H, Shapero MH, Carson AR, Chen W, et al: Global variation in copy number in the human genome. Nature. 2006, 444 (7118): 444-454. 10.1038/nature05329.PubMed CentralView ArticlePubMedGoogle Scholar
- Watkins-Chow DE, Pavan WJ: Genomic copy number and expression variation within the C57BL/6J inbred mouse strain. Genome Research. 2008, 18 (1): 60-66. 10.1101/gr.6927808.PubMed CentralView ArticlePubMedGoogle Scholar
- McIntyre A, Summersgill B, Lu YJ, Missiaglia E, Kitazawa S, Oosterhuis JW, Looijenga LH, Shipley J: Genomic copy number and expression patterns in testicular germ cell tumours. Br J Cancer. 2007, 97 (12): 1707-1712. 10.1038/sj.bjc.6604079.PubMed CentralView ArticlePubMedGoogle Scholar
- Harada T, Chelala C, Bhakta V, Chaplin T, Caulee K, Baril P, Young BD, Lemoine NR: Genome-wide DNA copy number analysis in pancreatic cancer using high-density single nucleotide polymorphism arrays. Oncogene. 2008, 27 (13): 1951-1960. 10.1038/sj.onc.1210832.PubMed CentralView ArticlePubMedGoogle Scholar
- Bailey JA, Eichler EE: Primate segmental duplications: crucibles of evolution, diversity and disease. Nat Rev Genet. 2006, 7 (7): 552-564. 10.1038/nrg1895.View ArticlePubMedGoogle Scholar
- Bejerano G, Pheasant M, Makunin I, Stephen S, Kent WJ, Mattick JS, Haussler D: Ultraconserved elements in the human genome. Science. 2004, 304 (5675): 1321-1325. 10.1126/science.1098119.View ArticlePubMedGoogle Scholar
- Gardiner-Garden M, Frommer M: CpG islands in vertebrate genomes. J Mol Biol. 1987, 196 (2): 261-282. 10.1016/0022-2836(87)90689-9.View ArticlePubMedGoogle Scholar
- Yamashita R, Suzuki Y, Sugano S, Nakai K: Genome-wide analysis reveals strong correlation between CpG islands with nearby transcription start sites of genes and their tissue specificity. Gene. 2005, 350 (2): 129-136. 10.1016/j.gene.2005.01.012.View ArticlePubMedGoogle Scholar
- Crawford GE, Holt IE, Whittle J, Webb BD, Tai D, Davis S, Margulies EH, Chen Y, Bernat JA, Ginsburg D, et al: Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS). Genome Research. 2006, 16 (1): 123-131. 10.1101/gr.4074106.PubMed CentralView ArticlePubMedGoogle Scholar
- Elgin SC: The formation and function of DNase I hypersensitive sites in the process of gene activation. J Biol Chem. 1988, 263 (36): 19259-19262.PubMedGoogle Scholar
- Xi H, Shulha HP, Lin JM, Vales TR, Fu Y, Bodine DM, McKay RD, Chenoweth JG, Tesar PJ, Furey TS, et al: Identification and characterization of cell type-specific and ubiquitous chromatin regulatory structures in the human genome. PLoS Genetics. 2007, 3 (8): e136-10.1371/journal.pgen.0030136.PubMed CentralView ArticlePubMedGoogle Scholar
- Kurdistani SK, Grunstein M: Histone acetylation and deacetylation in yeast. Nat Rev Mol Cell Biol. 2003, 4 (4): 276-284. 10.1038/nrm1075.View ArticlePubMedGoogle Scholar
- Bernstein BE, Kamal M, Lindblad-Toh K, Bekiranov S, Bailey DK, Huebert DJ, McMahon S, Karlsson EK, Kulbokas EJ, Gingeras TR, et al: Genomic maps and comparative analysis of histone modifications in human and mouse. Cell. 2005, 120 (2): 169-181. 10.1016/j.cell.2005.01.001.View ArticlePubMedGoogle Scholar
- Kininis M, Chen BS, Diehl AG, Isaacs GD, Zhang T, Siepel AC, Clark AG, Kraus WL: Genomic analyses of transcription factor binding, histone acetylation, and gene expression reveal mechanistically distinct classes of estrogen-regulated promoters. Molecular and Cellular Biology. 2007, 27 (14): 5090-5104. 10.1128/MCB.00083-07.PubMed CentralView ArticlePubMedGoogle Scholar
- Rada-Iglesias A, Ameur A, Kapranov P, Enroth S, Komorowski J, Gingeras TR, Wadelius C: Whole-genome maps of USF1 and USF2 binding and histone H3 acetylation reveal new aspects of promoter structure and candidate genes for common human disorders. Genome Research. 2008, 18 (3): 380-392. 10.1101/gr.6880908.PubMed CentralView ArticlePubMedGoogle Scholar
- Guo X, Tatsuoka K, Liu R: Histone acetylation and transcriptional regulation in the genome of Saccharomyces cerevisiae. Bioinformatics (Oxford, England). 2006, 22 (4): 392-399. 10.1093/bioinformatics/bti823.View ArticleGoogle Scholar
- Barski A, Cuddapah S, Cui K, Roh TY, Schones DE, Wang Z, Wei G, Chepelev I, Zhao K: High-resolution profiling of histone methylations in the human genome. Cell. 2007, 129 (4): 823-837. 10.1016/j.cell.2007.05.009.View ArticlePubMedGoogle Scholar
- Williams A, Flavell RA: The role of CTCF in regulating nuclear organization. The Journal of Experimental Medicine. 2008, 205 (4): 747-750. 10.1084/jem.20080066.PubMed CentralView ArticlePubMedGoogle Scholar
- Zhou GL, Xin L, Song W, Di LJ, Liu G, Wu XS, Liu DP, Liang CC: Active chromatin hub of the mouse alpha-globin locus forms in a transcription factory of clustered housekeeping genes. Molecular and Cellular Biology. 2006, 26 (13): 5096-5105. 10.1128/MCB.02454-05.PubMed CentralView ArticlePubMedGoogle Scholar
- Noordermeer D, Branco MR, Splinter E, Klous P, van Ijcken W, Swagemakers S, Koutsourakis M, Spek van der P, Pombo A, de Laat W: Transcription and Chromatin Organization of a Housekeeping Gene Cluster Containing an Integrated beta-Globin Locus Control Region. PLoS Genetics. 2008, 4 (3): e1000016-10.1371/journal.pgen.1000016.PubMed CentralView ArticlePubMedGoogle Scholar
- Goldberg AD, Allis CD, Bernstein E: Epigenetics: a landscape takes shape. Cell. 2007, 128 (4): 635-638. 10.1016/j.cell.2007.02.006.View ArticlePubMedGoogle Scholar
- Weber M, Hellmann I, Stadler MB, Ramos L, Paabo S, Rebhan M, Schubeler D: Distribution, silencing potential and evolutionary impact of promoter DNA methylation in the human genome. Nat Genet. 2007, 39 (4): 457-466. 10.1038/ng1990.View ArticlePubMedGoogle Scholar
- Shen L, Kondo Y, Guo Y, Zhang J, Zhang L, Ahmed S, Shu J, Chen X, Waterland RA, Issa JP: Genome-wide profiling of DNA methylation reveals a class of normally methylated CpG island promoters. PLoS Genetics. 2007, 3 (10): 2023-2036. 10.1371/journal.pgen.0030181.View ArticlePubMedGoogle Scholar
- Caiafa P, Zampieri M: DNA methylation and chromatin structure: the puzzling CpG islands. J Cell Biochem. 2005, 94 (2): 257-265. 10.1002/jcb.20325.View ArticlePubMedGoogle Scholar
- Jones PA, Baylin SB: The fundamental role of epigenetic events in cancer. Nat Rev Genet. 2002, 3 (6): 415-428.PubMedGoogle Scholar
- Jones PA, Baylin SB: The epigenomics of cancer. Cell. 2007, 128 (4): 683-692. 10.1016/j.cell.2007.01.029.PubMed CentralView ArticlePubMedGoogle Scholar
- Johnson JM, Castle J, Garrett-Engele P, Kan Z, Loerch PM, Armour CD, Santos R, Schadt EE, Stoughton R, Shoemaker DD: Genome-wide survey of human alternative pre-mRNA splicing with exon junction microarrays. Science. 2003, 302 (5653): 2141-2144. 10.1126/science.1090100.View ArticlePubMedGoogle Scholar
- Castle J, Garrett-Engele P, Armour CD, Duenwald SJ, Loerch PM, Meyer MR, Schadt EE, Stoughton R, Parrish ML, Shoemaker DD, et al: Optimization of oligonucleotide arrays and RNA amplification protocols for analysis of transcript structure and alternative splicing. Genome Biol. 2003, 4 (10): R66-10.1186/gb-2003-4-10-r66.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.