Transcriptional regulation and spatial interactions of head-to-head genes
© Chen et al.; licensee BioMed Central Ltd. 2014
Received: 27 March 2014
Accepted: 19 June 2014
Published: 24 June 2014
In eukaryotic genomes, about 10% of genes are arranged in a head-to-head (H2H) orientation, and the distance between the transcription start sites of each gene pair is closer than 1 kb. Two genes in an H2H pair are prone to co-express and co-function. There have been many studies on bidirectional promoters. However, the mechanism by which H2H genes are regulated at the transcriptional level still needs further clarification, especially with regard to the co-regulation of H2H pairs. In this study, we first used the Hi-C data of chromatin linkages to identify spatially interacting H2H pairs, and then integrated ChIP-seq data to compare H2H gene pairs with and without evidence of spatial interactions in terms of their binding transcription factors (TFs). Using ChIP-seq and DNase-seq data, histones and DNase associated with H2H pairs were identified. Furthermore, we looked into the connections between H2H genes in a human co-expression network.
We found that i) Similar to the behaviour of two genes within an H2H pair (intra-H2H pair), a gene pair involving two distinct H2H pairs (inter-H2H pair) which interact with each other spatially, share common transcription factors (TFs); ii) TFs of intra- and inter-H2H pairs are distributed differently. Factors such as HEY1, GABP, Sin3Ak-20, POL2, E2F6, and c-MYC are essential for the bidirectional transcription of intra-H2H pairs; while factors like CTCF, BDP1, GATA2, RAD21, and POL3 play important roles in coherently regulating inter-H2H pairs; iii) H2H gene blocks are enriched with hypersensitive DNase and modified histones, which participate in active transcriptions; and iv) H2H genes tend to be highly connected compared with non-H2H genes in the human co-expression network.
Our findings shed new light on the mechanism of the transcriptional regulation of H2H genes through their linear and spatial interactions. For intra-H2H gene pairs, transcription factors regulate their transcriptions through bidirectional promoters, whereas for inter-H2H gene pairs, transcription factors are likely to regulate their activities depending on the spatial interaction of H2H gene pairs. In this way, two distinctive groups of transcription factors mediate intra- and inter-H2H gene transcriptions respectively, resulting in a highly compact gene regulatory network.
It is thought that intra-H2H gene pairs are regulated by bidirectional promoters . But what elements and transcription factors (TFs) play a vital role during the regulation process is still an open question. It is well known that eukaryotic gene expression regulation involves combinatorial control of TFs, which could be organized both in linear and three-dimensional conformations . In our previous report, there proved to be positive expression correlations among H2H gene pairs as well as genes within an individual H2H pair . However, the means by which TF regulations can be accomplished over long distances between inter-gene pairs are unknown. It was hypothesized that the establishment of close contacts or chromatin loops may facilitate the process . Using the new Hi-C technology, spatial proximity maps of the human genome have been built . With ChIP-seq data, TFs and modified histones for H2H genes can be identified. By using these data, we identified two distinctive groups of transcription factors mediating intra- and inter-H2H gene transcriptions respectively.
It has been proposed that bidirectional promoters should not be considered under an umbrella classification for one large regulatory network, nor should they be divided into thousands of gene pairs . We hypothesized that such H2H genes, may contribute to the compactness of the overall gene regulatory network because they are highly co-expressed. Through the construction of human co-expression networks as part of our methodology, we looked into the connections between H2H genes and non-H2H genes and compared their attributes.
Results and discussion
Characterization of TFs for intra-H2H gene pairs
TF interaction network regulating H2H gene pairs
Using the transcription factor data described in Methods, we built a binary matrix categorizing whether a given H2H gene pair is regulated by a given TF. By analyzing the correlation between each TF pair (see Methods), we identified co-occurring TF pairs which bind to the same H2H gene pairs. Then we constructed a transcription factor interaction network based on TFs’ co-occurrence.
Since TFs can bridge both promoters and distal cis-regulatory elements such as enhancers, insulators, and silencers while looping out of the intervening DNA , we looked into the TF distributions in spatially interacting inter-H2H pairs.
Inter-H2H gene pairs involved in Hi-C supported interaction loci
Percentage of TFs detected in intra- H2H gene pairs and interacting inter- H2H pairs
Intra- H2H pairs
Interacting inter-H2H pairs
Enrichment (fold change)
Modified histones and hypersensitive DNase bound to H2H gene pairs
The proportion of hypersensitive DNase and modified histones bound to H2H pairs
#H2H pairs with epigenetic markers
Proportion of H2H with epigenetic markers
Proportion of all genes with epigenetic markers
The role of H2H genes in Human co-expression network
It was reported that co-expressed genes tend to be co-regulated by one or more common transcription factors . Since H2H genes tend to be highly connected to other genes in the co-expression network and have distinctive groups of transcription factors mediating intra- and inter-H2H gene transcriptions, we propose that H2H genes contribute to the compactness of the overall gene regulatory network.
A systematic investigation of H2H genes, their transcription factors and the histones and DNase bound to them based on human genome Hi-C, ChIP-seq and DNase-seq data was conducted in this study. We echoed and adjusted several known properties of H2H gene organization and also provided new observations on the spatial regulation of H2H genes. We further demonstrated that H2H intra-pairs and inter-pairs are regulated by two distinct groups of transcription factors. The binding of hypersensitive DNase and the modified histones associated with active transcription may facilitate the high expression of H2H genes. Finally, we analysed the properties of H2H genes in a human co-expression network and found that H2H genes tend to be highly connected to other genes. We propose that the highly expressed H2H genes, regulated through both linear and spatial interactions, contribute to the compactness and thus the high efficiency of the entire gene regulatory network.
H2H gene pair information was obtained from our previous work: DBH2H . DBH2H contains information about H2H gene pairs from species Human, Mouse, Rat, Chicken and Fugu. There are 1447 H2H pairs in the DBH2H database. Human gene co-expression data was obtained from COXPRESdb . COXPRESdb is a database of co-expressed gene sets. Gene expression profiles of humans in the database are from NCBI GEO, based on the Affymetrix GeneChip (Human Genome U133 Plus 2.0 Array). Genomic interaction regions calculated by Xun Lan et al. were derived from an Hi-C data set in K562 cell line  using the Mixture Poisson Regression Model and a power-law decay distribution .
DNase-seq data for DNase hypersensitivity and ChIP-seq data for 9 modified histones and 45 transcription factors in the K562 cell line were downloaded from UCSC (http://genome.ucsc.edu/encode/dataMatrix/encodeDataMatrixHuman.html). The data were analyzed using W-ChIPeaks (http://motif.bmi.ohio-state.edu/W-ChIPeaks) [18, 30].
Identification of interacting inter-H2H pairs
We integrated the location data of human H2H blocks with genomic interaction loci from Hi-C. If a locus overlapped with a H2H block, we annotated the locus with the H2H pair. If both of the interacting loci overlapped with different H2H blocks, the two H2H pairs (an ‘inter-H2H’pair) were regarded to be spatially interacting with each other. A total of 546 pairs of interacting loci were fully annotated by H2H pairs, and among them, 105 pairs of interacting loci were annotated by different H2H pairs (‘inter-H2H’ pairs) (see Additional file 2).
Histone modifications and transcription factors for H2H genes
In this paper we studied 45 transcription factors, 9 different types of histone modifications as well as DNase hypersensitive sites. TFs, modified histones and DNase with binding sites overlapping with H2H blocks were identified and annotated with H2H pairs for further analysis. For an interacting ‘inter-H2H’ pair, TFs bound to either H2H pair were regarded as binding to the ‘inter-H2H’ pair. The distributions as well as the preferences of transcriptional factors and epigenetic markers bound to H2H pairs were investigated. To assess the bindings of epigenetic markers to background genes (all the 25363 human genes from UCSC hg18), we used each human gene region plus its 100 bp bases before transcription start site to map to the binding sites of modified histones and DNase. We also compared TF similarities between interacting inter-H2H pairs and random inter-pairs to study the characteristics of their spatial interactions. Here, the TF similarity between an inter-gene pair is represented as the ratio of the shared TFs to the total TFs binding to the same inter-gene pair.
Construction of co-functional networks by transcription factors
We obtained a binary regulation matrix of transcription factors and H2H genes after completing the above analysis (see Additional file 4). The matrix rows represent transcription factors and the columns represent H2H gene pairs. Each element of the matrix indicates the binding of a transcription factor to an H2H gene pair. 1 represents a transcription factor binding to an H2H pair, 0 represents a transcription factor not binding to an H2H pair. Then we analyzed the co-occurrence of each transcription factor pair regulating at least 10 gene pairs. P-value was calculated to evaluate the statistical significance of co-occurrence using a binomial test. The overlapping ratio between a pair of two transcription factors was calculated as: number of shared H2H pairs regulated by a TF pair/total number of H2H pairs bound to a TF pair. We defined transcription factor pair co-occurrence as a calculated p-value < 0.01 and the overlapping ratio > 0.6. A co-occurrence network of transcription factors that regulate H2H genes was visualized using Cytoscape 3.0.2 .
Human co-expression network
COXPRESdb provides the co-expression data of 20280 human genes including Pearson’s correlation coefficient (PCC) of gene expression profiles and a relative correlation index: mutual rank (MR) for each gene . Mutual rank (MR) is a geometric average of the PCC rank from gene A to gene B and that of gene B to gene A and is considered a standard measure of the biological significance of gene co-expression. Here, we considered gene pairs with MR < =20 as co-expressed pairs. We constructed a global human gene co-expression network based on the co-expressed pairs and singleton genes (without co-expressed genes) from this database.
Pearson’s correlation coefficient
The authors want to thank Dr. Hui Yu and Dr. Xiao Dong from SIBS for helpful suggestion. This work was supported by grants from National Key Basic Research Program (grant numbers: 2014DFB30020, 2013CB910801, 2012CB316501) and National Natural Science Foundation of China (31171268).
- Trinklein ND, Aldred SF, Hartman SJ, Schroeder DI, Otillar RP, Myers RM: An abundance of bidirectional promoters in the human genome. Genome Res. 2004, 14 (1): 62-66.PubMed CentralPubMedView ArticleGoogle Scholar
- Li YY, Yu H, Guo ZM, Guo TQ, Tu K, Li YX: Systematic analysis of head-to-head gene organization: evolutionary conservation and potential biological relevance. PLoS Comput Biol. 2006, 2 (7): e74-PubMed CentralPubMedView ArticleGoogle Scholar
- Chen YQ, Yu H, Li YX, Li YY: Sorting out inherent features of head-to-head gene pairs by evolutionary conservation. BMC bioinformatics. 2010, 11 Suppl 11: S16-PubMedView ArticleGoogle Scholar
- Lin JM, Collins PJ, Trinklein ND, Fu Y, Xi H, Myers RM, Weng Z: Transcription factor binding and modified histones in human bidirectional promoters. Genome Res. 2007, 17 (6): 818-827.PubMed CentralPubMedView ArticleGoogle Scholar
- Remenyi A, Scholer HR, Wilmanns M: Combinatorial control of gene expression. Nat Struct Mol Biol. 2004, 11 (9): 812-815.PubMedView ArticleGoogle Scholar
- Dean A: In the loop: long range chromatin interactions and gene regulation. Brief Funct Genomics. 2011, 10 (1): 3-10.PubMed CentralPubMedView ArticleGoogle Scholar
- Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, Amit I, Lajoie BR, Sabo PJ, Dorschner MO, Sandstrom R, Bernstein B, Bender MA, Groudine M, Gnirke A, Stamatoyannopoulos J, Mirny LA, Lander ES, Dekker J: Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science. 2009, 326 (5950): 289-293.PubMed CentralPubMedView ArticleGoogle Scholar
- Yang MQ, Koehly LM, Elnitski LL: Comprehensive annotation of bidirectional promoters identifies co-regulation among breast and ovarian cancer genes. PLoS Comput Biol. 2007, 3 (4): e72-PubMed CentralPubMedView ArticleGoogle Scholar
- Yu H, Yu FD, Zhang GQ, Shen X, Chen YQ, Li YY, Li YX: DBH2H: vertebrate head-to-head gene pairs annotated at genomic and post-genomic levels. Database (Oxford). 2009, 2009: bap006-View ArticleGoogle Scholar
- da Huang W, Sherman BT, Lempicki RA: Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009, 4 (1): 44-57.PubMedView ArticleGoogle Scholar
- Collins PJ, Kobayashi Y, Nguyen L, Trinklein ND, Myers RM: The ets-related transcription factor GABP directs bidirectional transcription. PLoS Genet. 2007, 3 (11): e208-PubMed CentralPubMedView ArticleGoogle Scholar
- Sims RJ, Mandal SS, Reinberg D: Recent highlights of RNA-polymerase-II-mediated transcription. Curr Opin Cell Biol. 2004, 16 (3): 263-271.PubMedView ArticleGoogle Scholar
- Dang CV: c-Myc target genes involved in cell growth, apoptosis, and metabolism. Mol Cell Biol. 1999, 19 (1): 1-11.PubMed CentralPubMedView ArticleGoogle Scholar
- Stevens C, La Thangue NB: E2F and cell cycle control: a double-edged sword. Arch Biochem Biophys. 2003, 412 (2): 157-169.PubMedView ArticleGoogle Scholar
- Schratt G, Weinhold B, Lundberg AS, Schuck S, Berger J, Schwarz H, Weinberg RA, Ruther U, Nordheim A: Serum response factor is required for immediate-early gene activation yet is dispensable for proliferation of embryonic stem cells. Mol Cell Biol. 2001, 21 (8): 2933-2943.PubMed CentralPubMedView ArticleGoogle Scholar
- Ward JH: Hierarchical grouping to optimize an objective function. J Am Stat Assoc. 1963, 58 (301): 236-244.View ArticleGoogle Scholar
- Lee BK, Iyer VR: Genome-wide studies of CCCTC-binding factor (CTCF) and cohesin provide insight into chromatin structure and regulation. J Biol Chem. 2012, 287 (37): 30906-30913.PubMed CentralPubMedView ArticleGoogle Scholar
- Lan X, Witt H, Katsumura K, Ye Z, Wang Q, Bresnick EH, Farnham PJ, Jin VX: Integration of Hi-C and ChIP-seq data reveals distinct types of chromatin linkages. Nucleic Acids Res. 2012, 40 (16): 7690-7704.PubMed CentralPubMedView ArticleGoogle Scholar
- Kim SI, Bultman SJ, Kiefer CM, Dean A, Bresnick EH: BRG1 requirement for long-range interaction of a locus control region with a downstream promoter. Proc Natl Acad Sci U S A. 2009, 106 (7): 2259-2264.PubMed CentralPubMedView ArticleGoogle Scholar
- Kim J, Woolridge S, Biffi R, Borghi E, Lassak A, Ferrante P, Amini S, Khalili K, Safak M: Members of the AP-1 family, c-Jun and c-Fos, functionally interact with JC virus early regulatory protein large T antigen. J Virol. 2003, 77 (9): 5241-5252.PubMed CentralPubMedView ArticleGoogle Scholar
- Chavanas S, Adoue V, Mechin MC, Ying S, Dong S, Duplan H, Charveron M, Takahara H, Serre G, Simon M: Long-range enhancer associated with chromatin looping allows AP-1 regulation of the peptidylarginine deiminase 3 gene in differentiated keratinocyte. PLoS One. 2008, 3 (10): e3408-PubMed CentralPubMedView ArticleGoogle Scholar
- Splinter E, Heath H, Kooren J, Palstra RJ, Klous P, Grosveld F, Galjart N, de Laat W: CTCF mediates long-range chromatin looping and local histone modification in the beta-globin locus. Genes Dev. 2006, 20 (17): 2349-2354.PubMed CentralPubMedView ArticleGoogle Scholar
- Frietze S, O’Geen H, Blahnik KR, Jin VX, Farnham PJ: ZNF274 recruits the histone methyltransferase SETDB1 to the 3′ ends of ZNF genes. PLoS One. 2010, 5 (12): e15082-PubMed CentralPubMedView ArticleGoogle Scholar
- Cabart P, Murphy S: BRFU, a TFIIB-like factor, is directly recruited to the TATA-box of polymerase III small nuclear RNA gene promoters through its interaction with TATA-binding protein. J Biol Chem. 2001, 276 (46): 43056-43064.PubMedView ArticleGoogle Scholar
- Haeusler RA, Engelke DR: Spatial organization of transcription by RNA polymerase III. Nucleic Acids Res. 2006, 34 (17): 4826-4836.PubMed CentralPubMedView ArticleGoogle Scholar
- Kouzarides T: Chromatin modifications and their function. Cell. 2007, 128 (4): 693-705.PubMedView ArticleGoogle Scholar
- Barski A, Cuddapah S, Cui K, Roh TY, Schones DE, Wang Z, Wei G, Chepelev I, Zhao K: High-resolution profiling of histone methylations in the human genome. Cell. 2007, 129 (4): 823-837.PubMedView ArticleGoogle Scholar
- Allocco DJ, Kohane IS, Butte AJ: Quantifying the relationship between co-expression, co-regulation and gene function. BMC bioinformatics. 2004, 5: 18-PubMed CentralPubMedView ArticleGoogle Scholar
- Obayashi T, Okamura Y, Ito S, Tadaka S, Motoike IN, Kinoshita K: COXPRESdb: a database of comparative gene coexpression networks of eleven species for mammals. Nucleic Acids Res. 2013, 41 (Database issue): D1014-D1020.PubMed CentralPubMedView ArticleGoogle Scholar
- Lan X, Bonneville R, Apostolos J, Wu W, Jin VX: W-ChIPeaks: a comprehensive web application tool for processing ChIP-chip and ChIP-seq data. Bioinformatics. 2011, 27 (3): 428-430.PubMed CentralPubMedView ArticleGoogle Scholar
- Smoot ME, Ono K, Ruscheinski J, Wang PL, Ideker T: Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics. 2011, 27 (3): 431-432.PubMed CentralPubMedView ArticleGoogle Scholar
- Obayashi T, Kinoshita K: Rank of correlation coefficient as a comparable measure for biological significance of gene coexpression. DNA Res. 2009, 16 (5): 249-260.PubMed CentralPubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.