Research article | Open | Published:
Bromodomain protein 4 discriminates tissue-specific super-enhancers containing disease-specific susceptibility loci in prostate and breast cancer
BMC Genomicsvolume 18, Article number: 270 (2017)
Epigenetic information can be used to identify clinically relevant genomic variants single nucleotide polymorphisms (SNPs) of functional importance in cancer development. Super-enhancers are cell-specific DNA elements, acting to determine tissue or cell identity and driving tumor progression. Although previous approaches have been tried to explain risk associated with SNPs in regulatory DNA elements, so far epigenetic readers such as bromodomain containing protein 4 (BRD4) and super-enhancers have not been used to annotate SNPs. In prostate cancer (PC), androgen receptor (AR) binding sites to chromatin have been used to inform functional annotations of SNPs.
Here we establish criteria for enhancer mapping which are applicable to other diseases and traits to achieve the optimal tissue-specific enrichment of PC risk SNPs. We used stratified Q-Q plots and Fisher test to assess the differential enrichment of SNPs mapping to specific categories of enhancers. We find that BRD4 is the key discriminant of tissue-specific enhancers, showing that it is more powerful than AR binding information to capture PC specific risk loci, and can be used with similar effect in breast cancer (BC) and applied to other diseases such as schizophrenia.
This is the first study to evaluate the enrichment of epigenetic readers in genome-wide associations studies for SNPs within enhancers, and provides a powerful tool for enriching and prioritizing PC and BC genetic risk loci. Our study represents a proof of principle applicable to other diseases and traits that can be used to redefine molecular mechanisms of human phenotypic variation.
Genome-wide association studies (GWASs) have linked more than ten thousand of single nucleotide polymorphisms (SNPs) to human diseases and traits . Given that a great part of associated variants are located in known tissue-specific enhancers, a recent study by Tehranchi and colleagues  found that these non-coding variants affect transcription factors (TFs) binding and gene expression. Although they found that CCCTC-binding factor (CTCF) is likely to play a pioneering role in translating natural genetic variation in chromosomal architecture , we still strive to understand tumor-specific epigenetic features that render possible progression toward such disease. For instance, previous approaches have been adopted to explore disease risk association with regulatory DNA elements [3–6].
In prostate cancer (PC) the androgen receptor (AR) binds predominantly to gene-distal sites and has been used by multiple groups to functionally annotate genetic risk loci based on overlaps with risk single nucleotide polymorphisms (SNPs) as measured in genome-wide association studies (GWAS), which in some cases are also predicted to affect AR binding [7, 8].
Epigenetic marks such as acetylation on Histone 3 lysine 27 (H3K27ac) have been used as annotation of enhancers . Moreover, regions of extended H3K27ac bound by combinations of mediator complex subunit 1 (MED1) and bromodomain containing protein 4 (BRD4) have been defined as super-enhancers important to determine cell identity [10–12]. BRD4 has proven to be involved in several diseases thanks to the small molecule inhibitor JQ1 . In PC cells, BRD4 was recently shown to bind to the AR and affect its activity  while components of the mediator complex such as MED1 and MED12 were recently found to be implicated in advanced PC [15, 16].
SNPs associated with common diseases have been found to lie within enhancers driving transcriptional output and have been identified using different methods . For PC, the most recent methods include genotyping matched to expression quantitative trait loci analysis and epigenetic marks such as H3K27Ac combined with chromatin accessibility [17, 18] or additional combination of binding information for key TFs such as AR and FOXA1 . Here we combined information on H3K27ac profile with binding site data for BRD4 and MED12 to improve the functional annotation of PC risk SNPs based on a previously described enrichment analysis .
We show that this method is able to capture SNPs associated not only with PC but also in the context of Breast Cancer (BC) and Lung Cancer (LC) susceptibility. We find that BRD4 is the key discriminant of tissue-specific super-enhancers and binds disease specific PC and BC low p-value risk SNPs. Enrichment of disease-specific risk SNPs is higher when BRD4 binding profile information is incorporated with other epigenetic marks such as H3K27Ac and MED components, than for binding profiles of key TFs implicated in disease development and progression such as the AR or estrogen receptor (ER). Inhibitors for BRD4 are in clinical trials. However, little is known about the contribution of BRD4 to brain diseases. In order to evaluate if similar principles apply also for heritable mental disorders we extended our framework to epigenetic marks including BRD4 binding derived from Schwann cells and applied the enrichment analysis to GWAS studies of mental disorders from the Psychiatric Genetics Consortium (PGC) [20, 21].
Data source for enhancers’ annotation
AR binding information in both LNCaP and VCaP cells was retrieved from Massie et al., (2011) . Raw data were aligned with novoalign to human genome version hg19, and peaks were called with MACS using default parameters after filtering low quality reads (score below 20). Resulting peaks were then overlapped using Bedtools. MED1 binding information and H3K27Ac profile in LNCaP cells was retrieved from Wang et al., (2012)  and re-analyzed as described above. To define the degree of overlap with super-enhancers, we also downloaded super-enhancers coordinates from dbSUPER database . BRD4 binding information and H3K27Ac profile in VCaP cells was retrieved from Asangani et al., (2014) . ER and BRD4 binding information were retrieved from Nagarajan et al., (2014) . H3K27Ac profile in MCF7 was retrieved from Theodorou et al., (2013) . BRD4 and MED1 binding information, and H3K27Ac profile for small cell lung cancer (SCLC) cell line H2171 and Schwann cells were retrieved from cistrome . All cell-specific datasets were equally analyzed to ensure comparability within a tissue type.
Enhancers were defined in LNCaP based on (1) extended H3K27Ac marked regions ranging from 3000 bp to 200 kb (Additional file 1); (2) an intersection of these H3K27Ac marked regions with MED12 binding sites (Additional file 2). In VCaP cells enhancers were defined (3) as an extended H3K27Ac marked regions ranging from 3000 bp to 200 kb (Additional file 3) (4) the intersection of H3K27Ac stretches longer than 2000 bp and BRD4 binding sites (in VCaP cultured in presence of androgens) (Additional file 4) or (5) as BRD4 sites alone (Additional file 5). (6) To achieve a consensus map of super-enhancers in PC (Additional file 6) we selected super-enhancers found in LNCaP cells that were found to have H3K27Ac and BRD4 binding also in VCaP cells.
Enhancers in MCF7 cells were identified following the criteria described in Hnisz et al., (2013) . First, H3K27Ac peaks closer than 100 bp were merged, then only stretches longer than 2000 bp were selected (Additional file 7). Different compendia of enhancers were then created based on the presence of BRD4 (Additional file 8) and ER binding (Additional file 9) or the combination of these features (Additional files 10 and 11). The same type of algorithm was followed to identify enhancers in H2171 and Schwann cells 90-8TL (Additional files 12, 13, 14, 15, 16, 17 and 18). DNase I hypersensitive sites (DHS) profiles for LNCaP cells were retrieved from He et al. (2012)  and from ENCODE (Additional files 19 and 20). A more stringent profile of these two based on overlap (Additional file 21) was also included.
Data source for summary statistics of genome-wide association studies
We obtained summary statistics from large meta-analyses of the traits of interest. In particular, the summary statistics for association with PC risk were obtained from the Illumina array Collaborative Oncological Gene-environment Study (iCOGS) consortium  and comprised information on 25,074 cases and 24,272 controls genotyped on a customized array including 211,155 SNPs. Additionally, we used summary statistics on 525,821 SNPs for association with PC risk derived from a smaller UK-based cohort including 1854 cases and 1854 controls in collaboration with the PRACTICAL consortium . Genetic association with BC risk was obtained in collaboration with the BCAC consortium and was derived from a meta-analysis including 15,863 cases and 40,022 controls on ~2.5 million SNPs . We collected also summary statistics for 14,900 cases of lung cancer (LC) and 29,485 controls including 2,433,836 SNPs from the TRICL consortium . From the IGAP consortium we obtained summary data from 17,008 Alzheimer's disease cases and 37,154 controls genotyped on 518,871 SNPs . Finally from the PGC consortium we used summary statistics on association with schizophrenia on 36,989 cases and 113,075 controls including 2,540,803 SNPs , and summary statistics on association with bipolar disorder on 11,974 cases and 51,792 controls on a total of 2,382,073 SNPs .
SNPs enrichment method
Enrichment is defined by the presence of lower p-values than expected by chance. Quantile-quantile (Q-Q) plots are tools commonly used in genetics to visualize enrichment . Typically, the observed p-value quantiles on the y-axes are plotted against the theoretical p-value quantiles under the assumption of no association (i.e. following the quantiles of the uniform distribution) on the x-axes. In case of no association, a Q-Q plot follows a straight 0–1 line starting from the origin. In the presence of association, the enrichment (of low p-values) is described by the deflection of the Q-Q plot from this theoretical line of no association. We used stratified Q-Q plots to assess differential enrichment of SNPs mapping to specific categories of enhancers. Stratified Q-Q plots have been used previously to demonstrate enrichment of general location annotation categories such as 5’UTR SNPs .
Quantifying SNPs enrichment within sets of enhancers
To assess the significance of the association enrichment among the sets of SNPs within enhancers we used Fisher’s hypergeometric test. More specifically, we tested for over-representation of genome-wide significant SNPs (i.e. association of –log10 p-value > 7.3) within specific enhancers. We adjusted for multiple testing using a Bonferroni-correction accounting for the number of annotations tested.
The statistical models underlying the SNP enrichment analyses carried out here generally assume independence of the data. Far from resembling independent samples, SNPs are linked by complex correlation patterns reflected in their linkage disequilibrium (LD) structure. In order to adhere more closely to the independence assumption, and to rule out bias due to confounding factors such as LD, and assess whether the intrinsic capacity of functional annotations to enrich specific SNP sets was due to such confounding factors, the SNPs were randomly pruned prior to the analyses by randomly selecting representatives from all 1Mbase LD blocks of SNPs with pairwise r 2 ≥ 0.2. Iterating the random pruning procedure 100 times and subsequently averaging the corresponding test statistics compensated the arbitrariness in the choice of representative SNPs. These analyses were performed and shown in Additional file 22: Figures S1, S3, and S6.
To assess whether tissue or cell-specific enhancers could mark tissue-specific risk SNPs associated with development of PC, we analyzed datasets from two studies that profiled MED12 binding and H3K27Ac map in LNCaP cells , and BRD4 and H3K27Ac in VCaP cells . MED12, is a subunit of the same chromatin looping mediator complex as MED1  therefore we used it for our PC study assuming that these two subunits would have similar binding profiles in the same cells.
Enhancers were defined in LNCaP based on (1) extended H3K27Ac marked regions (Additional file 1); (2) an intersection of these H3K27Ac marked regions with MED12 binding sites (Additional file 2). In VCaP cells enhancers were defined as (3) extended H3K27Ac marked regions (Additional file 3) (4) the intersection of H3K27Ac stretches and BRD4 binding sites (Additional file 4) or (5) as BRD4 sites alone (Additional file 5). (6) To achieve a consensus map of enhancers in PC (Additional file 6) we intersected the enhancers found in both LNCaP and VCaP cells characterized by all three epigenetic features and responded to the definition of super-enhancers  (Table 1 and Fig. 1).
Enrichment of SNPs associated with prostate cancer in regions bound by MED and BRD4, marked by H3K27Ac in prostate cancer cells
First, we overlaid genome coordinates of enhancers in PC cell lines, as defined previously, with genomic coordinates of all SNPs in the PC iCOGS dataset . To visualize differential enrichment patterns of specific epigenetic markers with respect to their genetic association with PC risk we generated stratified Q-Q plots which is a method for visualizing the enrichment of statistical association relative to that expected under the global null hypothesis . Q-Q plots show that SNPs within regions with different genomic features (H3K27ac, BRD4, and MED12, or a combination of these) had different enrichment patterns compared to all SNPs (Fig. 2a). The SNPs contained in common PC enhancers, and therefore characterized by BRD4 and MED12 binding, and a long stretch of H3K27Ac had lower p-values than SNPs contained in enhancers identified in VCaP cells by mapping long stretches of H3K27Ac and BRD4 binding. SNPs associated with PC risk were more enriched within BRD4 binding sites alone than within H3K27Ac sites or H3K27Ac/MED12 overlapping sites in LNCaP. In addition, we focused on SNPs achieving genome wide significance and compared overrepresentation of these SNPs mapping to the above-described enhancers (Additional file 22: Table S1). 12% and 3% of the SNPs contained in PC enhancers achieved genome-wide significance in the iCOGS and in the PRACTICAL GWAS respectively. SNPs that achieved significance in iCOGs are listed in Additional file 22: Table S2. These results highlight that combining generic epigenetic marks such as H3K27Ac with generic epigenetic readers such as BRD4 and with MED binding increases the capacity of capturing SNPs associated with PC.
Importantly, to rule out possible confounding factors, we first randomly pruned the SNPs, selecting one representative SNP per LD block. The random pruning did not change the enrichments patterns caused by the functional annotations (Additional file 22: Figure S1). Secondly, in order to rule out that the enrichment merely results from the non-independence of the SNPs in the enhancer regions or other confounding features of these, we compared the observed enrichment to the one attained on a set of SNPs numerically matching those in the enhancer regions on minor allele frequencies and mutual LD r 2 (Additional file 22: Figure S2). The numerically matched SNP set was also used as control set to assess the enrichment significance (Additional file 22: Table S1, S3, S4) by means of Fisher’s hypergeometric test (see Methods).
Enrichment of prostate cancer associated SNPs within androgen receptor binding information.
We also compared the genomic coordinates of the SNPs to the coordinates for AR binding sites (ARBSs). Despite the use in the literature of ARBSs for functional annotation of GWAS significant PC SNPs, intersecting enhancer information with AR binding data did not lead to any further enrichment of SNPs associated with PC compared with enhancer information alone (Fig. 2b and Additional file 22: Table S3). In particular, although intersecting AR binding information induced a slight left-shift of the Q-Q plot for enhancers marked by H3K27Ac, MED12, and BRD4 binding, and for enhancers marked by H3K27Ac and BRD4, the enrichment was caused by the same SNPs responsible for the enrichment without AR binding information (see Additional file 22: Table S1 and S3). Furthermore, enhancer information outperformed ARBSs profile alone, or in combination with H3K27Ac profile, in enriching for genome-wide significant p-valued SNPs in PC (Additional file 22: Table S3), and overlapping AR with BRD4 binding sites did not alter the superior capability of BRD4 (as in Fig. 2a) to enrich for disease associated SNPs. Interestingly, although DHSs have been used to predict locations of common disease-associated variation , DHSs profiles enriched less than ARBSs alone (Additional file 22: Figure S3).
Validation of the enrichment method on an independent GWAS for prostate cancer.
Finally, we validated our results on the independent PC GWAS obtained from the PRACTICAL consortium measured on a smaller UK-based cohort  (Fig. 2c). Again, we observed the strongest SNP enrichment in PC super-enhancers marked by H3K27Ac, MED12, and BRD4 binding.
BRD4 binding sites derived from prostate cancer cells do not enrich for SNPs associated with breast cancer
To test the specificity of BRD4, MED12 and H3K27Ac profiles in PC cells in identifying tissue-specific SNPs, we performed a similar enrichment analysis for genetic association with BC risk measured on the genotype array content from the BCAC  (Fig. 2d). Enhancers defined on the basis of BRD4 binding profile in PC cells failed to enrich specifically for BC associated SNPs. Whilst H3K27ac and MED12 together achieved some enrichment of BC SNPs, the addition of BRD4 depleted this enrichment entirely. Importantly, once again, randomly pruning the SNPs did not alter the results of the enrichment analysis (Additional file 22: Figure S4). These results are in stark contrast to the analysis on PC datasets in which inclusion of BRD4 enhanced enrichment of low p-valued SNPs associated with PC, and suggests a hierarchical determination of tissue-specificity, based on the subsequential deposition of these epigenetic marks. Taken together, this indicates that BRD4 substantially contributes to prostate-specific SNP enrichment within super-enhancers.
Of note, the genomic distribution of the BCAC SNP array mirrored the genomic distribution of the SNP arrays used for iCOGS with the majority of SNPs located within intronic (48% and 57%, respectively) and intergenic (48% and 34%, respectively) regions of the genome (Additional file 22: Figure S5) thus meaning that whilst the number of SNPs differed between the PC and BC studies, there was no genomic distribution bias for imputed SNPs. The SNPs included within the enhancers defined in this study reflected similar distributions, with the only exception of SNPs lists derived from LNCaP cells that were slightly biased toward intergenic regions. Around 69% to 77% of the SNPs were located within intergenic regions (data not shown).
Enrichment of SNPs associated with breast cancer in regions bound by BRD4, marked by H3K27Ac in breast cancer cells
Next, we sought to identify whether using BC-specific epigenetic profiles for the same markers derived from the BC cell line MCF7, we would be able to repeat the same performance as in the PC enrichment analysis. Therefore we retrieved genome-wide profiles of H3K27Ac, ER, and BRD4 binding in MCF7 , compiled a similar list of enhancers (Table 1 and Additional file 22: Figure S6), and performed an enrichment analysis of association with BC risk on the BCAC GWAS (Fig. 2e and Additional file 22: Table S4). Information on MED binding is not available for BC cell lines. However, BRD4 binding information alone caused the strongest enrichment of SNPs associated with BC (Additional file 22: Table S5). These data confirm that BRD4 alone is an important enhancer and super-enhancer discriminant, which binds disease-specific susceptibility loci in a tissue specific fashion. Randomly pruning the SNPs involved, proved not to alter the capacity of BRD4 of capturing disease-specific associated SNPs (Additional file 22: Figure S7). Interestingly, pruning the SNPs revealed that ER capability to capture disease associated SNPs in combination with other epigenetic features was enhanced, possibly suggesting a different contribution of ER and AR in breast and PC pathogenesis, respectively.
As counterproof, we tested whether BC epigenetic profiles caused any enrichment in iCOGS PC associations, but no such enrichment was detected (Additional file 22: Figure S8). These results are consistent with BRD4 binding being cell and tissue-specific . Moreover, these results pinpoint the tissue-specificity of risk loci and hint that BRD4 activity may be influenced by genetic variations as it is for TFs .
Enrichment of risk SNPs associated with lung cancer and psychiatric traits using H3K27Ac profiles, BRD4, and MED binding sites derived from relevant cell lines
To understand whether the properties of BRD4 binding to clinically relevant genetic risk loci is confined to PC and BC only, or such selectivity can also be observed to other diseases and traits, we retrieved binding information for BRD4, MED1 and H3K27Ac profiles available for the lynphoblastoid cell line H2171 derived from a metastatic site in a LC patient  and from the malignant peripheral nerve sheath tumor Schwann cells 90-8TL  (Additional files 12, 13, 14, 15, 16, 17 and 18). To retrieve associations of these epigenetic features with other phenotypes we collected summary statistics for LC , Alzheimer’s disease , schizophrenia , and bipolar disorder .
BRD4 binding information alone caused the strongest enrichment of associations with LC, although combined information for BRD4 and MED1 binding, also combined with H3K27Ac profile failed to improve the enrichment of low p-value SNPs (Additional file 22: Figure S9a). We speculate that the LC cell line H2171 might not reflect characteristics of the tissue of origin, as well as the PC and BC cell lines. However, upon assessing the enrichment using epigenetic features related to PC cells for the same LC GWAS (Additional file 22: Figure S9b), as expected, we detected none, confirming that BRD4 binding information in H2171 retains some tissue-specificity and capacity to enrich for LC tissue-specific risk SNPs.
Next, we applied our enrichment method to perform an inverse analysis in which we sought to understand whether any association could be found between epigenetic features related to Schwann cells (the only brain cells for which H3K27Ac profile and BRD4 binding information were publicly available) and three diseases affecting the brain. No enrichment for SNPs associated with Alzheimer disease and bipolar disorder was detected (Additional file 22: Figure S10a&b). However, low p-valued SNPs associated with schizophrenia were highly enriched within BRD4 binding sites (Fig. 2f). Interestingly H3K27Ac profiles also enriched substantially for clinically relevant SNPs associated with schizophrenia. These data suggest that BRD4 activity in Schwann cells could potentially be involved in the etiology of schizophrenia , and grant further investigation on the molecular mechanism underlying these findings.
With the discovery of significant numbers of cancer genetic risk loci through GWAS there is now a major focus on the functional annotation of these loci to prioritize them for further biological study. So far this annotation has been undertaken Post-GWAS and has often employed classifiers of open chromatin, for example DHS, as a primary annotation followed by genome-wide binding maps for tissue-specific transcription factors such as the AR for PC or the ER for BC, while combining this information with H3K27Ac and open chromatin in a tissue-specific manner . In this study we ask whether it is possible to use binding sites data and chromatin marks upfront to enrich for genetic risk factors in a cancer type-specific manner. We show that an enhancer signature comprising a number of factors but dominated by BRD4 allows for the enrichment of PC-specific and BC-specific genetic risk loci (Fig. 3a and b ). Interestingly, these chromatin features have been previously reported to be characteristic of super-enhancer-like profiles [10–12, 35]. We found a strong degree of tissue-specificity, that is when profiles are derived from cell-lines associated with specific cancer types such as the cancer of the breast and prostate, they become far more effective at enriching for cancer-type specific risk loci than other widely used cancer type-specific TFs such as the AR, ER or DHSs alone. We also applied this enrichment strategy to infer that BRD4 binding information may allow in future for the upfront nomination of genomic-regions for high-coverage sequencing in risk studies for schizophrenia (Fig. 3c). Functional determination of the impact of risk SNPs have been the priority of several consortia aiming to uncover the effects on epigenetics mediated by clinically relevant risk variants located in non-exonic regions . Our study implies a conserved and important relationship between enhancers and cancer-associated risk loci, which is being pinpointed also by recent work linking the effect of genetic variation to TFs binding . Our approach is the first one that implies an effect of such genetic variations on the activity of generic epigenetic readers. This is also the first time that such epigenetic readers have been evaluated as enrichment factors for SNPs without prior filtering based on published p-values for risk association.
We highlight the possibility that SNPs lying within super-enhancers marked by BRD4 are more likely to be associated with an increased susceptibility to BC, PC, and schizophrenia. The expression of the genes regulated by enhancers identified in these diseases could be altered by the presence of specific SNPs lying therein (Additional file 22: Figure S11). This is a concept that has recently been postulated for cancer mutations occurring in a chromatin-specific context .
In conclusion we have discovered that BRD4-bound super-enhancers provide a powerful tool for enriching and prioritizing PC and BC genetic risk loci (Fig. 4), and have shown that key TFs such as AR or ER, despite being pivotal tissue-specific TFs, do not contribute to tissue-specific genetic risk enrichment more than epigenetic factors. We propose to refine disease specific risk loci enrichment with the identification of potential binding of BRD4 combined with key MED components and acetylation profiles. Our study will promote the use of BRD4 for SNP annotation as the genetic landscape for different diseases goes on expanding.
AR binding sites
Breast Cancer Association Consortium
bromodomain containing protein 4
Encyclopedia of DNA Elements
Genome-wide association studies
Acetylation on Histone 3 lysine 27
Illumina array Collaborative Oncological Gene-environment Study
Mediator complex subunit 1/12
Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome
Single nucleotide polymorphisms
Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H, Klemm A, Flicek P, Manolio T, Hindorff L, et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic acids research. 2014;42(Database issue):D1001–1006.
Tehranchi AK, Myrthil M, Martin T, Hie BL, Golan D, Fraser HB. Pooled ChIP-Seq Links Variation in Transcription Factor Binding to Complex Disease Risk. Cell. 2016;165(3):730–41.
Maurano MT, Humbert R, Rynes E, Thurman RE, Haugen E, Wang H, Reynolds AP, Sandstrom R, Qu H, Brody J, et al. Systematic localization of common disease-associated variation in regulatory DNA. Science (New York, NY). 2012;337(6099):1190–5.
Coetzee SG, Shen HC, Hazelett DJ, Lawrenson K, Kuchenbaecker K, Tyrer J, Rhie SK, Levanon K, Karst A, Drapkin R et al.: Cell Type Specific Enrichment Of Risk Associated Regulatory Elements At Ovarian Cancer Susceptibility Loci. Human molecular genetics. 2015;24(13):3595–607.
Paul DS, Soranzo N, Beck S. Functional interpretation of non-coding sequence variation: concepts and challenges. Bioessays. 2014;36(2):191–9.
Ritchie GR, Dunham I, Zeggini E, Flicek P. Functional annotation of noncoding sequence variants. Nat Methods. 2014;11(3):294–6.
Huang CN, Huang SP, Pao JB, Chang TY, Lan YH, Lu TL, Lee HZ, Juang SH, Wu PP, Pu YS, et al. Genetic polymorphisms in androgen receptor-binding sites predict survival in prostate cancer patients receiving androgen-deprivation therapy. Ann Oncol. 2012;23(3):707–13.
Hazelett DJ, Rhie SK, Gaddis M, Yan C, Lakeland DL, Coetzee SG, Henderson BE, Noushmehr H, Cozen W, Kote-Jarai Z, et al. Comprehensive functional annotation of 77 prostate cancer risk loci. PLoS genetics. 2014;10(1):e1004102.
Corradin O, Scacheri PC. Enhancer variants: evaluating functions in common disease. Genome Med. 2014;6(10):85.
Loven J, Hoke HA, Lin CY, Lau A, Orlando DA, Vakoc CR, Bradner JE, Lee TI, Young RA. Selective inhibition of tumor oncogenes by disruption of super-enhancers. Cell. 2013;153(2):320–34.
Whyte WA, Orlando DA, Hnisz D, Abraham BJ, Lin CY, Kagey MH, Rahl PB, Lee TI, Young RA. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell. 2013;153(2):307–19.
Hnisz D, Abraham BJ, Lee TI, Lau A, Saint-Andre V, Sigova AA, Hoke HA, Young RA. Super-enhancers in the control of cell identity and disease. Cell. 2013;155(4):934–47.
Arshad Z, Smith J, Roberts M, Lee WH, Davies B, Bure K, Hollander GA, Dopson S, Bountra C, Brindley D. Open Access Could Transform Drug Discovery: A Case Study of JQ1. Expert opinion on drug discovery. 2016;11(3):321–32.
Asangani IA, Dommeti VL, Wang X, Malik R, Cieslik M, Yang R, Escara-Wilke J, Wilder-Romans K, Dhanireddy S, Engelke C, et al. Therapeutic targeting of BET bromodomain proteins in castration-resistant prostate cancer. Nature. 2014;510(7504):278–82.
Shaikhibrahim Z, Offermann A, Braun M, Menon R, Syring I, Nowak M, Halbach R, Vogel W, Ruiz C, Zellweger T, et al. MED12 overexpression is a frequent event in castration-resistant prostate cancer. Endocr Relat Cancer. 2014;21(4):663–75.
Liu G, Sprenger C, Wu PJ, Sun S, Uo T, Haugk K, Epilepsia KS, Plymate S. MED1 mediates androgen receptor splice variant induced gene expression in the absence of ligand. Oncotarget. 2015;6(1):288–304.
Andreassen OA, Zuber V, Thompson WK, Schork AJ, Bettella F, Djurovic S, Desikan RS, Mills IG, Dale AM. Shared common variants in prostate cancer and blood lipids. International journal of epidemiology. 2014;43(4):1205–14.
Schork AJ, Thompson WK, Pham P, Torkamani A, Roddey JC, Sullivan PF, Kelsoe JR, O'Donovan MC, Furberg H, Schork NJ, et al. All SNPs Are Not Created Equal: Genome-Wide Association Studies Reveal a Consistent Pattern of Enrichment among Functionally Annotated SNPs. PLoS genetics. 2013;9(4):e1003449.
Whitington T, Gao P, Song W, Ross-Adams H, Lamb AD, Yang Y, Svezia I, Klevebring D, Mills IG, Karlsson R, et al. Gene regulatory mechanisms underpinning prostate cancer susceptibility. Nature genetics. 2016;48(4):387–97.
Psychiatric GWAS Consortium Bipolar Disorder Working Group. Large-scale genome-wide association analysis of bipolar disorder identifies a new susceptibility locus near ODZ4. Nature genetics 2011, 43(10):977–983.
Hoefer J, Kern J, Ofer P, Eder IE, Schäfer G, Dietrich D, Kristiansen G, Geley S, Rainer J, Gunsilius E, et al. SOCS2 correlates with malignancy and exerts growth-promoting effects in prostate cancer. Endocr Relat Cancer. 2014;21(2):175–87.
Massie CE, Lynch A, Ramos-Montoya A, Boren J, Stark R, Fazli L, Warren A, Scott H, Madhu B, Sharma N, et al. The androgen receptor fuels prostate cancer by regulating central metabolism and biosynthesis. The EMBO journal. 2011;30(13):2719–33.
Wang D, Garcia-Bassets I, Benner C, Li W, Su X, Zhou Y, Qiu J, Liu W, Kaikkonen MU, Ohgi KA, et al. Reprogramming transcription by distinct classes of enhancers functionally defined by eRNA. Nature. 2011;474(7351):390–4.
Khan A, Zhang X. dbSUPER: a database of super-enhancers in mouse and human genome. Nucleic acids research. 2016;44(D1):D164–171.
Nagarajan S, Hossan T, Alawi M, Najafova Z, Indenbirken D, Bedi U, Taipaleenmaki H, Ben-Batalla I, Scheller M, Loges S, et al. Bromodomain protein BRD4 is required for estrogen receptor-dependent enhancer activation and gene transcription. Cell Rep. 2014;8(2):460–9.
Theodorou V, Stark R, Menon S, Carroll JS. GATA3 acts upstream of FOXA1 in mediating ESR1 binding by shaping enhancer accessibility. Genome research. 2013;23(1):12–22.
Liu T, Ortiz JA, Taing L, Meyer CA, Lee B, Zhang Y, Shin H, Wong SS, Ma J, Lei Y, et al. Cistrome: an integrative platform for transcriptional regulation studies. Genome biology. 2011;12(8):R83.
He HH, Meyer CA, Chen MW, Jordan VC, Brown M, Liu XS. Differential DNase I hypersensitivity reveals factor-dependent chromatin dynamics. Genome research. 2012;22(6):1015–25.
Eeles RA, Olama AA, Benlloch S, Saunders EJ, Leongamornlert DA, Tymrakiewicz M, Ghoussaini M, Luccarini C, Dennis J, Jugurnauth-Little S, et al. Identification of 23 new prostate cancer susceptibility loci using the iCOGS custom genotyping array. Nature genetics. 2013;45(4):385–91. 391e381-382.
Eeles RA, Kote-Jarai Z, Giles GG, Olama AA, Guy M, Jugurnauth SK, Mulholland S, Leongamornlert DA, Edwards SM, Morrison J, et al. Multiple newly identified loci associated with prostate cancer susceptibility. Nature genetics. 2008;40(3):316–21.
Michailidou K, Hall P, Gonzalez-Neira A, Ghoussaini M, Dennis J, Milne RL, Schmidt MK, Chang-Claude J, Bojesen SE, Bolla MK, et al. Large-scale genotyping identifies 41 new loci associated with breast cancer risk. Nature genetics. 2013;45(4):353–61. 361e351-352.
Timofeeva MN, Hung RJ, Rafnar T, Christiani DC, Field JK, Bickeboller H, Risch A, McKay JD, Wang Y, Dai J, et al. Influence of common genetic variation on lung cancer risk: meta-analysis of 14 900 cases and 29 485 controls. Human molecular genetics. 2012;21(22):4980–95.
Lambert JC, Ibrahim-Verbaas CA, Harold D, Naj AC, Sims R, Bellenguez C, DeStafano AL, Bis JC, Beecham GW, Grenier-Boley B, et al. Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer's disease. Nature genetics. 2013;45(12):1452–8.
Taatjes DJ. The human Mediator complex: a versatile, genome-wide regulator of transcription. Trends Biochem Sci. 2010;35(6):315–22.
Heinz S, Romanoski CE, Benner C, Glass CK. The selection and function of cell type-specific enhancers. Nature reviews Molecular cell biology. 2015;16(3):144–54.
De Raedt T, Beert E, Pasmant E, Luscan A, Brems H, Ortonne N, Helin K, Hornick JL, Mautner V, Kehrer-Sawatzki H, et al. PRC2 loss amplifies Ras-driven transcription and confers sensitivity to BRD4-based therapies. Nature. 2014;514(7521):247–51.
Weickert CS, Weickert TW. What's Hot in Schizophrenia Research? The Psychiatric clinics of North America. 2016;39(2):343–51.
Gusev A, Shi H, Kichaev G, Pomerantz M, Li F, Long HW, Ingles SA, Kittles RA, Strom SS, Rybicki BA, et al. Atlas of prostate cancer heritability in European and African-American men pinpoints tissue-specific regulation. Nature communications. 2016;7:10979.
Freedman ML, Monteiro AN, Gayther SA, Coetzee GA, Risch A, Plass C, Casey G, De Biasi M, Carlson C, Duggan D, et al. Principles for the post-GWAS functional characterization of cancer risk loci. Nature genetics. 2011;43(6):513–8.
Polak P, Karlic R, Koren A, Thurman R, Sandstrom R, Lawrence MS, Reynolds A, Rynes E, Vlahovicek K, Stamatoyannopoulos JA, et al. Cell-of-origin chromatin organization shapes the mutational landscape of cancer. Nature. 2015;518(7539):360–4.
We thank the COGS, PRACTICAL, TRICL and BCAC consortia for access to GWAS summary statistics data. Further details are provided below.
The PRACTICAL Consortium (http://practical.ccge.medschl.cam.ac.uk/):
Rosalind Eeles1,2, Doug Easton3, Zsofia Kote-Jarai1, Ali Amin Al Olama3, Sara Benlloch3, Kenneth Muir4, Graham G. Giles5,6, Fredrik Wiklund7, Henrik Gronberg7, Christopher A. Haiman8, Johanna Schleutker9,10, Maren Weischer11, Ruth C. Travis12, David Neal13, Paul Pharoah14, Kay-Tee Khaw15, Janet L. Stanford16,17, William J. Blot18, Stephen Thibodeau19, Christiane Maier20,21, Adam S. Kibel22,23, Cezary Cybulski24, Lisa Cannon-Albright25, Hermann Brenner26,27, Jong Park28, Radka Kaneva29, Jyotsna Batra30, Manuel R. Teixeira 31, Hardev Pandha32
1The Institute of Cancer Research, London, SM2 5NG, UK, 2Royal Marsden NHS Foundation Trust, London, SW3 6JJ, UK, 3Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Strangeways Research Laboratory, Worts Causeway, Cambridge, UK, 4University of Warwick, Coventry, UK, 5Cancer Epidemiology Centre, Cancer Council Victoria, 615 St Kilda Road, Melbourne Victoria, Australia, 6Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Victoria, Australia, 7Department of Medical Epidemiology and Biostatistics, Karolinska Institute, Stockholm, Sweden, 8Department of Preventive Medicine, Keck School of Medicine, University of Southern California/Norris Comprehensive Cancer Center, Los Angeles, California, USA, 9Department of Medical Biochemistry and Genetics, Institute of Biomedicine, Kiinamyllynkatu 10, FI-20014 University of Turku; and Tyks Microbiology and Genetics, Department of Medical Genetics, Turku University Hospital, 10BioMediTech, 30014 University of Tampere, Tampere, Finland, 11Department of Clinical Biochemistry, Herlev Hospital, Copenhagen University Hospital, Herlev Ringvej 75, DK-2730 Herlev, Denmark, 12Cancer Epidemiology Unit, Nuffield Department of Clinical Medicine, University of Oxford, Oxford, UK, 13Surgical Oncology (Uro-Oncology: S4), University of Cambridge, Box 279, Addenbrooke’s Hospital, Hills Road, Cambridge, UK and Cancer Research UK Cambridge Research Institute, Li Ka Shing Centre, Cambridge, UK, 14Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Strangeways Research Laboratory, Worts Causeway, Cambridge, UK, 15Cambridge Institute of Public Health, University of Cambridge, Forvie Site, Robinson Way, Cambridge CB2 0SR, 16Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA, 17Department of Epidemiology, School of Public Health, University of Washington, Seattle, Washington, USA, 18International Epidemiology Institute, 1455 Research Blvd., Suite 550, Rockville, MD 20850, 19Mayo Clinic, Rochester, Minnesota, USA, 20Department of Urology, University Hospital Ulm, Germany, 21Institute of Human Genetics University Hospital Ulm, Germany, 22Brigham and Women's Hospital/Dana-Farber Cancer Institute, 45 Francis Street- ASB II-3, Boston, MA 02115, 23Washington University, St Louis, Missouri, 24International Hereditary Cancer Center, Department of Genetics and Pathology, Pomeranian Medical University, Szczecin, Poland, 25Division of Genetic Epidemiology, Department of Medicine, University of Utah School of Medicine, 26Division of Clinical Epidemiology and Aging Research & Division of Preventive Oncology, German Cancer Research Center, Heidelberg Germany, 27German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg Germany, 28Division of Cancer Prevention and Control, H. Lee Moffitt Cancer Center, 12902 Magnolia Dr., Tampa, Florida, USA, 29Molecular Medicine Center and Department of Medical Chemistry and Biochemistry, Medical University - Sofia, 2 Zdrave St, 1431, Sofia, Bulgaria, 30Australian Prostate Cancer Research Centre-Qld, Institute of Health and Biomedical Innovation and Schools of Life Science and Public Health, Queensland University of Technology, Brisbane, Australia, 31Department of Genetics, Portuguese Oncology Institute, Porto, Portugal and Biomedical Sciences Institute (ICBAS), Porto University, Porto, Portugal, 32The University of Surrey, Guildford, Surrey, GU2 7XH, UK
COGS acknowledgement and funding: This study would not have been possible without the contributions of the following: Per Hall (COGS); Douglas F. Easton, Paul Pharoah, Kyriaki Michailidou, Manjeet K. Bolla, Qin Wang (BCAC), Andrew Berchuck (OCAC), Rosalind A. Eeles, Douglas F. Easton, Ali Amin Al Olama, Zsofia Kote-Jarai, Sara Benlloch (PRACTICAL), Georgia Chenevix-Trench, Antonis Antoniou, Lesley McGuffog, Fergus Couch and Ken Offit (CIMBA), Joe Dennis, Alison M. Dunning, Andrew Lee, and Ed Dicks, Craig Luccarini and the staff of the Centre for Genetic Epidemiology Laboratory, Javier Benitez, Anna Gonzalez-Neira and the staff of the CNIO genotyping unit, Jacques Simard and Daniel C. Tessier, Francois Bacot, Daniel Vincent, Sylvie LaBoissière and Frederic Robidoux and the staff of the McGill University and Génome Québec Innovation Centre, Stig E. Bojesen, Sune F. Nielsen, Borge G. Nordestgaard, and the staff of the Copenhagen DNA laboratory, and Julie M. Cunningham, Sharon A. Windebank, Christopher A. Hilker, Jeffrey Meyer and the staff of Mayo Clinic Genotyping Core Facility Funding for the iCOGS infrastructure came from: the European Community's Seventh Framework Programme under grant agreement n° 223175 (HEALTH-F2-2009-223175) (COGS), Cancer Research UK (C1287/A10118, C1287/A 10710, C12292/A11174, C1281/A12014, C5047/A8384, C5047/A15007, C5047/A10692, C8197/A16565), the National Institutes of Health (CA128978) and Post-Cancer GWAS initiative (1U19 CA148537, 1U19 CA148065 and 1U19 CA148112 - the GAME-ON initiative), the Department of Defence (W81XWH-10-1-0341), the Canadian Institutes of Health Research (CIHR) for the CIHR Team in Familial Risks of Breast Cancer, Komen Foundation for the Cure, the Breast Cancer Research Foundation, and the Ovarian Cancer Research Fund.
Breast Cancer Association Consortium (BCAC) (http://bcac.ccge.medschl.cam.ac.uk/)
BCAC Lead Investiagtors:
Douglas Easton, Ph.D1, Paul Pharoah, Ph.D2, Georgia Chenevix-Trench, Ph.D3, Manjeet Humphreys1
1University of Cambridge, 2Cambridge Cancer Center, 3Queensland Institute of Medical Research
Transdisciplinary Research in Cancer of the Lung (TRICL) Research Team:
Hung RJ1, Han Y2, Brennan P3, Bickeböller H4, Rosenberger A4, Houlston RS5, Caporaso N6, Landi MT6, Heinrich J7, Risch A8, Wu X9, Ye Y9, Christiani DC10,11, Amos CI2
1Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, Ontario, Canada, 2Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Hanover, NH, 3Genetic Epidemiology Group, International Agency for Research on Cancer (IARC), Lyon, France, 4Department of Genetic Epidemiology, University Medical Center, Georg-August-University Göttingen, Göttingen, Germany, 5Division of Genetics and Epidemiology, The Institute of Cancer Research, London, United Kingdom, 6Division of Cancer Epidemiology and Genetics, National Cancer, Institute, National Institutes of Health, Bethesda, MD, USA, 7Helmholtz Centre Munich, German Research Centre for Environmental Health, Institute of Epidemiology I, Neuherberg, Germany, 8Department of Molecular Biology, University of Salzburg, Salzburg, Austria, 9Department of Epidemiology, UT MD Anderson Cancer Center, Houston, TX, 10Massachusetts General Hospital, Boston, Massachusetts, 11Department of Environmental Health, Harvard School of Public Health, Boston, Massachusetts
A.U. is supported by the South-East Norway Health Authorities (Helse Sor-Ost grant ID 2014040) at the Oslo University Hospital, and the Norwegian Centre for Molecular Medicine. I.G.M. is supported by funding from the Research Council of Norway (RCN), South East Norway Health Authority (SENHA) and the University of Oslo through the Centre for Molecular Medicine (Norway), which is part of the Nordic EMBL (European Molecular Biology Laboratory) partnership and also supported by Oslo University Hospitals. I.G.M. is also supported by the Norwegian Cancer Society and by EU FP7 funding. I.G.M. holds a visiting scientist position with Cancer Research UK through the Cambridge Research Institute and a Senior Visiting Research Fellowship with Cambridge University through the Department of Oncology. A.U. is funded by the SENHA at the Oslo University Hospital. V. Z. is supported by the Centre for Molecular Medicine (Norway) and together with A.W., F.B and O.A.A. supported by the Norwegian Centre of Research in Mental Disorders (NORMENT) with funding from the RCN, SENHA, Norwegian Health Association and KG Jebsen Foundation. This work was supported by the Kristian Gerhard Jebsen Foundation, Centre for Molecular Medicine Norway, Research Council of Norway (213837, 223273), South-East Norway Health Authorities (2013–123), National Institutes of Health (R01AG031224, R01EB000790 and RC2DA29475). I.G.M. and group members participate in the NIH Genetic Associations and Mechanisms in Oncology (GAME-ON): A Network of Consortia for Post-Genome Wide Association (Post-GWA) Research (prostate: 1U19CA148537-01).
This work was also supported by Cancer Research UK Grant C5047/A3354. We would also like to thank the following for funding support: the Institute of Cancer Research and the Everyman Campaign, the Prostate Cancer Research Foundation, Prostate Research Campaign UK (now known as Prostate Cancer UK), the National Cancer Research Network UK and the National Cancer Research Institute (NCRI) UK. The ProtecT study is ongoing and is funded by the Health Technology Assessment Programme (projects 96/20/06, 96/20/99). The ProtecT trial and its linked ProMPT and CAP (Comparison Arm for ProtecT) studies are supported by Department of Health, UK, Cancer Research UK grant number C522/A8649, Medical Research Council (UK) grant number G0500966, ID 75466 and the NCRI, UK. The epidemiological data for ProtecT were generated through funding from the Southwest National Health Service Research and Development.
Availability of data and materials
The data supporting the results of this research paper are included within this article and its additional supplementary files. Summary statistics from the GWAS studies used in this manuscript are available through application to the relevant consortia (PRACTICAL, BCAC, TRICL and iCOGS).
IGM, AU and VZ conceived the study, performed data analysis and wrote the manuscript. IGM helped the conceptual design of the study, the preparation of the manuscript, and the interpretation of the results. FB and AW provided scripts for random pruning and edited the manuscript, OAA edited the manuscript. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Consent of publication
Ethics approval and consent to participate
No patient samples were collected and analysed during this study. All GWAS data were provided as summary statistics by the consortia acknowledged in this study having been collected in accordance with ethical regulations in the partner countries and as defined in original research publications by such consortia.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
SE_LNCaP_H3K27Ac. (XLS 84 kb)
SE_LNCaP_MED12_H3K27Ac. (XLS 45 kb)
SE_VCaP_H3K27Ac. (BED 344 kb)
SE_VCaP_BRD4_H3K27Ac. (XLS 42 kb)
SE_VCaP_BRD4. (XLS 945 kb)
SE_PC_BRD4_MED12_H3K27Ac. (XLS 28 kb)
SE_MCF7_H3K27Ac. (XLS 539 kb)
SE_MCF7_BRD4_ER. (XLS 72 kb)
SE_MCF7_H3K27Ac_ER. (XLS 87 kb)
SE_MCF7_BRD4_H3K27Ac_ER. (XLS 54 kb)
SE_MCF7_BRD4_H3K27Ac. (XLS 137 kb)
SE_H2171_BRD4_MED1_H3K27Ac. (BED 53 kb)
SE_H2171_BRD4. (BED 1857 kb)
SE_H2171_H3K27Ac. (BED 88 kb)
SE_H2171_MED1_H3K27Ac. (BED 53 kb)
SE_Schwann_BRD4_H3K27Ac. (BED 153 kb)
SE_Schwann_BRD4. (BED 916 kb)
SE_Schwann_H3K27Ac. (BED 195 kb)
DHS_consensus. (BED 3694 kb)
DHS_encode. (BED 6397 kb)
DHS_He. (BED 4673 kb)
Including Supplementary Material such as Supplementary Figures S1–S11, Supplementary Tables S1–S5, and Supplementary References. (DOCX 2037 kb)
About this article
- Genome-wide association studies
- Functional annotation
- Risk loci
- Prostate cancer risk
- breast cancer risk