Genome-wide epistatic expression quantitative trait loci discovery in four human tissues reveals the importance of local chromosomal interactions governing gene expression
© Fitzpatrick et al.; licensee BioMed Central. 2015
Received: 14 April 2014
Accepted: 29 January 2015
Published: 21 February 2015
Epistasis (synergistic interaction) among SNPs governing gene expression is likely to arise within transcriptional networks. However, the power to detect it is limited by the large number of combinations to be tested and the modest sample sizes of most datasets. By limiting the interaction search space firstly to cis-trans and then cis-cis SNP pairs where both SNPs had an independent effect on the expression of the most variable transcripts in the liver and brain, we greatly reduced the size of the search space.
Within the cis-trans search space we discovered three transcripts with significant epistasis. Surprisingly, all interacting SNP pairs were located nearby each other on the chromosome (within 290 kb-2.16 Mb). Despite their proximity, the interacting SNPs were outside the range of linkage disequilibrium (LD), which was absent between the pairs (r2 < 0.01). Accordingly, we redefined the search space to detect cis-cis interactions, where a cis-SNP was located within 10 Mb of the target transcript. The results of this show evidence for the epistatic regulation of 50 transcripts across the tissues studied. Three transcripts, namely, HLA-G, PSORS1C1 and HLA-DRB5 share common regulatory SNPs in the pre-frontal cortex and their expression is significantly correlated. This pattern of epistasis is consistent with mediation via long-range chromatin structures rather than the binding of transcription factors in trans. Accordingly, some of the interactions map to regions of the genome known to physically interact in lymphoblastoid cell lines while others map to known promoter and enhancer elements. SNPs involved in interactions appear to be enriched for promoter markers.
In the context of gene expression and its regulation, our analysis indicates that the study of cis-cis or local epistatic interactions may have a more important role than interchromosomal interactions.
Genome-wide studies of gene expression have successfully identified genetic variants that contribute to the variation of gene expression within populations [1-11]. The objective of genome-wide association studies (GWAS) is to map genotypic variation to phenotypic variation. Jansen and Nap  proposed extending the GWAS paradigm to deal with quantitative endophenotypes, e.g. RNA, protein and metabolite abundance in a cell. To date, consideration of RNA abundance has received most attention in the literature [1-11]. Those variants that affect gene expression are referred to as expression quantitative trait loci (eQTLs) of which thousands have been reported [1-11]. Most studies have focused on single nucleotide polymorphisms (SNPs).
The literature reports two classes of eQTL, cis-acting SNPs and trans-acting SNPs. Cis-acting SNPs lie within a gene or near the transcription start or stop site of a gene and correlate with the expression of that gene. In contrast, trans-acting SNPs can lie anywhere else in the genome. Cis-acting variants are more numerous than trans-acting variants but not necessarily more common due to difficulties in detecting trans-SNPs related to a larger search space [3-5]. Understanding of the mechanisms of action of expression polymorphisms detected in GWAS is limited. Cis-acting variants may affect the binding of the transcriptional machinery or the stability of the transcript . The mechanics of trans-acting variants have proven more difficult to determine. Cheung et al. report that the majority of trans-SNPs do not map to transcription factors or signaling molecules . More recently, SNPs implicated in disease associations were shown to be enriched in enhancers and microRNA binding sites .
As with the traditional GWAS, the focus in studies of expression polymorphisms has been the detection of single variants that function independently to affect gene expression. However, consideration of single variants alone typically explains only a small proportion of the variance of a trait. This missing heritability is attributed by some to the fact that genetic interactions (epistasis) are generally ignored in mapping studies despite claims that they are an ubiquitous feature of biological processes [14-18]. In contrast to studies in model organisms which have reported extensive epistasis [19,20], where epistasis has been studied using GWAS, the results have been few [21,22]. However, regardless of the percentage contribution to heritability, epistatic interactions are of intrinsic interest as they reveal aspects of regulatory networks that single SNPs do not identify.
The reasons for the relative absence of reported epistasis in the GWAS literature are twofold. Firstly, exhaustive searches of all pairs of SNP-SNP interactions are computationally expensive, e.g. a screen for all two-locus interactions amongst 500,000 SNPs and 30,000 transcripts would require 3.25 × 1015 statistical tests. More importantly, the combinatorial explosion of even simple pairwise interactions necessitates stringent correction for multiple testing which eliminates all but the most striking results. Inadequate correction for the search space and reliance on assumptions of normality in gene expression can lead to false inferences of epistasis where none exists. In this study, we sought to avoid such problems by restricting the search space in the first place, and then employing careful analyses, in particular avoiding tests that assumed normality of the quantitative RNA levels. While we do manage to reduce the search space considerably, there remains substantial lack of power to fully quantify the real extent of biologically important epistasis. Our study design is secondly limited by the availability of only four tissue types analysed on a similar platform. Nonetheless, by sampling a small fraction of such epistatic effects, our approach provides insights into the nature of epistasis governing RNA expression in mammalian tissues.
We set out to discover epistasis affecting gene expression in the human liver and three brain tissues. To reduce the size of the search space, we initially considered only variable transcripts and SNP pairs that had a cis and trans effect on the same transcript. This approach revealed a small number of statistically significant epistatic intrachromosomal effects and no interchromosomal effects. As a follow on, we redefined the search space to consider only cis-cis interactions. In this more focused search, we discovered a greater number of interactions affecting a greater number of transcripts, suggesting the importance of cis-cis epistasis in transcriptional regulation.
Results and discussion
Genetic interactions in four human tissues
Bonferroni significant cis-trans interactions in the liver, pre-frontal cortex and the cerebellum
9.35 × 10−7
2.22 × 10−5
2.54 × 10−5
2.19 × 10−8
2.74 × 10−8
1.57 × 10−7
2.32 × 10−7
1.05 × 10−7
2.09 × 10−7
2.34 × 10−6
3.10 × 10−6
3.24 × 10−6
6.56 × 10−7
7.48 x 10−7
8.01 × 10−7
9.66 × 10−7
1.05 × 10−6
1.86 × 10−6
1.52 × 10−6
1.52 × 10−6
Given that the significant interactions from the cis-trans analysis were all intrachromosomal and that the SNPs involved were near (within 5 Mb) the transcript which they regulate, we next sought to identify such interactions by testing for cis-cis interactions where the cis-SNPs were within 10 Mb of the target transcript and of each other and at least 100 kb apart from each other and in linkage disequilibrium (r2 < 0.01). A total of 10,349, 135,495, 90,975 and 44,490 tests for cis-cis interactions were conducted in the liver, pre-frontal cortex, cerebellum and visual cortex, respectively. After Bonferroni correction ( α < 0.05), 34 interactions were significant in the liver (p < 4.83 × 10−6 ), 321 in the pre-frontal cortex (p < 3.69 × 10−7), 144 in the cerebellum (p < 5.50 × 10−7) and 66 in the visual cortex (p < 1.12 × 10−6). The interactions involve 2, 35, 27 and 16 transcripts in each of the four tissues, respectively. Details of the interactions for the four tissues are contained in Additional files 2, 3, 4, 5: Tables S4-S7.
The distance between SNPs involved in cis-cis interactions across the four tissues ranged from 100 kb to 1.7 Mb. Although the minimum distance allowed between interacting SNPs in the analysis was 100 kb, it is interesting that the maximum distance between two interacting SNPs was less than 2 Mb given that the maximum possible distance in the search space is 10 Mb. The empirical cumulative distributions of distances between interacting SNPs are given in Additional file 1: Figures S10-S13. Kolmogorov-Smirnov tests were used to determine whether the distribution of distances between SNPs involved in significant interactions was different from the distributions of distances between all interacting SNPs tested in each tissue. In all tissues, the two distributions were significantly different with p = 1.89 × 10-15 in the liver and p < 1 × 10−16 in the pre-frontal cortex, cerebellum and visual cortex.
In the pre-frontal cortex, there is some overlap between interactions, i.e. different transcripts share interacting SNPs. On chromosome 6 in the pre-frontal cortex, PSORSIC1, HLA-G and HLA-DRB5 share common interactors (Additional file 1: Figure S14). The overlap of interacting SNPs could be indicative of common regulation. Consistent with this, the expression of this cluster is correlated with HLA-G showing correlations with PSORS1C1 and HLA-DRB5 (ρ = −0.21, p = 4.21 × 10−7; ρ = 0.18, p = 1.83 × 10−5, respectively) (Additional file 1: Figure S15).
Interpreting the interactions
SNPs implicated in GWAS may not be the causal SNPs but rather linked to the causal SNP. In the case of mapping the genetic basis of disease susceptibility, if an associated SNP falls within a non-coding region, it can be difficult to decipher a mechanism of how such a SNP confers risk. With expression studies, associations link SNPs directly to the expression of a particular transcript but in itself, does not necessarily reveal anything of the transcriptional mechanism. Accordingly, we used auxiliary data in the form of the 3-dimensional structure of the genome  and enhancer and promoter annotations from ENCODE  and the Roadmap Epigenomics Project  in order to help interpret the functional significance of the cis-cis interactions reported.
The 3-dimensional genome
Epistatic interactions that map to HiC interactions
The regulatory genome
SNPs that map to promoter regions in brain tissues
6.9 × 10−6 *
9.75 × 10−7*
6.12 × 10−6*
1.42 × 10−7*
2.14 × 10−6*
8.02 × 10−6*
4.68 × 10−8*
SNPs that map to enhancer regions in brain tissues
We initially conducted a genome-wide screen for cis-trans epistasis governing gene expression. Interestingly, all of the interactions we report using this search space are intra-chromosomal (within 290 kb - 2.16 Mb), but are not due to detectable linkage disequilibrium. Accordingly, we looked for cis-cis interactions and identified numerous transcripts under epistatic regulation. Such cis-cis genetic interactions seem important in the regulation of gene expression and appear to be a more significant contributor to large epistatic effects than inter-chromosomal and cis-trans effects. The patterns of multiple interactions at the HLA loci may be coupled with complex looping structures bringing together multiple DNA regions. Consistent with this, there is enrichment for physical interactions amongst epistatic eQTLs as well as enrichment for promoters in SNPs involved in interactions. As such, we propose that the role of distant cis-cis interactions in the regulation of gene expression and in disease susceptibility merits careful searching.
Genotype data & quality control
Individuals from the Human Liver Cohort (HLC) were genotyped on both the Affymetrix 500 K and Illumina humanHap650Y platforms. The genotyping protocol for the HLC has been described previously . The Harvard Brain Tissue Resource Centre (HBTRC) samples were genotyped on both the Illumina HumanHap650Y array and a custom Perlegen 300 k array. Informed consent and ethical approval for the collection of data relating to the Human Liver Cohort was obtained from tissue resource centres at Vanderbilt University, the University of Pittsburgh and Merck Research Laboratories. All brain tissue was acquired by Merck Research Laboratories from the Harvard Brain Tissue Resource Center at McLean Hospital where informed consent was obtained from both the donors and their next of kin. The data was handled in accordance with the HBTRC guidelines. The study was approved by the McLean Hospital Institutional Review Board. For the purpose of this analysis, only SNPs on autosomes with a minor allele frequency of greater than 0.2 and a missing value quota of not more than 20% were considered. The high MAF threshold of 0.2 was used to avoid cells with small counts when investigating interactions. The highest missingness of SNPs involved in the finally reported epistatic effects was 14.64%, 4.17%, 2.66% and 2.48% in the liver, pre-frontal cortex, cerebellum and visual cortex, respectively. SNPs were filtered for violations of Hardy-Weinberg equilibrium using a Chi-Squared test (p < 0.05). After this, a total of 449,313 SNPs remained from the HLC samples and 309,976, 310,566 and 309,163 SNPs from the HBTRC cerebellum, visual cortex and pre-frontal cortex samples, respectively.
The smartpca program from the EIGENSOFT 4.2 package was used to compute principal components on the genotypic data to measure population stratification . The significance (p < 0.05) of the components was determined using Tracy-Widom statistics . For the liver the first principal component was statistically significant but the first three were used as covariates in all eQTL mapping analyses. For the cerebellum, visual cortex and pre-frontal cortex the first 9, 12 and 10 components were significant and used as covariates in all eQTL mapping analyses. Twenty seven individuals from the liver and 7, 6 and 7 individuals from the cerebellum, visual cortex and pre-frontal cortex were detected as outliers (smartpca default settings) and removed from all analyses. The resulting sample sizes for each of the tissues was nliver = 403, ncerebellum = 489, npre-frontal cortex = 576 and nvisual cortex = 403.
The liver tissue samples were collected from three tissue resource centres at Vanderbilt University, University of Pittsburgh and Merck Research Laboratories and the microarray analysis was conducted on a custom Agilent 44 k array. The expression profiling routine for the liver samples has been previously described . In brief, expression is measured as the mean-log ratio relative to a sex-balanced pool of samples and adjusted for age, sex and centre of origin. The brain tissue samples were collected from the Harvard Brain Tissue Resource Centre and the expression profiling conducted on a custom Agilent array. The brain expression profiling has been described previously . Similar to the liver data, the brain gene expression is measured as the mean-log ratio and has been corrected for gender, RNA integrity number, pH, post-mortem interval, batch and preservation of samples. The brain data comprised a mixture of control samples and samples with Alzheimer's and Huntington's disease.
The presence of SNPs within microarray expression probes has been shown to affect the accuracy of RNA measures . We sought to remove probes containing common SNPs by mapping probes to the SNPs from the HapMap CEU population (release 128) and removing those probes containing SNPs from the analysis. In total, 6858 autosomal probes were removed.
This study considered those RNAs which were most variable in the sample populations based on the interquartile range (IQR). The interquartile range for each RNA in the four tissues was computed and the top 5% of reporters carried forward for the analyses. The IQR was chosen as a measure of variability over the variance or standard deviation so as to avoid selecting those RNAs with extreme values for few individuals. A total of 1599 reporters in the liver and 1621 reporters for each of the brain tissues were used in the analysis.
Modeling marginal & interaction effects
Independent effects of single SNPs were estimated using rank-transform (RT) regression. RT-regression is achieved by ranking the expression values and utilising them as the dependent variable in a linear model . RT-regression has been evaluated in the context of GWAS where it performs similarly to classical regression when assumptions of normality are met but has greater power and control of family-wise error rates in the presence of non-normality . Genotypes AA, Aa and aa were encoded as 0, 1 and 2, respectively, where A denotes the major allele and a, the minor allele. Putative marginal eQTLs were designated as either cis or trans depending on their location relative to the midpoint of the expression probe coordinates. For the cis-trans study, cis-SNPs were defined as those which lie within 1 Mb upstream or downstream of the probe midpoint. Conversely, trans-SNPs were defined as those located outside of 1 Mb upstream or downstream of the probe midpoint or occupying a different chromosome. For the cis-cis study, cis-SNPs were defined as those SNPs which fall within 10 Mb of the probe midpoint. The putative cis and trans eQTLs were separately corrected for multiple testing using the Benjamini-Hochberg approach to control false discovery rates (FDR) .
Depending on the search space strategy, where an expression phenotype had both a cis and trans eQTL or two cis-eQTLs at an FDR of 50%, the interaction of those eQTLs was computed using a RT-regression model. Only pairs of SNPs with a minimum sample size of 10 individuals in each of the nine genotypic combinations of a pairwise interaction were considered. Fitting of both the marginal and interaction effect models was performed using R. Interactions were computed as the product of the two loci with possible values of 0, 1, 2 or 4 representing the nine genotypic combinations of a pairwise interaction. An additive model was chosen as it requires the fitting of just a single interaction term alongside the two marginal effects. The consideration of dominance would require fitting four main effects (2 additive and 2 dominant) and four interaction terms (additive*additive, dominant*dominant and 2 additive*dominant interactions) and as such require a larger sample size in order to detect such interactions. While not all epistatic effects are likely to follow an additive model, it was chosen primarily to maximise statistical power of detecting any epistatic effects. The significance of the interaction was determined using the p-value of the regression coefficient of the interaction term in the model. Interactions were corrected for multiple testing using the Bonferroni correction (α < 0.05) across all tests for that tissue. Step-wise multiple regression to determine the independence of interactions, where reported, was assessed using a step-wise multiple regression procedure based on the Akaike Information Criteria using the step() function in R. As an initial model, all main and interaction effect terms were fitted.
Linkage disequilibrium (LD) between interacting SNPs was measured using PLINK . To ensure that interacting SNPs on the same chromosome were independent, we only tested interactions among those intra-chromosomal SNP pairs with an r2 < 0.01, as measured in the samples for that particular tissue. In order to determine the numbers of independent interactions, linkage disequilibrium was also measured amongst sets of significant SNP pairs that were chromosomally adjacent. Differences between the distributions of the distances between interacting SNPs and the entire set of interactions tested were tested using Kolmogorov-Smirnov statistics. The potential for the cross hybridisation of probes whose expression is under epistatic control was determined using BLAST. The probe sequence was compared to the RefSeq RNA database using a discontiguous megablast. In the case of the cis-trans study, no probes had a match other than itself whereas for the cis-cis study, 9 probes representing 5 transcripts were removed from the set of significant interactions.
Enrichment for regulatory annotations
Promoter and enhancer annotations for cis-cis interacting SNPs were taken from HepG2 cells as measured by ENCODE and from 7 brain tissues (Anterior Caudate, Hippocampus Middle, Angular Gyrus, Inferior Temporal Lobe, Germinal Matrix, Mid Frontal Lobe, Substantia Nigra) as measured by the Roadmap epigenomics Project [26,27]. The annotations were gathered from HaploReg [40,41] and enrichment calculated using a Chi-squared test of independence comparing the number of SNPs involved in significant interactions that map to promoter or enhancer elements to all other SNPs tested for interactions. SNPs from the liver were tested for enrichments using the HepG2 annotations and SNPs from the three brain tissues were tested from regulatory annotations in the 7 brain tissues.
Physical interactions from Hi-C
The genomic coordinates of interacting SNP pairs were mapped to the human genome hg18 build using the UCSC LiftOver tool . Epistatic interactions were then mapped to known physically interacting regions of the genome measured in lymphoblastoid cell lines . The same criteria of Cheung et al.  was used to map epistatic eQTLs to interacting regions, i.e. a pair of epistatic eQTLs were deemed to be in a physically interacting region of the genome when both eQTLs mapped one each +/−5 kb upstream or downstream of the alignment start sites. Enrichment for epistatic interactions were computed using a Chi-squared test of independence comparing the number of epistatic interactions that map to at least one Hi-C interaction to the number of Hi-C mappings in the remaining non-epistatic interactions tested.
Microarray data for gene expression data in the liver can be downloaded from the gene expression omnibus (GEO) archive, accession no. GSE9588. Microarray data for gene expression in the pre-frontal cortex, cerebellum and visual cortex can similarly be downloaded from GEO, accession no. GSE44772, GSE44768, GSE44770, and GSE44771. The Hi-C data can also be downloaded from GEO, accession no. GSE189199.
This material is based upon works supported by the Science Foundation Ireland under Grant No. 08/SRC/I1407: Clique: Graph Network Analysis Cluster. The Harvard Brain Tissue Resource Center, which generously provided the tissue samples, is supported by federal grant R24 MH/NS068855. NS was supported by the Irish Research Council and CJR was supported by an ICON-Newman Fellowship.
- Dimas A, Deutsch S, Stranger B, Montgomery S, Borel C, Attar-Cohen H, et al. Common regulatory variation impacts gene expression in a cell type-dependent manner. Science. 2009;325(5945):1246–50.View ArticlePubMed CentralPubMedGoogle Scholar
- Dixon A, Liang L, Moffatt M, Chen W, Heath S, Wong K, et al. A genome-wide association study of global gene expression. Nat Genet. 2007;39(10):1202–7.View ArticlePubMedGoogle Scholar
- Emilsson V, Thorleifsson G, Zhang B, Leonardson A, Zink F, Zhu J, et al. Genetics of gene expression and its effect on disease. Nature. 2008;27(452):423–8.View ArticleGoogle Scholar
- Goring H, Curran J, Johnson M, Dyer T, Charlesworth J, Cole S, et al. Discovery of expression QTLs using large-scale transcriptional profiling in human lymphocytes. Nat Genet. 2007;39(10):1208–16.View ArticlePubMedGoogle Scholar
- Schadt E, Molony C, Chudin E, Hao K, Yang X, Lum P, et al. Mapping the genetic architecture of gene expression in human liver. PLoS Biol. 2008;6(5):e107.View ArticlePubMed CentralPubMedGoogle Scholar
- Cheung V, Nayak R, Xiaorong Wang I, Elwun S, Cousins S, Morley M, et al. Polymorphic Cis- and trans-regulation of human gene expression. PLoS Biol. 2010;8(9):e1000480.View ArticlePubMed CentralPubMedGoogle Scholar
- Stranger B, Forrest M, Dunning M, Ingle C, Beazley C, Thorne N, et al. Relative impact of nucleotide and copy number variation on gene expression. Science. 2007;9(315):848–53.View ArticleGoogle Scholar
- Stranger B, Nica A, Forrest M, Dimas A, Bird C, Beazley C, et al. Population genomics of human gene expression. Nat Genet. 2007;39(10):1217–24.View ArticlePubMed CentralPubMedGoogle Scholar
- Idaghdour Y, Czika W, Shianna K, Lee S, Visscher P, Martin H, et al. Geographical genomics of human leukocyte gene expression variation in southern Morocco. Nat Genet. 2010;42:62–7.View ArticlePubMed CentralPubMedGoogle Scholar
- Morley M, Molony C, Weber T, Devlin J, Ewens K, Spielman R, et al. Genetic analysis of genome-wide variation in human gene expression. Nature. 2004;12(430):743–7.View ArticleGoogle Scholar
- Veyrieras J, Kudaravalli S, Kim S, Dermitzakis E, Gilad Y, Stephens M, et al. High-resolution mapping of expression-QTLs yields insight into human gene regulation. PLoS Genet. 2008;4(10):e1000214.View ArticlePubMed CentralPubMedGoogle Scholar
- Jansen R, Nap J. Genetical genomics: the added value from segregation. Trends Genet. 2001;17(7):388–91.View ArticlePubMedGoogle Scholar
- Westra H, Peters M, Esko T, Yaghootkar H, Schurmann C, Kettunen J, et al. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat Genet. 2012;45(10):1238–43.View ArticleGoogle Scholar
- Van Steen K. Travelling the world of gene-gene interactions. Brief Bioinform. 2012;13:1–19.View ArticlePubMedGoogle Scholar
- Carlborg O, Haley C. Epistasis: too often neglected in complex trait studies. Nat Rev Genet. 2004;5(8):618–25.View ArticlePubMedGoogle Scholar
- Cordel H. Detecting gene-gene interactions that underlie human diseases. Nat Rev Genet. 2009;10(6):392–404.View ArticleGoogle Scholar
- Moore J. A global view of epistasis. Nat Genet. 2005;37:13–4.View ArticlePubMedGoogle Scholar
- Moore J, Williams S. Epistasis and its implications for personal genetics. Am J Hum Genet. 2009;85(3):309–20.View ArticlePubMed CentralPubMedGoogle Scholar
- Costanzo M, Baryshnikova A, Bellay J, Kim Y, Spear E, Sevier C, et al. The genetic landscape of the cell. Science. 2010;327(5964):425–31.View ArticlePubMedGoogle Scholar
- Ryan C, Roguev A, Patrick K, Xu J, Jahari H, Tong Z, et al. Hierarchical modularity and the evolution of genetic interactomes across species. Mol Cell. 2012;46(5):691–704.View ArticlePubMed CentralPubMedGoogle Scholar
- Sapkota Y, Mackey J, Lai R, Franco-Villalobos C, Lupichuk S, Robson P, et al. Assessing SNP-SNP interactions among DNA repair, modification and metabolism related pathway genes in breast cancer susceptibility. PLoS One. 2013;8(6):e64896.View ArticlePubMed CentralPubMedGoogle Scholar
- Ma L, Brautbar A, Boerwinkle E, Sing C, Clark A, Keinan A. Knowledge-driven analysis identi es a gene-gene interaction affecting high-density lipoprotein cholesterol levels in multi-ethnic populations. PLoS Genet. 2012;8(5):e1002714.View ArticlePubMed CentralPubMedGoogle Scholar
- Brawand D, Soumillon M, Necsulea A, Julien P, Csardi G, Harrigan P, et al. The evolution of gene expression levels in mammalian organs. Nature. 2011;478:343–8.View ArticlePubMedGoogle Scholar
- Petryszak R, Burdett T, Fiorelli B, Fonseca NA, Gonzalez-Porta M, Hastings E, et al. Expression Atlas update – a database of gene and transcript expression from microarray and sequencing-based functional genomics experiements. Nucl Acids Res. 2014;42(D1):D926–32.View ArticlePubMed CentralPubMedGoogle Scholar
- Lieberman-Aiden E, van Berkum N, Williams L, Imakaev M, Ragoczy T, Telling A, et al. Comprehensive mapping of long range interactions reveals folding principles of the human genome. Science. 2009;326(5950):289–93.View ArticlePubMed CentralPubMedGoogle Scholar
- Consortium TEP. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;486:57–74.Google Scholar
- Bernstein BE, Stamatoyannopoulos JA, Costello JF, Ren B, Milosavlijevic A, Meissner A, et al. The NIH roadmap epigenomics mapping consortium. Nat Biotech. 2010;28:1045–8.View ArticleGoogle Scholar
- Dekker J. Gene regulation in the third dimension. Science. 2008;319(5871):1793–4.View ArticlePubMed CentralPubMedGoogle Scholar
- Sanyal A, Lajoie B, Jain G, Dekker J. The long-range interaction landscape of gene promoters. Nature. 2012;489(7414):109–13.View ArticlePubMed CentralPubMedGoogle Scholar
- Jin F, Li Y, Dixon J, Selvaraj S, Ye Z, Lee A, et al. A high-resolution map of the three-dimensional chromatin interactome in human cells. Nature. 2013;530(7475):290–4.Google Scholar
- Thurman R, Rynes E, Humbert R, Vierstra J, Maurano M, Haugen E, et al. The accessible chromatin landscape of the human genome. Nature. 2012;6(489):75–82.View ArticleGoogle Scholar
- Price AL, Patterson NJ, Plenge RM, Weinblat ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;28:904–9.View ArticleGoogle Scholar
- Patterson N, Price AL, Reich D. Population structure and eigenanalysis. PLoS Genet. 2006;2(12):e190.View ArticlePubMed CentralPubMedGoogle Scholar
- Podtelezhnikov A, Tanis K, Nebozhyn M, Ray W, Stone D, Loboda A. Molecular insights into the pathogenesis of Alzheimer's disease and its relationship to normal aging. PLoS One. 2011;6(12):e29610.View ArticlePubMed CentralPubMedGoogle Scholar
- Alberts R, Terpestra P, Li Y, Breitling R, Nap J, Jansen R. Sequence polymorphisms cause many false cis eQTLs. PLoS One. 2007;2(7):e622.View ArticlePubMed CentralPubMedGoogle Scholar
- Conover W, Iman R. Rank transformations as a bridge between parametric and nonparametric statistics. Am Stat. 1981;35:121–9.Google Scholar
- Lourenco V, Pires A, Kirst M. Robust linear regression methods in association studies. Bioinformatics. 2011;27(6):815–21.View ArticlePubMedGoogle Scholar
- Benjamini Y, Hochberg Y. Contolling the false discovery rate: a practical and powerful approach to multiple testing. J R Statist Soc B. 1995;57:289–300.Google Scholar
- Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira M, Bender D, et al. PLINK: a toolset for whole-genome association and population-based linkage analysis. Am J Hum Genet. 2007;81(3):559–75.View ArticlePubMed CentralPubMedGoogle Scholar
- Ward L, Kellis M. HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res. 2012;40:D930-a.View ArticleGoogle Scholar
- HaploReg V2 [http://www.broadinstitute.org/mammals/haploreg/haploreg.php]
- UCSC Batch Coordinate Conversion [http://genome.ucsc.edu/cgi-bin/hgLiftOver]
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.