We have analyzed the role of genetic heterogeneity among human populations in the replicability of genetic association studies. To address this question, we have measured the degree of population differentiation in loci that have shown differential patterns of association to disease, as reported in the Genetic Association Database . We report three main results. First, SNPs harbored in genes associated with complex disease present lower F
values than the rest of genic SNPs in the genome; second, there is a negative correlation between the replicability of studies associating genes to disease and the F
ST values of the associated genes in European and East Asian populations; and, third, in the same populations, high replicability genes present increased levels of high-frequency derived alleles. These findings would confirm the importance of the recent evolutionary history of our species in the current patterns of susceptibility to complex diseases.
Given the large number of false positives reported in association studies [4, 6–8] a relevant starting issue is the adequacy of the GAD to perform our analysis. In that respect, two points must be noted. First, it is important to see that replication studies, which are the center of our manuscript, are in fact a way to assess how likely previous associations are false positives. A good part of our study would be unnecessary if every association ever reported had been a true positive. In that sense, the known presence of both false and true positives in the database prompted the particular series of analysis that we presented here. The approach will be different when enough GWAS data are available, since, given current standards in the field; it is false negatives that dominate in these studies [6, 25, 26]. Secondly, even if the "low replicability" category contains a mixture of false and true positives, it is clear that the studies with highest replication rates will correspond to true positives. Indeed, it has been known for quite some time that a considerable number of genetic variants have been consistently associated to complex diseases. For example, a review of 25 associations by Lohmueller et al.  found an excess of replications in classical association studies that cannot be explained by false positives. Moreover, a recent paper by Siontis et al.  shows that a good number of the associations detected in non-GWAS classical association studies (mostly those extensively studied) have been replicated in recent GWAS (41 of 291 with a p < 10-7). Neither of these results would have been obtained if highly replicated associations would have been false positives.
Our first observation of lower F
values in genes associated to complex disease is relevant to the adaptive history of these genes. It is well-known that purifying selection is the main force driving the evolution of genes related to Mendelian disorders, as they tend to harbor lower levels of polymorphism. In contrast, complex-disease associated genes seem to be under different pressures, with mixed evolutionary signals . Overall, our observation of lower levels of population is suggestive of purifying selection. These findings contradict results from other authors that did not detect differences in F
ST values of disease-associated variants relative to genome-wide levels [19–21]. However, these previous studies focused in variants instead of genes and, therefore, could only muster small sample sizes. Myles et al.  and Lohmueller et al.  studied, respectively, 25 and 48 SNPs, with the resulting lack in statistical power. More recently, the study by Adeyemo and Rotimi  was able to collect 621 disease-associated SNPs. As expected, they found both SNPs with very large and very low F
ST values through populations. However, they focused on average F
ST values per disease and did not test their global average F
ST of 0.105.
Anyhow, our finding of low average F
ST values in 403 genes that have been associated to disease is still inconclusive. Since our data mainly come from classical (non genome-wide) association studies, our observation may have different causes, some of them spurious. Of course, a true extensive role of purifying selection governing the evolution of these genes is a possibility; but it is also possible that certain classes of genes with particular average selective pressures tend to be involved in complex diseases; or that there has been a human bias towards the inclusion of certain categories of genes in association studies . Indeed, when tested for functional enrichment of PANTHER Biological Process categories (see Additional File 10), complex-disease genes from the Global Set showed an enrichment for the category "Immunity and defense" (corrected p < 2.11 × 10-40) and an array of "signaling"- related categories, such as "Signal transduction", "Cell surface receptor mediated signal transduction" and "Cell communication" (corrected p-values = 9.71 × 10-40, 2.22 × 10-30 and 1.21 × 10-23, respectively), but these results can be the consequence of anyone of the causes mentioned above, or of several of them.
In a previous analysis of the Genetic Association Database, Amato et al.  found a trend that seems opposite to the one we report here. Namely, they detected increased levels of population differentiation in disease-associated genes when compared to genome-wide base levels. However, a careful analysis shows that our results are consistent with Amato et al.'s and that the apparent contradiction is due to their analysis criteria differing from ours in two key aspects. First, their set of "disease genes" was composed by genes positively associated to disease at least once while, to avoid noise, we only included associations that had been studied four or more times (n = 1,793 vs. n = 403). Second, Amato et al.  used as the F
value representative of each gene the maximum F
value of any of the SNP within that gene. In contrast, we averaged the F
values of all the SNPs in a gene. This second difference is crucial: when we repeat our analysis using the "maximum F
" method we do find marginally significant increased levels of population differentiation in disease genes (F
ST = 0.366, n = 403 vs. F
ST of 0.345, n = 18,671, p-value < 0.022, Mann-Whitney test). The reverse is also true, when we analyze the gene set from Amato et al.  with our "average F
" approach, we detect significantly lower population differentiation than genome-wide autosomic levels (F
ST = 0.097, n = 1,631 vs. F
ST of 0.104, n = 17,443, p-value < 4.4 × 10-5, Mann-Whitney test).
The fact that using either "maximum F
ST" or "average F
ST" leads to different results, raises the question of which approach is more accurate. We believe our method to be more precise, due to the larger average length of "disease genes". As such, they tend to harbor more SNPs than the average gene (34.8% more, with an average of 101.48 SNPs, n = 403 vs. an average of 75.28 SNPs, n = 18,671, p-value < 3.1 × 10-14, Mann-Whitney test). And, in fact, there is a strong positive correlation between the number of SNPs a gene harbors and the maximum F
ST value these SNPs can reach (ρ = 0.527, p < 10-50, n = 19,074), while the correlation is much weaker with the gene-specific average F
ST (ρ = 0.094, p < 10-39, n = 19,074). As a result, the maximum F
ST is more biased by gene length than the average F
ST. Therefore, an approach based on the average F
ST in our data seems to be more accurate, in the sense that the average F
ST of a gene is a better proxy of the amount of genetic differentiation at a given locus.
Our second main observation is that genetic heterogeneity through human populations varies greatly amongst loci associated to complex diseases. These loci present different degrees of population differentiation if we attend to their replicability and the consistency of replicabilities between Europeans and East Asians. These two populations are more similar for loci that contain variants which have been similarly associated to disease over and over again in different studies, while greater genetic differences are found in loci whose disease variants have not been consistently replicated. These observations can have at least three sources. First, it is possible that different statistical power in different populations is contributing to the correlation between continental replicability and F
ST. For this to happen, it should be the case that genetic variants that have been associated to disease in a given population tend to be rare other parts of the world. However, we found no evidence of loci with low consistency of replicability having more SNPs with extreme frequencies (common in a population while rare in the other). Alternatively, recent theoretical studies demonstrate that rare variants may create spurious or synthetic associations at certain common alleles . If rare causal variants make a substantial contribution to disease risk and if different populations present different genealogies, the spurious associations detected in each population would differ and replicability patterns may differ. This scenario would point to an important role for rare variants in the etiology of complex diseases. However it is difficult to see how highly replicated associations could be spurious and we did observe a stronger correlation between F
ST and consistency of replicability for associations that have been replicated in at least 50% of the studies. The final explanation would be that certain variants are contributing to the risk for the disease in some populations but not in others. The range of factors underlying this possibility is not limited to purely genetic causes. For instance, some gene-environment interactions that have appreciable joint effects in complex diseases have been described  and environmental conditions vary widely across the planet. Thus, environmental variability among populations could have a role in the differential effect of genetic variants through populations that we have detected. In any case, the evolutionary history of humans would be such that some of the variants associated to disease would increase susceptibility differently in different populations.
Our study points at the heterogeneous genetic architecture of complex diseases, which even if modulated by similar cellular and molecular pathways in all humans, may present intricate population differences regarding causal variants and loci. Although in most cases the behavior of susceptibility or protective risk variants are shared through populations , some differential effects for the same alleles in different populations have been established, like the European-specific protective effects to HIV1 infection progression by the 32-bp deletion allele of the CCR5 gene [34–36] or the presence of two different haplotype blocks in the NRG1 gene that give susceptibility to schizophrenia in European and East Asian populations, respectively . These differences could eventually lead to systematic differences among human populations in susceptibility to, and may underlie well-known cases, such as the differential susceptibility and prevalence of asthma between individuals of Mexican or Puerto Rican ancestry [38–40].
Usually, lack of replication of association mapping methods is thought to be due to the presence of confounding factors such as population stratification, lack of statistical power or publication bias. Therefore, stringent replication criteria are necessary to avoid false positives and to ultimately confirm that a certain genetic variant confers susceptibility to disease . However, the fact that the allelic architecture of disease may be different through human populations raises the issue of revisiting some genetic association studies for complex diseases, since some putatively false positives might hint at diseases whose etiology is geographically heterogeneous.
As to the causes of these differences, it has been previously shown that there is variation in the disease-susceptibility variants that are present in different populations. These differences have been attributed to changes in selective pressures over standing variation [41, 42] or to population-specific selective processes [43, 44]. Our results showing that, when compared against low replicability genes, high replicability genes present lower F
ST values between European and Asians, but high F
ST values between either of these populations and Africans; together with the fact that derived alleles are more frequent in these high replicability genes in Asian and European populations, suggest that replicability has been higher in loci whose allele frequencies changed in the ancestors of Europeans and Asians after they left Africa. It is tempting to speculate about a role of natural selection in shaping this pattern, which would fit into suggestions about selection leading, in some cases, to disease as a side-effect consequence of adaptation [41, 42]. However, our results could be just due to the action of genetic drift relaxing purifying selection in non-African populations. In fact, it has been shown that the bottleneck due to the out-of-Africa event induced a decreased ability of purifying selection to purge deleterious alleles .