A new association test based on disease allele selection for case–control genomewide association studies
 Zhongxue Chen^{1}Email author
DOI: 10.1186/1471216415358
© Chen; licensee BioMed Central Ltd. 2014
Received: 5 February 2014
Accepted: 6 May 2014
Published: 12 May 2014
Abstract
Background
Current robust association tests for case–control genomewide association study (GWAS) data are mainly based on the assumption of some specific genetic models. Due to the richness of the genetic models, this assumption may not be appropriate. Therefore, robust but powerful association approaches are desirable.
Results
In this paper, we propose a new approach to testing for the association between the genotype and phenotype for case–control GWAS. This method assumes a generalized genetic model and is based on the selected disease allele to obtain a pvalue from the more powerful onesided test. Through a comprehensive simulation study we assess the performance of the new test by comparing it with existing methods. Some real data applications are also used to illustrate the use of the proposed test.
Conclusions
Based on the simulation results and real data application, the proposed test is powerful and robust.
Keywords
Generalized genetic model Robust test Singlenucleotide polymorphismBackground
In a case–control genomewide association study (GWAS), to detect the associated singlenucleotide polymorphisms (SNPs), we need to conduct a test for each individual SNP data, which are summarized as a 2by3 table. Although Pearson’s chisquare test can be used, it is usually less powerful than the CochranArmitage trend test (CATT) if the genetic model is known [1–3]. However, if the genetic models are unknown or various, the optimal scores in the CATTs are difficult or unable to find. If we use a CATT with fixed scores for all SNPs, we may lose power for some situations [4–13]. To circumvent this disadvantage, in the literature, many robust association tests have been proposed [7, 12, 14–22]. Those tests do not rely solely on one specific genetic model; rather they consider several possible genetic models simultaneously. In addition, many of them are based on the assumption that the underlying genetic model is one of the following three: additive, recessive, and dominant. For example, the maxmin efficiency robust test (MERT) by Gastwirth [23, 24], and the maximum of the three optimal CATTs under recessive, additive, and dominant models (MAX3) have been studied [6]. Zheng and Ng proposed a twophase procedure called genetic model selection (GMS) method [12] which selects a genetic model from the three models in its first stage. On the contrary, Joo et al. proposed a test which eliminates genetic models [25]. Due to the environmental interaction, there are unlimited genetic models besides the three ideal models (recessive, additive, and dominant). Chen and Ng proposed a robust association test based on the generalized genetic model (GGM) [19], which includes the recessive, additive, and dominant models as special cases. Their approach obtains a pvalue from a onesided test for each of the two possible disease alleles. With the uncertainty of the disease allele, the overall pvalue is then approximated from the two dependent tests.
In this paper, we propose a new robust association test which utilizes the GGM and obtains a onesided pvalue based on the selected disease allele. The performance of the new test is compared with existing methods in terms of controlling type I error rate and detecting power. Some real data applications are also used to demonstrate the use of the new test.
Methods
GGM and existing methods
SNP data in a case control GWAS
Genotype  AA  Aa  aa  Total 

Case  r _{ 1 }  r _{ 2 }  r _{ 3 }  r 
Control  s _{ 1 }  s _{ 2 }  s _{ 3 }  s 
Total  n _{ 1 }  n _{ 2 }  n _{ 3 }  n 
Under the null hypothesis that there is no association between the genotype and phenotype, we have Λ_{1} = Λ_{2} = 1. Regarding the alternative hypothesis, we assume the underlying genetic model is a GGM, which is also called orderrestricted relative risks model [19]. For the case where a is the disease allele, GGM assumes Λ_{1} ≥ 1 and Λ_{2} ≥ Λ_{1} with at least one of the inequalities is strictly greater than. It is easy to see that the aforementioned ideal models, recessive (Λ_{1} = 1, Λ_{2} > Λ_{1}), additive (Λ_{1} = (1 + Λ_{2})/2), and dominant (Λ_{1} = Λ_{2} > 1), are all special cases of the generalized model. If A, rather a, is the disease allele, GGM assumes 1 ≥ Λ_{1} and Λ_{1} ≥ Λ_{2} with at least one of the inequalities is strict.
Suppose the frequencies for AA, Aa, and aa are p_{ 1 }, p_{ 2 }, p_{ 3 } for cases and q_{ 1 }, q_{ 2 }, and q_{ 3 } for controls, respectively. Under the null hypothesis that there is no association between the disease and the genotype, it is easy to show p_{ 1 } = q_{ 1 }, p_{ 2 } = q_{ 2 }, and p_{ 3 } = q_{ 3 }. In this paper, we assume the cases and controls follow trinomial distributions TN(r, p_{ 1 }, p_{ 2 }, p_{ 3 }) and TN (s, q_{ 1 }, q_{ 2 }, q_{ 3 }), respectively.
The test statistics for some wellknown existing methods are summarized as follows:
They are the scores assigned to the three columns.
where I is the indicator function, and the HardyWeinberg disequilibrium trend test (HWDTT) statistic is given by [15]:
${\mathit{Z}}_{\mathit{\text{HWDTT}}}=\frac{{\left(\mathit{rs}/\mathit{n}\right)}^{1/2}\left({\widehat{\mathrm{\Delta}}}_{\mathit{P}}{\widehat{\mathrm{\Delta}}}_{\mathit{Q}}\right)}{\left\{1{\mathit{n}}_{2}/\mathit{n}{\mathit{n}}_{1}/\left(2\mathit{n}\right)\left\}\right\{{\mathit{n}}_{2}/\mathit{n}+{\mathit{n}}_{1}/\left(2\mathit{n}\right)\right\}},$${\widehat{\mathrm{\Delta}}}_{\mathit{P}}={\mathit{r}}_{2}/\mathit{r}{\left({\mathit{r}}_{2}/\mathit{r}+{\mathit{r}}_{1}/\left(2\mathit{r}\right)\right)}^{2},.$$\phantom{\rule{0.25em}{0ex}}{\widehat{\mathrm{\Delta}}}_{\mathit{Q}}={\mathit{s}}_{2}/\mathit{s}{\left({\mathit{s}}_{2}/\mathit{s}+{\mathit{s}}_{1}/\left(2\mathit{s}\right)\right)}^{2},$ and c is a constant and usually chosen as 1.645. Here n_{i} = s_{i} + r_{i} (i = 1,2,3), and n = n_{1} + n_{2} + n_{3}.
The proposed test
It is easy to prove the following result.
From theorem 1, Z_{1}, Z_{2} and Z_{3} are linear dependent; it is not difficult to show that asymptotically Z_{3} = aZ_{1} + bZ_{2}, where $\mathit{a}=\sqrt{\frac{{\mathit{p}}_{2}{\mathit{p}}_{3}}{\left(1{\mathit{p}}_{2}\right)\left(1{\mathit{p}}_{3}\right)}}$, and $\mathit{b}=\sqrt{\frac{{\mathit{p}}_{1}}{\left(1{\mathit{p}}_{2}\right)\left(1{\mathit{p}}_{3}\right)}}$. It can also be shown that under the GGM, if a (or A) is the disease allele, the expectations of Z_{i} (i = 1,2,3) in (2) are all greater (or less) than 0. In addition, under the GGM, if z_{1} is close to 0, the genetic model is close to the recessive model. On the other hand, the genetic model is unlikely to be the recessive model if z_{1} is far from 0.
where Φ(∙) is the cumulative density function (CDF) of the standard normal distribution (N(0,1)) and F^{ 1}(∙) is the inverse of the CDF of the chisquare distribution with one degree of freedom.
Usually it is difficult to directly calculate the pvalue (or critical value) for statistic C in (3). However, the pvalue can be easily estimated using resampling method. Specifically, we first simulate two independent samples, z_{1}, and z_{2}, both from N(0,1). Then we calculate ${\mathit{z}}_{3}=\hat{\mathit{a}}{\mathit{z}}_{1}+\hat{\mathit{b}}{\mathit{z}}_{2}$, where $\hat{\mathit{a}}=\sqrt{\frac{{\hat{\mathit{p}}}_{2}{\hat{\mathit{p}}}_{3}}{\left(1{\hat{\mathit{p}}}_{2}\right)\left(1{\hat{\mathit{p}}}_{3}\right)}}$, $\hat{\mathit{b}}=\sqrt{\frac{{\hat{\mathit{p}}}_{1}}{\left(1{\hat{\mathit{p}}}_{2}\right)\left(1{\hat{\mathit{p}}}_{3}\right)}}$ are the estimates of a and b, and ${\hat{\mathit{p}}}_{\mathit{i}}=\frac{{\mathit{n}}_{\mathit{i}}}{\mathit{n}}\phantom{\rule{0.25em}{0ex}}$ are the estimates for p_{ i } (i = 1, 2, 3) from the data. Next we calculate.
$\mathit{g}=\left\{\begin{array}{c}\hfill {\mathrm{F}}^{1}\left(\mathrm{\Phi}\left({\mathit{z}}_{1}\right)\right)+{\mathrm{F}}^{1}\left(\mathrm{\Phi}\left({\mathit{z}}_{2}\right)\right)\phantom{\rule{0.25em}{0ex}}\mathit{if}\phantom{\rule{0.25em}{0ex}}{\mathit{z}}_{3}\ge 0\hfill \\ \hfill {\mathrm{F}}^{1}\left(\mathrm{\Phi}\left({\mathit{z}}_{1}\right)\right)+{\mathrm{F}}^{1}\left(\mathrm{\Phi}\left({\mathit{z}}_{2}\right)\right)\phantom{\rule{0.25em}{0ex}}\mathit{if}\phantom{\rule{0.25em}{0ex}}{\mathit{z}}_{3}<0\hfill \end{array}\right.$. We repeat the above steps K times and get K values for g. The pvalue can then be estimated as (number of g ' s that are greater than or equal to c)/K, where c is the observed statistic calculated from data using (3).
Results and discussion
Simulation study
In this section, we will assess the performance of the proposed test by comparing it with existing methods in terms of controlling type I error rate and detecting power through a comprehensive simulation study. In the simulation study we assume that cases and controls in Table 1 follow trinomial distributions with probabilities p = (p_{ 1 },p_{ 2 },p_{ 3 }) and q = (q_{ 1 },q_{ 2 },q_{ 3 }), respectively. It can be shown that, for given q_{ i }’s, and relative risks Λ_{1}, Λ_{2}, the values of the corresponding p_{ i }’s can be obtained as follows [17–21]: ${\mathit{p}}_{1}=\frac{{\mathit{q}}_{1}}{{\mathit{q}}_{1}+{\mathit{\Lambda}}_{1}{\mathit{q}}_{2}+{\mathit{\Lambda}}_{2}{\mathit{q}}_{3}}$, ${\mathit{p}}_{2}=\frac{{\mathit{\Lambda}}_{1}{\mathit{q}}_{2}}{{\mathit{q}}_{1}+{\mathit{\Lambda}}_{1}{\mathit{q}}_{2}+{\mathit{\Lambda}}_{2}{\mathit{q}}_{3}}$, and ${\mathit{p}}_{3}=\frac{{\mathit{\Lambda}}_{2}{\mathit{q}}_{3}}{{\mathit{q}}_{1}+{\mathit{\Lambda}}_{1}{\mathit{q}}_{2}+{\mathit{\Lambda}}_{2}{\mathit{q}}_{3}}$.
We first assume Hardy–Weinberg equilibrium (HWE) holds for controls, and the minor allele frequencies (MAF) are 0.3 and 0.5. The disease allele is either the minor or the major allele. The numbers of cases (r) and controls (s) are both set to be 1000. Different pairs of Λ_{1} and Λ_{2} are used in the simulation to compare the performance of our proposed method with those of GMS, MERT, MAX3, Pearson’s chisquare test, and CATT with x = 0.5, which is the commonly used test under the assumption of additive genetic model. More specifically, we fix Λ_{2} to be 1.4 and let Λ_{1} vary from 1.0 to 1.4 with increment 0.1. Therefore, the three special genetic models are included in the simulation study. To assess the robustness of the proposed test, we also simulate data from genetic models other than the GGM: overdominant model (Λ_{1} = 1.5, Λ_{2} = 1.4 ) and underdominant model (Λ_{1} = 0.9, Λ_{2} = 1.4). The significance level is set to be 0.05, and the type I error rate and power are estimated by the proportions of rejections from 1000 replicates. To estimate the pvalue for the proposed test, we resample 10,000 times (i.e., K = 10,000). The pvalues from MAX3, GMS, and MERT are obtained by using the R package “Rassoc” [13].
Empirical type I error rates and powers for each method from 1000 replicates at significance level 0.05 when the sample sizes are 1000 for cases and controls and HWE holds for controls with minor allele is the disease allele and MAF equals 0.3
Λ _{1} Λ _{2}  (1,1)  (1, 1.4) RM*  (1.1, 1.4)  (1.2, 1.4) AM*  (1.3, 1.4)  (1.4, 1.4) DM*  (1.5,1.4) ODM*  (0.9, 1.4) UDM* 

ChiSQ  0.053  0.927  0.759  0.528  0.395  0.428  0.524  0.992 
MAX3  0.051  0.941  0.794  0.568  0.438  0.445  0.484  0.994 
GMS  0.05  0.933  0.772  0.577  0.431  0.432  0.459  0.993 
CATT  0.05  0.934  0.807  0.644  0.427  0.25  0.136  0.986 
MERT  0.051  0.879  0.744  0.642  0.479  0.377  0.261  0.938 
GGM  0.053  0.944  0.793  0.598  0.447  0.432  0.412  0.995 
New  0.049  0.927  0.761  0.574  0.455  0.46  0.512  0.982 
Empirical type I error rates and powers for each method from 1000 replicates at significance level 0.05 when the sample sizes are 1000 for cases and controls and HWE holds for controls with major allele is the disease allele and MAF equals 0.3
Λ _{1} Λ _{2}  (1,1)  (1, 1.4) RM*  (1.1, 1.4)  (1.2, 1.4) AM*  (1.3, 1.4)  (1.4, 1.4) DM*  (1.5,1.4) ODM*  (0.9, 1.4) UDM* 

ChiSQ  0.046  0.538  0.496  0.615  0.805  0.933  0.988  0.752 
MAX3  0.048  0.552  0.547  0.662  0.831  0.948  0.987  0.676 
GMS  0.05  0.547  0.536  0.648  0.827  0.933  0.986  0.681 
CATT  0.048  0.375  0.555  0.716  0.854  0.926  0.963  0.195 
MERT  0.042  0.45  0.587  0.685  0.811  0.853  0.911  0.324 
GGM  0.052  0.523  0.55  0.682  0.839  0.944  0.987  0.614 
New  0.050  0.526  0.551  0.678  0.839  0.942  0.971  0.602 
Empirical type I error rates and powers for each method from 1000 replicates at significance level 0.05 when the sample sizes are 1000 for cases and controls and HWE holds for controls MAF equals 0.5
Λ _{1} Λ _{2}  (1,1)  (1, 1.4) RM*  (1.1, 1.4)  (1.2, 1.4) AM*  (1.3, 1.4)  (1.4, 1.4) DM*  (1.5,1.4) ODM*  (0.9, 1.4) UDM* 

ChiSQ  0.048  0.854  0.736  0.646  0.697  0.81  0.909  0.973 
MAX3  0.045  0.868  0.778  0.704  0.732  0.818  0.909  0.962 
GMS  0.044  0.857  0.768  0.692  0.724  0.81  0.908  0.966 
CATT  0.044  0.795  0.787  0.761  0.724  0.705  0.701  0.815 
MERT  0.044  0.789  0.782  0.763  0.729  0.718  0.722  0.806 
GGM  0.045  0.865  0.795  0.735  0.746  0.816  0.901  0.962 
New  0.041  0.84  0.773  0.714  0.747  0.829  0.919  0.952 
Empirical type I error rates and powers for each method from 1000 replicates at significance level 0.05 when the sample sizes are 1000 for cases and controls and the genotype frequencies for controls are (0.1,0.36,0.54) with minor allele is the disease allele
Λ _{1} Λ _{2}  (1,1)  (1, 1.4) RM*  (1.1, 1.4)  (1.2, 1.4) AM*  (1.3, 1.4)  (1.4, 1.4) DM*  (1.5,1.4) ODM*  (0.9, 1.4) UDM* 

ChiSQ  0.046  0.93  0.743  0.539  0.426  0.481  0.565  0.995 
MAX3  0.04  0.944  0.771  0.601  0.473  0.486  0.507  0.995 
GMS  0.04  0.929  0.756  0.609  0.471  0.49  0.517  0.992 
CATT  0.043  0.936  0.807  0.659  0.459  0.314  0.194  0.974 
MERT  0.044  0.88  0.763  0.656  0.517  0.409  0.302  0.933 
GGM  0.043  0.952  0.786  0.635  0.484  0.46  0.437  0.995 
New  0.049  0.926  0.755  0.616  0.485  0.514  0.556  0.988 
Empirical type I error rates and powers for each method from 1000 replicates at significance level 0.05 when the sample sizes are 1000 for cases and controls and the genotype frequencies for controls are (0.1,0.36,0.54) with major allele is the disease allele
Λ _{1} Λ _{2}  (1,1)  (1, 1.4) RM*  (1.1, 1.4)  (1.2, 1.4) AM*  (1.3, 1.4)  (1.4, 1.4) DM*  (1.5,1.4) ODM*  (0.9, 1.4) UDM* 

ChiSQ  0.042  0.55  0.553  0.673  0.812  0.927  0.981  0.749 
MAX3  0.044  0.568  0.6  0.718  0.843  0.939  0.987  0.712 
GMS  0.041  0.585  0.61  0.718  0.82  0.922  0.98  0.727 
CATT  0.041  0.431  0.607  0.765  0.854  0.918  0.962  0.248 
MERT  0.042  0.512  0.635  0.755  0.808  0.866  0.896  0.385 
GGM  0.036  0.55  0.611  0.743  0.848  0.942  0.986  0.66 
New  0.043  0.552  0.597  0.734  0.849  0.941  0.981  0.654 
The method based on GGM by Chen and Ng (GGM in Tables 2–6) has similar performance as the proposed test. However, if the disease allele is the minor allele, the new test is more powerful than the GGM method when the genetic model is dominant or overdominant (see Tables 2, 4 and 5). In addition, unlike the proposed test, the GGM method doesn’t report the disease allele.
Real data application
Genotypic count data for rs181489 from different populations (data obtained from[27])
Population  Case  Control  Total n  

GG  GA  AA  GG  GA  AA  
A: Australia  400  402  99  320  307  58  1586 
B: France  244  245  57  5  61  11  623 
C: Germany  86  119  19  16  18  7  265 
D: Germany  222  176  85  133  107  25  748 
E: Germany  144  149  39  169  140  29  670 
F: Greece  44  67  17  44  37  10  219 
G: Greece  119  126  47  130  123  18  563 
H: Ireland  140  147  58  229  157  38  769 
I: Italy  78  86  21  87  71  9  352 
J: Italy  73  88  28  44  43  8  284 
K: Italy  33  47  10  41  21  9  161 
L: Norway  290  233  80  240  228  56  1127 
M: Poland  158  144  47  171  135  30  685 
N: Sweden  50  30  10  91  68  17  266 
O: USA  156  170  50  191  137  32  736 
Pvalues and Z statistics from different methods based on each population of the SNP rs181489 data
Population  ChiSQ  MAX3  GMS  CATT  MERT  New  Z1  Z2  Z3 

A: Australia  0.23  0.19  0.20  0.14  0.11  0.16  0.44  1.66  1.72 
B: France  0.66  0.64  0.47  0.41  0.38  0.51  0.34  0.84  0.90 
C: Germany  0.20  0.18  0.51  0.46  0.31  0.23  0.54  1.70  1.41 
D: Germany  0.011  0.0060  0.0055  0.024  0.014  0.0088  0.090  3.02  2.82 
E: Germany  0.16  0.11  0.089  0.055  0.058  0.099  1.36  1.36  1.69 
F: Greece  0.11  0.080  0.12  0.076  0.11  0.082  2.02  0.51  1.19 
G: Greece  0.0018  0.0011  0.0011  0.0032  0.0013  0.0012  0.63  3.51  3.52 
H: Ireland  1.2e4  5.8e5  5.6e5  2.2e5  2.3e5  7.0e5  2.70  3.28  3.94 
I: Italy  0.055  0.043  0.036  0.020  0.016  0.033  1.35  2.00  2.29 
J: Italy  0.23  0.19  0.15  0.098  0.088  0.15  0.79  1.53  1.71 
K: Italy  0.013  0.018  0.015  0.070  0.15  0.012  2.94  0.31  0.63 
L: Norway  0.18  0.34  0.25  0.94  0.73  0.43  1.31  1.33  0.86 
M: Poland  0.12  0.10  0.081  0.049  0.041  0.077  0.88  1.88  2.05 
N: Sweden  0.69  0.79  0.80  0.78  0.89  0.96  0.78  0.37  0.16 
O: USA  0.0048  0.0031  0.0029  0.0013  0.0020  0.0026  2.66  1.90  2.61 
Conclusions
Although CATT has been widely used in case–control GWAS with the assumption that the underlying genetic model is additive, its performance may be very poor if the true genetic model is not additive. Therefore, robust but powerful association tests are more appropriate when detecting the associated SNPs. Many existing association tests make the assumption that the genetic model is one of the three special genetic models (recessive, additive, and dominant), which may be a too strong assumption in practice. In this paper, we propose a robust association test without making strong assumption about the genetic model. Our simulation results show that even the assumption of GGM is violated (e.g., over and under dominant models), the proposed test still has reasonable power; indicating it is a robust test. In terms of computational cost, the proposed test is reasonably fast. For instance, it took my desktop about 70 seconds to get the results in Table 8 for the real data application. Our simulation study also confirmed that the proposed test can control type I error rate with smaller cutoff pvalue, e.g., 10^{4} and 10^{5} (see Additional file 2: Table S1S2).
The test statistic in (3) is defined based on the idea of combining pvalues from independent studies using chisquare distribution with 1 degree of freedom [26]. Although there are many other approaches available in the literature [26, 28–31], it remains a research topic to choose the best one if there is any. However, it should be noticed that for case control GWAS, a robust method, such as the proposed test, is desirable due to the various underlying genetic models. In addition, when we combine the pvalues from the 15 independent studies using the chisquare distribution with 1 degree of freedom [26], the overall pvalue is 2.6 × 10^{11}.
Through simulation studies and real data applications, we have shown that the proposed test is robust and powerful. In addition, the three statistics, z_{1}, z_{2}, and z_{3}, may also provide useful information about the disease allele and the genetic model.
Abbreviations
 GWAS:

Genomewide association study
 SNP:

Singlenucleotide polymorphism
 CATT:

CochranArmitage trend test
 MERT:

Maxmin efficiency robust test
 MAX3:

The maximum of the three optimal
 CATTs:

Under recessive, additive, and dominant models
 GMS:

Genetic model selection
 GGM:

Generalized genetic model
 CDF:

Cumulative density function
 HWDTT:

HardyWeinberg disequilibrium trend test
 HWE:

Hardy–Weinberg equilibrium.
Declarations
Aknowledgement
The author would like to thank the support from several faculty research funds awarded to the author by the Indiana University School of Public HealthBloomington. The author is also grateful to the three anonymous referees for their constructive comments which highly improved the presentation of the manuscript.
Authors’ Affiliations
References
 Cochran W: Some methods for strengthening the common chisquare tests. Biometrics. 1954, 10 (4): 417451. 10.2307/3001616.View ArticleGoogle Scholar
 Armitage P: Tests for linear trends in proportions and frequencies. Biometrics. 1955, 11 (3): 375386. 10.2307/3001775.View ArticleGoogle Scholar
 Zheng G, Freidlin B, Gastwirth JL: Comparison of robust tests for genetic association using case–control studies. IMS Lect NotesMonogr Ser. 2006, 49: 253265. (Optimality: The Second Erich L. Lehmann Symposium)View ArticleGoogle Scholar
 Chen Z, Zheng G: Exact robust tests for detecting candidategene association in case–control trio design. J Data Sci. 2005, 3: 1933.Google Scholar
 Freidlin B, Podgor MJ, Gastwirth JL: Efficiency robust tests for survival or ordered categorical data. Biometrics. 1999, 55 (3): 883886. 10.1111/j.0006341X.1999.00883.x.PubMedView ArticleGoogle Scholar
 Freidlin B, Zheng G, Li Z, Gastwirth JL: Trend tests for case–control studies of genetic markers: power, sample size and robustness. Hum Hered. 2002, 53 (3): 146152. 10.1159/000064976.PubMedView ArticleGoogle Scholar
 Gonzalez JR, Carrasco JL, Dudbridge F, Armengol L, Estivill X, Moreno V: Maximizing association statistics over genetic models. Genet Epidemiol. 2008, 32 (3): 246254. 10.1002/gepi.20299.PubMedView ArticleGoogle Scholar
 Sasieni PD: From genotypes to genes: doubling the sample size. Biometrics. 1997, 53 (4): 12531261. 10.2307/2533494.PubMedView ArticleGoogle Scholar
 Sladek R, Rocheleau G, Rung J, Dina C, Shen L, Serre D, Boutin P, Vincent D, Belisle A, Hadjadj S, Balkau B, Heude B, Charpentier G, Hudson TJ, Montpetit A, Pshezhetsky AV, Prentki M, Posner BI, Balding DJ, Meyre D, Polychronakos C, Froguel P: A genomewide association study identifies novel risk loci for type 2 diabetes. Nature. 2007, 445 (7130): 881885. 10.1038/nature05616.PubMedView ArticleGoogle Scholar
 Slager SL, Schaid DJ: Case–control studies of genetic markers: power and sample size approximations for Armitage’s test for trend. Hum Hered. 2001, 52 (3): 149153. 10.1159/000053370.PubMedView ArticleGoogle Scholar
 Zheng G, Freidlin B, Li Z, JL G: Choice of scores in trend tests for case–control studies of candidate gene associations. Biom J. 2003, 45: 335348. 10.1002/bimj.200390016.View ArticleGoogle Scholar
 Zheng G, Ng HKT: Genetic model selection in twophase analysis for case–control association studies. Biostatistics. 2008, 9 (3): 391399. 10.1093/biostatistics/kxm039.PubMed CentralPubMedView ArticleGoogle Scholar
 Zang Y, Fung WK, Zheng G: Simple algorithms to calculate the asymptotic null distributions of robust tests in case–control genetic association studies in R. J Stat Softw. 2010, 33 (8): 124.View ArticleGoogle Scholar
 Kwak M, Joo J, Zheng G: A robust test for twostage design in genomewide association studies. Biometrics. 2009, 65 (4): 12881295. 10.1111/j.15410420.2008.01187.x.PubMedView ArticleGoogle Scholar
 Song K, Elston RC: A powerful method of combining measures of association and HardyWeinberg disequilibrium for finemapping in case–control studies. Stat Med. 2006, 25 (1): 105126. 10.1002/sim.2350.PubMedView ArticleGoogle Scholar
 Wang K, Sheffield VC: A constrainedlikelihood approach to markertrait association studies. Am J Hum Genet. 2005, 77 (5): 768780. 10.1086/497434.PubMed CentralPubMedView ArticleGoogle Scholar
 Chen Z, Huang H, Ng HKT: Testing for association in case–control genomewide association studies with shared controls. Stat Methods Med Res. 2013, Published online before print February 1, 2013, doi:101177/0962280212474061Google Scholar
 Chen Z: Association tests through combining pvalues for case control genome–wide association studies. Stat Probability Lett. 2013, 83 (8): 18541862. 10.1016/j.spl.2013.04.021.View ArticleGoogle Scholar
 Chen Z, Ng HKT: A Robust method for testing association in genomewide association studies. Hum Hered. 2012, 73 (1): 2634. 10.1159/000334719.PubMed CentralPubMedView ArticleGoogle Scholar
 Chen Z, Huang H, Ng HKT: Design and analysis of multiple diseases genomewide association studies without controls. GENE. 2012, 510 (1): 8792. 10.1016/j.gene.2012.07.089.PubMed CentralPubMedView ArticleGoogle Scholar
 Chen Z: A new association test based on Chisquare partition for casecontrol GWA studies. Genet Epidemiol. 2011, 35 (7): 658663. 10.1002/gepi.20615.PubMedView ArticleGoogle Scholar
 Chen Z, Huang H, Ng HKT: An improved robust association test for GWAS with multiple diseases. Stat Probability Lett. 2014, 91: 153161.View ArticleGoogle Scholar
 Gastwirth JL: On robust procedures. J Am Stat Assoc. 1966, 61: 929948. 10.1080/01621459.1966.10482185.View ArticleGoogle Scholar
 Gastwirth JL: The use of maximin efficiency robust tests in combining contingency tables and survival analysis. J Am Stat Assoc. 1985, 80: 380384. 10.1080/01621459.1985.10478127.View ArticleGoogle Scholar
 Joo J, Kwak M, Zheng G: Improving power for testing genetic association in case–control studies by reducing the alternative space. Biometrics. 2010, 66 (1): 266276. 10.1111/j.15410420.2009.01241.x.PubMedView ArticleGoogle Scholar
 Chen Z, Nadarajah S: On the optimally weighted ztest for combining probabilities from independent studies. Comput Stat Data Anal. 2014, 70: 387394.View ArticleGoogle Scholar
 Elbaz A, Ross OA, Ioannidis J, SotoOrtolaza AI, Moisan F, Aasly J, Annesi G, Bozi M, Brighina L, ChartierHarlin MC, Destée A, Ferrarese C, Ferraris A, Gibson JM, Gispert S, Hadjigeorgiou GM, JasinskaMyga B, Klein C, Krüger R, Lambert JC, Lohmann K, van de Loo S, Loriot MA, Lynch T, Mellick GD, Mutez E, Nilsson C, Opala G, Puschmann A, Quattrone A, et al: Independent and joint effects of the MAPT and SNCA genes in Parkinson disease. Ann Neurol. 2011, 69 (5): 778792. 10.1002/ana.22321.PubMed CentralPubMedView ArticleGoogle Scholar
 Chen Z, Nadarajah S: Comments on ‘Choosing an optimal method to combine p‐values’ by Sungho Won, Nathan Morris, Qing Lu and Robert C. Elston, Statistics in Medicine 2009; 28: 1537–1553. Stat Med. 2011, 30 (24): 29592961. 10.1002/sim.4222.PubMedView ArticleGoogle Scholar
 Chen Z: Is the weighted z‐test the best method for combining probabilities from independent tests?. J Evol Biol. 2011, 24 (4): 926930. 10.1111/j.14209101.2010.02226.x.PubMedView ArticleGoogle Scholar
 Loughin TM: A systematic comparison of methods for combining pvalues from independent tests. Comput Stat Data Anal. 2004, 47 (3): 467485. 10.1016/j.csda.2003.11.020.View ArticleGoogle Scholar
 Fisher RA: Statistical Methods for Research Workers. 1932, Edinburgh: Oliver and BoydGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.