 Research Article
 Open access
 Published:
A novel estimator of betweenstudy variance in randomeffects models
BMC Genomics volumeÂ 21, ArticleÂ number:Â 149 (2020)
Abstract
Background
With the rapid development of highthroughput sequencing technologies, many datasets on the same biological subject are generated. A metaanalysis is an approach that combines results from different studies on the same topic. The randomeffects model in a metaanalysis enables the modeling of differences between studies by incorporating the betweenstudy variance.
Results
This paper proposes a moments estimator of the betweenstudy variance that represents the acrossstudy variation. A new randomeffects method (DSLD2), which involves twostep estimation starting with the DSL estimate and the \(D_{g}^{2}\) in the second step, is presented. The DSLD2 method is compared with 6 other metaanalysis methods based on effect sizes across 8 aspects under three hypothesis settings. The results show that DSLD2 is a suitable method for identifying differentially expressed genes under the first hypothesis. The DSLD2 method is also applied to Alzheimerâ€™s microarray datasets. The differentially expressed genes detected by the DSLD2 method are significantly enriched in neurological diseases.
Conclusions
The results from both simulationes and an application show that DSLD2 is a suitable method for detecting differentially expressed genes under the first hypothesis.
Background
With the advances of highthroughput experimental technology, a multitude of datasets have been produced and have resulted in several public databases, such as the European Bioinformatics Institute (EBI) and Gene Expression Omnibus database (GEO) [1]. A major challenge is how to reexploit, reextract and combine the information from a large number of datasets [2]. A metaanalysis, combining data or results from independent studies on the same topic, is widely applied and the major contribution is discovering disease pathogenesis [3, 4]. The statistical power could be raised through metaanalysis by combining information from individual studies that have small sample sizes [5, 6]. Although many significantly differential gene expression lists are presented, individual conclusions tend to be discordant because of various study designs, individual treatment protocols, limited sample sizes and different genders among the study participants [7]. Metaanalysis is an important method for providing reliable and consistent differentially expressed gene lists by integrating information on the same disease [8]. As metaanalysis methods use available datasets, they are relatively inexpensive [9]. But not all datasets are usually available due to publication bias and outcome reporting bias [10].
Metaanalysis methods based on effect sizes, which contribute to the early diagnosis and treatment of diseases, can be broadly divided into two classes: the fixedeffects model (FEM) and the randomeffects model (REM) [11]. The fixedeffects model assumes that all studies in a metaanalysis have the same true effect size [12]. The randomeffects model assumes that different studies in a metaanalysis have the different true effect sizes [12]. Metaanalyses were first introduced to microarray data by Rhode et al. (2002) [13] and Choi et al. (2003) [14]. Many metaanalysis methods, including DerSimonian and Laird estimate (DSL) [15], restricted maximum likelihood estimate (RML) and Sidik and Jonkman estimate (SJ) [16], were later applied to microarray studies. Twostep estimate starting with the DSL estimate (DL2) is an iterative estimator. The twostep method DL2 and the iterative Paule and Mandel method are close [17]. The randomeffects methods in metaanalysis make possible the modeling of differences and the differences between studies often caused by the study design, sample sizes, sex/gender differences in participants and so on. The betweenstudy variance Ï„^{2} is incorporated by randomeffects methods in metaanalyses to estimate the acrossstudy variation [18]. The fixedeffects model in metaanalyses excludes the betweenstudy variance Ï„^{2} from the randomeffects model [19].
This paper develops an estimator of the betweenstudy variance \(D_{g}^{2}\) which originates from the general moments estimator described by DerSimonian and Kacker (2007). Therefore, a new randomeffects method (DSLD2) based on \(D_{g}^{2}\) is presented. In subsequent sections, three hypothesis testing frameworks were thoroughly reviewed. We observed the biases and root mean square errors (RMSE) of between study variance \(D_{g}^{2}\). The randomeffects method based on \(D_{g}^{2}\) and other metaanalysis models were applied to simulation datasets of gene expression levels. Then, we compared the DSLD2 method with other metaanalysis methods, including the DSL method, the DSLR2 method, the fixedeffects model, the PM method, the RML method and the SJ method across the following metrics: the false discovery rates (FDRs), accuracy, precision, false positive rate (FPR), sensitivity, precisionrecall curve and the receiver operating characteristic curve (ROC). DSLD2 performed well among the metaanalysis methods based on effect sizes under the first hypothesis. We also applied DSLD2 to Alzheimerâ€™s disease. The pathways of differentially expressed genes detected by the DSLD2 method indicate that Alzheimerâ€™s disease is related to the nervous system, which is obvious. The results from both the simulation and the application suggest that DSLD2 is appropriate for identifying differentially expressed genes. In addition, we prove the reasonableness of the betweenstudy variance \(D_{g}^{2}\) in Additional file 1.
Methods
Underlying hypothesis settings
Statistical hypothesis tests are primarily used in metaanalyses to identify differentially expressed genes, and three common hypothesis testing frameworks are often applied [20]. In the first hypothesis test, targeted biomarkers are differentially expressed genes with nonzero effect sizes in all studies. The null and alternative hypotheses are as follows:
where Î¸_{ig} denotes the underlying true effect size for gene g in study i (i=1,2,â‹¯,k), k is the number of studies in a metaanalysis. The second hypothesis test aims to determine a differentially expressed gene with nonzero effect sizes in one or more studies. The null and alternative hypotheses are as follows:
The third hypothesis test aims to determine a differential gene expression if it has nonzero effect sizes in the majority of studies (half or more). The null and alternative hypotheses are as follows:
where the indicator function is denoted by I(.), which takes a value of 0 if Î¸_{ig}=0 and a value of 1 if Î¸_{ig}â‰ 0. r is the number of studies that we identify a differentially expressed gene in at least r studies. r is usually set as greater than 0.5k. For instance, we can define a differentially expressed gene if it is significant in at least 4 (r=4) of 8 studies.
Metaanalysis methods based on effect sizes
Fixedeffects model
The fixedeffects model (FEM) assumes that all studies included in the metaanalysis have the same true effect size and that the difference in the observed effect between combined studies is caused by random error [21]. The observed effect sizes of each study are combined with a simple linear model.
Randomeffects model
Let Î¼_{g} be the overall mean for gene g, which is a typical parameter of interest. y_{ig} denotes the observed effect size for gene g in study i (i=1,2,â‹¯,k). The randomeffects model is given by
where Î¾_{ig} is the random effect for gene g in study i and obeys a normal distribution with mean 0 and variance \(\tau _{g}^{2}, \sigma _{ig}^{2}\) is the withinstudy variance representing the sampling error for gene g in study i, and \(\tau _{g}^{2}\) denotes the betweenstudy variance which is the variability between studies. If \(\tau _{g}^{2}=0\), then the randomeffects model reduces to a fixedeffects model. If \(\hat {\sigma }_{ig}^{2} (i=1,2,\cdots,k)\) and \(\hat {\tau }_{g}^{2}\) are the estimates of \(\sigma _{ig}^{2} (i=1,2,\cdots,k)\) and \(\tau _{g}^{2}\), the overall mean Î¼_{g} can be estimated by
DerSimonian and Laird estimate
The betweenstudy variance \(\tau _{g}^{2}\) can be estimated by the DerSimonianâ€“Laird (DSL) method
where \(Q_{g}=\sum _{i=1}^{k}\omega _{ig}(y_{ig}M)^{2}, \omega _{ig}=\hat {\sigma }_{ig}^{2}, M=\sum _{i=1}^{k}\omega _{ig}y_{i}/\sum _{i=1}^{k}\omega _{ig}\) [22]. The estimator is not unbiased, but it is the simplest [23]. The DSL estimator is the most widely used method [24].
Twostep estimation starting with the DSL estimate and the \(R_{g}^{2}\) in the second step (DSLR2)
DSLR2 is a randomeffects model based on the betweenstudy variability \(R_{g}^{2}\) [17, 25], which yields
where \(\omega _{ig}^{*}=\left (\hat {\sigma }_{ig}^{2} + \hat {\tau }_{g}^{2}(DSL)\right)^{1}, M_{g}^{*}=\sum _{i=1}^{k}\omega _{ig}^{*}y_{i}/ \sum _{i=1}^{k}\omega _{ig}^{*}.\)
Paule and Mandel estimate (PM)
\(\hat {\tau }_{g}^{2}(PM)\) is the unique solution of \(\sum _{i=1}^{k}\omega _{ig}^{*} \left (y_{ig}M_{g}^{*}\left (\hat {\tau }_{g}^{2}(PM)\right)\right)(k1)=0\), where \(\omega _{ig}^{*}=(\hat {\sigma }_{ig}^{2} + \hat {\tau }_{g}^{2}(PM))^{1}\) [26]. Negative \(\hat {\tau }_{g}^{2}(PM)\) estimates truncate to 0. \(\hat {\tau }_{g}^{2}(PM)\) is estimated and we can substitute it in \(\omega _{ig}^{*}=\left (\hat {\sigma }_{ig}^{2} + \hat {\tau }_{g}^{2}(PM)\right)^{1}\) to obtain \(M_{g}^{*}\left (\hat {\tau }_{g}^{2}(PM)\right) = \sum _{i=1}^{k}\omega _{ig}^{*}y_{i}/\sum _{i=1}^{k}\omega _{ig}^{*}\) [26].
Restricted maximum likelihood estimate
The method of restricted maximum likelihood estimate (RML) can be used to calculate the estimators of overall mean value Î¼_{g} and betweenstudies variance \(\tau _{g}^{2}\) of a randomeffects metaanalysis model [27]. The loglikelihood function based on the linear mixed effects model is
The loglikelihood can be maximized using the Fisher scoring algorithm to obtain the estimates of Î¼_{g} and \(\tau _{g}^{2}\). Negative \(\tau _{g}^{2}\) estimates are truncated to 0 [27].
Sidik and Jonkman estimate
The following twostep estimator of betweenstudy variance \(\tau _{g}^{2}\) was proposed by Sidik and Jonkman [28, 29]
where \(M_{g}^{*}(SJ)=\sum _{i=1}^{k}\omega _{ig}^{*}y_{ig}/\sum _{i=1}^{k}\omega _{ig}^{*}, \omega _{ig}^{*}=\frac {1}{1+\hat {\sigma }_{ig}^{2}/\hat {\tau }_{0g}^{2}}, \hat {\tau }_{0g}^{2}=max\left \{0.01, \frac {1}{k1}\sum _{i=1}^{k}(y_{ig}y_{Ag})^{2}\frac {1}{k}\sum _{i=1}^{k}\hat {\sigma }_{ig}^{2}\right \}, y_{Ag}=\frac {1}{k}\sum _{i=1}^{k}y_{ig}\).
Twostep estimation starting with the DSL estimate and the \(D_{g}^{2}\) in the second step (DSLD2)
The main component of the randomeffects metaanalysis model is the betweenstudy variability. We develop a betweenstudy variability estimator \(D_{g}^{2}\), which estimates the amount of conditional variance in y_{ig}, which yields
where \(Q_{g}=\sum _{i=1}^{k}\frac {(y_{ig}M_{g})^{2}}{\hat {\sigma }_{ig}^{2}}, S_{MM,g}=\sum _{i=1}^{k}\frac {(y_{ig}M_{g}^{*})^{2}}{\hat {\sigma }_{ig}^{2}+\hat {\tau }_{g}^{2}}, M_{g}=\frac {\sum _{i=1}^{k}\omega _{ig}y_{ig}}{\sum _{i=1}^{k}\omega _{ig}}, \omega _{ig}=\hat {\sigma }_{ig}^{2}\) and \(M_{g}^{*}=\frac {\sum _{i=1}^{k}y_{ig}/\left (\hat {\sigma }_{ig}^{2} + \hat {\tau }_{g}^{2}\right)}{\sum _{i=1}^{k}1/\left (\hat {\sigma }_{ig}^{2} + \hat {\tau }_{g}^{2}\right)}\).
Such an estimator of the betweenstudy variance is always greater than 0 and indicates how strong the random effects are. The algorithm of the DSLD2 method is as follows:

Calculate Q_{g} and \(\hat {\tau }_{g}^{2}\) in Eq. (2),

Calculate \(M_{g}^{*}\) in Eq. (1),

Calculate \(D_{g}^{2}\) in Eq. (3) and

Replace \(\hat {\tau }_{g}^{2}\) with \(D_{g}^{2}\) in Eq. (1).
The weights, overall mean estimator, variance of the overall mean estimator, bounds of the confidence interval and zstatistics based on the betweenstudy variance \(D_{g}^{2}\) can be obtained by
and
Simulation and application
Metaanalysis methods used in simulation datasets
Two class simulation datasets were generated to observe the performance of DSLD2 method. The methods used in simulation datasets of gene expression levels were the fixedeffects model (FEM), the randomeffects model based on DerSimonian and Laird estimate for \(\tau _{g}^{2}\) (DSL), the randomeffects model based on the betweenstudy variance estimotor \(R_{g}^{2}\) (DSLR2), the randomeffects model based on Paule and Mandel estimate for \(\tau _{g}^{2}\) (PM), the randomeffects model based on the restricted maximum likelihood estimate for \(\tau _{g}^{2}\) (RML), the randomeffects model based on Sidik and Jonkman estimate for \(\tau _{g}^{2}\) (SJ) and the randomeffects model based on the betweenstudy variance estimote \(D_{g}^{2}\) (DSLD2). We compared the performances of DSLD2 method and other 6 metaanalysis methods based on effectsizes in histograms, precision, accuracy, the false discovery rates (FDRs), false positive rate (FPR), Matthews correlation coefficient (MCC), sensitivity, receiver operating characteristic curves (ROC) and precisionrecall curves under three hypotheses using simulation datasets of gene expression levels. We reported the bias and root mean square error (RMSE) of the betweenstudy variance estimators \(D_{g}^{2}\) through Monte Carlo simulation datasets.
Simulation setting of gene expression levels
A common method was used to produce simulation data for comparing the ability of detecting DE genes among 16 metaanalysis methods under the three hypothesis settings [30]. Five studies were simulated (k=1,2,â‹¯,5). Each study contained 2000 genes and 2N samples (2N=10,20,60,100,140,180,220). In each study, the first N samples were controls, and the last N samples were cases. Each sample in each study contained 40 gene clusters (C_{g}=1,2,â‹¯,40), and each cluster included 20 genes \((\sum I(C_{g}=c)=20,c=1,2,\cdots,40).\) The remaining 1200 genes had 0 gene clusters \((\sum I(C_{g}=0)=1200)\). The first 1000 genes in each study were divided into 5 groups (k_{g}=1,2,3,4,5). The first 200 genes were put into the first group (k_{g}=1). The 201th gene to the 400th gene were put into the second group (k_{g}=2). The 401th gene to the 600th gene were put into the third group (k_{g}=3). The 601th gene to the 800th gene were put into the fourth group (k_{g}=4). The 801th gene to the 1000th gene were put into the fifth group (k_{g}=5). The 1001th gene to the 2000th gene were put into the zeroth group (k_{g}=0). The simulation algorithm is summarized as follows:

We sampled \(\sum _{ck}^{'}\sim W^{1}(\psi,60)\) for genes in cluster c (1â‰¤câ‰¤40) and study k (1â‰¤kâ‰¤5), where Ïˆ=0.5I_{20Ã—20}+0.5J_{20Ã—20},I_{20Ã—20} was the identity matrix, J_{20Ã—20} was the matrix in which all elements equal 1, and W^{âˆ’1} denoted the inverse Wishart distribution. We then standardized \(\sum _{ck}^{'}\) into \(\sum _{ck}\) with all diagonal elements equaling 1.

We sampled the expression levels of genes in clusters c and n as \(\left (X_{g_{c1}nk}^{'}, \cdots, X_{g_{c20}nk}^{'}\right)^{T} \sim MVN\left (0, \sum _{ck}\right)\), where 1â‰¤nâ‰¤2N,1â‰¤câ‰¤40 and 1â‰¤kâ‰¤5. The gene expression levels are \(g \sim N\left (0,\sigma _{k}^{2}\right)\) for the gene in cluster 0, where \(\sigma _{k}^{2} \sim U(0.8,1.2), 1 \leq n \leq 2N\) and 1â‰¤kâ‰¤5.

We randomly sampled Î´_{gk}âˆˆ{0,1} such that \(\sum _{k=1}^{5} \delta _{gk}=k_{g} (k_{g}=1,2,\cdots 5)\). When Î´_{gk}=1, the gene g in study k was DE, and we sampled Î¼_{gk}âˆ¼U(0.5,3). The expression level of the control samples remained unchanged, and the case samples were \(Y_{gnk}=X_{g(n+N)k}^{'} + \mu _{gk} \cdot \delta _{gk}\), where 1â‰¤gâ‰¤2000,1â‰¤nâ‰¤N, and 1â‰¤kâ‰¤5.
Thus, the numbers for truly differentially expressed genes were 200, 1000 and 600 under the first hypothesis, the second hypothesis and the third hypothesis, respectively.
Simulation setting using Monte Carlo method
Let \(X_{ijg}^{ctrl}\) and \(X_{ijg}^{case}\) be the observations of gth iteration for jth samples in the ith study from a control and a case group. Assume that \(X_{ijg}^{ctrl}\) was sampled \(N\left (\mu _{i}^{ctrl}, \sigma _{i}^{2}\right)\) and \(X_{ijg}^{case}\) was sampled \(N\left (\mu _{i}^{case}, \sigma _{i}^{2}\right)\). Let \(n_{i}^{ctrl}\) and \(n_{i}^{case}\) be the sample sizes in ith study. To simplify things, it was set that \(n_{i}=n_{i}^{ctrl}=n_{i}^{case}, \sigma _{i}^{2}=10\) and \(\mu _{i}^{ctrl}=0\). \(\mu _{i}^{case}\) was sampled from N(0,Ï„^{2}) The following factors were set in the simulations: \(k=(5,10,20,40,80), \tau ^{2}=(0.0, 1.0), \overline {n}_{i}=40\) and g=1,2,â‹¯,1000. The values of n_{i} were sampled from N(40,(40/3)^{2}). The standardized mean difference (SMD) and the mean difference (MD) were chosen as the effect size measures.
Results
Simulation results
The numbers of differentially expressed genes with p<0.05 (DE_{1}) or FDR<0.05(DE_{2}) identified by various metaanalysis models are presented in Table 1 [31]. More differentially expressed genes were identified by the fixedeffects model. The DSLD2 method detected fewer DE genes than the FEM and SJ methods. All methods had normal FDR_{1} levels and FDR_{2} levels except the FEM method. The FDR_{2} of FEM is 0.3808 and greater than other metaanalysis methods. The FDR_{1} value of DSLD2 method was 0.0165, which was greater than that of the DSLR2 methods. However, the FDR_{1} value of the DSLD2 method was smaller than that of the other 5 metaanalysis methods. The FDR_{2} value of the DSLD2 method was 0.0236, which was the smallest among 7 metaanalysis methods based on effect sizes.
Histograms were constructed to compare the differences in differentially expressed genes (p<0.05) among different groups detected by various metaanalysis methods (see Fig. 1). The numbers of studies that were differentially expressed for gene g in 1âˆ¼200,201âˆ¼400,401âˆ¼600,601âˆ¼800,801âˆ¼1000,1001âˆ¼2000 were 1, 2, 3, 4, 5 and 0, respectively. The DSLD2 method identified fewer DE genes in group 1, group 2 and group 0 (see Fig. 1). More differentially expressed genes were detected by the DSLD2 method in groups 3, 4 and 5 (see Fig. 1). The DE genes discovered by the DSLD2 method showed an increasing trend, and the differentially expressed genes in group 5 could be completely identified by the DSLD2 method (see Fig. 1). The numbers of DE genes identified by the DSLD2 method in every group were consistent with the data simulation method (see Fig. 1).
Precision is an important descriptor of random errors. Line graphs and tables were constructed to compare the precision among 7 metaanalysis methods (see Fig. 2, Additional file 3: Figures S1S2 and Additional file 4: Tables S1â€“S3). The precision of all the methods increased significantly from 10 to 60 samples and fluctuated slightly between 60 and 220 samples under the first hypothesis, the second hypothesis and the third hypothesis (see Fig. 2, Additional file 3: Figures S1â€“S2). The precision of the DSLD2 method was lower than other methods in 10 studies, however, the precision values of DSLD2 method went up to 1.0 when numbers of sample sizes per study were larger than 60 under the first hypothesis. Under the first hypothesis, the DSLR2 method had the lowest precision among the metaanalysis methods combining effect sizes. Under the second and third hypothesis, the FEM method had the highest precision values among 7 metaanalysis methods combining effect sizes (see Additional file 3: Figures S1, S2 and Additional file 4: Tables S2, S3).
Accuracy is a critical descriptor of systematic errors. Among the metaanalysis methods based on effect sizes, DSLD2 had the highest accuracy among 7 metaanalysis methods based on effect sizes under the first hypothesis (see Fig. 3 and Additional file 4: Table S4). Under the first hypothesis, the accuracy of the DSLD2 method experienced a decrease from 10 to 100 samples and tended to be steady between 100 and 220 samples (Fig. 3). The accuracy of FEM method is the lowest among 7 metaanalysis methods based on effect sizes under the first hypothesis (see Fig. 3 and Additional file 4: Table S4). Under the second hypothesis, the accuracy of FEM method was highest among 7 metaanalysis methods (Additional file 3: Figure S3 and Additional file 4: Table S5). Under the third hypothesis, the SJ method had the highest accuracy values among 7 metaanalysis methods when the numbers of sample sizes per study were between 60 and 220 (Additional file 3: Figure S4 and Additional file 4: Table S6).
The false positive rate (FPR) is the probability of falsely rejecting the null hypothesis of a test. Under the first hypothesis, DSLD2 had the highest FPR value when number of sample sizes per study was 10 (see Fig. 5 and Additional file 4: Table S7). However, the FPR value of DSLD2 method went down to 0.0 when numbers of sample sizes per study were larger than 60 under the first hypothesis (see Fig. 4 and Additional file 4: Table S7). Under the first hypothesis, the DSLR2 method had the highest FPR values when numbers of sample sizes per study were more than 60 (see Fig. 4 and Additional file 4: Table S7). Under the second and the third hypothesis, the FEM method had the lowest FPR values among 7 metaanalysis methods (see Additional file 3: Figures S5, S6 and Additional file 4: Tables S8, S9).
The Matthews correlation coefficient (MCC), a numerical measure of correlation, indicates a statistical relationship between the predicted and observed binary classifications. An MCC close to 1 denotes perfect prediction. Under the first hypothesis, the DSLD2 method had the highest MCC among the 7 metaanalysis methods based on effect sizes (see Fig. 5 and Additional file 4: Table S10). The FEM method had the lowest MCC values among the 7 metaanalysis methods under the first hypothesis (see Fig. 5 and Additional file 4: Table S10). Under the first hypothesis, the SJ method had the lowest MCC values among the 6 randomeffects metaanalysis methods (see Fig. 5 and Additional file 4: Table S10). Under the second, the FEM method had the highest MCC values among the 7 metaanalysis methods (see Additional file 3: Figure S7 and Additional file 4: Table S11). Under the third hypothesis, the SJ method had the highest MCC values among 7 metaanalysis methods based on effect sizes when the numbers of sample sizes per study were between 60 and 220 (see Additional file 3: Figure S8 and Additional file 4: Table S12).
Sensitivity is a statistical measure of the performance of binary classification tests. Under the first hypothesis, the DSLD2 method had the highest sensitivity values among the 7 metaanalysis methods based on effect sizes (see Fig. 6 and Additional file 4: Table S13). The FEM method had the lowest sensitivity values among the 7 metaanalysis methods under the first hypothesis (see Fig. 6 and Additional file 4: Table S13). Under the second hypothesis, the 7 metaanalysis methods had close sensitivity curves (see Additional file 3: Figure S9 and Additional file 4: Table S14). Under the third hypothesis, the randomeffect metaanalysis methods had close sensitivity curves which are higher than the curve of the FEM method. (see Additional file 3: Figure S10 and Additional file 4: Table S15).
The receiver operating characteristic curve (ROC) is a tool for selecting possibly optimal models, and the area under the curve (AUC) measures how well two diagnostic results can be distinguished. AUCâˆˆ(0.9,1.0],AUCâˆˆ(0.7,0.9] and AUCâˆˆ(0.5,0.7] represent high, moderate and low accuracy, respectively. The DSLD2 method had AUC values of 0.996, 0.940 and 0.979 under the first, second, and third hypotheses, respectively. Under the first hypothesis, the DSLD2 method had the highest roc curve among all 7 metaanalysis methods (see Fig. 7). Under the second hypothesis, the roc curve of the DSLD2 method was the highest among 6 randomeffects methods (see Additional file 3: Figure S11). Under the third hypothesis, the roc curve of the DSLD2 method was highest among the 7 metaabnalysis methods based on effect sizes (see Additional file 3: Figure S12).
When the labels are highly imbalanced, ROCAUC may give pretty good results and be misleading. Precisionrecall plots could provide the researcher with a more accurate prediction because they evaluate the proportion of true positives among positive predictions [32]. Under the first hypothesis, the precisionrecall curve of DSLD2 method was the highest among seven metaanalysis methods (see Fig. 8). The precisionrecall curves of FEM and DSLR2 were lower than other curves under the first hypothesis (see Fig. 8). The DSL, PM, RML and SJ methods had almost the same precisionrecall curve under the first hypothesis (see Fig. 8). Under the second hypothesis, the precisionrecall curve of DSLD2 method was the highest among the curves of randomeffects metaanalysis methods (see Additional file 3: Figure S13). The randomeffects metaanalysis methods had close precisionrecall curves which are lower than the curve of FEM under the third hypothesis (see Additional file 3: Figure S14).
Bias and root mean square error (RMSE) are outcomes directly related to the betweenstudy variance estimator \(D^{2}_{g}\). The DSLD2, DSL, PM and RML methods had close bias and RMSE curves when Ï„^{2} was set to 0.0 and 1.0 (see Figs. 9, 10, 11 and 12, Additional file 3: Figures S15â€“S18 and Additional file 4: Tables S16S23). The bias and RMSE curves of DSLD2, DSL, PM and RML methods were lower than that of SJ and DSLR2 methods when SMD was chosen as the effect size measure and Ï„^{2} was set to 0.0 (see Figs. 9 and 10). The DLSR2 method had the lowest bias and RMSE curves when MD was chosen as the effect size measure and Ï„^{2} was set to 0.0 (see Figs. 11 and 12). The bias and RMSE values of DSLD2, DSL, PM and RML methods were lower than that of the SJ method when MD was chosen as the effect size measure and Ï„^{2} was set to 0.0 (see Figs. 11 and 12). The DSLD2, DSL, PM and RML methods had the close bias and RMSE curves when the between study variance was set to 1.0 (Additional file 3: See Additional file 3: Figures S15â€“S18).
The DSLD2, DSL, PM and RML methods had close mean values of I^{2} (see Figs. 13 and 14, Additional file 3: Figures S19â€“S20 and Additional file 4: Tables S24S27). The I^{2} curves of DSLD2, DSL, PM and RML methods were lower than that of DSLR2 and SJ methods when SMD was chosen as the effect size measure (see Fig. 13 and Additional file 3: Figure S19). The I^{2} values of DSLD2, DSL, PM and RML methods were higher than that of DSLR2 method when MD was chosen as the effect size measure (see Fig. 14 and Additional file 3: Figure S20). The SJ method had the highest I^{2} curves when MD was chosen as the effect size measure (see Fig. 14 and Additional file 3: Figure S20).
Application in genomic data
Alzheimerâ€™s gene expression datasets
Alzheimerâ€™s disease (AD), a neurodegenerative disease, is common in elderly indiviuals [33]. The incidence of AD has increased and is increasingly diagnosed in younger individuals. However, the etiology of AD is still unknown [34]. In this section, we used the DSLD2 method to analyze Alzheimerâ€™s disease from a genetic perspective. Seven public AD gene expression datasets of the hippocampus from postmortem brain samples were used in this paper. The phenotypic and gene expression data are available through GEO accession numbers GSE36980 [35], GSE29378 [36], GSE84422 [37], GSE1297 [38], GSE5281 [39â€“41], GSE28146 and GSE48350 [42â€“48]. After withinstudy data preprocessing, filtering out genes with very low gene expression and excluding small variation genes, the metaanalysis of the DSLD2 method was conducted on 3257 target genes in 305 subjects (168 AD and 137 controls).
A Venn diagram was plotted to compare DE genes (p<0.01) detected by the DSLD2, PM, SJ and RML methods. The DSLD2, PM, SJ and RML methods identified 364, 454, 611, 410 significantly DE genes (p<0.01), respectively (Fig. 15). The four metaanalysis methods found 299 overlapping DE genes. The DE genes detected by the DSLD2 method were different from DE genes identified by the PM, SJ and RML methods.
To biologically annotate the differentially expressed genes identified by DSLD2, the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis was performed using overrepresentation analysis (ORA), and the first ten pathways are listed in Table 2. The differentially expressed genes with p<0.001 were significantly enriched in the neurological disease pathways, including the MAPK signaling pathway (hsa04010), the ErbB signaling pathway (hsa04012), Helicobacter pylori infectioninduced epithelial cell signaling (hsa05120) and the hippo signaling pathway (hsa04392). Many studies have shown that Alzheimerâ€™s disease is closely related to the MAPK signaling pathway. For example, Eun Kyung and EuiJu reported that deviation from the control of the MAPK signaling pathway influenced the progression of Alzheimerâ€™s disease [49]. ErbB, a key NRG1 receptor, plays a significant role in the development and plasticity of Alzheimerâ€™s disease. Woo et al. showed that the upregulation of ErbB4 immunoreactivity implicates the development of AD pathology [50]. The relationship between Helicobacter pylori infection (HpI) and Alzheimerâ€™s disease was investigated by histological diagnosis [51]. Studies have shown that the pathophysiology of AD is influenced by Helicobacter pylori infection through many mechanisms [51]. Many studies have suggested that Alzheimerâ€™s disease is related to the hippo signaling pathway [52].
Discussion and conclusion
This paper proposed a metaanalysis method (DSLD2) based on new betweenstudy variance estimator \(D_{g}^{2}\). The biases and RMSE of \(D_{g}^{2}\) were lowest among 6 meta analysis methods when Ï„^{2} was set to 0 and SMD was chosen as the effect size measure (see Figs. 9 and 10). The DSLD2, DSL, PM and RML methods had close bias and RMSE values when Ï„^{2} was set to 0 or 1 and SMD or MD was chosen as the effect size measure (see Figs. 9, 10, 11 and 12 and Additional file 3: Figures S15â€“S18). The I^{2} values of DSLD2, DSL, PM and RML methods were close when the Ï„^{2} is set to 0.0 and 1.0 (see Figs. 13 and 14 and Additional file 3: Figures S19â€“S20).
We applied 7 metaanalysis methods based on effect sizes to simulation datasets of gene expression levels and compared the performance between the DSLD2 method and the other metaanalysis models. The FDR_{1} values of DSLD2 were smaller than that of DSL, PM, FEM, RML and SJ methods (see Table 1). The DSLD2 method had the lowest FDR_{2} values among the 7 metaanalysis methods based on effect sizes (see Table 1).
Under the first hypothesis, the precision, accuracy, sensitivity, FPR and MCC of the DSLD2 method varied greatly from 10 to 20 samples but tended to be stable between 60 and 220 samples (see Figs. 2, 3, 4, 5 and 6). The accuracy, MCC, sensitivity, ROC and precisionrecall curve of the DSLD2 method were the highest among the 7 metaanalysis methods (see Figs. 3, 5, 6, 7 and 8). The precision of DSLD2 method wen up to 1.0 when the number of sample sizes per study was larger than 60 (see Fig. 2). The FPR of DSLD2 method wen down to 0.0 when the number of sample sizes per study was larger than 60 (see Fig. 4). The FEM method had the lowest curves of precision, accuracy, sensitivity, FPR and MCC (see Figs. 2, 3, 4, 5 and 6). The curves of precision, accuracy, sensitivity, FPR and MCC for the SJ method was lowest among randomeffects metaanalysis methods (see Figs. 2, 3, 4, 5 and 6). The results of this simulation show that DSLD2 is a suitable method for detecting differentially expressed genes under the first hypothesis.
Under the second hypothesis, the DSLD2 and DSLR2 methods had the highest sensitivity values of approximately 1.0 (see Additional file 3: Figure S9). The ROC curve and precisionrecall curve of DSLD2 method were the highest among 6 randomeffects methods (see Additional file 3: Figures S11 and S13). The FEM method had the highest values of the precision, accuracy, FPR and MCC among 7 metaanalysis methods based on effect sizes (see Additional file 3: Figures S1, S3, S5 and S7).
Under the third hypothesis, the DSLD2 method had the high sensitivity values of approximately 1.0 (see Additional file 3: Figure S10). The ROC curve and precisionrecall curve of DSLD2 method were the highest among 6 randomeffects methods (see Additional file 3: Figures S12 and S14). The SJ method had the highest values of accuracy and MCC when number of sample sizes per study was between 60 to 220 (see Additional file 3: Figures S4 and S8). The FEM method had the highest precision values and the lowest FPR values (see Additional file 3: Figures S2 and S6).
We also applied the DSLD2 method to microarray data of Alzheimerâ€™s disease. The differentially expressed genes with p<0.01 were significantly enriched in the neurological disease pathways, including the MAPK signaling pathway, the ErbB signaling pathway, Helicobacter pylori infectioninduced epithelial cell signaling and the hippo signaling pathway. Moreover, many previous studies suggest that Alzheimerâ€™s disease is related to pathways that DSLD2 discovered [49â€“52].
Availability of data and materials
The simulation datasets has been made publicly available (https://github.com/andwisdom/Anovelestimatorofbetweenstudyvarianceinrandomeffectsmodels). The code of simulation datasets is shared in the articleâ€™s supplementary information files. The AD datasets analysised during the current study are available under entry numbers GSE36980, GSE29378, GSE84422, GSE1297, GSE5281, GSE28146 and GSE48350 from the GEO database in NCBI (http://www.ncbi.nlm.gov/geo). The R code used in this manuscript is public available in Additional fileÂ 5.
Abbreviations
 AD:

Alzheimerâ€™s disease
 AUC:

The area under the receiver operating characteristic curve
 DSL:

DerSimonian and Laird estimate
 DSLD2:

Twostep estimation starting with the DSL estimate and the \(D_{g}^{2}\) in the second step
 DSLR2:

Twostep estimation starting with the DSL estimate and the R^{2} in the second step
 EBI:

European Bioinformatics Institute
 FDR:

False discovery rate
 FEM:

Fixedeffects model
 FPR:

False positive rate
 GEO:

Gene Expression Omnibus database
 I ^{2} :

Heterogeneity statistic I^{2}
 PM:

Paule and Mandel estimate
 REM:

Randomeffects model
 RML:

Restricted maximum likelihood estimate
 RMSE:

Root mean square error
 ROC:

Receiver operating characteristic curve
 SJ:

Sidik and Jonkman estimate
References
Sudmant PH, Alexis MS, Burge CB. Metaanalysis of RNAseq expression data across species, tissues and studies,. Genome Biol. 2015; 16:287.
Bagos PG. Genetic model selection in genomewide association studies: robust methods and the use of metaanalysis,. Stat Appl Genet Mol. 2013; 12:285â€“308.
Barendregt JJ, Doi SA, Lee YY. Metaanalysis of prevalence,. J Epidemiol Community Health. 2013; 67:974â€“378.
Panagiotou OA, Willer CJ, Hirschhorn JN. The power of metaanalysis in genomewide association studies. Annu Rev Genomics Hum Genet. 2013; 14:441â€“65.
GonzalezCastro TB, TovillaZarate AC. Metaanalysis: a tool for clinical and experimental research in psychiatry. Nord J Psychiat. 2014; 68:243â€“50.
Lee CH, Eskin E. Increasing the power of metaanalysis of genomewide association studies to detect heterogeneous effects. Bioinformatics. 2017; 33:379â€“88.
Bolanos RD, Calderon MC. Introduction to traditional metaanalysis. Rev Gastroenterol Peru. 2014; 34:45â€“51.
Ma T, Huo Z, Kuo A, Zhu L, Fang Z, Zeng X, Lin CW, Liu S, Wang L, Liu P, Rahman T, Chang LC, Kim S, Li J, Park Y, Song C, Oesterreich S, Sibille E, Tseng GC. Metaomics: analysis pipeline and browserbased software suite for transcriptomic metaanalysis. Bioinformatics. 2019; 35:1597â€“9.
Tseng GC, Feingold DG. Comprehensive literature review and statistical considerations for microarray metaanalysis. Nucleic Acids Res. 2012; 40:3785â€“99.
Lin L, Chu H. Quantifying publication bias in metaanalysis. Biometrics. 2018; 74:785â€“94.
McKenzie JE, Beller EM, Forbes AB. Introduction to systematic reviews and metaanalysis. Respirology. 2016; 21:626â€“37.
Borenstein M, Hedges L, Higgins J, Rothstein H. A basic introduction to fixedeffect and randomeffects models for metaanalysis. Res Synth Methods. 2010; 1:97â€“111.
Rhodes DR, Barrette TR, Rubin MA. Metaanalysis of microarrays: interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer. Cancer Res. 2002; 62:4427â€“33.
Choi JK, Yu U, Kim S, Yoo OJ. Combining multiple microarray studies and modeling interstudy variation. Bioinformatics. 2003; 19:84â€“90.
Tseng G, Ghosh D, Feingold E. Comprehensive literature review and statistical considerations for microarray metaanalysis. Nucleic Acids Res. 2012; 40:3785â€“99.
Waldron L, Riester M. Metaanalysis in gene expression studies. Methods Mol Biol. 2016; 1418:161â€“76.
Siangphoe U, Archer KJ. Estimation of random effects and identifying heterogeneous genes in metaanalysis of gene expression studies. Bioinformatics. 2016; 18:602â€“18.
Bolanos D, Calderon RMC. Introduction to the indirect metaanalyses. Rev Gastroenterol Peru. 2014; 34:151â€“4.
Borenstein M, Hedges LV, Higgins J. A basic introduction to fixedeffect and randomeffects models for metaanalysis. Res Synth Methods. 2010; 1:97â€“111.
Song C, Tseng GC. Hypothesis setting and order statistic for robust genomic metaanalysis. Ann Appl Stat. 2014; 8:777.
Evangelou E, Ioannidis JP. Metaanalysis methods for genomewide association studies and beyond. Nat Rev Genet. 2013; 14:379â€“89.
DerSimonian R, Kacker R. Randomeffects model for metaanalysis of clinical trials: An update. Contemp Clin Trials. 2007; 28:105â€“14.
van Aert R, Jackson D. Multistep estimators of the betweenstudy variance: The relationship with the paulemandel estimator. Stat Med. 2018; 37:2616â€“29.
Mavridis D, Salanti G. A practical introduction to multivariate metaanalysis. Stat Methods Med Res. 2013; 22:133â€“58.
Demidenko E, Sargent J, Onega T. Random effects coefficient of determination for mixed and metaanalysis models. Commun Stat Theory Methods. 2012; 41:953â€“69.
Langan D, Higgins J, Simmonds M. Comparative performance of heterogeneity variance estimators in metaanalysis: A review of simulation studies. Res Synth Methods. 2016; 8:181â€“98.
Veroniki A, Jackson D, Viechtbauer W, Bender R, Bowden J, Knapp G, Kuss O, Higgins J, Langan D, Salanti G. Methods to estimate the betweenstudy variance and its uncertainty in metaanalysis. Res Synth Methods. 2016; 7:55â€“79.
Sidik K, Jonkman J. A comparison of heterogeneity variance estimators in combining results of studies. Stat Med. 2007; 26:1964â€“81.
Sidik K, Jonkman J. Simple heterogeneity variance estimation for metaanalysis. J R Stat Soc Ser C Appl Stat. 2005; 54:367â€“84.
Chang LC, Lin HM, Sibille E. Metaanalysis methods for combining multiple expression profiles: comparisons, statistical characterization and an application guideline. BMC Bioinformatics. 2013; 14:368.
Benjamini Y, Yekutieli D. The control of the false discovery rate in multiple testing under dependency. Ann Stat. 2001; 29:1165â€“88.
Saito T, Rehmsmeier M. The precisionrecall plot is more informative than the roc plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE. 2015; 10:0118432.
Swets JA. Measuring the accuracy of diagnostic systems. Science. 1988; 240:1285â€“93.
Gusareva ES, Carrasquillo MM, Bellenguez C. Genomewide association interaction analysis for alzheimerâ€™s disease. Neurobiol Aging. 2014; 35:2436â€“43.
Hokama M, Oka S, Leon J. Altered expression of diabetesrelated genes in alzheimerâ€™s disease brains: the hisayama study. Cereb Cortex. 2013; 24:2476â€“88.
Miller JA, Woltjer RL, Goodenbour JM. Genes and pathways underlying regional and cell type changes in alzheimerâ€™s disease. Genome Med. 2013; 5:48.
Wang M, Roussos P, McKenzie A. Integrative network analysis of nineteen brain regions identifies molecular signatures and networks underlying selective regional vulnerability to alzheimerâ€™s disease. Genome Med. 2016; 8:104.
Blalock EM, Geddes JW, Chen KC. Incipient alzheimerâ€™s disease: microarray correlation analyses reveal major transcriptional and tumor suppressor responses. Proc Natl Acad Sci. 2004; 101:2173â€“8.
Liang WS, Dunckley T, Beach TG. Gene expression profiles in anatomically and functionally distinct regions of the normal aged human brain. Physiol Genomics. 2007; 28:311â€“22.
Liang WS, Reiman EM, Valla J. Alzheimerâ€™s disease is associated with reduced expression of energy metabolism genes in posterior cingulate neurons. Proc Natl Acad Sci. 2008; 105:4441â€“6.
Readhead B, HaureMirande JV, Funk CC. Multiscale analysis of independent alzheimerâ€™s cohorts finds disruption of molecular, genetic, and clinical networks by human herpesvirus. Neuron. 2018; 99:64â€“827.
Blalock EM, Buechel HM. Popovic jmicroarray analyses of lasercaptured hippocampus reveal distinct gray and white matter signatures associated with incipient alzheimerâ€™s disease. J Chem Neuroanat. 2011; 42:118â€“26.
Blair LJ, Nordhues BA, Hill SE. Accelerated neurodegeneration through chaperonemediated oligomerization of tau. J Clin Invest. 2013; 123:4158â€“69.
Astarita G, Jung KM, Berchtold NC. Deficient liver biosynthesis of docosahexaenoic acid correlates with cognitive impairment in alzheimerâ€™s disease. PLoS ONE. 2010; 5:12538.
Cribbs DH, Berchtold NC, Perreau V. Extensive innate immune gene activation accompanies brain aging, increasing vulnerability to cognitive decline and neurodegeneration: a microarray study. J Neuroinflamm. 2012; 9:179.
Berchtold NC, Cribbs DH, Coleman PD. Gene expression changes in the course of normal brain aging are sexually dimorphic. Proc Natl Acad Sci. 2008; 105:15605â€“10.
Sarvari M, Hrabovszky E, Kallo I. Menopause leads to elevated expression of macrophageassociated genes in the aging frontal cortex: rat and human studies identify strikingly similar changes. J Neuroinflammation. 2012; 9:264.
Berchtold NC, Coleman PD, Cribbs DH. Synaptic genes are extensively downregulated across multiple brain regions in normal human aging and alzheimerâ€™s disease. Neurobiol Aging. 2013; 34:1653â€“61.
Kim EK, Choi EJ. Pathological roles of mapk signaling pathways in human diseases. Biochim Biophys Acta. 2010; 1802:396â€“405.
Woo RS, Lee JH, Yu HN. Expression of erbb4 in the neurons of alzheimerâ€™s disease brain and app/ps1 mice, a model of alzheimerâ€™s disease. Anat Cell Biol. 2011; 44:116â€“27.
Kountouras J, Tsolaki M, Gavalas E. Relationship between helicobacter pylori infection and alzheimer disease. Neurology. 2006; 66:938â€“40.
Wang SP, Wang LH. Disease implication of hyperhippo signalling. Open Biol. 2016; 6:160119.
Acknowledgements
Not applicable.
Funding
This work is supported by the National Natural Science Foundation of China (grant no. 11971130), the Natural Science Foundation of Heilongjiang Province, China (grant no. A2015001). This work was partially supported by the National HighTech Research and Development Program of China (863 Program) (nos. 2015AA020101 and 2015AA020108).
Author information
Authors and Affiliations
Contributions
SJ and LX provided guidance for the entire study. QJ, XY and JQ collected the data. NW, BL and YT carried out the data analysis. LC, JZand YJ wrote the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Because the data in this metaanalysis are from published articles and the data are publicly available, it was not necessary to obtain ethics approval and consent to participate.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisherâ€™s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Additional file 1
Proofs. Additional file 1 proves that the betweenstudy variance \(D_{g}^{2}\) is greater than 0 and \(D_{g}^{2}\left (au^{2}\right)\) increases with Ï„^{2}.
Additional file 2
Supplementary methods. Additional file 2 gives the calculation processes of the false discovery rate, the precision, the accuracy, the false positive rate, the sensitivity and the Matthews correlation coefficient.
Additional file 3
Supplementary figures. Figure S1 Plot of the precision under the second hypothesis. Figure S2 Plot of the precision under the third hypothesis. Figure S3 Plot of the accuracy under the second hypothesis. Figure S4 Plot of the accuracy under the third hypothesis. Figure S5 Plot of the FPR under the second hypothesis. Figure S6 Plot of the FPR under the third hypothesis. Figure S7 Plot of the MCC under the second hypothesis. Figure S8 Plot of the MCC under the third hypothesis. Figure S9 Plot of the sensitivity under the second hypothesis. Figure S10 Plot of the sensitivity under the third hypothesis. Figure S11 Plot of the ROC curve and the AUC value under the second hypothesis. Figure S12 Plot of the ROC under the third hypothesis. The DSLD2 method is developed in this paper. Figure S13 Precisionrecall plot under the second hypothesis. Figure S14 Precisionrecall plot under the third hypothesis. Figure S15 Bias plot of 6 metaanalysis methods when Ï„^{2} is set to 1.0 and SMD is chosen as the effect size measure. Figure S16 RMSE plot of 6 metaanalysis methods when Ï„^{2} is set to 1.0 and SMD is chosen as the effect size measure. Figure S17 Bias plot of 6 metaanalysis methods when Ï„^{2} is set to 1.0 and MD is chosen as the effect size measure. Figure S18 RMSE plot of 6 metaanalysis methods when Ï„^{2} is set to 1.0 and MD is chosen as the effect size measure. Figure S19 Mean of I^{2} plot of 6 metaanalysis methods when Ï„^{2} is set to 1.0 and SMD is chosen as the effect size measure. Figure S20 Mean of I^{2} plot of 6 metaanalysis methods when Ï„^{2} is set to 1.0 and MD is chosen as the effect size measure.
Additional file 4
Tables. Additional file 4 is the tables of the precision, accuracy, FPR, MCC and sensitivity under three hypothesis and tables of bias, RMSE and mean of I^{2}.
Additional file 5
The r code. Additional file 5 is the code of the simulation setting for gene expression levels, the code of the simulation data using Monte Carlo method and the R code of DSLD2 method.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Wang, N., Zhang, J., Xu, L. et al. A novel estimator of betweenstudy variance in randomeffects models. BMC Genomics 21, 149 (2020). https://doi.org/10.1186/s1286402065009
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1286402065009