Skip to main content

Advertisement

Table 1 Performance of 18 regression modeling methods on four datasets assessed by coefficient of determination (R2, mean ± standard deviation, n = 10) estimated from ten runs of 10-fold cross-validation with values of the best performing method for each dataset shown in bold

From: Predicting chemical bioavailability using microarray gene expression data and regression modeling: A tale of three explosive compounds

Regression method RDX_D4 RDX_D14 TNT_D4 TNT_D14
Predictor size (gene #) 26 3 53 6
Linear
 Multivariate 0.62 ± 0.19 0.65 ± 0.12 0.42 ± 0.14 0.72 ± 0.18
 Robust 0.63 ± 0.14 0.65 ± 0.13 NA 0.67 ± 0.15
 Ridge 0.65 ± 0.15 0.65 ± 0.13 0.73 ± 0.15 0.71 ± 0.16
 LASSO 0.65 ± 0.18 0.65 ± 0.14 0.73 ± 0.15 0.69 ± 0.15
 Elastic net 0.66 ± 0.20 0.66 ± 0.13 0.75 ± 0.19 0.69 ± 0.17
 SVR 0.60 ± 0.15 0.68 ± 0.14 0.74 ± 0.16 0.66 ± 0.16
Nonlinear
 Stepwise 0.42 ± 0.21 0.69 ± 0.14 0.33 ± 0.21 0.6 ± 0.16
 Ridge Polynomial 0.62 ± 0.18 0.71 ± 0.12 0.71 ± 0.14 0.66 ± 0.16
 Ridge Exponential 0.65 ± 0.13 0.67 ± 0.13 0.68 ± 0.14 0.67 ± 0.17
 Ridge Gaussian 0.64 ± 0.14 0.70 ± 0.15 0.43 ± 0.13 0.64 ± 0.16
 SVR Polynomial 0.61 ± 0.15 0.68 ± 0.14 0.70 ± 0.12 0.63 ± 0.16
 SVR Gaussian 0.63 ± 0.13 0.68 ± 0.14 0.74 ± 0.12 0.67 ± 0.13
 SVR Sigmoid 0.17 ± 0.00 NA 0.08 ± 0.00 NA
 Nadaraya-Watson 0.54 ± 0.09 0.68 ± 0.16 0.73 ± 0.17 0.67 ± 0.13
 Inverse 0.44 ± 0.14 NA 0.31 ± 0.10 NA
 Loglog NA NA NA NA
 Regression Tree 0.53 ± 0.10 0.59 ± 0.13 0.73 ± 0.12 0.54 ± 0.14
 Random Forest 0.60 ± 0.12 0.59 ± 0.16 0.75 ± 0.10 0.70 ± 0.17
  1. RDX_D4 4-day RDX exposure, RDX_D14 14-day RDX exposure, TNT_D4 4-day TNT exposure, TNT_D14 14-day TNT exposure, NA not available. See Additional file 5 for the lists and annotation of predictor genes