Skip to main content

Advertisement

Table 2 Performance of 18 regression modeling methods on the three HMX exposure datasets assessed by coefficient of determination (R2, mean ± standard deviation, n = 10) estimated from ten runs of 10-fold cross-validation with values of the best performing method shown in bold

From: Predicting chemical bioavailability using microarray gene expression data and regression modeling: A tale of three explosive compounds

Regression method D4 D14 D28
Predictor size (gene #) 6 6 10
Linear
 Multivariate 0.53 ± 0.15 0.52 ± 0.15 0.58 ± 0.15
 Robust 0.66 ± 0.12 0.72 ± 0.09 0.79 ± 0.02
 Ridge 0.67 ± 0.10 0.70 ± 0.11 0.81 ± 0.02
 LASSO 0.69 ± 0.10 0.72 ± 0.10 0.81 ± 0.04
 Elastic net 0.72 ± 0.09 0.71 ± 0.11 0.82 ± 0.03
 SVR 0.70 ± 0.10 0.65 ± 0.09 0.81 ± 0.05
Nonlinear
 Stepwise 0.67 ± 0.07 0.66 ± 0.11 0.79 ± 0.05
 Ridge Polynomial 0.63 ± 0.11 0.73 ± 0.08 0.76 ± 0.05
 Ridge Exponential 0.68 ± 0.08 0.68 ± 0.09 0.79 ± 0.04
 Ridge Gaussian 0.51 ± 0.16 0.56 ± 0.14 0.66 ± 0.06
 SVR Polynomial 0.69 ± 0.11 0.64 ± 0.11 0.79 ± 0.06
 SVR Gaussian 0.65 ± 0.09 0.60 ± 0.10 0.73 ± 0.10
 SVR Sigmoid 0.48 ± 0.15 0.49 ± 0.15 0.68 ± 0.12
 Nadaraya-Watson 0.68 ± 0.09 0.67 ± 0.09 0.80 ± 0.04
 Inverse NA NA NA
 Loglog NA NA NA
 Regression Tree 0.56 ± 0.15 0.61 ± 0.14 0.65 ± 0.13
 Random Forest 0.55 ± 0.16 0.60 ± 0.13 0.69 ± 0.10
  1. D4 4-day HMX exposure, D14 14-day HMX exposure, D28 28-day HMX exposure, NA not available. See Additional file 5 for the lists and annotation of predictor genes