Skip to main content

Table 1 Performance of 18 regression modeling methods on four datasets assessed by coefficient of determination (R2, mean ± standard deviation, n = 10) estimated from ten runs of 10-fold cross-validation with values of the best performing method for each dataset shown in bold

From: Predicting chemical bioavailability using microarray gene expression data and regression modeling: A tale of three explosive compounds

Regression method

RDX_D4

RDX_D14

TNT_D4

TNT_D14

Predictor size (gene #)

26

3

53

6

Linear

 Multivariate

0.62 ± 0.19

0.65 ± 0.12

0.42 ± 0.14

0.72 ± 0.18

 Robust

0.63 ± 0.14

0.65 ± 0.13

NA

0.67 ± 0.15

 Ridge

0.65 ± 0.15

0.65 ± 0.13

0.73 ± 0.15

0.71 ± 0.16

 LASSO

0.65 ± 0.18

0.65 ± 0.14

0.73 ± 0.15

0.69 ± 0.15

 Elastic net

0.66 ± 0.20

0.66 ± 0.13

0.75 ± 0.19

0.69 ± 0.17

 SVR

0.60 ± 0.15

0.68 ± 0.14

0.74 ± 0.16

0.66 ± 0.16

Nonlinear

 Stepwise

0.42 ± 0.21

0.69 ± 0.14

0.33 ± 0.21

0.6 ± 0.16

 Ridge Polynomial

0.62 ± 0.18

0.71 ± 0.12

0.71 ± 0.14

0.66 ± 0.16

 Ridge Exponential

0.65 ± 0.13

0.67 ± 0.13

0.68 ± 0.14

0.67 ± 0.17

 Ridge Gaussian

0.64 ± 0.14

0.70 ± 0.15

0.43 ± 0.13

0.64 ± 0.16

 SVR Polynomial

0.61 ± 0.15

0.68 ± 0.14

0.70 ± 0.12

0.63 ± 0.16

 SVR Gaussian

0.63 ± 0.13

0.68 ± 0.14

0.74 ± 0.12

0.67 ± 0.13

 SVR Sigmoid

0.17 ± 0.00

NA

0.08 ± 0.00

NA

 Nadaraya-Watson

0.54 ± 0.09

0.68 ± 0.16

0.73 ± 0.17

0.67 ± 0.13

 Inverse

0.44 ± 0.14

NA

0.31 ± 0.10

NA

 Loglog

NA

NA

NA

NA

 Regression Tree

0.53 ± 0.10

0.59 ± 0.13

0.73 ± 0.12

0.54 ± 0.14

 Random Forest

0.60 ± 0.12

0.59 ± 0.16

0.75 ± 0.10

0.70 ± 0.17

  1. RDX_D4 4-day RDX exposure, RDX_D14 14-day RDX exposure, TNT_D4 4-day TNT exposure, TNT_D14 14-day TNT exposure, NA not available. See Additional file 5 for the lists and annotation of predictor genes