Feature selection methods
|
Feature number
|
Classification algorithm
|
Training accuracy (%) (D2*)
|
Prediction accuracy (%) (D2 to D1*)
|
---|
SVM-RFE
|
200
|
LibSVM
|
100
|
64.9
|
Relief
|
500
|
SL
|
83.7
|
72.6
|
Inforgain
|
500
|
SL
|
83.7
|
66.7
|
Chisquare
|
400
|
SL
|
82.6
|
72.6
|
Gainratio
|
500
|
SL
|
82.9
|
66.1
|
PCA
|
200
|
SMO
|
73.3
|
66.7
|
Gradient
|
300
|
SMO
|
85.7
|
79.7
|
- *Dataset 1 (D1) has a total of 168 array samples what were produced in 2007, and dataset 2 (D2) includes 363 array samples that were hybridized in 2008. For each dataset, a complete set of 105 compounds were included.