Skip to main content

Table 1 Prediction of 14 class compounds using independent dataset

From: Identification of biomarkers that distinguish chemical contaminants based on gene expression profiles

Feature selection methods Feature number Classification algorithm Training accuracy (%) (D2*) Prediction accuracy (%) (D2 to D1*)
SVM-RFE 200 LibSVM 100 64.9
Relief 500 SL 83.7 72.6
Inforgain 500 SL 83.7 66.7
Chisquare 400 SL 82.6 72.6
Gainratio 500 SL 82.9 66.1
PCA 200 SMO 73.3 66.7
Gradient 300 SMO 85.7 79.7
  1. *Dataset 1 (D1) has a total of 168 array samples what were produced in 2007, and dataset 2 (D2) includes 363 array samples that were hybridized in 2008. For each dataset, a complete set of 105 compounds were included.