Skip to main content

Table 1 Prediction of 14 class compounds using independent dataset

From: Identification of biomarkers that distinguish chemical contaminants based on gene expression profiles

Feature selection methods

Feature number

Classification algorithm

Training accuracy (%) (D2*)

Prediction accuracy (%) (D2 to D1*)

SVM-RFE

200

LibSVM

100

64.9

Relief

500

SL

83.7

72.6

Inforgain

500

SL

83.7

66.7

Chisquare

400

SL

82.6

72.6

Gainratio

500

SL

82.9

66.1

PCA

200

SMO

73.3

66.7

Gradient

300

SMO

85.7

79.7

  1. *Dataset 1 (D1) has a total of 168 array samples what were produced in 2007, and dataset 2 (D2) includes 363 array samples that were hybridized in 2008. For each dataset, a complete set of 105 compounds were included.