Skip to main content

Table 2 Entire feature space: Average testing set error rate and sparsity (in parentheses) for 10 random partitions of the data

From: Predictive computational phenotyping and biomarker discovery using reference-free genome comparisons

Dataset SCM LinSVM PolySVM Baseline
C. difficile     
Azithromycin 0.030 (3.3) 0.050 (32 752 570) 0.048 (32 752 570) 0.446
Ceftriaxone 0.073 (2.6) 0.079 (25 405 987) 0.076 (25 405 987) 0.306
Clarithromycin 0.011 (3.0) 0.053 (32 752 570) 0.053 (32 752 570) 0.446
Clindamycin 0.021 (1.4) 0.039 (30 988 214) 0.039 (30 988 214) 0.136
Moxifloxacin 0.020 (1.0) 0.054 (32 752 570) 0.048 (32 752 570) 0.390
M. tuberculosis     
Ethambutol 0.179 (1.4) 0.215 (9 465 489) 0.221 (9 465 489) 0.351
Isoniazid 0.021 (1.0) 0.117 (9 701 935) 0.119 (9 701 935) 0.421
Pyrazinamide 0.318 (3.1) 0.382 (8 058 479) 0.382 (8 058 479) 0.347
Rifampicin 0.031 (1.4) 0.200 (9 701 935) 0.204 (9 701 935) 0.452
Streptomycin 0.050 (1.0) 0.143 (9 282 080) 0.148 (9 282 080) 0.435
P. aeruginosa     
Amikacin 0.175 (4.9) 0.184 (116 441 834) 0.179 (116 441 834) 0.216
Doripenem 0.270 (1.4) 0.288 (122 438 059) 0.281 (122 438 059) 0.359
Levofloxacin 0.072 (1.2) 0.221 (122 216 859) 0.225 (122 216 859) 0.463
Meropenem 0.267 (1.6) 0.329 (123 466 989) 0.331 (123 466 989) 0.404
S. pneumoniae     
Benzylpenicillin 0.013 (1.1) 0.015 (8 968 176) 0.015 (8 968 176) 0.073
Erythromycin 0.037 (2.0) 0.046 (9 666 898) 0.047 (9 666 898) 0.142
Tetracycline 0.031 (1.1) 0.039 (8 657 259) 0.037 (8 657 259) 0.106
  1. Results are shown for the SCM and the kernel methods: LinSVM and PolySVM. The baseline method predicts the most abundant class in the training set. The smallest error rates are in bold