Skip to main content

Table 2 Entire feature space: Average testing set error rate and sparsity (in parentheses) for 10 random partitions of the data

From: Predictive computational phenotyping and biomarker discovery using reference-free genome comparisons

Dataset

SCM

LinSVM

PolySVM

Baseline

C. difficile

    

Azithromycin

0.030 (3.3)

0.050 (32 752 570)

0.048 (32 752 570)

0.446

Ceftriaxone

0.073 (2.6)

0.079 (25 405 987)

0.076 (25 405 987)

0.306

Clarithromycin

0.011 (3.0)

0.053 (32 752 570)

0.053 (32 752 570)

0.446

Clindamycin

0.021 (1.4)

0.039 (30 988 214)

0.039 (30 988 214)

0.136

Moxifloxacin

0.020 (1.0)

0.054 (32 752 570)

0.048 (32 752 570)

0.390

M. tuberculosis

    

Ethambutol

0.179 (1.4)

0.215 (9 465 489)

0.221 (9 465 489)

0.351

Isoniazid

0.021 (1.0)

0.117 (9 701 935)

0.119 (9 701 935)

0.421

Pyrazinamide

0.318 (3.1)

0.382 (8 058 479)

0.382 (8 058 479)

0.347

Rifampicin

0.031 (1.4)

0.200 (9 701 935)

0.204 (9 701 935)

0.452

Streptomycin

0.050 (1.0)

0.143 (9 282 080)

0.148 (9 282 080)

0.435

P. aeruginosa

    

Amikacin

0.175 (4.9)

0.184 (116 441 834)

0.179 (116 441 834)

0.216

Doripenem

0.270 (1.4)

0.288 (122 438 059)

0.281 (122 438 059)

0.359

Levofloxacin

0.072 (1.2)

0.221 (122 216 859)

0.225 (122 216 859)

0.463

Meropenem

0.267 (1.6)

0.329 (123 466 989)

0.331 (123 466 989)

0.404

S. pneumoniae

    

Benzylpenicillin

0.013 (1.1)

0.015 (8 968 176)

0.015 (8 968 176)

0.073

Erythromycin

0.037 (2.0)

0.046 (9 666 898)

0.047 (9 666 898)

0.142

Tetracycline

0.031 (1.1)

0.039 (8 657 259)

0.037 (8 657 259)

0.106

  1. Results are shown for the SCM and the kernel methods: LinSVM and PolySVM. The baseline method predicts the most abundant class in the training set. The smallest error rates are in bold