Skip to main content

Table 3 Statistical comparison of performances of classifiers trained on normal and without-one-feature druggability datasets

From: A machine learning approach for genome-wide prediction of morbid and druggable human genes based on systems-level data

Missing feature 1

Median AUC [min,max]2

N

W

W c (two-tailed p = 0.05)3

ppi

0.819 [0.798,0.835]

10

27

8

metin

0.817 [0.803,0.834]

10

26

8

metout

0.817 [0.801,0.832]

9

20

6

regin

0.818 [0.799,0.83]

9

18

6

regout

0.818 [0.801,0.833]

10

26

8

c

0.821 [0.799,0.836]

10

21

8

identicalness

0.819 [0.8,0.836]

10

27

8

cent

0.814 [0.797,0.832]

10

18

8

inbet

0.821 [0.804,0.837]

10

25

8

inbetppi

0.819 [0.803,0.833]

10

25

8

inbetmet

0.82 [0.791,0.833]

10

26

8

inbetreg

0.818 [0.802,0.83]

9

19

6

numtissuesexp 4

0.806 [0.795,0.832]

9

11

6

avegexptec 5

0.814 [0.799,0.835]

10

23

8

Unknown

0.816 [0.796,0.832]

9

12

6

Cytoplasm

0.814 [0.794,0.834]

10

20

8

Endoplasmic reticulum

0.820 [0.799,0.834]

10

27

8

Mitochondrion

0.820 [0.796,0.831]

9

22

6

Nucleus

0.816 [0.793,0.831]

10

20

8

Other localization

0.821 [0.802,0.837]

9

20

6

Cellular component

0.82 [0.801,0.835]

10

25

8

Extracellular space

0.817 [0.8,0.837]

10

26

8

Golgi apparatus

0.812 [0.8,0.834]

10

24

8

Plasma membrane

0.781 [0.762,0.816]

10

1

8*

Median AUC [min,max] for normal datasets : 0.820 [0.801,0.835]

  1. 1 See “Methods” and Additional file 1 for a description of features
  2. 2 Of 10 datasets
  3. 3 According to table of critical values for W in [6]
  4. 4 The number of tissues (out of 32) in which the gene is expressed at least 5 transcripts per million (tpm) according to Reverter et al. [33]
  5. 5 The average expression in tpm among all the tissues in which the gene is expressed according to Reverter et al. [33]
  6. * Difference statistically significant