Skip to main content

Table 2 Statistical comparison of performances of classifiers trained on normal and without-one-feature morbidity datasets

From: A machine learning approach for genome-wide prediction of morbid and druggable human genes based on systems-level data

Missing feature 1

Median AUC [min,max]2

N

  

W

W c (two-tailed p = 0.05)3

ppi

0.715 [0.705,0.726]

10

  

26

8

metin

0.714 [0.707,0.727]

10

  

26

8

metout

0.713 [0.707,0.729]

10

  

25

8

regin

0.714 [0.703,0.726]

9

  

18

6

regout

0.716 [0.705,0.729]

10

  

26

10

c

0.713 [0.701,0.724]

10

  

13

8

identicalness

0.711 [0.704,0.727]

10

  

24

8

cent

0.714 [0.707,0.727]

10

  

25

8

inbet

0.716 [0.708,0.731]

10

  

25

8

inbetppi

0.714 [0.707,0.727]

9

  

21

6

inbetmet

0.714 [0.707,0.728]

9

  

21

6

inbetreg

0.715 [0.706,0.727]

10

  

25

8

numtissuesexp 4

0.709 [0.701,0.719]

10

  

7

8*

avegexptec 5

0.715 [0.704,0.727]

10

  

27

8

Unknown

0.713 [0.701,0.725]

10

  

18

8

Cytoplasm

0.715 [0.706,0.728]

10

  

26

8

Endoplasmic reticulum

0.716 [0.705,0.727]

10

  

26

8

Mitochondrion

0.714 [0.706,0.728]

10

  

24

8

Nucleus

0.715 [0.704,0.728]

10

  

24

8

Other localization

0.714 [0.704,0.726]

10

  

21

8

Cellular component

0.714 [0.705,0.727]

9

  

21

6

Extracellular space

0.710 [0.7,0.723]

10

  

14

8

Golgi apparatus

0.715 [0.706,0.728]

10

  

26

8

Median AUC [min,max] for normal datasets: 0.716 [0.706,0.729]

  1. 1 See “Methods” and Additional file 1 for a description of features
  2. 2 Of 10 datasets
  3. 3 According to table of critical values for W in [6]
  4. 4 The number of tissues (out of 32) in which the gene is expressed at least 5 transcripts per million (tpm) according to Reverter et al. [33]
  5. 5 The average expression in tpm among all the tissues in which the gene is expressed according to Reverter et al. [33]
  6. * Difference statistically significant