Skip to main content

Table 1 Relevance of prediction features ranked according to the information gain with respect to the class

From: KinMutRF: a random forest classifier of sequence variants in the human protein kinase superfamily

Rank

Gain

Feature

Rank

Gain

Feature

1

0.4914

Gene Ontology

14

4.79e-3

Binding (UniProt)

2

0.1787

SIFT

15

4.43e-3

Np_bind (UniProt)

3

0.1197

Kinase group

16

3.38e-3

Repeat (UniProt)

4

0.1121

PFAM domain

17

2.47e-3

Phospho.ELM

5

0.0438

Wild type amino ac.

18

2.37e-3

Zn finger (UniProt)

6

0.0373

Hydrophobicity

19

1.82e-3

Modified res. (UniProt)

7

0.0368

Alternative amino ac.

20

1.51e-3

Metal binding (UniProt)

8

0.0353

Volume change

21

9.4e-4

Signal peptide (UniProt)

9

0.0239

FireDB residue

22

7.71e-4

Active site (UniProt)

10

8.94e-3

Any uniprot

23

6.86e-4

Carbohyd (UniProt)

11

7.70e-3

Formal charge

24

5.02e-4

Site (UniProt)

12

6.80e-3

Cbeta Branching

25

5.33e-5

Transmembrane (UniProt)

13

6.02e-3

Disulfid (UniProt)

   
  1. Ranking calculated with the InfoGainAttributeEval function in Weka. Features that are specifically related to the protein kinase superfamily rank among the most informative ones