Skip to main content

Table 1 Relevance of prediction features ranked according to the information gain with respect to the class

From: KinMutRF: a random forest classifier of sequence variants in the human protein kinase superfamily

Rank Gain Feature Rank Gain Feature
1 0.4914 Gene Ontology 14 4.79e-3 Binding (UniProt)
2 0.1787 SIFT 15 4.43e-3 Np_bind (UniProt)
3 0.1197 Kinase group 16 3.38e-3 Repeat (UniProt)
4 0.1121 PFAM domain 17 2.47e-3 Phospho.ELM
5 0.0438 Wild type amino ac. 18 2.37e-3 Zn finger (UniProt)
6 0.0373 Hydrophobicity 19 1.82e-3 Modified res. (UniProt)
7 0.0368 Alternative amino ac. 20 1.51e-3 Metal binding (UniProt)
8 0.0353 Volume change 21 9.4e-4 Signal peptide (UniProt)
9 0.0239 FireDB residue 22 7.71e-4 Active site (UniProt)
10 8.94e-3 Any uniprot 23 6.86e-4 Carbohyd (UniProt)
11 7.70e-3 Formal charge 24 5.02e-4 Site (UniProt)
12 6.80e-3 Cbeta Branching 25 5.33e-5 Transmembrane (UniProt)
13 6.02e-3 Disulfid (UniProt)    
  1. Ranking calculated with the InfoGainAttributeEval function in Weka. Features that are specifically related to the protein kinase superfamily rank among the most informative ones