Table 9 Performance of N-signal vs N-signal-free protein binary classification on automatically collected orthologs

From: Plus ça change – evolutionary sequence divergence predicts protein subcellular localization signals

Yeast dataset Mean accuracy Mean AUC Mean MCC
J48 71.47±5.00 0.67±0.07 0.36±0.12
SVM 75.35±3.49 0.71±0.04 0.44±0.08
The majority class fraction 65.23% N/A N/A
Human dataset    
J48 69.32±4.10 0.72±0.07 0.43±0.09
SVM 72.28±5.95 0.72±0.06 0.43±0.12
The majority class fraction 62.41% N/A N/A
Plant dataset    
J48 79.41±6.03 0.75±0.06 0.55±0.13
SVM 83.47±4.01 0.79±0.04 0.64±0.09
The majority class fraction 63.60% N/A N/A
  1. Three classification performance measures when using only divergence features are shown for the discrimination of N-signal containing and N-signal-free proteins on automatically collected orthologs. AUC denotes the area under the ROC curves. For each measure the average and standard deviation is shown over the 5 folds of the cross-validation.