Skip to main content
Figure 5 | BMC Genomics

Figure 5

From: Predicting protein function by machine learning on amino acid sequences – a critical evaluation

Figure 5

Feature concordance between species. The feature lists selected for function prediction in each species using the Wilcoxon filter as described were analyzed for concordance. The feature selection procedure generates sorted lists of features. The agreement between these lists can be calculated using a rank correlation method, for example Kendall's Coefficient of Concordance. A good correlation (reflected in a small p-value) indicates that the same features are high in the list of selected features. The p-values of Kendall's Coefficient of Concordance for each pairwise comparison are shown. The feature lists for the first five species show high correlation, while those of the two mycoplasmal species differ significantly. This may explain the difference in performance on these two species. Note that the matrix is not symmetrical, because different features will be removed by the redundancy filtering step depending on which species is used as a reference

Back to article page