Skip to main content

Table 3 Average classification accuracies for 13 numerical representations. Averages over the six classifiers are in bold

From: ML-DSP: Machine Learning with Digital Signal Processing for ultrafast, accurate, and scalable genome classification at all taxonomic levels

DataSet/

Numerical representation

classification model

Integer

Integer (Other)

Real

Atomic

EIIP

PP

Paired Num.

NN based doublet

Codon

Just-A

Just-C

Just-G

Just-T

Primates (148 sequences)

Linear Discriminant

97.3%

98.0%

99.3%

98.6%

99.3%

99.3%

97.3%

97.3%

98.0%

98.0%

97.3%

96.6%

96.6%

Linear SVM

97.3%

95.9%

98.6%

96.6%

97.3%

98.0%

95.9%

97.3%

94.6%

98.0%

96.6%

96.6%

95.3%

Quadratic SVM

97.3%

95.9%

98.6%

93.2%

95.9%

98.6%

96.6%

98.6%

95.9%

98.0%

98.0%

97.3%

95.9%

Fine KNN

98.0%

98.0%

100.0%

98.0%

96.6%

100.0%

99.3%

99.3%

98.0%

100.0%

98.6%

100.0%

98.6%

Subspace Discriminant

98.0%

97.3%

99.3%

98.0%

99.3%

98.6%

95.3%

97.3%

95.9%

98.0%

97.3%

98.0%

95.3%

Subspace KNN

98.0%

97.3%

98.6%

96.6%

95.9%

98.0%

100%

98.0%

98.0%

99.3%

97.3%

98.6%

98.6%

Average

97.7%

97.1%

99.1%

96.8%

97.4%

98.8%

97.4%

98.0%

96.7%

98.6%

97.5%

97.9%

96.7%

Protists (159 sequences)

Linear Discriminant

83.6%

84.9%

85.5%

86.2%

86.2%

84.3%

85.5%

83.0%

85.5%

84.3%

83.6%

83.0%

83.6%

Linear SVM

84.3%

83.0%

83.6%

83.0%

83.0%

71.7%

82.4%

83.0%

83.6%

83.6%

83.6%

83.6%

83.0%

Quadratic SVM

84.9%

84.9%

83.6%

82.4%

83.0%

81.1%

85.5%

84.9%

86.2%

83.0%

84.3%

83.0%

86.2%

Fine KNN

86.8%

86.2%

81.8%

84.3%

88.1%

78.0%

89.9%

88.7%

91.8%

86.8%

88.7%

93.7%

92.5%

Subspace Discriminant

85.5%

84.9%

88.1%

86.8%

85.5%

86.8%

83.6%

83.0%

85.5%

84.9%

83.6%

83.0%

83.6%

Subspace KNN

88.7%

87.4%

91.8%

85.5%

88.1%

91.2%

89.9%

88.1%

93.1%

86.8%

88.1%

92.5%

93.7%

Average

85.6%

85.2%

85.7%

84.7%

85.7%

82.2%

86.1%

85.1%

87.6%

84.9%

85.3%

86.5%

87.1%

Fungi (226 sequences)

Linear Discriminant

76.3%

76.8%

82.1%

50.9%

57.1%

80.4%

75.4%

68.8%

77.7%

81.7%

70.5%

71.9%

79.0%

Linear SVM

66.5%

58.0%

76.8%

49.1%

46.0%

73.7%

73.2%

66.1%

71.0%

75.9%

64.7%

66.1%

75.4%

Quadratic SVM

58.9%

59.8%

82.6%

33.9%

37.9%

79.9%

71.4%

67.4%

63.4%

71.0%

67.9%

71.4%

64.3%

Fine KNN

61.6%

56.7%

84.4%

49.6%

54.9%

85.7%

72.3%

65.2%

58.0%

68.8%

61.6%

68.8%

67.9%

Subspace Discriminant

74.6%

75.0%

78.6%

46.0%

55.4%

79.0%

75.0%

71.4%

78.1%

79.9%

68.8%

69.2%

78.6%

Subspace KNN

63.4%

58.9%

89.3%

51.8%

58.0%

89.3%

68.3%

63.8%

59.8%

67.9%

65.6%

72.8%

64.3%

Average

66.9%

64.2%

82.3%

46.9%

51.6%

81.3%

72.6%

67.1%

68.0%

74.2%

66.5%

70.0%

71.6%

Plants (174 sequences)

Linear Discriminant

96.0%

95.4%

76.4%

92.5%

93.7%

91.4%

95.4%

96.0%

95.4%

96.0%

96.0%

96.0%

96.0%

Linear SVM

96.0%

96.0%

85.6%

96.0%

96.0%

87.9%

94.8%

96.0%

96.0%

96.0%

96.0%

96.0%

96.0%

Quadratic SVM

96.0%

96.0%

86.8%

96.0%

96.0%

88.5%

94.3%

96.0%

96.0%

96.0%

96.0%

96.0%

96.0%

Fine KNN

93.1%

94.8%

91.4%

94.3%

94.3%

90.8%

86.8%

93.1%

94.3%

93.7%

91.4%

93.1%

93.1%

Subspace Discriminant

96.0%

95.4%

87.4%

94.8%

95.4%

87.9%

94.8%

96.0%

96.0%

96.0%

96.0%

96.0%

96.0%

Subspace KNN

93.7%

94.3%

90.2%

94.3%

94.3%

90.2%

92.5%

92.5%

94.8%

93.7%

94.3%

94.8%

94.3%

Average

95.1%

95.3%

86.3%

94.7%

95.0%

89.5%

93.1%

94.9%

95.4%

95.2%

95.0%

95.3%

95.2%

Amphibians (290 sequences)

Linear Discriminant

92.1%

91.4%

95.5%

89.0%

89.3%

99.0%

94.5%

93.4%

91.4%

96.2%

93.4%

93.8%

91.7%

Linear SVM

91.0%

90.0%

89.0%

88.3%

88.6%

93.1%

89.0%

91.4%

90.0%

93.1%

92.1%

92.4%

90.3%

Quadratic SVM

90.3%

89.0%

92.4%

59.3%

83.4%

96.6%

91.0%

93.1%

86.9%

94.1%

93.1%

93.4%

90.7%

Fine KNN

90.0%

86.9%

96.6%

83.8%

83.4%

98.3%

87.9%

92.1%

89.7%

93.4%

91.7%

94.8%

89.7%

Subspace Discriminant

90.7%

90.3%

90.0%

89.3%

89.3%

96.6%

90.3%

91.7%

90.3%

95.2%

92.8%

92.1%

91.0%

Subspace KNN

88.3%

86.6%

94.1%

85.2%

84.5%

98.3%

89.7%

92.8%

87.2%

94.5%

90.0%

94.8%

90.3%

Average

90.4%

89.0%

92.9%

82.5%

86.4%

97.0%

90.4%

92.4%

89.3%

94.4%

92.2%

93.6%

90.6%

Mammals (830 sequences)

Linear Discriminant

98.3%

97.6%

97.7%

97.0%

96.0%

97.1%

96.6%

97.2%

96.7%

98.0%

96.9%

96.3%

96.3%

Linear SVM

90.6%

89.6%

88.9%

84.5%

85.3%

91.6%

86.5%

91.2%

88.8%

90.8%

90.0%

88.2%

88.1%

Quadratic SVM

92.4%

89.9%

91.0%

32.9%

41.7%

93.4%

88.0%

93.4%

89.9%

90.7%

92.5%

89.8%

90.5%

Fine KNN

94.1%

92.3%

96.0%

79.9%

81.0%

96.6%

93.9%

93.7%

91.7%

96.3%

96.3%

94.8%

95.5%

Subspace Discriminant

92.3%

91.9%

92.3%

88.3%

87.7%

94.0%

90.2%

91.7%

90.4%

92.3%

93.4%

91.9%

91.3%

Subspace KNN

92.8%

90.8%

95.5%

78.2%

79.2%

96.4%

91.2%

93.3%

89.2%

94.8%

94.3%

94.9%

92.2%

Average

93.4%

92.0%

93.6%

76.8%

78.5%

94.9%

91.1%

93.4%

91.1%

93.8%

93.9%

92.7%

92.3%

Insects (898 sequences)

Linear Discriminant

92.2%

92.7%

90.1%

91.6%

92.2%

94.2%

93.3%

92.4%

89.2%

93.1%

92.1%

94.4%

90.4%

Linear SVM

86.9%

82.6%

85.9%

66.7%

69.5%

85.3%

86.4%

90.0%

80.5%

89.4%

87.4%

88.4%

86.2%

Quadratic SVM

85.0%

81.8%

86.7%

24.4%

21.3%

87.1%

85.7%

89.6%

82.6%

89.5%

88.0%

89.6%

85.3%

Fine KNN

82.0%

79.3%

80.0%

62.5%

68.0%

93.2%

83.3%

87.9%

80.8%

85.6%

83.6%

87.9%

83.0%

Subspace Discriminant

85.7%

83.9%

88.3%

77.5%

79.3%

89.1%

88.0%

88.2%

82.1%

87.1%

87.6%

88.2%

86.4%

Subspace KNN

80.4%

77.3%

90.5%

61.0%

67.6%

92.0%

81.4%

86.9%

77.4%

85.4%

86.0%

89.3%

81.4%

Average

85.4%

82.9%

86.9%

64.0%

66.3%

90.2%

86.4%

89.2%

82.1%

88.4%

87.5%

89.6%

85.5%

3Classes (2170 sequences; Subspace Discriminant & Subspace KNN omitted)

Linear Discriminant

99.9%

99.9%

99.6%

99.4%

99.7%

99.7%

99.7%

99.7%

99.8%

99.8%

99.9%

99.9%

99.6%

Linear SVM

94.1%

90.2%

99.4%

89.8%

89.3%

99.6%

99.2%

98.1%

94.6%

99.1%

97.3%

99.3%

97.9%

Quadratic SVM

97.5%

92.5%

99.4%

66.6%

78.8%

99.7%

99.5%

98.7%

97.6%

99.4%

98.4%

99.5%

98.8%

Fine KNN

95.9%

95.2%

97.6%

93.3%

94.4%

95.9%

97.6%

97.7%

96.4%

98.9%

98.0%

99.2%

98.4%

Average

96.9%

94.5%

99.0%

87.3%

90.6%

98.7%

99.0%

98.6%

97.1%

99.3%

98.4%

99.5%

98.7%

Vertebrates (4322 sequences; Subspace Discriminant & Subspace KNN omitted)

Linear Discriminant

99.7%

99.7%

99.6%

99.3%

99.5%

99.7%

99.2%

99.3%

99.3%

99.3%

99.4%

99.5%

99.2%

Linear SVM

98.3%

98.2%

98.5%

96.3%

96.8%

97.9%

98.0%

98.4%

98.2%

98.2%

98.5%

98.8%

98.4%

Quadratic SVM

98.1%

96.6%

99.0%

40.6%

34.0%

98.7%

98.4%

98.2%

96.7%

98.5%

98.7%

98.8%

98.6%

Fine KNN

97.1%

96.1%

98.4%

88.3%

91.7%

97.9%

96.4%

96.3%

95.3%

96.4%

97.5%

97.6%

97.2%

Average

98.3%

97.7%

98.9%

81.1%

80.5%

98.6%

98.0%

98.1%

97.4%

98.1%

98.5%

98.7%

98.4%

Table average

90.0%

88.7%

91.6%

79.4%

81.3%

92.3%

90.5%

90.7%

89.4%

91.9%

90.5%

91.5%

90.7%