Skip to main content

Table I Investigating the encoding capability of the PC-mer (using only Linear SVM classifier) vs. that of the FCGR (using six classifiers: Linear Discriminant (LD), Linear SVM (LSVM), Quadratic SVM (QSVM), Fine KNN (FKNN), Subspace Discriminant (SD), and Subspace KNN (SKNN)) as the encoding methods for generating input vectors

From: A new profiling approach for DNA sequences based on the nucleotides' physicochemical features for accurate analysis of SARS-CoV-2 genomes

Encoding algorithm

k-mer

Classification model

Metrics

Datasets (%)

Test-1

Test-2

Test-3a

Test-3b

Test-4

Test-5

Test-6

Human Coronavirus

PC-mer

12

Linear SVM

Accuracy

97.25

95.93

98.52

100

100

99.33

100

100

F1

97.23

95.9

98.49

100

100

99.36

100

100

Precision

97.38

96.16

98.85

100

100

99.55

100

100

Recall

97.25

95.93

98.52

100

100

99.33

100

100

FCGR

7

LD

Accuracy

91.7

91.2

98.1

100

97.6

98.6

100

-

LSVM

90.8

89.2

94.2

93.3

98.4

97.4

100

-

QSVM

95

93.1

95.2

93.3

98.4

97.4

100

-

FKNN

93.4

90.3

95.7

95

97.6

97.4

100

-

SD

87.6

89

97.6

95

98.4

98.7

100

-

SKNN

93.2

90.4

96.2

95

97.2

96.1

100

-

Average accuracy

92

90.5

96.2

95.3

97.6

97.5

100

-