Skip to main content

Table 2 Inferring quality from clustering methods

From: A singular value decomposition approach for improved taxonomic classification of biological sequences

Algorithm/ software

Rank

N

Min cLtlf

Max cLtlf

Mean cLtlf

cLtlf clusters sum (∑cLtlf)

cLtlf standard deviation (σ)

Linnaean clusters quality (∑cLtlf/σ)

Linnaean clusters quality gain (K09/K60)%

cLtlf median

Median clusters quality gain (K09/K60)%

AQBC-javaml

K09

8

32

180

71.25

570

52.27

10.90

49.58%

42.50

26.87%

 

K60

8

0

220

64.38

515

70.64

7.29

 

33.50

 

EM-weka

K09

8

40

120

70.12

561

31.53

17.79

48.99%

57.00

1.79%

 

K60

8

16

160

70.25

562

47.06

11.94

 

56.00

 

Kmeans-weka

K09

8

30

180

69.38

555

46.70

11.88

9.26%

61.50

-2.38%

 

K60

8

16

180

69.88

559

51.39

10.88

 

63.00

 

Kmeans-R

K09

8

40

140

71.62

573

34.48

16.62

9.21%

62.00

6.90%

 

K60

8

26

140

71.75

574

37.72

15.22

 

58.00

 

K-Medoids-R

K09

8

24

160

70.12

561

44.37

12.64

15.92%

60.00

13.21%

 

K60

8

26

180

68.50

548

50.24

10.91

 

53.00

 

MDBC-weka

K09

8

30

180

69.38

555

46.70

11.88

9.26%

61.50

-2.38%

 

K60

8

16

180

69.88

559

51.39

10.88

 

63.00

 

ASAP-in house

K09

8

13

225

70.25

562

67.68

8.30

27.51%

52.00

197.14%

 

K60

8

13

243

69.12

553

84.92

6.51

 

17.50

 
  1. All evaluated partitioning's algorithms showed improved performance considering the Linnaean clusters quality when used the optimized distance matrix created by the better kdc parameters tested.