Skip to main content

Table 1 KNN accuracy on test data for different sample sizes, test sizes, training sizes and different numbers of neighbors K used

From: Alignment-free genome comparison enables accurate geographic sourcing of white oak DNA

Test size Training size K = 1 K = 2 K = 3 K = 4 K = 5 K = 6 K = 7 K = 8 K = 9 K = 10
Samples of 50 MBases
 1 91 1.00 1.00 1.00 1.00 1.00 1.00 0.93 0.97 0.88 0.94
 17 75 1.00 1.00 0.99 0.99 0.97 0.97 0.95 0.96 0.94 0.96
 32 60 0.99 0.99 0.97 0.97 0.94 0.95 0.93 0.95 0.93 0.95
 47 45 0.98 0.98 0.95 0.96 0.93 0.94 0.93 0.95 0.91 0.93
 62 30 0.95 0.95 0.92 0.93 0.90 0.93 0.90 0.92 0.89 0.91
 77 15 0.89 0.89 0.85 0.87 0.81 0.82 0.79 0.77 0.73 0.67
Samples of 100 MBases
 1 91 1.00 1.00 1.00 1.00 1.00 1.00 0.98 1.00 0.96 1.00
 17 75 1.00 1.00 0.99 0.99 0.98 0.98 0.97 0.99 0.97 0.99
 32 60 1.00 1.00 0.98 0.99 0.97 0.98 0.96 0.97 0.95 0.96
 47 45 0.99 0.99 0.97 0.98 0.96 0.97 0.95 0.97 0.95 0.97
 62 30 0.98 0.98 0.95 0.96 0.94 0.96 0.93 0.95 0.90 0.91
 77 15 0.93 0.93 0.90 0.91 0.86 0.84 0.81 0.78 0.70 0.67
Samples of 300 MBases
 1 91 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
 17 75 1.00 1.00 1.00 1.00 1.00 1.00 0.99 1.00 1.00 1.00
 32 60 1.00 1.00 0.99 1.00 0.99 1.00 0.98 0.99 0.98 0.99
 47 45 1.00 1.00 0.99 0.99 0.98 0.99 0.98 0.99 0.97 0.98
 62 30 0.99 0.99 0.97 0.98 0.96 0.97 0.94 0.95 0.91 0.92
 77 15 0.96 0.96 0.92 0.93 0.86 0.86 0.81 0.79 0.74 0.71