Skip to main content
Figure 4 | BMC Genomics

Figure 4

From: A hybrid qPCR/SNP array approach allows cost efficient assessment of KIR gene copy numbers in large samples

Figure 4

Error rate of k-nearest neighbour prediction from R and θ of rs592645 in random subset of samples. Each panel shows the LOOCV error rates of KIR3DL1/3DS1 copy number prediction from R and θ of rs592645 in the remaining unlabeled samples when using a different size subset of the training data. The percentage of the complete training data set and the size of the subset is given in the title of each panel. Each point represents the LOOCV error rate averaged over ten multiply imputed qPCR call datasets (using the posterior probabilities from Figure 1). Smoothing lines show the average over 25 independent random subsets of training data. The black dashed line represent the observed error rate in the complete sample. As the size of the training dataset increases the error rate becomes less sensitive to the choice of the parameter k. Only 295 samples are required to achieve LOOCV error rates <5% and 590 for error rates <2.5%.

Back to article page