Skip to main content
Fig. 5 | BMC Genomics

Fig. 5

From: The landscape of PBMC methylome in canine mammary tumors reveals the epigenetic regulation of immune marker genes and its potential application in predicting tumor malignancy

Fig. 5

A machine learning-based diagnostic two-step classifier discriminating tumor from normal PBMCs followed by carcinoma from benign PBMCs. A The concept of a two-step classifier for precisely distinguishing three groups (Normal, Benign, and Carcinoma). B Schematic diagram of the diagnostic methylome-based classifier modeling. To generate the best predictive model, tenfold cross-validation with multiple ML algorithms were employed, and then the performance of each model was evaluated. C The ROC curves of the NT classifiers were established by SVM_L, SVM_R, RF, GBM, KNN, and logistic regression. AUC values are shown in the right-bottom area under the curves. D Heatmap of the confusion matrix (left) for tumor detection by the SVM_L-based NT classifier, which has the best AUC value (AUC = 1) and accuracy (Accuracy = 1). The confusion matrix for tenfold cross-validation (right) shows the prediction results for seven to nine test samples in each fold. E Validation of the predictive performance in multiple NT classifiers. PBMC MBD-seq data from six dogs with CMT were used as the validation set. Except for the logistic classifier, which incorrectly predicted three out of six, the SVM_L, SVM_R, RF, GBM, and KNN classifiers predict tumors. F The ROC curves (left) for the BC classifier modeled with 2911 DMRs containing ‘BC_DMR’ and DMRs identified ‘only in NB_DMR’ or ‘only in NC_DMR’. BC classifiers show lower AUC values compared to NT classifiers. The bar graph (right) exhibits the highest accuracy in GBM. 127 DMRs extracted by GBM-based feature importance are used for BC classifier re-modeling. This iterative process is illustrated in the center of (B). G The ROC curves of re-modeled BC classifiers using 127 DMRs, which show enhanced performance compared to previous BC classifiers. H The improved performance was confirmed via both a heatmap of the confusion matrix (left) and the tenfold confusion matrix (right) for the final BC classifier (SVM_L) generated using 127 DMRs

Back to article page