Skip to main content

Table 4 The performance of various models for discriminating cross-regional from within-regional in the lineage4 cohort

From: Association between two-component systems gene mutation and Mycobacterium tuberculosis transmission revealed by whole genome sequencing

Parameters

Training set

(n = 3201, 245 cross-regional strains,2956 within-regional trains)

Test set

(n = 1373, 93 cross-regional strains,

1280 within-regional strains)

 

Random Forest

Gradient Boosted Classification Tree

Random Forest

Gradient Boosted Classification Tree

Kappa

0.649

0.553

0.472

0.435

AUC

(95% CI)

0.954

(0.947, 0.961)

0.941

(0.933, 0.949)

0.927

(0.913, 0.941)

0.922

(0.908, 0.936)

Sensitivity

(95% CI)

0.981

(0.976, 0.986)

0.458

(0.441, 0.475)

0.971

(0.962, 0.980)

0.363

(0.338, 0.388)

Specificity

(95% CI)

0.981

(0.976, 0.986)

0.990

(0.987, 0.993)

0.971

(0.962, 0.980)

0.984

(0.977, 0.991)

PPV

(95% CI)

0.732

(0.717, 0.747)

0.783

(0.769, 0.797)

0.543

(0.517, 0.569)

0.649

(0.624, 0.674)

NPV

(95% CI)

0.969

(0.963, 0.975)

0.958

(0.951, 0.965)

0.962

(0.952, 0.972)

0.951

(0.940, 0.962)

PLR

(95% CI)

23.808

(23.802, 23.814)

18.728

(18.721, 18.735)

14.323

(14.315, 14.331)

13.142

(13.131, 13.153)

NIR

(95% CI)

0.042

(-0.017, 0.101)

0.053

(-0.014, 0.120)

0.070

(0.006, 0.134)

0.076

(0, 0.152)

Accuracy

(95% CI)

0.954

(0.947, 0.961)

0.951

(0.944, 0.958)

0.937

(0.924, 0.950)

0.938

(0.925, 0.951)

  1. AUC, area under the curve; PPV, positive predictive value; NPV, negative predictive value; PLR, positive likelihood ratio; NLR, negative likelihood ratio; CI, confidence