Skip to main content

Table 1 Model Architecture Performance by Feature Set

From: CASowary: CRISPR-Cas13 guide RNA predictor for transcript depletion

 

P < 0.05

Z > 2

Z > 3

Gini

Gini DT

3-Fold

 Random Forest

0.717 [55.4]

0.713 [57.55]

0.714 [51.1]

0.716 [54.35]

0.717 [50.55]

 KNN

0.715 [2]

0.715 [2]

0.711 [2]

0.717 [3]

0.717 [3]

 SVC [linear]

0.71

0.609

0.505

0.6

0.64

 SVC [poly]

0.698

0.607

0.517

0.599

0.616

 SVC [sigmoid]

0.62

0.551

0.47

0.54

0.553

 SVC [rbf]

0.642

0.521

0.487

0.567

0.581

 Decision Tree

0.715

0.715

0.711

0.715

0.714

5-Fold

 Random Forest

0.654 [41.3]

0.651 [46.75]

0.641 [45.75]

0.652 [43.95]

0.654 [41.3]

 KNN

0.558 [3.39]

0.496 [9.71]

0.515 [8.44]

0.535 [3.67]

0.558 [3.39]

 SVC [linear]

0.615

0.553

0.504

0.589

0.634

 SVC [poly]

0.593

0.535

0.501

0.562

0.574

 SVC [sigmoid]

0.551

0.489

0.458

0.517

0.521

 SVC [rbf]

0.563

0.504

0.469

0.537

0.541

 Decision Tree

0.634

0.636

0.634

0.64

0.637

  1. Distribution of model accuracy using a variety of different architectures and different feature lists for both 5-fold and 3-fold cross validation methods. For KNN and Random Forest, average values for parameters with the highest accuracy are recorded in brackets