Skip to main content

Advertisement

Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Table 2 The prediction performance of SVM models with combinations of three kinds of regulatory features such as over-represented hexamer nucleotides (OR), nucleotide composition (NC), and DNA stability (DS), is evaluated by benchmark "Cross-validation" based on the specified window size -200 to +100 of TSS(+1).

From: GPMiner: an integrated system for mining combinatorial cis-regulatory elements in mammalian gene group

Training set Window size Features Precision Sensitivity Specificity Accuracy
All
(6,452)
-200 ~+100 OR+NC 77% 71% 79% 75%
  -200 ~+100 OR+DS 76% 69% 78% 74%
  -200 ~+100 NC+DS 75% 74% 76% 75%
  -200 ~+100 OR+NC+DS 79% 76% 79% 78%
With CpG
(4,898)
-200 ~+100 OR+NC 79% 81% 79% 80%
  -200 ~+100 OR+DS 77% 80% 76% 78%
  -200 ~+100 NC+DS 77% 82% 75% 78%
  -200 ~+100 OR+NC+DS 80% 84% 79% 82%
Without CpG (1,554) -200 ~+100 OR+NC 68% 70% 67% 68%
  -200 ~+100 OR+DS 68% 71% 66% 68%
  -200 ~+100 NC+DS 66% 67% 66% 66%
  -200 ~+100 OR+NC+DS 69% 69% 71% 70%
  1. The number of training sequences used to construct the SVM models is shown in parenthesis of the column "Training set".