Skip to main content

Advertisement

Table 2 The prediction performance of SVM models with combinations of three kinds of regulatory features such as over-represented hexamer nucleotides (OR), nucleotide composition (NC), and DNA stability (DS), is evaluated by benchmark "Cross-validation" based on the specified window size -200 to +100 of TSS(+1).

From: GPMiner: an integrated system for mining combinatorial cis-regulatory elements in mammalian gene group

Training set Window size Features Precision Sensitivity Specificity Accuracy
All (6,452) -200 ~+100 OR+NC 77% 71% 79% 75%
  -200 ~+100 OR+DS 76% 69% 78% 74%
  -200 ~+100 NC+DS 75% 74% 76% 75%
  -200 ~+100 OR+NC+DS 79% 76% 79% 78%
With CpG (4,898) -200 ~+100 OR+NC 79% 81% 79% 80%
  -200 ~+100 OR+DS 77% 80% 76% 78%
  -200 ~+100 NC+DS 77% 82% 75% 78%
  -200 ~+100 OR+NC+DS 80% 84% 79% 82%
Without CpG (1,554) -200 ~+100 OR+NC 68% 70% 67% 68%
  -200 ~+100 OR+DS 68% 71% 66% 68%
  -200 ~+100 NC+DS 66% 67% 66% 66%
  -200 ~+100 OR+NC+DS 69% 69% 71% 70%
  1. The number of training sequences used to construct the SVM models is shown in parenthesis of the column "Training set".