Skip to main content

Table 2 The prediction performance of SVM models with combinations of three kinds of regulatory features such as over-represented hexamer nucleotides (OR), nucleotide composition (NC), and DNA stability (DS), is evaluated by benchmark "Cross-validation" based on the specified window size -200 to +100 of TSS(+1).

From: GPMiner: an integrated system for mining combinatorial cis-regulatory elements in mammalian gene group

Training set

Window size

Features

Precision

Sensitivity

Specificity

Accuracy

All

(6,452)

-200 ~+100

OR+NC

77%

71%

79%

75%

 

-200 ~+100

OR+DS

76%

69%

78%

74%

 

-200 ~+100

NC+DS

75%

74%

76%

75%

 

-200 ~+100

OR+NC+DS

79%

76%

79%

78%

With CpG

(4,898)

-200 ~+100

OR+NC

79%

81%

79%

80%

 

-200 ~+100

OR+DS

77%

80%

76%

78%

 

-200 ~+100

NC+DS

77%

82%

75%

78%

 

-200 ~+100

OR+NC+DS

80%

84%

79%

82%

Without CpG (1,554)

-200 ~+100

OR+NC

68%

70%

67%

68%

 

-200 ~+100

OR+DS

68%

71%

66%

68%

 

-200 ~+100

NC+DS

66%

67%

66%

66%

 

-200 ~+100

OR+NC+DS

69%

69%

71%

70%

  1. The number of training sequences used to construct the SVM models is shown in parenthesis of the column "Training set".