Skip to main content

Table 1 Description of features used.

From: Ensemble approach combining multiple methods improves human transcription start site prediction

Feature

Information

Profisi

DNA melting temperature, as calculated with Fixman & Freire's method

ARTS

Custom SVM kernel using both sequence and structural information

N-SCAN

HMM gene predictor - start of 5' UTR defines TSS

FirstEF

Decision tree using k-mers, GC and CpG content

Eponine

RVM using mixture of Gaussian distributions of position weight matrices

ProSOM

Self-organising map trained on base stacking energy

EP3

Base stacking energy

Methylation

Experimentally determined CpG methylation profiles

Conservation

17-way vertebrate conservation scores

  1. All except methylation and conservation are outputs from prediction programs. ARTS scores were split into + and - strands. Methylation scores were split into stem cell and differentiated categories.