Skip to main content

Table 2 DNA motif rules extracted from the RF classified trained on the MAX dataset

From: Discovery of cell-type specific DNA motif grammar in cis-regulatory elements using random Forest

Rule

Prediction

Reference

IRF > =1

GM12878 (lymphoblastoid cells)

[46, 47]

JDP2 > =1

HeLa-S3 (cervical carcinoma cells)

[54]

AP1 > =9 & HESX1 > =2 & LMO2 > =2

AP1 [48, 49]

EMX1 < =11 & ETS < =12 & HNF4 > =1

HepG2 (liver cancer cells)

HNF4 [50], ETS [69]

HNF4 > =1 & IRF4 < =4 & RUNX2 < =4 & TAL1 < =4

HNF4 [50]

ALX3 < =26 & EVX1 > =6 & GATA > =2 & LMO2 > =2

K562 (immortalised myelogenous leukaemia cells)

GATA [52, 53], LMO2 [51]

GATA > = 4 & HNF4 = 0 & POU4F3 < =4

GATA [52, 53]

No rule identified

A549 (adenocarcinomic alveolar basal epithelial cells)

 
  1. The numbers in the rules represent the motif frequency detected in the +/− 120 bp regions from the peak centre