Skip to main content

Advertisement

Table 2 DNA motif rules extracted from the RF classified trained on the MAX dataset

From: Discovery of cell-type specific DNA motif grammar in cis-regulatory elements using random Forest

Rule Prediction Reference
IRF > =1 GM12878 (lymphoblastoid cells) [46, 47]
JDP2 > =1 HeLa-S3 (cervical carcinoma cells) [54]
AP1 > =9 & HESX1 > =2 & LMO2 > =2 AP1 [48, 49]
EMX1 < =11 & ETS < =12 & HNF4 > =1 HepG2 (liver cancer cells) HNF4 [50], ETS [69]
HNF4 > =1 & IRF4 < =4 & RUNX2 < =4 & TAL1 < =4 HNF4 [50]
ALX3 < =26 & EVX1 > =6 & GATA > =2 & LMO2 > =2 K562 (immortalised myelogenous leukaemia cells) GATA [52, 53], LMO2 [51]
GATA > = 4 & HNF4 = 0 & POU4F3 < =4 GATA [52, 53]
No rule identified A549 (adenocarcinomic alveolar basal epithelial cells)  
  1. The numbers in the rules represent the motif frequency detected in the +/− 120 bp regions from the peak centre