Skip to main content

Advertisement

Table 1 DNA motif rules extracted from the RF classified trained on the TCF7L2 dataset

From: Discovery of cell-type specific DNA motif grammar in cis-regulatory elements using random Forest

Rule Prediction Reference
NFE2 > =3 HCT-116 (colon cancer cells) [56, 57]
GSX1 < =35 & NFE2 > =2 NFE2 [56, 57]
BACH1 > =2 & FOXO3 < =2 & HOMEZ < =5 BACH1 [55]
BACH1 < =1 & E2F > =2 & HOXB13 > =3 HEK293 (embryonic kidney cells) E2F [63], HOXB13 [64]
CDX2 > =5 & GATA > =2 & MAF = 0 GATA [65, 66]
BACH1 < =1 & HNF4 < =6 & HOXA13 > =3 & OTX1 > =9 HOXB13 [64]
JDP2 > =2 & SOX21 > =37 HeLa-S3 (cervical carcinoma cells) JDP2 [54]
HNF4 > =2 HepG2 (liver cancer cells) [50]
HOXB13 < =3 & JDP2 < =1 & TCF7L2 > =5 TCF7L2 [67]
HNF4 > =1 & HOXC10 < =16 & SOX9 > =5 HNF4 [50], SOX9 [68]
GRHL1 > =4 MCF-7 (mammary gland adenocarcinoma cells) GRHL1 [58]
No rule identified PANC-1 (pancreatic cancer cells)  
  1. The numbers in the rules represent the motif frequency detected in the +/− 120 bp regions from the peak centre