Skip to main content

Table 1 DNA motif rules extracted from the RF classified trained on the TCF7L2 dataset

From: Discovery of cell-type specific DNA motif grammar in cis-regulatory elements using random Forest

Rule

Prediction

Reference

NFE2 > =3

HCT-116 (colon cancer cells)

[56, 57]

GSX1 < =35 & NFE2 > =2

NFE2 [56, 57]

BACH1 > =2 & FOXO3 < =2 & HOMEZ < =5

BACH1 [55]

BACH1 < =1 & E2F > =2 & HOXB13 > =3

HEK293 (embryonic kidney cells)

E2F [63], HOXB13 [64]

CDX2 > =5 & GATA > =2 & MAF = 0

GATA [65, 66]

BACH1 < =1 & HNF4 < =6 & HOXA13 > =3 & OTX1 > =9

HOXB13 [64]

JDP2 > =2 & SOX21 > =37

HeLa-S3 (cervical carcinoma cells)

JDP2 [54]

HNF4 > =2

HepG2 (liver cancer cells)

[50]

HOXB13 < =3 & JDP2 < =1 & TCF7L2 > =5

TCF7L2 [67]

HNF4 > =1 & HOXC10 < =16 & SOX9 > =5

HNF4 [50], SOX9 [68]

GRHL1 > =4

MCF-7 (mammary gland adenocarcinoma cells)

GRHL1 [58]

No rule identified

PANC-1 (pancreatic cancer cells)

 
  1. The numbers in the rules represent the motif frequency detected in the +/− 120 bp regions from the peak centre