Skip to main content

Table 1 Details of each cell line dataset. The enhancers (or promoters) column indicates the number of all known active enhancers (or promoters) for each cell line, which are used for unsupervised feature learning for enhancer (or promoter) sequences

From: Prediction of enhancer-promoter interactions via natural language processing

Dataset

enhancers

promoters

true EPIs

false EPIs

K562

82806

8196

1977

1975

IMR90

108996

5253

1254

1250

GM12878

100036

8453

2113

2110

HUVEC

65358

8180

1524

1520

HeLa-S3

103460

7794

1740

1740

NHEK

144302

5254

1291

1280

FANTOM

43011

49620

61542

61542