Skip to main content

Table 1 Details of each cell line dataset. The enhancers (or promoters) column indicates the number of all known active enhancers (or promoters) for each cell line, which are used for unsupervised feature learning for enhancer (or promoter) sequences

From: Prediction of enhancer-promoter interactions via natural language processing

Dataset enhancers promoters true EPIs false EPIs
K562 82806 8196 1977 1975
IMR90 108996 5253 1254 1250
GM12878 100036 8453 2113 2110
HUVEC 65358 8180 1524 1520
HeLa-S3 103460 7794 1740 1740
NHEK 144302 5254 1291 1280
FANTOM 43011 49620 61542 61542