Skip to main content
Fig. 4 | BMC Genomics

Fig. 4

From: Identification and prediction of developmental enhancers in sea urchin embryos

Fig. 4

Modeling of CRM reporter activity from of ATAC-, Pol II ChIP- and PRO-seq. A Violin/box-plot of the ATAC, Pol II ChIP peak call and dREG TRE prediction sizes, and the 389 CRMs. The inset plots the size distributions of active and inactive CRMs, which is not significatively different. B and C, ranked CRM expression plot in 12 and 24 h embryos, respectively. The blue line at 1 marks the CRM expression level when it equals that of the basal-promoter reporter. The red line by the curve “elbow” marks the 2 fold above control chosen as the expression threshold. D Violin/box-plots of PRO-, ATAC-, and Pol II ChIP-seq significatively different signals between active and inactive CRMs in 12 and 20 h embryos. E, top, 12 h embryo Receiver Operating Characteristics (ROC) and, bottom, Precision-Recall Curves (PRC) of the logistic regression models trained and tested by 5 fold cross-validation repeated 200 times. Area Under the ROC (AUROC) and AUPRC as indicated for each model. Dotted lines mark random guess prediction performance, a mid-diagonal for ROC and a horizontal line at the fraction of active CRMs for PRC. The absolute AUPRC indicated in bold and the difference with random guess in parenthesis. F ROCs and PRCs in 20 h embryos. G, top, PRCs evaluating the enhancer activity predictions for the CRM promoter-overlapping data set of models trained with the entire 20 h CRM data set. Bottom, model predictions for the complementary, non-promoter overlapping data set. H Violin/box-plot of the AUPRC after cross-validation with different predictors, as indicated; All, includes the sum and max of the 3 genomic profiles allowing second order interactions among predictors; dREG-max, signifies the sum of the maximum values at dREG peaks

Back to article page