Skip to main content
Fig. 7 | BMC Genomics

Fig. 7

From: Deep learning for DNase I hypersensitive sites identification

Fig. 7

It is obvious from the graph that the difference of nucleotide ratios between DHSs and non-DHSs will decreases with reducing of the complexity of species. Benchmark dataset quiet has a weak coverage of sample space, because of the small amount of data. At the same time, a larger number of non-DHSs (benchmark dataset has 280 DHSs and 737 non-DHSs) were more likely to lead to overfitting of the model on the non-DHSs. It can also be seen that there is little distinction the proportion of MNC and DNC in arabidopsis’s DHSs and non-DHSs, which indicates that it is challenging to use the feature of nucleotide site training model on the benchmark dataset

Back to article page