 Research
 Open access
 Published:
Detection of differentially methylated regions from bisulfiteseq data by hidden Markov models incorporating genomewide methylation level distributions
BMC Genomics volumeÂ 16, ArticleÂ number:Â S3 (2015)
Abstract
Background
Detection of differential methylation between biological samples is an important task in bisulfiteseq data analysis. Several studies have attempted de novo finding of differentially methylated regions (DMRs) using hidden Markov models (HMMs). However, there is room for improvement in the design of HMMs, especially on emission functions that evaluate the likelihood of differential methylation at each cytosine site.
Results
We describe a new HMM for DMR detection from bisulfiteseq data. Our method utilizes emission functions that combine binomial models for aligned read counts, and beta mixtures for incorporating genomewide methylation level distributions. We also develop unsupervised learning algorithms to adjust parameters of the betabinomial models depending on differential methylation types (up, down, and not changed). In experiments on both simulated and real datasets, the new HMM improves DMR detection accuracy compared with HMMs in our previous study. Furthermore, our method achieves better accuracy than other methods using Fisher's exact test and methylation level smoothing.
Conclusions
Our method enables accurate DMR detection from bisulfiteseq data. The implementation of our method is named ComMet, and distributed as a part of Bisulfighter package, which is available at http://epigenome.cbrc.jp/bisulfighter.
Background
Cytosine methylation is an epigenetic modification that affects many biological processes including normal development and pathogenesis [1]. Genomewide profiling of cytosine methylation is enabled by bisulfiteseq, where unmethylated cytosines are converted and sequenced as thymines [2]. In bisulfiteseq data analysis, a fundamental task is alignment of bisulfiteconverted reads to a reference genome, and thus numerous tools have been already developed [3â€“6]. On the other hand, methods for downstream tasks after read alignment have been relatively limited [7]. Among them, one of the most important is detection of differential methylation between biological samples [8]. Differential methylation analyses can be divided into two categories: those focusing only on prespecified regions such as known transcription factor binding sites (e.g. [9]), and those for de novo finding of differentially methylated regions (DMRs) as novel candidates of regulatory elements (e.g. [10]). In this paper, we address the latter case, which is more challenging due to the necessity for determining exact boundaries of DMRs.
DMR detection has been attempted by twostep procedures: first, differentially methylated cytosines (DMCs) are detected by comparison of alignment results between samples; then, DMCs at neighbor positions are grouped as contiguous DMRs by certain distance criteria. Most studies have focused mainly on the first step, and proposed to detect DMCs using Fisher's exact test [10], Student's ttest with methylation level smoothing [11], and logistic regression test [12]. Additionally, many methods have been developed for detecting DMCs based on a variety of probability models [13â€“15]. In contrast, there have been much less studies on methods for grouping DMCs into DMRs. Although fixedlength distance criteria (e.g. sliding windows) have been conventionally used, such strategies depend on the choice of distance parameters (e.g. window sizes). Unfortunately, it is difficult to adjust distance parameters empirically because DMR lengths range from hundreds of base pairs as in CpG islands, to millions of base pairs as in cancer aberrations [16].
To address this problem, we have recently proposed a framework for DMR detection based on hidden Markov models (HMMs) [6]. Unlike the twostep procedures, HMMs can integrate detection and grouping of DMCs as joint probability models using emission and transition functions, respectively. Moreover, HMMs enable us to adjust their parameters by wellestablished learning algorithms so that they incorporate useful information for DMR detection. In particular, we have observed that DMCs exhibit distance distributions distinct from cytosines whose methylation is not changed. Therefore, we have adjusted parameters of transition functions so that they fit these distance distributions. Thanks to this strategy, our method has improved DMR detection accuracy, especially on determining exact boundaries of DMRs. We note that HMMbased DMR detection has also been employed for methylation data other than bisulfiteseq such as Infinium BeadChip [17] and MBDCapseq [18].
While our previous study has shown the effectiveness of transition functions in HMMbased DMR detection, there is still room for improvement in the design of emission functions. As mentioned above, many studies have proposed various probability models for detecting DMCs [13â€“15]. An important suggestion from these studies is that DMC detection at individual cytosine sites can be improved by considering probability distributions of methylation levels collected from all genomic cytosine sites. This implies that the information of genomewide methylation level distributions may also be useful for DMR detection. However, the probability models in [13â€“15] are specifically developed for DMC detection, and thus cannot be directly applied to the emission functions for HMMbased DMR detection.
In this paper, we describe new emission functions for HMMbased DMR detection from bisulfiteseq data. We show that the emission functions in our previous study [6] have an empirical parameter to represent methylation levels used in binomial models for aligned read counts. From this viewpoint, we propose new emission functions that replace the empirical parameter by beta mixtures for incorporating genomewide methylation level distributions. We also develop unsupervised learning algorithms to adjust parameters of the betabinomial models depending on differential methylation types (up, down, and not changed). In experiments on both simulated and real datasets, the new emission functions improve DMR detection accuracy compared with the old ones. Furthermore, our HMMbased method achieves better accuracy than other methods using Fisher's exact test and methylation level smoothing.
Methods
In this section, we describe a new method for DMR detection from bisulfiteseq data. The method uses new emission functions with an HMMbased framework called ComMet which we developed in our previous study [6]. We first review ComMet, and show that emission functions in our previous study have an empirical parameter to represent methylation levels used in binomial models for aligned read counts. Then, we design new emission functions replacing this empirical parameter by beta mixtures for incorporating genomewide methylation level distributions. We also present unsupervised learning algorithms to adjust parameters of the betabinomial models depending on differential methylation types (up, down, and not changed).
HMMbased DMR detection from bisulfiteseq data
In our previous study [6], we developed ComMet, an HMMbased framework for DMR detection from bisulfiteseq data (Figure 1). The motivation for employing HMMs came from our observation of real data where DMCs showed distance distributions distinct from CpGs whose methylation was not changed. We incorporated these distributions into transition functions of HMMs. ComMet uses the state transition diagram shown in Figure 1a where transition probabilities among Up, Down, and NoCh states represent distinct distance distributions among DMCs. ComMet adjusts transition probabilities for each dataset to be analyzed using expectationmaximization algorithms. ComMet detects DMRs by dynamic programming algorithms that maximize loglikelihood ratio scores log \frac{P\left(\mathsf{\text{region}},\phantom{\rule{2.36043pt}{0ex}}dir\right)}{P\left(\mathsf{\text{region}},\phantom{\rule{2.36043pt}{0ex}}\mathsf{\text{NoCh}}\right)}, where dir (= Up or Down) is the direction of differential methylation. The output of ComMet is a list of DMRs ranked by their loglikelihood ratio scores.
While transition functions of ComMet incorporated distance distributions of DMCs, the design of emission functions was not well established in our previous study. Given alignment results of bisulfiteconverted reads, we can observe the counts of reads supporting CpG methylation as the number of CC matches, and the counts of reads not supporting CpG methylation as the number of CT mismatches (Figure 1b). Let us denote the count of reads supporting methylation by m_{ si }, the count of reads not supporting methylation by u_{ si }, and the total count of aligned reads by n_{ si }, for each CpG site i and each sample s = 1, 2. If a CpG site is differentially methylated, the counts can be considered to be taken from separate probability distributions reflecting the difference of methylation levels between two samples. On the other hand, if a CpG site is not differentially methylated, the counts should be the consequence of the common methylation level. Therefore, in our previous study, emission functions for CpG states were designed as follows:
where U, D, and N represent Up, Down, and NoCh states, respectively, Binom() is a binomial distribution, and {\mathrm{\xce\xb8}}_{i}^{\xe2\u2039\dots} is the occurrence probability of reads supporting CpG methylation at the ith CpG site for each differential methylation state. (Note that we use common emission functions between the basic and second units in Figure 1a, and no emission function for gap states.) The problem here is how to model {\mathrm{\xce\xb8}}_{i}^{\xe2\u2039\dots} depending on differential methylation states. One may consider to use {\mathrm{\xce\xb8}}_{1i}^{\mathsf{\text{U}}}={\mathrm{\xce\xb8}}_{1i}^{\mathsf{\text{D}}}={m}_{1i}/{n}_{1i}, {\mathrm{\xce\xb8}}_{2i}^{\mathsf{\text{U}}}={\mathrm{\xce\xb8}}_{2i}^{\mathsf{\text{D}}}={m}_{2i}/{n}_{2i}, and {\mathrm{\xce\xb8}}_{0i}^{\mathsf{\text{N}}}=\left({m}_{1i}+{m}_{2i}\right)/\left({n}_{1i}+{n}_{2i}\right). However, this cannot discriminate the direction of differential methylation due to {\mathrm{\xce\xb8}}_{i}^{\mathsf{\text{U}}}={\mathrm{\xce\xb8}}_{i}^{\mathsf{\text{D}}}, and thus is not a suitable choice. In our previous study, we resorted to introduce an empirical parameter pseudo, resulting in
We note that pseudo can be regarded as a pseudocount added to actual read counts, playing a role to represent statedependent methylation levels. For example, if a CpG site has the differential methylation state of Up, we expect that the methylation level is high in the sample 1 and low in the sample 2 (Figure 1b). Accordingly, pseudo is added to m_{1i}(supporting CpG methylation in the sample 1) and to u_{2i}(not supporting CpG methylation in the sample 2).
Although the empirical parameter partially solved the problem of designing emission functions, our previous study did not address how to adjust it. The optimal value of pseudo depends on the magnitude of read counts m and n (i.e. sequencing depth). Moreover, it may also depend on underlying biological processes between samples such as normal development and pathogenesis. In fact, as will be shown in the "Results and discussion" section, ComMet with the above emission functions may result in poor accuracy of DMR detection depending on the value of pseudo.
New emission functions and learning algorithms
To design new emission functions for ComMet, we recall that the empirical parameter in our previous study had a role to represent statedependent methylation levels. This viewpoint leads to the idea that the empirical parameters can be replaced by utilizing genomewide methylation level distributions observed from real data. To formulate this intuition, we propose new emission functions in the following form:
Note that new emission functions use probability distributions of statedependent methylation levels p(Î¸Â·). This is in contrast to the emission functions in our previous study using fixed values {\mathrm{\xce\xb8}}_{i}^{\xe2\u2039\dots}.
Next, we present unsupervised learning algorithms for estimatingthese distributions for each dataset to be analyzed. Figure 2 shows the overview of the algorithms. As shown in Figure 2a, we exploit that methylation levels m/n collected from all genomic CpG sites form a distribution with two modes of high and low methylation. Such bimodal methylation level distributions are a common feature observed in many real datasets, and have also been reported by other researchers (e.g. 9, 10, 16, 19). Moreover, recent studies have suggested that detection of DMCs at individual cytosine sites can be improved by considering genomewide methylation level distributions [13â€“15]. We propose to utilize this information for HMMbased DMR detection. We model genomewide methylation level distributions by using beta mixtures as follows:
where H and L represent two modes of high and low methylation, each of which is modeled by a beta distribution, and unif represents a background ground methylation level modeled by an uniform distribution (Figure 2b). Using these component distributions, we represent probability distributions of statedependent methylation levels as follows:
where I() is an indicator function that takes 1 or 0 depending on whether the condition is true or false. This corresponds to represent differential methylation states as alterations between two modes. For example, Up state p(Î¸_{1}, Î¸_{2}U) is represented as high methylation in the sample 1, p_{H}(Î¸_{1}), and low methylation in the sample 2, p_{L}(Î¸_{2}). By substituting the equations 46 into the equations 13, the new emission functions are finally written as
where B() is a beta function, and Î±_{unif1} = Î²_{unif1} = Î±_{unif2} = Î²_{unif2} = 1 by definition of uniform distribution.
The parameter estimation of w., Î±., and Î². involves several technical issues. First, we perform maximum likelihood estimation that maximizes the likelihood of read counts m and n, rather than methylation levels m/n. Read counts preserve the information of sequencing depth (i.e. the magnitude of read counts), which is cancelled in methylation levels. Therefore, this enables to incorporate the information of sequencing depth into parameter values, thereby to overcome the drawback of the previous emission functions where the optimal value of pseudo depends on sequencing depth. The estimation problem is regarded as maximum likelihood estimation for betabinomial mixtures, and thus can be solved as a simpler case of wellestablished maximum likelihood estimation for Dirichletmultinomial mixtures described in [20]. Second, we can reduce the computational cost by only using read counts from a small number of randomly selected CpG sites. As shown in Figure 2, the histograms depicted using all genomic CpG sites are well fitted by the probability distributions estimated from 10000 CpG sites. Thus, we use 10000 CpG sites also for other datasets throughout this study. Third, we need to restrict the ranges of parameter values so that the integral in the equation 3 is tractable, and each beta component distribution corresponds to exactly one mode of methylation levels. Accordingly, parameter estimation is performed under the constraints of Î±., Î². â‰¥ 1 and Î²_{H1} = Î²_{H2} = Î±_{L1} = Î±_{L2} = 1.
Results and discussion
To evaluate DMR detection accuracy, we conducted experiments on both simulated and real datasets. Unfortunately, there is no database of gold standards for benchmarking DMR detection (i.e. true biological DMRs). Therefore, we employ multilateral evaluation using a series of simulated and real datasets. The overall protocols are similar to those used in [6]. In experiments on simulated data, detected DMRs were evaluated for their overlap with simulated true DMRs. In experiments on real data, detected DMRs were evaluated for agreement with gene expression and DNase I hypersensitivity.
We compared DMR detection accuracy between ComMet using the new emission functions and that using the old ones. In addition, we also compared new ComMet with other methods using Fisher's exact test [10] and methylation level smoothing [11]. We used LAST [5] to align bisulfiteconverted reads to reference sequences. The alignment results were used as the common input for each DMR detection method.
Experiments on simulated data
We simulated bisulfiteconverted reads using DNemulator [5]. The human chromosome Ã— (chrX) was used as a reference. Methylation levels were assigned for all CpG sites in the chrX. 87bp sinlgleend reads were generated from random loci in the chrX with cytosines converted to thymines according to their methylation levels. Quality values were attached to reads according to SRR094461 in the Sequence Read Archive (http://www.ncbi.nlm.nih.gov/sra), which is bisulfiteseq data produced by the Illumina's platform. These reads were treated as the dataset for the sample 1. Next, 100 random regions were defined as DMRs for Up or Down, and methylation levels of all CpG sites in these regions were changed to the maximum or the minimum, respectively. Reads were again generated, and treated as the dataset for the sample 2. To test the effects of sequencing depth, we varied the number of generated reads for each dataset from 1 to 50 million (M). We also varied the length of simulated DMRs by preparing four versions of datasets: 50 bp, 500 bp, 5 kbp, and 50 kbp.
We evaluated DMR detection accuracy using the rate of correct predictions in the top 100 DMRs detected by each method. A correct prediction was defined as a simulated true DMR reciprocally overlapped with a detected DMR in a certain proportion of their lengths. For example, a correct prediction with 50% reciprocal overlap was counted only if the length of the overlapping region was larger than half the length of the simulated true DMR, and half the length of the detected DMR. Similarly, we also defined correct predictions for 90% and 99% reciprocal overlaps.
Figures 3 and 4, and Figure S1 in Additional file 1 show the experimental results. ComMet using the new emission functions achieved better accuracy than that using the old ones with various values of the pseudo parameter (Figure 3a and Figure S1 in Additional file 1). It should be noted that, while the old emission functions attained comparable accuracy to the new ones when used with the optimal value of pseudo, it is difficult to find such optimal values in a practical situation where accuracy cannot be systematically evaluated. In fact, as explained in the "Methods" section, the optimal value of pseudo critically depends on sequencing depth, while the parameters in the new emission functions were successfully adjusted by our learning algorithms (Figure 3b and Figure S1 in Additional file 1). ComMet with the new emission functions also achieved better accuracy than Fisher's exact test and the smoothing method (Figure 4).
Experiments on real data
We conducted experiments that evaluate agreement between detected DMRs and changes in gene expression. Note that similar experiments have been employed also in previous studies [6, 11]. We collected data from [16], where human breast cancer and normal breast are measured by both RNAseq and bisulfiteseq. We aligned RNAseq reads to the human genome using TopHat [21]. Gene expression was measured by fragments per kilobase of transcript per million mapped reads (FPKM) using Cufflinks [22]. Differentially expressed genes (DEGs) were determined by the threshold of fivefold FPKM change. We evaluated agreement between DEGs and detected DMRs according to the previous study [6, 11]. We focused on DEGs whose Â±5 kbp regions around transcription start sites (TSSs) contained detected DMRs. The numbers of DEGs were counted for the top 1000 and 3000 DMRs detected by each method. We used these counts as a measure of the agreement. For the baseline of accuracy, we calculated the expected number of DEGs when DMRs were randomly placed in the TSS windows (denoted by random guessing).
In addition, we evaluated agreement between detected DMRs and changes in DNase I hypersensitivity as conducted in [6]. We collected data from [19], where human foreskin fibroblasts and embryonic stem cells are measured by bisulfiteseq. For these cell types, we obtained DNase I hypersensitivity data from the ENCODE project http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/byDataType/openchrom/jan2011/fdrPeaks/. The data for each cell type contain the set of 150bp regions that show the local maxima of DNase I hypersensitivity with false discovery rate (FDR) less than 1%. We defined "differentially sensitive sites" (DSSs) as those 150bp regions present in either one of the two cell types. The agreement between DSSs and detected DMRs was evaluated similarly to the experiment for DEGs. We focused on DSSs whose Â±5 kbp regions around the midpoints contained detected DMRs. The numbers of DSSs were counted for the top 1000 and 3000 DMRs detected by each method. We used these counts as a measure of the agreement.
Figures 5 and 6 show the experimental results. The advantage of the new emission functions over the old ones was also validated on real data, achieving better agreement of detected DMRs with DEGs (Figure 5a) and DSSs (Figure 5b). We again emphasize that the performance of the old emission functions critically depends on the choice of pseudo values, while the optimal value is difficult to find empirically. In contrast, as shown in Figure 5cd, the bimodal distributions of methylation levels were observed in real datasets, and our learning algorithms successfully fitted the beta mixtures. The new ComMet achieved better accuracy than Fisher's exact test and the smoothing method also for real datasets (Figure 6).
Conclusions
In this paper, we described the new emission functions for HMMbased DMR detection from bisulfiteseq data. We proposed to incorporate the information of genomewide methylation level distributions into emission functions, replacing the empirical parameter used in our previous study. ComMet with the new emission functions successfully improved DMR detection accuracy compared to the previous version. Recent studies suggest that detection of DMCs at individual cytosine sites can be improved by considering genomewide methylation level distributions [13â€“15]. Therefore, our results have shown that such information is useful not only for detecting DMCs, but also for DMR detection. Furthermore, our HMMbased method achieves better accuracy than other methods using Fisher's exact test and methylation level smoothing. The implementation of ComMet is distributed as a part of Bisulfighter package, which is available at http://epigenome.cbrc.jp/bisulfighter.
Abbreviations
 DMR:

differentially methylated region
 DMC:

differentially methylated cytosine
 HMM:

hidden Markov model
 DEG:

differentially expressed gene
 DSS:

differentially sensitive site.
References
Jones PA: Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nat. Rev. Genet. 2012, 13 (7): 484492.
Laird PW: Principles and challenges of genomewide DNA methylation analysis. Nat. Rev. Genet. 2010, 11 (3): 191203.
Xi Y, Li W: BSMAP: whole genome bisulfite sequence MAPping program. BMC Bioinformatics. 2009, 10: 232
Krueger F, Andrews SR: Bismark: a flexible aligner and methylation caller for BisulfiteSeq applications. Bioinformatics. 2011, 27 (11): 15711572.
Frith MC, Mori R, Asai K: A mostly traditional approach improves alignment of bisulfiteconverted DNA. Nucleic Acids Res. 2012, 40 (13): 100
Saito Y, Tsuji J, Mituyama T: Bisulfighter: accurate detection of methylated cytosines and differentially methylated regions. Nucleic Acids Res. 2014, 42 (6): 45.
Bock C: Analysing and interpreting DNA methylation data. Nat. Rev. Genet. 2012, 13 (10): 705719.
Robinson MD, Kahraman A, Law CW, Lindsay H, Nowicka M, Weber LM, Zhou X: Statistical methods for detecting differentially methylated loci and regions. Front Genet. 2014, 5: 324
Takada H, Mituyama T, Wei Z, Yoshihara E, Jacinto S, Downes M, Evans RM: Methylome, transcriptome, and PPARÎ³ cistrome analyses reveal two epigenetic transitions in fat cells. Epigenetics. 2014, 9 (9): 11951206.
Lister R, Pelizzola M, Kida YS, Hawkins RD, Nery JR, Hon G, AntosiewiczBourget J, O'Malley R, Castanon R, Klugman S, Downes M, Yu R, Stewart R, Ren B, Thomson JA, Evans RM, Ecker JR: Hotspots of aberrant epigenomic reprogramming in human induced pluripotent stem cells. Nature. 2011, 471 (7336): 6873.
Hansen KD, Langmead B, Irizarry RA: BSmooth: from whole genome bisulfite sequencing reads to differentially methylated regions. Genome Biol. 2012, 13 (10): 83.
Akalin A, Kormaksson M, Li S, GarrettBakelman FE, Figueroa ME, Melnick A, Mason CE: methylKit: a comprehensive R package for the analysis of genomewide DNA methylation profiles. Genome Biol. 2012, 13 (10): 87.
Dolzhenko E, Smith AD: Using betabinomial regression for highprecision differential methylation analysis in multifactor wholegenome bisulfite sequencing experiments. BMC Bioinformatics. 2014, 15 (1): 215.
Feng H, Conneely KN, Wu H: A Bayesian hierarchical model to detect differentially methylated loci from single nucleotide resolution sequencing data. Nucleic Acids Res. 2014, 42 (8): 69.
Raineri E, Dabad M, Heath S: A note on exact differences between beta distributions in genomic (Methylation) studies. PLoS ONE. 2014, 9 (5): 97349.
Hon GC, Hawkins RD, Caballero OL, Lo C, Lister R, Pelizzola M, Valsesia A, Ye Z, Kuan S, Edsall LE, Camargo AA, Stevenson BJ, Ecker JR, Bafna V, Strausberg RL, Simpson AJ, Ren B: Global DNA hypomethylation coupled to repressive chromatin domain formation and gene silencing in breast cancer. Genome Res. 2012, 22 (2): 246258.
Morris TJ, Butcher LM, Feber A, Teschendorff AE, Chakravarthy AR, Wojdacz TK, Beck S: ChAMP: 450k Chip Analysis Methylation Pipeline. Bioinformatics. 2014, 30 (3): 428430.
Mao Z, Ma C, Huang TH, Chen Y, Huang Y: BIMMER: a novel algorithm for detecting differential DNA methylation regions from MBDCapseq data. BMC Bioinformatics. 2014, 15 (Suppl 12): 6.
Laurent L, Wong E, Li G, Huynh T, Tsirigos A, Ong CT, Low HM, Kin Sung KW, Rigoutsos I, Loring J, Wei CL: Dynamic changes in the human methylome during differentiation. Genome Res. 2010, 20 (3): 320331.
Sjolander K, Karplus K, Brown M, Hughey R, Krogh A, Mian IS, Haussler D: Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology. Comput. Appl. Biosci. 1996, 12 (4): 327345.
Trapnell C, Pachter L, Salzberg SL: TopHat: discovering splice junctions with RNASeq. Bioinformatics. 2009, 25 (9): 11051111.
Roberts A, Trapnell C, Donaghey J, Rinn JL, Pachter L: Improving RNASeq expression estimates by correcting for fragment bias. Genome Biol. 2011, 12 (3): 22
Acknowledgements
This study was supported by Core Research for Evolutional Science and Technology (CREST) from Japan Science and Technology Agency (JST), New Energy and Industrial Technology Development Organization (NEDO), and GrantinAid for Young Scientists (B) Grant Number 15K16089 from Japan Society for the Promotion of Science (JSPS).
Declarations
The publication charges for this article have been funded by Core Research for Evolutional Science and Technology (CREST) from Japan Science and Technology Agency (JST).
This article has been published as part of BMC Genomics Volume 16 Supplement 12, 2015: Joint 26th Genome Informatics Workshop and 14th International Conference on Bioinformatics: Genomics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcgenomics/supplements/16/S12.
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
YS developed the method, wrote the program, performed the experiments, and drafted the manuscript. TM conceived of the study, and coordinated the project. All authors have read and approved the final manuscript.
Electronic supplementary material
12864_2015_7249_MOESM1_ESM.pdf
Additional file 1: Figure S1. Benchmark for DMR detection at varying sequencing depth. For each DMR length, accuracy evaluated with 50% reciprocal overlap is shown. Also shown are estimated values of parameters in new emission functions. (PDF 69 KB)
Rights and permissions
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Saito, Y., Mituyama, T. Detection of differentially methylated regions from bisulfiteseq data by hidden Markov models incorporating genomewide methylation level distributions. BMC Genomics 16 (Suppl 12), S3 (2015). https://doi.org/10.1186/1471216416S12S3
Published:
DOI: https://doi.org/10.1186/1471216416S12S3