Different HCS and RTA versions provide different CpG methylation levels
WGBS relies on bisulfite conversion of unmethylated C, but not 5mC, to uracil. Because 5mC normally occupies only a small proportion of Cs, bisulfite-treated DNA shows depletion of C, resulting in a low diversity sequence. PBAT-seq is designed to generate sequence reads complementary to the bisulfite-converted strand, and thus 5mC appears as G in the first read (R1) and as C in the second read (R2) (Additional file 1: Figure S1b) [4]. In the course of our WGBS study of early postnatal mouse spermatogonia [13], we realized that the global CpG methylation level determined by PBAT-seq of the same library significantly changed upon HCS and RTA updates of the HiSeq sequencer. Therefore, we set out to examine the generality of the problem and to explore the causes. Throughout this paper, global CpG methylation levels refer to weighted levels, which take sequencing depth into account [19].
We performed a series of single-end runs using three PBAT libraries (IMR-90 human fibroblasts, mouse epiblast-like cells [EpiLCs], and mouse spermatogonia). Each library was sequenced multiple times (replicates) and each replicate run was performed using a lane of flow cell on a HiSeq 1500 or HiSeq 2500 sequencers (Additional file 1: Figure S1c). We mapped single-end sequence reads trimmed to 96 bases on the human (hg19) or the mouse (mm10) reference genome and obtained 59–99 million uniquely mapped reads per lane (Additional file 1: Table S1). We confirmed that different combinations of HCS and RTA versions provided different global CpG methylation levels (up to approximately 5% difference) for the same libraries (Fig. 1a, b, Additional file 1: Figure S2a). Such differences were observed even when an identical HiSeq sequencer was used (Additional file 1: Figure S1d). The global CpG methylation difference of 5% was not negligible because similar differences were observed in several types of mouse cells during differentiation (Additional file 1: Figure S3).
For the three libraries, HCS v2.0.5 always provided the highest CpG methylation level, and HCS v2.0.12 the lowest (Fig. 1a). Compared with the other HCS versions, HCS v2.0.12 provided less Gs (Fig. 1a), indicating that a decreased G count is the cause of the lower methylation levels. HCS v2.0.10 provided a methylation level approximately 5% higher than that obtained by HCS v2.0.12, even though the same RTA version (v1.17.21.3) was used (Fig. 1b, Additional file 1: Figure S2b). Thus, HCS and not RTA was the major determinant of the observed differences. Regions with higher CpG density tended to show larger differences between HCS v2.0.5 and v2.0.12 (Fig. 1c, Additional file 1: Figure S2c). Furthermore, when we tried to identify partially methylated domains (PMDs) in EpiLCs, the two versions gave very different results (Additional file 1: Figure S2d). The PMDs are observed in several cell types including cancer cells [20] and associated with intermediate levels of methylation, specific histone modifications, nuclear lamina, and gene silencing [21, 22], showing that different HCS versions can impact biological outcomes.
We next prepared a paired-end PBAT library from IMR-90 human fibroblasts and compared the results obtained using the three HCS versions (Additional file 1: Table S1). As mentioned above, 5mC appears as G in R1 and as C in R2 in paired-end PBAT-seq (Additional file 1: Figure S1b). R1 data showed CpG methylation differences with different HCS versions (Fig. 1d), similar to those observed by single-end PBAT-seq (Fig. 1a). Interestingly, R2 data showed smaller differences (<1.0%) (Fig. 1d). Thus, R1 data derived by HCS v2.0.12 produced a significantly lower (approximately 4%) methylation level than the corresponding R2 data (Fig. 1d). Ideally, R1 and R2 data should provide identical methylation levels. Among the HCS versions, v2.0.5 produced the closest R1 and R2 methylation levels (Fig. 1d).
We previously performed paired-end MethylC-seq with human genomic DNA (purchased from Promega) using HCS v2.0.12 and RTA v1.17.21.3 (accession no. DRA002280) [14]. MethylC-seq is designed to read 5mC as C in R1 and as G in R2 [3], which is the opposite of PBAT-seq. The global CpG methylation level determined using R2 data (55.5%) was lower than that determined using R1 data (57.4%) as expected. However, the difference between R1 and R2 methylation levels was relatively small (1.9%) compared with the other paired-end WGBS cases. We then realized that this particular paired-end MethylC-seq library had contained unconverted PhiX DNA at 50% w/w. Thus, the 50% w/w PhiX DNA spike-in may have alleviated the problem, perhaps through increasing the sequence diversity, for this run using HCS v2.0.12. These results suggest that R1 of PBAT-seq and R2 of MethylC-seq provide lower methylation levels than the other reads and that the methylation level is lower when 5mC is read as G or is higher when 5mC is read as C.
The HCS version suitable for WGBS
We then attempted to determine which version of HCS is most suitable for WGBS. To generate substrates with known CpG methylation levels, lambda phage DNA was methylated in vitro to near completion by treatment with SssI methyltransferase. We confirmed the overall resistance of the treated DNA to methylation-sensitive restriction enzyme HpaII (data not shown). Furthermore, we performed bisulfite sequencing at three loci (58 CpG sites), which demonstrated 97.9% CpG methylation (Additional file 1: Figure S4). We prepared a series of lambda DNA mixtures with increasing proportions of the methylated DNA (10, 44, and 88%) and performed paired-end PBAT-seq using the three HCS versions. We mapped the reads onto the lambda phage genome (48,502 base pairs) and obtained 3.5–8.0 million uniquely mapped reads (Additional file 1: Table S2), with the calculated average depths of 3,498–7,962 per strand. R1 data revealed striking differences in CpG methylation between the HCS versions, with HCS v2.0.5 showing the best performance (closest to the predetermined level) and v2.0.12 the poorest (Fig. 2a). In contrast, R2 data showed smaller differences, and all data were close to the predetermined level (Fig. 2a). HCS v2.0.5 provided the least differences between R1 and R2 methylation levels (Fig. 2b), consistent with the findings in human and mouse genomic DNAs (Fig. 1d). HCS v2.0.12 always provided the lowest methylation levels in R1 among the different versions (Figs. 1a, d, and 2a). These results indicate that HCS v2.0.5 is most suitable for WGBS among the three HCS versions and that HCS v2.0.12 provides methylation levels lower than the real values when 5mC is read as G. Since R2 data obtained with different HCS versions were all close to the real values, single-end MethylC-seq, where 5mC is read as C, should not be affected by the versions.
Quality scores assigned to 5mCs
Next, we examined the quality scores assigned to the respective bases of the PBAT-seq reads. In the single-end PBAT-seq data obtained from IMR-90 human fibroblasts, mouse EpiLCs, and mouse spermatogonia, quality scores over 30 (99.9% accuracy) were assigned to over 85% of the bases other than G (Additional file 1: Figure S5). However, the quality scores assigned to G greatly changed depending on the HCS versions. In particular, HCS v2.0.12 assigned low quality scores to G (only 18–36% of G had quality scores over 30) (Fig. 3a). Steep drops (>10) in quality score were observed at Gs in 78.0% of the sequence reads containing at least one G (IMR-90) (Fig. 3b). In contrast, high quality scores were consistently observed at Gs in the unconverted PhiX phage control lane of the same flow cell (Fig. 3b). HCS v2.0.10 assigned better quality scores to Gs than HCS v2.0.12 with the same RTA version (v1.17.21.3) (Fig. 3a). We did not find low score assignments to Gs in our PBAT-seq data generated using earlier HCS versions, including HCS v1.5.15, v1.4.8, and v1.1.37 (data not shown), suggesting that HCS v1 may not have the problem in G calling.
In the paired-end PBAT-seq using HCS v2.0.12 (IMR-90), Gs in R1 showed lower quality scores than Cs in R2 (Fig. 3c). In contrast, in the paired-end MethylC-seq using HCS v2.0.12 (accession no. DRA002280) [14], Gs in R2 had lower quality scores than Cs in R1, even though PhiX DNA was added at 50% w/w to confer sufficient sequence diversity (Fig. 3c). These results showed that HCS v2.0.12 has problems in scoring the fewest G bases in low diversity samples showing depletion of Gs. The low quality scores and fewer G outputs may be linked to each other, and both are likely due to the fact that base G has the lowest fluorescence intensity among the four bases [23].
Effect of cluster density on WGBS
The identification of individual clusters and determination of their coordinates by HCS relies on sequence diversity because discrimination of clusters in close proximity requires different fluorescence signals in the initial cycles. Thus, it has been reported that low diversity sequencing is affected in a high cluster density range [18]. To investigate the relationships between cluster density, HCS version, and CpG methylation level in WGBS, we prepared a series of dilutions of the paired-end PBAT library (IMR-90), loaded them onto flow cell lanes, and created different cluster densities (323–629 K per mm2; recommended range for v3 cluster kits 750–850 K per mm2) (Additional file 1: Table S3). The three HCS versions provided similar CpG methylation levels (<0.5% difference) at different cluster densities (Additional file 1: Table S3), indicating that lower cluster densities have little impact in this case.
However, HCS v2.2.38 assigned lower quality scores to Cs in R2, as the cluster density increases (Fig. 4a). Such a density-dependent decrease in quality score for C was not observed with the other HCS versions (v2.0.5 and v.2.0.12) (Fig. 4b). Also, Gs of R1 of the same sequencing run did not show such drops in quality score (data not shown). Because the PhiX control at a high cluster density (672 K per mm2) showed good quality scores at Cs in R2 (Fig. 4a), we speculate that HCS v2.2.38 provides lower quality scores to the fewest bases in low-diversity R2 data. Furthermore, R2 data generated using HCS v2.2.38 at 629 K per mm2 provided a high global non-CpG (CpA, CpT, and CpC) methylation level (1.86%) for IMR-90 human fibroblasts, which is clearly different from other data (<0.1%) [24], suggesting that R2 data may be less accurate.
Recently, HCS v2.2.58 and RTA v1.18.64 were released from Illumina. To examine the performance of the latest versions, we performed paired-end PBAT-seq on an IMR-90 library constructed from the same DNA as the above studies (Additional file 1: Table S3). These versions provided a global CpG methylation level similar to that obtained by HCS v2.0.5 (62.4% versus 63.0%) in R1. However, they assigned very low quality scores to overall R2 data at a modest cluster density (483 K per mm2) (Fig. 4c) and quality scores over 30 were assigned to only 1.8% of all bases.
WGBS data generated by new Illumina systems
Finally, we analyzed published paired-end WGBS data generated by new Illumina systems, HiSeq 4000 and NextSeq 500. HCS v3.3.x is installed on HiSeq 4000, which uses patterned flow cell technology. We analyzed MethylC-seq data (GSM1707686 and GSM2137773) generated by HiSeq 4000 and found that the difference between R1 and R2 global CpG methylation levels was 2.3% in both data sets. As expected, R1 produced a higher methylation level than corresponding R2 (Additional file 1: Figure S6a). Quality scores over 30 were assigned to approximately 50% of the Gs in R2 (Additional file 1: Figure S6a). Thus, the performance of HiSeq 4000 seemed better than HCS v.2.0.12, but it was not clear whether its performance was better than HCS v2.0.5.
We also analyzed MethylC-seq data (GSM1973803 and GSM1973807) generated by NextSeq 500, which uses a two-color chemistry. Calls for G are made where there is actually no signal on a flow cell. In this run, 30% PhiX DNA was spiked in [25]. We found that the difference between R1 and R2 global CpG methylation levels was relatively small (1.9 and 0.6%) (Additional file 1: Figure S6b). Quality scores over 30 were assigned to 73% of the Gs in R2 (Additional file 1: Figure S6b). Taken together, WGBS outputs by HiSeq 4000 and NextSeq 500 produced better results than HCS v2.0.12. Since we had no information on the software versions and cluster densities, further validation requires replicates from the same WGBS libraries.