Digital PCR provides sensitive and absolute calibration for high throughput sequencing
© White et al; licensee BioMed Central Ltd. 2009
Received: 30 September 2008
Accepted: 19 March 2009
Published: 19 March 2009
The Erratum to this article has been published in BMC Genomics 2009 10:541
Next-generation DNA sequencing on the 454, Solexa, and SOLiD platforms requires absolute calibration of the number of molecules to be sequenced. This requirement has two unfavorable consequences. First, large amounts of sample-typically micrograms-are needed for library preparation, thereby limiting the scope of samples which can be sequenced. For many applications, including metagenomics and the sequencing of ancient, forensic, and clinical samples, the quantity of input DNA can be critically limiting. Second, each library requires a titration sequencing run, thereby increasing the cost and lowering the throughput of sequencing.
We demonstrate the use of digital PCR to accurately quantify 454 and Solexa sequencing libraries, enabling the preparation of sequencing libraries from nanogram quantities of input material while eliminating costly and time-consuming titration runs of the sequencer. We successfully sequenced low-nanogram scale bacterial and mammalian DNA samples on the 454 FLX and Solexa DNA sequencing platforms. This study is the first to definitively demonstrate the successful sequencing of picogram quantities of input DNA on the 454 platform, reducing the sample requirement more than 1000-fold without pre-amplification and the associated bias and reduction in library depth.
The digital PCR assay allows absolute quantification of sequencing libraries, eliminates uncertainties associated with the construction and application of standard curves to PCR-based quantification, and with a coefficient of variation close to 10%, is sufficiently precise to enable direct sequencing without titration runs.
A new generation of sequencing technologies based on sequencing by synthesis and sequencing by ligation are revolutionizing biology, biotechnology, and medicine [1, 2]. A key advance facilitating higher throughput and lower costs for several of these platforms was migration from the clone-based sample preparation used in Sanger sequencing to the massively parallel clonal PCR amplification of sample molecules on beads (Roche 454 and ABI Solid) or on a surface (Solexa) [3, 4]. The workflow for these new sequencing technologies proceeds as follows: library creation, library quantification, massively parallel clonal PCR amplification of library molecules, and sequencing. During library creation, adaptor sequences are appended to both ends of the DNA molecules in a sample. The presence of these adaptors enables the amplification of random-sequence inserts by parallel PCR amplification of millions of individual DNA molecules. On the Roche/454 and ABI/SOLiD platforms, emulsion PCR is used to amplify a single DNA molecule to millions of copies of the same sequence all attached to a single polymer bead. On the Illumina/Solexa platform, library molecules are captured by surface-tethered probes complementary to the adaptor sequences and are amplified by bridge PCR to convert a single DNA molecule into a surface-bound cluster with many copies of the same sequence.
Accurate quantification of the number of library molecules is a critical factor affecting next-generation sequencing performance. Underestimation of library concentration results in multiple library molecules associating with the same bead within an emulsion microdroplet or overlapping images of DNA clusters after bridge PCR. The consequences are mixed signals or un-resolvable clusters, which reduce the number of high quality reads. Overestimation of library concentration results in fewer DNA-bearing beads after emulsion PCR or sparse clustering in bridge PCR, in which case the full capacity of the sequencer cannot be realized. Accurate quantification of the sequencing library is essential to achieve high yield and high quality sequencing. Inaccuracy in quantification is addressed by the manufacturers through 'titration' runs of the sequencer, which are used to empirically divine the concentration of productive DNA fragments in the sequencing library. The accuracy of digital PCR and its ability to count only amplifiable molecules obviate the need for expensive and time-consuming titration sequencing runs.
Comparison of current sequencing library quantification methods
Syber Green I Intercalating fluorophore
Hydrolysis probe (Taqman)
Hydrolysis probe (Taqman)
(7.2 billion copies)
(91 billion copies)
(3.6 billion copies)
No standard necessary
Required – calibrated by mass
Required – calibrated by mass
Required – calibrated by mass
Required – calibrated by mass
No standard necessary
Ricicova, M. et al (2003)
Jones, Lj et al. (1998)
Simpson (2000); Meyer (2008)
Zhang (2003); This work
Kalinina (1997); This work
We developed a digital PCR-based method for highly accurate absolute quantification of sequencing libraries that consumes subfemptogram amounts of library material. Digital PCR is a technique where a limiting dilution of the sample is made across a large number of separate PCR reactions such that most of the reactions have no template molecules and give a negative amplification result. In counting the number of positive PCR reactions at the reaction endpoint, one can count the individual template molecules present in the original sample one-by-one . The term 'digital PCR' was coined in 1999 . A major advantage of digital PCR is that the quantification is independent of variations in the amplification efficiency – successful amplifications are counted as one molecule, independent of the actual amount of product. PCR-based techniques have the additional advantage of only counting molecules that can be amplified, e.g. that are relevant to the massively parallel PCR step in the sequencing workflow. Because digital PCR has single molecule sensitivity, only a few hundred library molecules are required for accurate quantification. Elimination of the quantification bottleneck reduces the sample input requirement from micrograms to nanograms or less, opening the way for minute and/or precious samples onto the next-generation sequencing platforms without the distorting effects of pre-amplification. Here we demonstrate the utility of digital PCR to directly prepare trace (<1 microgram) DNA samples for bulk sequencing on the 454/Roche and Solexa/Illumina sequencing platforms.
Universal Template Taqman PCR assay design
UT-digital PCR assay for sequencing library quantification
UT-Digital PCR assay enables direct bulk sequencing of trace samples on 454
Trace microbial/human 454 FLX library construction
Mean library fragment size (bp)
Input (total molecules by mass)
ssDNA library (total molecules by UT-dPCR)
dPCR replicate CV
Library prep recovery %
1.19 × 1011
2.61 × 107
8.52 × 1010
6.15 × 106
3.41 × 1010
4.20 × 106
1.70 × 1010
4.08 × 106
3.41 × 109
4.08 × 105
1.70 × 109
2.31 × 105
1.27 × 1011
2.67 × 107
9.10 × 1010
2.11 × 107
3.64 × 1010
2.15 × 106
1.82 × 1010
8.49 × 105
3.64 × 109
8.67 × 104
1.82 × 109
1.31 × 105
1.18 × 1011
3.24 × 106
1.01 × 1011
1.87 × 106
8.41 × 1010
1.49 × 106
2.26 × 1012
3.63 × 108
9.26 × 1010
9.06 × 106
To assess the reproducibility of the UT-digital PCR assay, we analyzed the coefficient of variation (CV) among replicate UT-digital PCR quantifications of the trace 454 libraries (Table 2). The mean CV was found to be 9.0% with standard error of the mean (SEM) 1.2%, indicating that the UT-dPCR assay is precise within about 10%. In order to make a direct comparison between the reproducibility of UT-dPCR and UT-qPCR quantification, we carried out a dedicated study. A variety of 454 Libraries were assayed in replicate (six to eight replicates per library per method) by both UT-digital PCR and UT-quantitative PCR [see Additional file 2]. The real-time PCR measurements were carried out using an ideally prepared standard curve. The mean CV for UT-digital PCR assay was found to be 11.8 ± 1.5%, consistent with the results obtained by UT-digital PCR for the trace libraries and significantly lower than the CV measured for UT-quantitative PCR, 21.2 ± 2.6% (p < 0.05, t-test, Figure 3B). Because the digital assay relies on neither internal nor external standards, the CV figure of 11.8% represented in Figure 3B closely approximates the real-world accuracy of the digital assay, which is sufficient to prepare bulk emulsion PCR or bridge PCR reactions without prior titration.
454 FLX trace library sequence results
Proportion of library sequenced
Raw bases (Mbp)
Number of reads
Average read length (bp)
% mapping to template/assembling*
2,400,000 sequencing library molecules (or 0.71 pg amplifiable DNA) from an Acetonema longum shotgun library (prepared according to the standard library preparation method from 723 ng of genomic DNA) were sufficient for digital PCR, emulsion PCR and sequencing on the 454 FLX. From these molecules, 74% of the beads loaded gave useful 454 sequence data (4.13% 'mixed' reads and 4.28% 'dot' reads) to yield 67 Mbp in 278,181 reads on one large PTP region (one-half of the 454 FLX sequencing run). Together with 38 Mbp of shotgun data from another run, 105.6 Mbp of very high quality Ace shotgun data were obtained without any titration techniques, 104.3 Mbp of which assembled de novo under Newbler to give better than 20-fold coverage of the ~5 Mbp Acetonema longum genome with N50 contig size in excess of 50,000 bp. These results indicate that significant quantities of DNA pyrosequencing data can be obtained from subnanogram DNA samples without titration runs.
UT-digital PCR assay enables trace Solexa library quantification and sequencing
Solexa trace library generation & sequence results
DNA Library (total molecules by UT-dPCR)*/ul
Average number of clusters generated per tile
Total number of reads
% mapping to Human reference (hg 18)
Plasma DNA Sample 1
1.07 × 1011
Plasma DNA Sample 2
7.88 × 1010
Plasma DNA Sample 3
7.17 × 1010
Plasma DNA Sample 4
6.03 × 1010
Plasma DNA Sample 5
7.17 × 1010
Plasma DNA Sample 6
7.23 × 1010
Whole Blood DNA Sample
6.30 × 1010
Environmental and clinical sampling for diagnostic, forensic, and metagenomic applications often yields mere nanograms of genetic material, an amount presently considered insufficient to support next-generation library preparation. Common practice is to amplify the materials using PCR or whole genome amplification, methods which introduce bias to the overall representation of the sample on an intentional or unintentional basis. There exists a clear need for a straightforward and reliable method to bring nanogram and subnanogram samples onto the next-generation sequencing platforms.
Quantifying the sequencing libraries by mass, as recommended in the sequencing protocols, presents three major stumbling blocks that render the quantification inaccurate to the degree where the sequencing results are compromised. First, mass-based quantification requires an accurate estimate of the length of the molecules to determine the molar concentration of DNA fragments. Second, degraded and damaged molecules that cannot be amplified in the massively parallel amplification step are counted. And third, methods of measuring DNA mass lack sensitivity, and are inaccurate at or below low-nanogram quantities.
Quantitative real-time PCR, and especially digital PCR, are ideal candidate techniques for this application because of their exquisite sensitivity. Some detection chemistries for real-time PCR, such as TaqMan, have the property of counting molecules rather than measuring DNA mass, although in the real-time modality, the measurements are relative and the methods by which standards are established often tie the real-time PCR results back to mass.
Recently, Meyer et al. developed a SYBR Green real-time PCR assay that allows the user to estimate the number of amplifiable molecules in sequencing libraries . This was the first report of PCR-based quantification of sequencing libraries, and extended the sensitivity of library quantification significantly – although to an unknown extent, since the source material used to make the Neandertal (presumably the lowest input quantity) libraries was not quantified. However, the SYBR Green assay presents several disadvantages: SYBR Green I dye is an intercalating flurochrome that gives signal in proportion to DNA mass, not molecule number; SYBR Green assays rely on external standards that limit the absolute accuracy and are not universal to all sample types; finally, intercalating fluorochromes give signal from nonspecific PCR reaction products. After this manuscript was submitted, a report from the Sanger Center describing the use of real time Taqman PCR to quantify sequencing libraries appeared . While this eliminates some of the problems related to SYBR Green, it was not applied to trace libraries and suffers from the same drawbacks as all real-time assays.
In a real-time assay, the standard must have the same amplification efficiency and molecular weight distribution as the unknown library sample. This means the user must have on hand a bulk sequencing library very similar to the trace library being made and that the molecular weight distributions of both the standard and the new library be known – often an impractical requirement for low-concentration shotgun libraries. Furthermore, this standard library must be of extremely high quality if mass-based quantification is to be used to calibrate the assay for amplifiable molecules. If not, the concentration of all the unknown samples will be overestimated, and the yield of enriched beads or clusters will be poor. For this reason, Roche and Illumina recommend carrying out a four-point titration run on their sequencers to empirically determine the quantity of DNA to be used before carrying out a bulk sequencing run with a new library. In addition, Illumina recommends that the user check the library quality with traditional Sanger sequencing before its application to high-throughput sequencing.
Lastly, sequence-nonspecific detection chemistries like SYBR Green give signal from all dsDNA products generated, including primer dimers and nonspecific amplification products, which can be a severe issue in complex samples. In particular, side products can compete with specific amplification from low numbers (<1000) of template molecules, limiting the accuracy of SYBR Green quantification for dilute samples . Although the presence of these side products can often be discerned by analysis of the product melting curve, opportunities to optimize the primers are limited due to the short length of the adaptor sequences and the specific nucleotide sequences required for compatibility with proprietary sequencing reagents. Sensitivity to side products gives SYBR Green a tendency toward overestimation of the sample quantity.
The characteristics of the quantification methods discussed are summarized in Table 1. The digital PCR method eliminates the issues associated with mass-based quantification and real-time PCR, as well as the requirement for titration, significantly reducing the cost of preparing a library for bulk sequencing. For example, the marginal cost of titrating a 454 library on the sequencer according to the manufacturer's protocol is $1500 – $2000, while the cost to quantify a sequencing library on the digital PCR chips is $30 – $90, depending on the number of panels dedicated to each library (typically 1 – 3 panels per library). In addition, PCR-based quantification saves time and leaves the expensive sequencing instrument free to carry out bulk sequencing runs.
Our results demonstrate that significant quantities of high quality sequencing data can be obtained from nanogram quantities of genetic material with the aid of digital PCR quantification. Digital PCR quantifies the amount of DNA by counting the number of positive amplification reactions from individual DNA molecules independent of amplification efficiency, and requires no standard, calibration, or information about the molecular weight distribution of the template molecules. The extraordinary sensitivity of real-time and digital PCR eliminate quantification as a material-limiting step in the sequencing workflow, bringing greater focus to library preparation procedures as the next most limiting step in sequencing trace samples. It is natural to expect that library preparation protocols developed with the capacity to handle up to five micrograms of input are far from optimal with respect to minimizing loss from nanogram or picogram samples. A procedure optimized for trace samples with reduced reaction volumes and media quantities, possibly formatted in a microfluidic chip, has the potential to dramatically improve the recovery of library molecules, allowing preparation of sequencing libraries from quantities of sample comparable to that actually required for the sequencing run, e.g. close to or less than one picogram.
Digital PCR quantification is sufficiently accurate in counting amplifiable library molecules to justify elimination of titration techniques as well as the associated time and cost. The method is also hundreds of millions of times more sensitive than traditional means of library quantification, and allows the sequencing of libraries prepared from tens to hundreds of picograms of starting material, rather than the micrograms of DNA required by the manufacturers' protocols. The reduced sample requirement enables the application of next-generation sequencing technologies to minute and precious samples without the need for pre-amplification.
DNA was extracted from mid-log phase E. coli K12 and and Acetonema longum cultures using Qiagen's DNeasy Tissue & Blood kit and further purified using Qiagen's QIAquick PCR purification kit following the manufacturer's protocol. E.coli amplicons were generated from 16S rRNA PCR following standard protocols to generate a uniform 466 bp fragment. Sample pX and Solexa libraries were DNA extracted from human plasma or whole blood using Qiagen's DNA Blood Mini Kit or Machinerey-Nagel's NucleoSpin Plasma Kit according to manufacturers' protocols. Samples K27-1, K27-2, IgG consisted of purified mouse DNA from chromatin immunoprecipitation experiments. The initial E. coli DNA template and Acetonema longum sample used for 454 FLX sequencing were quantified by Nanodrop, Agilent Bioanalyzer DNA chip, and a 16S rRNA qPCR assay. The E. coli template was further diluted to the 0.5 – 35 ng range for samples TS 1 – 6 prior to library construction. The E. coli template for samples TS 7 – 12 was PCR-amplified from 3 ng of initial template. The PCR product was quantified by Nanodrop, Bioanalyzer DNA chip, and 16S rRNA qPCR, then aliquotted to the final amounts (0.5 – 35 ng) before library construction. For samples K27-1, K27-2, and IgG, the initial template DNA was quantified by Nanodrop and Aligent Bioanalyzer. Sample pX was quantified by Nanodrop, Agilent Bioanalyzer, and digital PCR with human-specific primers.
Sequencing library preparation
454 shotgun libraries were generated according to the manufacturer's protocol with a few adjustments: trace E.coli amplicons and human sample pX were not nebulized; 0.01% Tween-20 was added to the elution buffer for each mini-elute column purification step; and libraries were eluted using 1xTE containing 0.05% Tween-20 at a volume of 30 μl. Single-stranded libraries were aliquotted for storage. Solexa libraries were generated following standard genomic DNA protocol with the following adjustments: no nebulization was performed on plasma DNA samples since they were fragmented in nature (average ~170 bp); the whole blood genomic DNA sample was sonicated to produce fragments between 100 and 400 bp; all ligated products were used for 18-cycle PCR enrichment; no gel extraction was performed; and no Sanger sequencing was used to confirm fragments of correct sequence.
Standard creation for UT-quantitative PCR on the Statagene Mx3005
After sequencing library preparation, UT-quantitative PCR was used to range the concentration for UT-digital PCR. For use with UT-quantitative PCR, a standard library was created, quantified by UT-digital PCR and serially diluted for use as a UT-quantitative PCR standard. In order to ensure uniform amplification among various libraries, several standard samples were prepared such that in each UT-quantitative PCR the fragment length distribution and average GC content of the standard approximated those of the samples being quantified.
UT-quantitative PCR quantification on the Statagene's Mx3005
Thermocycling parameters for UT-quantitative PCR and UT-digital PCR
Standard Adapters 454
UT-dPCR & UT-qPCR
95C, 3 mins
95C, 3 mins
95C, 3 mins
95C, 10 mins
94C, 30 secs
95C, 3 secs
95C, 15 secs
95C, 15 secs
60C, 30 secs
65C, 30 secs
65C, 30 secs
60C, 1 min
72C, 45 secs
UT-digital PCR quantification on Fluidigm's BioMark System
454 libraries: UT-quantitative PCR was first performed on aliquotted libraries in order to estimate the dilution factor for UT-digital PCR. The libraries were diluted to roughly 100–360 molecules per μl. PCR reaction mix containing the diluted template was loaded onto Fluidigm's 12.765 Digital Array microfluidic chip. The microfluidic chip has 12 panels and each panel contains 765 chambers. The concentration of diluted template that yielded 150–360 amplified molecules per panel was chosen for technical replication. Six replicate panels on the digital chip were assayed in order to obtain absolute quantification of the initial concentration of library.
Solexa libraries: quantitative real-time PCR using human specific primers was first performed to estimate the dilution factor required for carrying out UT-digital PCR. The final dilution yielded 150–360 amplified molecules per panel.
Primer/probe list for UT-quantitative PCR and UT-digital PCR
Primers for Standard 454 libraries:
Primers for 454 MID/Paired end libraries:
Primers for Solexa libraries:
Universal probe sequence:
Emulsion PCR/Bridge PCR & Sequencing
454 sequencing: Sequencing was performed according to manufacturer's protocol. No titration or Sanger sequencing was performed. The DNA to bead ratios of 0.085 – 0.300 (based on UT-digital PCR quantification) were used. These ratios resulted in acceptable enrichment sequencing results, including an incidence of 'mixed' reads clustering below 20%. 'Mixed' reads in 454 sequencing are defined as four consecutive positive nucleotide flows for a given read. Solexa sequencing: Sequencing libraries were first diluted to 10 nM according to the concentration determined by digital PCR. The average dilution factor was 10 – 20. Diluted libraries were denatured with 2 N NaOH and then diluted to a final concentration of 4 pM. The templates were loaded onto flow cells. Cluster generation was performed according to the manufacturer's instructions. Sequencing was carried out on the Genome Analyzer II. No titration or Sanger sequencing was performed.
Coefficient of Variation
genome sequencer 'flex' from Roche/454
Limit of Detection
Limit of Quantification
Multiple Displacement Amplification
Polymerase Chain Reaction
Standard Error of the Mean
Stanford Genome Technology Center
Whole Genome Amplification.
We thank Angela Wu, Jared Leadbetter, Liz Otteson, Baback Gharizadeh, Farbod Babrzadeh, and Roxana Jalili for sharing samples and/or data. We thank Matthias Meyer for helpful discussions. We also thank Joseph Derisi, Clement Chu, and Nick Ingolia for their help in carrying out sequencing experiments on the Solexa Genome Analyzer. This work was supported by Pioneer funding from the NIH to SRQ.
- Holt RA, Jones SJM: The new paradigm of flow cell sequencing. Genome Research. 2008, 18: 839-846. 10.1101/gr.073262.107.View ArticlePubMedGoogle Scholar
- Gupta PK: Single-molecule DNA sequencing technologies for future genomics research. Trends in Biotechnology. 2008, 26: 602-611. 10.1016/j.tibtech.2008.07.003.View ArticlePubMedGoogle Scholar
- Bing DH, Boles C, Rehman FN, Audeh M, Belmarsh M, Kelley B, Adams CP: Bridge amplification: a solid phase PCR system for the amplification and detection of allelic differences in single copy genes. Genetic Identity Conference Proceedings, Seventh International Symposium on Human Identification. 1996, [http://www.promega.com/geneticidproc/ussymp7proc/0726.html]Google Scholar
- Margulies M, et al: Genome sequencing in microfabricated high-density picoliter reactors. Nature. 2005, 437: 376-380.PubMed CentralPubMedGoogle Scholar
- Mackelprang R, Rubin EM: Paleontology: New Tricks with Old Bones. Science. 2008, 321: 211-212. 10.1126/science.1161890.View ArticlePubMedGoogle Scholar
- Pinard R, de Winter A, Sarkis GJ, Gerstein MB, Tartaro KR, Plant RN, Egholm M, Rothberg JM, Leamon JH: Assessment of whole genome amplification-induced bias through high-throughput, massively parallel whole genome sequencing. BMC Genomics. 2006, 7: 216-237. 10.1186/1471-2164-7-216.PubMed CentralView ArticlePubMedGoogle Scholar
- Kalinina O, Lebedeva I, Brown J, Silver J: Nanoliter scale PCR with TaqMan detection. Nucleic Acids Research. 1997, 25: 1999-2004. 10.1093/nar/25.10.1999.PubMed CentralView ArticlePubMedGoogle Scholar
- Vogelstein B, Kinzler KW: Digital PCR. Proc Natl Acad Sci USA. 1999, 96: 9236-9241. 10.1073/pnas.96.16.9236.PubMed CentralView ArticlePubMedGoogle Scholar
- Heid CA, Stevens J, Livak KJ, Williams PM: Real-time quantitative PCR. Genome Research. 1996, 6: 986-994. 10.1101/gr.6.10.986.View ArticlePubMedGoogle Scholar
- Zhang Y, Zhang D, Wenquan L, Chen J, Peng Y, Cao W: A novel real-time quantitative PCR method using attached universal template probe. Nucleic Acids Research. 2003, 31: e123-10.1093/nar/gng123.PubMed CentralView ArticlePubMedGoogle Scholar
- Meyer M, Briggs AW, Maricic T, Höber B, Höffner B, Krause J, Weihmann A, Pääbo S, Hofreiter M: From micrograms to picograms: quantitative PCR reduces the material demands of high-throughput sequencing. Nucleic Acids Research. 2008, 36: e5-10.1093/nar/gkm1095.PubMed CentralView ArticlePubMedGoogle Scholar
- Quail MA, Kozarewa I, Smith F, Scally A, Stephens PJ, Durbin R, Swerdlow H, Turner DJ: A large genome center's improvements to the Illumina sequencing system. Nature Methods. 2008, 5: 1005-1010. 10.1038/nmeth.1270.PubMed CentralView ArticlePubMedGoogle Scholar
- Simpson D, Feeney S, Boyle C, Stitt AW: Retinal VEGF mRNA measured by SYBR Green I fluorescence: A versatile approach to quantitative PCR. Molecular Vision. 2000, 6: 178-183.PubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.