High-throughput genotyping of single nucleotide polymorphisms with rolling circle amplification

Background Single nucleotide polymorphisms (SNPs) are the foundation of powerful complex trait and pharmacogenomic analyses. The availability of large SNP databases, however, has emphasized a need for inexpensive SNP genotyping methods of commensurate simplicity, robustness, and scalability. We describe a solution-based, microtiter plate method for SNP genotyping of human genomic DNA. The method is based upon allele discrimination by ligation of open circle probes followed by rolling circle amplification of the signal using fluorescent primers. Only the probe with a 3' base complementary to the SNP is circularized by ligation. Results SNP scoring by ligation was optimized to a 100,000 fold discrimination against probe mismatched to the SNP. The assay was used to genotype 10 SNPs from a set of 192 genomic DNA samples in a high-throughput format. Assay directly from genomic DNA eliminates the need to preamplify the target as done for many other genotyping methods. The sensitivity of the assay was demonstrated by genotyping from 1 ng of genomic DNA. We demonstrate that the assay can detect a single molecule of the circularized probe. Conclusions Compatibility with homogeneous formats and the ability to assay small amounts of genomic DNA meets the exacting requirements of automated, high-throughput SNP scoring.


Background
Sequencing studies of human transcriptomes and ge-nomes have identified hundreds of thousands of single nucleotide polymorphisms (SNPs), the most common type of human genetic variation [1]. Human SNPs are being assembled into extremely high-density genetic maps that are anticipated to considerably expand the remit of genetic analyses [2]. Broad availability of SNPs for candidate genes will enhance pharmacogenomic and complex trait association studies. Furthermore, extremely dense SNP maps have the potential to make possible genome-wide association studies for complex traits that bypass shortcomings of current genetic linkage analyses [3,4].
The emerging era of multiplexed SNP-based genetic analysis has underscored a need for simple and accurate genotyping methods that can accommodate thousands of loci with economy of cost and consumption of sample DNA. In general, current methods require pre-amplification of genomic DNA, typically by Polymerase Chain Reaction (PCR), followed by SNP genotyping with an allele discrimination method, such as DNA cleavage, ligation, single base extension or hybridization [5]. Current methods are limited either by expense, inaccuracy, consumption of sample DNA, or lack of scalability.
Recently a new method for SNP detection from genomic DNA based on DNA ligase-mediated single nucleotide discrimination and signal amplification by Rolling Circle Amplification (RCA) has been described [6][7][8][9][10]. An oligonucleotide Open Circle Probe (OCP) anneals to the target SNP such that the 5' and 3' ends of the OCP can be ligated together forming a circle topologically linked to the target. A base-pair match between the 3' end of the OCP and the SNP allows DNA ligase to circularize the OCP. A mismatch between the OCP and the SNP prevents ligation and circularization. In this manner, single base selectivity is achieved not only by the specific hybridization of the OCP ends to target sequences adjacent to the SNP, but also by the highly discriminative nick closure activity of the thermostable DNA ligase toward a perfectly matched substrate. Upon OCP circularization, an isothermal exponential RCA (ERCA) reaction involving an exonuclease (-) DNA polymerase with strand-displacement activity and two primers rapidly amplifies the signal by as much as 10 12 -fold, allowing for direct SNP genotyping from small quantities of DNA target [6,11,12].
We describe here a simple, scalable assay for SNP genotyping directly from human genomic DNA that uses a 96well plate format and fluorescent primers called Amplif-luors™ [13][14][15]12,16]. Ten different SNPs have been characterized, each for DNA samples from 192 different individuals, using this reporter assay system. We also report that the ERCA allows genotyping with as few as ∼ 300 copies of the target sequence.

Experimental strategy
Two OCPs were designed for each SNP screened, one complementary to each of the two possible bases of a biallelic SNP. If the OCP is complementary to the SNP, it can be circularized by DNA ligase (Fig. 1). The OCPs were ∼ 90 nucleotides in length, and are designed such that regions at both the 5' and 3' ends annealed to contiguous segments of the SNP-containing target sequence. The OCP backbone sequence connecting the target complementary 5' and 3' regions is unique for the two allele-specific OCPs. The 5' target-specific region of the OCP is 20-30 bp in length, while the 3' target-specific region is 12-20 bp [12]. The Tm of the 5' target specific region is ∼ 5°C above the ligation temperature, while that of the 3' target specific region is ∼ 15°C below the ligation temperature. Allele specificity of circularization is dependent upon two factors: 1) favorable hybridization of the matching OCP because of the correct base pair formed at the 3' end Schematic of RCA for SNP identification. An allele specific OCP anneals to the target sequence so that the 5' and 3' arms are adjacent to each other. The 3' terminal nucleotide of the OCP is at the SNP position. If it is complementary to the target, the OCP ends are covalently linked by a DNA ligase to form circles. The circles are then amplified in an ERCA reaction containing two primers, one of which is an Amplifluor containing a fluorophore and a non-fluorescent quencher as indicated. Incorporation of the primer into DNA product opens the primer hairpin giving a fluorescent signal. and also because of base stacking between the 3' and 5' annealed ends, and 2) specificity of the ligase for correctly base paired ends.
Upon ligation, the DNA circle creates an effective template for an exponential, or hyperbranching, RCA reaction [6,11]. The region of the OCP between the targetspecific ends, which we have designated the OCP backbone, provides a binding site for a complementary primer (P1). In addition to P1, a reverse primer (P2) is also present to achieve amplification with exponential kinetics ( Fig. 1). P1 is an "Amplifluor™"-primer, with a 5' hairpin and loop structure with a fluorophore and quencher at the base of the hairpin arms [13,12,16]. The two allelespecific OCPs have different backbone sequences complementary either to a FAM-or TET-labeled P1 allowing discrimination of the polymorphic base. Both P1s have an internal DABCYL quencher. Amplifluor primers are designed such that, at the ambient temperature of the assay, the fluorophore and the quencher are in close proximity, preventing fluorescence. Incorporation of Amplifluor into a double-stranded ERCA product opens the hairpin separating the fluorophore and quencher, resulting in increased fluorescence. Two different allelespecific P2s are also used in the assay. They are partly target-and partly backbone-specific, corresponding to the 3' target-specific region of the OCP, and 7-12 nucleotides of the adjacent backbone sequence. The backbone-specific 5' region of the allele-specific P2s not only increases the Tm of the primer to that desired for the assay, but also confers OCP specificity. P2s have two abasic residues at the 5' end to reduce primer-dimer formation (unpublished data).

Analysis of allele discrimination during the OCP ligation step of the assay
The first step in the genotyping assay involves SNP interrogation by ligation of the OCP using the thermostable ligase, Ampligase™, which is reported to have excellent discrimination between a matched and mismatched 3' terminal nucleotide [17]. In order to directly investigate the efficiency and specificity of the ligation step, allele discrimination of OCP ligation was analyzed using 5' 32 Plabeled OCP. The oligonucleotide target contained the sequence for the M1101K locus of the CFTR gene, a disease-causing mutation. An electrophoretic mobility shift for circularized OCP allowed quantification of ligation kinetics. Use of 40 nM OCP and 100 nM oligonucleotide target allowed sufficient signal for detection without the ERCA step that is required for genomic DNA target. 10-15% of the OCP matching the SNP were ligated in 15 sec (Fig. 2a). For mismatched OCP, less than 1% was ligated after 6 hours (Fig. 2b). The ability of Ampligase to discriminate against a mismatch was measured by comparison of initial ligation rates. Allele Discrimination = (percent match ligation/percent mismatch ligation) (time mismatch /time match ). Ligation of matched OCP occurred at a 100,000-fold greater rate than for mismatched OCP. Greater temperature and shorter 3' OCP arm length both tended to give increased allele discrimination (Fig. 3). The T m of the 5' OCP arm is greater than the reaction temperature and so the OCP tightly anneals to the target. However, the shorter 3' arm, having a T m below the reaction temperature, is in equilibrium between the annealed and melted state. A component of the allele discrimination is apparently derived from differential hybridization of the matched and mismatched 3' arms. Presumably, increasing reaction temperature has the same effect as shortening the 3' OCP arm.

Analysis of the ERCA signal amplification step of the assay
In reactions where the OCP matches the SNP, an estimated 100-30,000 OCPs are circularized by ligation de-

Figure 2
Kinetics of the ligation reaction. All reactions contained an oligonucleotide target with the wild type T at the SNP position. Reactions labeled ''Match'' had the corresponding 3' A on the OCP (a). Reactions labeled ''Mismatch'' had T on the OCP 3' end (b). Reactions contained 20 unit Ampligase/reaction (closed circles), 40 unit Ampligase/reaction (open circles), or 80 unit Ampligase/reaction (triangles). OCP was labeled on the 5' end with γ-32 P-dATP and T4 polynucleotide kinase. Ligation was quantified as percent conversion to circular form on a 6% DNA sequencing gel.

0LVPDWFK
OLJDWLRQ +RXUV D E pending on the amount of input DNA target. Detection of this small number of circles depends on ERCA in the second stage of the assay. ERCA is capable of 10 12 -fold signal amplification [6]. In order to investigate the efficiency of the ERCA step, OCP was circularized by ligation on a synthetic oligonucleotide target, and gel-purified to remove any linear DNA. A known number of the purified circles were added to reactions allowing direct analysis of ERCA signal amplification. The circles were formed using OCP-1822T for the SNP G1822A using an oligonucleotide target (Table 1I). A 10-fold serial dilution of the circles ranging from 10 6 to approximately one circle was used in the ERCA reactions. An Amplifluor primer was used to measure the kinetics of the reaction in a real-time instrument. The results confirmed that the assay is capable of detecting < 10 molecules of circular template (Fig. 4a). The time at which fluorescent signal is first detected (C T ) is proportional to the number of starting templates over the six orders of magnitude tested (Fig. 4b). This demonstrates the quantitative potential of ERCA and that it is capable of detecting and amplifying the small number of circles formed by ligation on genomic DNA.

SNP genotyping on human genomic DNA
The G1822A SNP on chromosome 13q32 was used for genotyping with OCPs using the SNP assay (B. Grimwade, unpublished). Human genomic DNA was digested with the restriction endonuclease Alu I and assayed in two different tubes, each containing one of the two allele specific OCP. Reactions contained 0.5 pM OCP, and 100 ng of the genomic DNA, corresponding to a gene copy number of approximately 30,000. OCP ligation was performed with Ampligase for 20 min at 60°C, approximately 15°C above the Tm for the OCP 3' arm, in order to maximize specificity of ligation. Since the 5' arm of the OCP has a Tm above that of the ligation temperature, the 5' arm hybridizes to its target in a stable manner, and SNP specificity is achieved via the 3' arm.
The isothermal ERCA reactions were performed at 60°C using Bst DNA polymerase. The reactions contained the appropriate OCP-specific P1 Amplifluor primer and the corresponding allele-specific P2 (1 uM each). When the genomic DNA was homozygous for the G allele (Fig. 5a, top panel) a fluorescent signal was detected only for the OCP with a matching C at the 3' end and the corresponding primer pair P1inFAM and P2inC. Similarly, when the reaction contained genomic DNA homozygous for the A allele (Fig. 5a, center panel) a fluorescence signal was detected only with the OCP containing a matching T at the 3' end and the corresponding primer pair P1ocTET and P2ocT. Only background fluorescence was detected for the mismatched OCPs. As expected, when the target DNA was heterozygous, a fluorescence signal was observed with both OCP/primer combinations (Fig. 5a,  lower panel). When the reaction end-point products were analyzed on an agarose gel, an ERCA product ladder was observed only for reactions where the OCP correctly matched the SNP base (Fig 5b). ERCA reactions produce a characteristic ladder of discrete bands because the product consists of double stranded, tandemly linked multiples of the circle sequence [6,11]. As expected, the ladder band spacing was consistent with the length of the OCP.

Validation of the assay for ten SNPs for 192 individuals
The accuracy of the SNP genotyping assay was demonstrated on two sets of human genomic DNA samples, each containing DNA from 96 different individuals. Ten different SNPs representing five of the six possible types of single nucleotide substitutions were assayed (Table  2II). Results were compared to the known genotypes determined by restriction fragment length polymorphism or DNA sequencing of both strands by the single base extension method. The SNP assay gave correct genotypes an average of 93 % of the time (Table 2II). When misscored samples were repeated in triplicate the genotyping accuracy was above 99% (data not shown). A scatter plot of the data for SNP G1822A shows the clear distinction between each of the three possible genotypes (Fig.  5c). The protocol uses a microtiter plate format providing a high throughput method of genotyping.

Single-tube assay for SNP genotyping using a low copy number of targets
The above experiments involved a two-tube assay with 100 ng of genomic DNA and one OCP/P2 combination per tube. Therefore, a total of 200 ng genomic DNA was used per DNA sample to be investigated. One of the objectives for high-throughput screening of SNPs is the ability to accurately detect a SNP from a small amount of genomic DNA. Previously, it was shown that 20 ng of genomic DNA, equivalent to ∼ 6000 copies of the gene, was sufficient to detect a C/T polymorphism using the serial invasive signal amplification reaction (SISAR) [18]. In order to investigate the sensitivity of ERCA for detecting SNPs, 1 ng of genomic DNA, equivalent to ∼ 300 copies of the gene, was used as target for circle formation with OCPs for the SNP G1822A.
The method was also simplified so that instead of one tube each for the two allele-specific OCPs, a single-tube for both OCPs was used. Both OCP systems could be combined in the same tube because the seven 5' bases of P2 are complementary to the unique backbone region of the OCP. The P2 for one allele could only effectively prime one OCP and the P2 for the other allele could only prime the other OCP at the ERCA reaction temperature. The P1 primer was the same for the two OCPs and detection and specificity was achieved by two allele-specific Amplifluor P2s carrying either a FAM or a TET fluorophore. The annealing/ ligation reaction therefore includ-  ed both of the allele-specific OCPs (OCP-1822C and OCP-1822T, Table 1I), and the ERCA reaction was performed in the presence of both of the allele-specific P2 Amplifluor primers (P2-1822C-TET and P2-1822T-FAM). This assay detected and differentiated all three possible genotypes for the G1822A SNP (Fig. 6). The background signal observed was slightly higher than that obtained with 100 ng genomic DNA in a two-tube system, with a signal/noise ratio of 5:1. However, correct SNP genotyping was observed with 48 different human genomic DNA samples (data not shown).

Discussion
We have demonstrated a solution-based, efficient, homogeneous and robust assay for genotyping SNPs directly from human genomic DNA utilizing ligation of open circle probes and rolling circle amplification. The design of the assay is fairly straightforward and the assay can be carried out in three hours (30 min for the denaturation / annealing / ligation reaction and 2 1/2 h for the amplification reaction) in a 96-well format. Specificity of the OCP to its target sequence is achieved by the complementarity of the two ends of the OCP to the target and the requirement of these ends to be adjacent for ligation. Single nucleotide discrimination is achieved at the ligation step by the use of the thermostable DNA ligase, Ampligase, which has a high affinity for a perfectly matched substrate at the 3' end of a DNA molecule [17]. We were able to enhance allele discrimination at the ligation step by designing the OCP such that the 5' complementary region is firmly hybridized to the target sequence whereas the 3' region is in equilibrium with its target at the ligation temperature [12]. This would result in increased specificity since the correctly matched OCP will have a greater chance to act as substrate for the ligase. It has been previously reported that 3'-T/G and 3' G/T mismatches are not good substrates for single nucleotide discrimination [17]. However, we found that the G1822A SNP, which would result in a 3'-T/G mismatch, was efficient for allele discrimination. The ability of this assay to genotype any SNP regardless of the base pair involved is an important advantage over assays based on primer extension such as PCR.
Allele discrimination achieved at the ligation step results in small circular DNA molecules topologically linked to the target DNA strand. These DNA circles are amplified in the powerful homogeneous ERCA reaction capable of 10 12 -fold signal amplification [6], similar to that achieved with PCR technology, the current gold standard in genetic analysis and quantitation. However, PCR involves exponential target amplification, thereby increasing the risk of amplicon cross-contamination. Even though this shortcoming can be overcome, it increases the cost and complexity of the assay, making it less attractive for high-throughput analysis. Since ERCA is a signal (circle) amplification method, it does not have the problems associated with PCR. In addition, ERCA is an isothermal reaction and the reaction endpoint can be used as the assay readout. Even though the present study was conducted on a real-time ABI 7700 Sequence Detector instrument the strategy can be easily adapted for a simple fluorescence plate reader coupled to an adjustable heating block. These properties make it an ideal system for high-throughput analysis.
The assay was tested for 10 SNPS on two sets of 96 different DNA samples (Table 2II). Results were compared to the known genotypes that we determined by RFLP or single nucleotide sequencing reactions. The assay had an average genotyping accuracy of 93% when samples were screened initially. When the mis-scored samples were repeated in triplicate, the genotyping accuracy jumped to over 99%. The majority of the mis-scoring involved a homozygous sample being called heterozygous, i.e., an amplification signal was observed with both sets of OCP/ primer combinations. A DNA sample homozygous for one allele was never genotyped as homozygous for the other allele. This implied that there is a low frequency of DNA synthesis artifacts resulting in a fluorescence signal.
Indeed when these reactions were analyzed on an agarose gel, the size of the characteristic ERCA DNA ladder was different from that obtained with amplification of the OCP (data not shown). We have used abasic residues at the 5' ends of P2 primers in order to reduce these artifacts. Other nucleotide modifications that have been reported to reduce primer-dimer formation will be tested to improve the accuracy of SNP genotyping. In addition, reagents that have been shown to reduce primer-dimer formation in PCR will be tested in the ERCA reaction.
Background signal is sometimes related to the amount of OCP used in the assay. For each SNP, the optimal OCP concentration needs to be determined before screening. Excess OCP concentrations result in an increase in nonspecific fluorescence signal, therefore lowering the accuracy rate of genotyping. Potentially, un-ligated OCP can act as template or primer giving rise to a low level of nonspecific DNA synthesis and subsequent fluorescent signal. We are currently developing a modified OCP design to overcome the need for concentration optimization. In this design, the 3' region of the OCP forms a stable hairpin-loop structure and opens up only to anneal to it target sequence (O. Alsmadi, unpublished). As with molecular beacons, this may also improve target specificity [19,20]. Any unused OCP is self primed and extend-

Figure 6
Genotyping of SNP G1822A on 1 ng of genomic DNA in a single-tube reaction. Ligation reactions were performed with 1 ng of Alu I digested genomic DNA in single tubes containing both of the allele-specific OCPs. ERCA reactions included both of the allele-specific P2 Amplifluor primers. Real-time amplification signal was observed only when the OCP/ P2 combination matched the SNP nucleotide on the genomic DNA. Therefore, FAM and TET P2s gave a signal with DNA homozygous for the A and G alleles respectively. DNA heterozygous for the SNP gave an amplification signal with both P2 Amplifluor primers.
ed to form an inert double stranded molecule, thus eliminating OCP as a source for non-specific amplification. Initial experiments with this design have been encouraging.

Conclusions
We have described a solution-based SNP genotyping assay that is simple, sensitive and robust and can be easily formatted to high-throughput analysis of single nucleotide polymorphisms and mutation detection directly from human genomic DNA. The SNP genotyping assay uses a simple, homogeneous, fluorescence readout and can be carried out on inexpensive instruments already available in many academic and industrial laboratories.
Amplifluor is a trademark of Intergen Company.
Ampligase is a trademark of Epicentre Technologies Corp.

Materials and methods
Oligonucleotides Table 1I shows the sequences of the oligonucleotides used in the study. Gel-purified and phosphorylated OCP oligonucleotides were obtained from Integrated DNA Technologies, Inc. (Coralville, IA). Abasic and Amplifluor primers were synthesized in-house and purified by HPLC. The two Amplifluors are labeled with a differently colored fluorophore: fluorescein (FAM) and tetrachloro-6-carboxyfluorescein (TET) and contain an internal nonfluorescent quencher DABCYL.

Instrumentation
DNA denaturation, annealing and ligation reactions were carried out in an Eppendorf Master Cycler (Eppendorf Scientific, Germany). ERCA reactions were performed in the Real-Time ABI 7700 Sequence Detector (Perkin Elmer).

Genomic DNA
Genomic DNA samples were obtained from (a) National Institute of Mental Health (NIMH) Center for Genetic Studies, Rutgers University Cell and DNA Repository, Piscataway, NJ and (b) Coriell Cell Repository (HD 100 CAU). The DNA samples were digested with the restriction endonuclease Alu I before being used as template in the ligation reaction.

DNA annealing and ligation
The reactions were set up in 96-well MicroAmp Optical plates (Perkin Elmer) in a 10 µl reaction volume containing 1 U Ampligase (Epicentre Technologies), 20 mM Tris-HCl (pH 8.3), 25 mM KCl, 10 mM MgCl 2 , 0.5 mM NAD, and 0.01% Triton ® X-100. Standard reactions contained 0.5 pM OCP and 100 ng of Alu I digested genomic DNA. DNA was denatured by heating the reactions at 95°C for 3 min followed by annealing and ligation at 60°C for 20 min.