Strong position-dependent effects of sequence mismatches on signal ratios measured using long oligonucleotide microarrays
- Catriona Rennie†1, 2,
- Harry A Noyes†1Email author,
- Stephen J Kemp1,
- Helen Hulme3,
- Andy Brass2, 3 and
- David C Hoyle4
© Rennie et al; licensee BioMed Central Ltd. 2008
Received: 29 March 2008
Accepted: 03 July 2008
Published: 03 July 2008
Microarrays are an important and widely used tool. Applications include capturing genomic DNA for high-throughput sequencing in addition to the traditional monitoring of gene expression and identifying DNA copy number variations. Sequence mismatches between probe and target strands are known to affect the stability of the probe-target duplex, and hence the strength of the observed signals from microarrays.
We describe a large-scale investigation of microarray hybridisations to murine probes with known sequence mismatches, demonstrating that the effect of mismatches is strongly position-dependent and for small numbers of sequence mismatches is correlated with the maximum length of perfectly matched probe-target duplex. Length of perfect match explained 43% of the variance in log2 signal ratios between probes with one and two mismatches. The correlation with maximum length of perfect match does not conform to expectations based on considering the effect of mismatches purely in terms of reducing the binding energy. However, it can be explained qualitatively by considering the entropic contribution to duplex stability from configurations of differing perfect match length.
The results of this study have implications in terms of array design and analysis. They highlight the significant effect that short sequence mismatches can have upon microarray hybridisation intensities even for long oligonucleotide probes.
All microarray data presented in this study are available from the GEO database , under accession number [GEO: GSE9669]
Microarrays are widely used for monitoring gene expression levels, using cDNA as a target, and for monitoring DNA copy number variations using genomic DNA as a target (Comparative Genomic Hybridisation CGH) [2–6]. Clearly sequence mismatches will affect the efficiency of hybridisation to probes spotted on microarrays and therefore the accuracy with which microarray based assays report the gene expression levels or genomic copy numbers. Understanding and quantifying the errors introduced by sequence mismatches is important, not only for handling microarray data, but also data from any high-throughput assay that makes use of duplex formation, for example sequence capture for regional sequencing [7–10]. The effects of sequence mismatches on hybridisations using short oligonucleotides (32 mers or shorter, such as the 25 mers used in Affymetrix GeneChips) have been reported previously [2, 11–13], but we are not aware of any comprehensive experimental studies on the effects of sequence mismatches on duplex formation with longer oligonucleotides (50 mers to 100 mers). This was the objective of the current study. We integrated mouse CGH microarray data with the 8 million SNP that have been published for 15 inbred strains  in order to identify the effect of mismatches at each position in the probe on log2 signal ratio. This made it possible to characterise sequence mismatches that affect microarray hybridisation on a genome-wide scale.
Overview of a microarray hybridisation
Nucleic acid probes are tethered to a solid support, such as a glass slide, with multiple copies of each probe sequence attached within the same spot on the slide. Nucleic acid strands are extracted from the sample, fragmented and labelled with a fluorescent dye. The labelled strands are called targets. The targets are incubated with the array for 16–48 hours to allow hybridisation to occur [2–6]. The targets are excited with a laser and the resulting fluorescent signals from each of the spots are measured. Where more target strands have hybridised to the probes for a particular sequence, there will be a stronger fluorescent signal from the relevant spot. Hence, the signal intensity from the spot can be used to estimate the amount of the sequence in the sample [2–6].
Often, as in this study, competitive hybridisations are used. In this case, two targets, labelled with different dyes, are hybridised to the same array. The ratio of fluorescent signal intensities from the two dyes is measured, and used to estimate the ratios of the amounts of the target in each sample with the assumption that the targets have equal binding affinities to the probe [3, 4].
However, microarrays are commonly used in situations where there may be mismatches between the probe and target sequences, such as variation between individuals, strain differences or interspecies differences. These sequence differences can reduce the hybridisation efficiency between probe and target strands, thus reducing the measured fluorescent signal intensity.
Short oligonucleotide probes, cDNA probes and long oligonucleotide probes
Short oligonucleotide probes, such as Affymetrix 25 mer probes, are known to be very sensitive to mismatches [2, 11–13]. This is partly due to the probe length and partly due to the analysis methods used [2, 11–13]. Indeed, the observed sensitivity to mismatch position of Affymetrix 25 mer probes has been exploited for SNP detection in applications such as SNPscanner . Long cDNA probes, often hundreds of bases long, are less sensitive to mismatches . The usual explanation for this is that a single base mismatch is unlikely to have a substantial effect on the probe-target duplex melting temperature.
It might seem reasonable to assume that the effect of mismatches on long oligonucleotides, intermediate in length, would be intermediate between these two extremes. However, relatively few studies have examined hybridisation of mismatched targets to long oligonucleotide microarray probes. Kane and co-workers examined cross hybridisation of non-target DNA to 50 mer oligonucleotide expression arrays and found detectable hybridisation signals from non-target transcripts with similar sequence to the true targets or with a continuous stretch of sequence complementary to the probe. However, the precise effect of individual mismatches on signal intensities from long oligonucleotides was not investigated . Letowski and co-workers investigated the influence of various factors on the performance of microarray probes of varying type and length, including 50 mer oligonucleotide probes that incorporated known mismatches. They reported that mismatches affected probe specificity, with mismatches at the ends of the probe having the least effect. Mismatches distributed along the length of the probe sequence caused more destabilisation of probe-target duplexes than mismatches clustered together .
In a review of genomic microarrays, Mantripragada and co-workers predicted that arrays using long oligonucleotides between 50 mers and 100 mers would largely replace BAC- and PCR-based microarrays for CGH . Hughes and co-workers compared the performance of a range of inkjet-printed oligonucleotides and reported that 60 mers represented a practical compromise between maximum sensitivity and specificity . Given the growing popularity of long oligonucleotide probes for both gene expression arrays  and CGH arrays, it is increasingly important to understand the effects of mismatches in reducing the signal intensity from these probes.
Studying the effect of sequence mismatches using mouse CGH data
Mouse is an ideal species for investigating these effects, due to the availability of inbred strains and public datasets describing genomic sequence variation between these strains . Long oligonucleotide CGH arrays provide a useful platform for examining the effect of mismatches since target abundances are largely fixed, target sequences will not be modified by alternative splicing and 60 mer oligonucleotides have been demonstrated to provide a good compromise between sensitivity and specificity [20, 21].
We carried out competitive two sample hybridisations with dye-flip replicates for each of three inbred mouse strains (129P3/J, A/J and BALB/cJ) against a C57BL/6J reference on the Agilent whole mouse genome 244K CGH array and a custom 56K Agilent mouse CGH array, both using 60 mer oligonucleotide probes. We then compared NCBI mouse genome build 36 position information for the 8 million SNP in the public Perlegen dataset  and the Agilent probe sequences to identify SNP that would cause a mismatch between a probe and one or more of the test strain targets. Since the probes were designed against the C57BL/6 genome sequence any SNP would give rise to a mismatch between the probe and the test targets but not between the probe and the C57BL/6 control target and hence a higher signal from the control target if mismatches have an effect.
Our initial observations indicate a strong effect of mismatches on log2 signal ratio, dependent on the number of mismatches and on their position relative to the probe sequence. More specifically, we identified a strong correlation between log2 signal ratio and the maximum continuous length of complementary duplex when comparing probes overlapping 1 and 2 SNP.
Thus, positive log2 signal ratios imply less efficient hybridisation for the test sample than for the C57BL/6 reference and conversely negative log2 signals imply more efficient hybridisation for the test sample. If the samples had equivalent levels of hybridisation to a probe, the log2 signal ratio would be close to 0.
Number and percentage of probes overlapping 1, 2 or 3 SNP loci for each test strain
244 k whole genome probe set
1 SNP (% of probes in set)
2 SNP (% of probes in set)
3 SNP (% of probes in set)
A/J v C57BL/6J
BALB/cJ v C57BL/6J
129P3/J v C57BL/6J
Any v C57BL/6J
56 k custom probe set
A/J v C57BL/6J
BALB/cJ v C57BL/6J
129P3/J v C57BL/6J
Any v C57BL/6J
Mismatches are associated with reduced signal intensity from long oligonucleotide probes
There was a significant association between log2 signal ratio and number of known mismatches (ANOVA p < 0.001) and a significant correlation co-efficient between number of known mismatches and log2 signal ratio (r2 = 0.94, p < 0.05).
The increase in mean log2 signal ratio with increasing number of sequence mismatches provides useful confirmation that sequence mismatches have an observable effect on signal intensities from long oligonucleotides as well as those from short oligonucleotides. It is clear that a higher number of mismatches is associated with a stronger effect on signal intensity. This observation might be expected, and could be consistent with a model of hybrid formation where the effect of mismatches simply results in loss of enthalpy from the hydrogen bonds that would have been formed during base-pairing.
Mismatches near the middle of probes are associated with a greater reduction in signal intensity than those near the end of probes
The significant dependence of log2 signal ratio upon position of the single base mismatch was unexpected for long oligonucleotide probes. A correlation between log2 signal ratio and mismatch position would not be expected if the only effect of a mismatch was on enthalpy. The loss of enthalpy, from breaking of the 2 or 3 hydrogen bonds of a complementary base pair, is the same for all mismatch positions. However, the range and diversity of configurations that the probe and target strands can adopt also forms a contribution to the thermodynamic stability of the probe-target hybrid. Therefore the number of potential configurations, i.e. the entropy, must also be considered when attempting to understand the effect of sequence mismatches on log2 signal ratio.
Log2 signal ratios are strongly correlated with the maximum length of perfect match
These observations provide further evidence that the effect of mismatches is more complex than a simple loss of enthalpy for each mismatched base. It appears that a large factor in the effect of mismatches on 60 mer probe-target hybridisation is a reduction in the maximum length of perfectly matched duplex.
Additional factors affecting the log2 signal ratio
GC content produces another effect related to the identity of bases in the sequence. Probes with a high GC content are known to have a higher melting temperature due to the presence of more hydrogen bonds. GC content may also affect probe specificity and the temperature sensitivity of probe-target hybrids . However, probes are designed to have a similar melting temperature. In addition, since the observations of the position-dependent effect of mismatches were averaged over all probes, they should not be affected by GC content. To confirm that probe GC content did not have a significant effect on log2 signal ratio, we checked the correlation between proportion of GC bases and log2 signal ratio for all three test strains. As anticipated, the correlation coefficients were all close to zero (-4.84 × 10-4 for A/J, -2.88 × 10-3 for BALB/cJ and 3.2 × 10-2 for 129P3/J).
Remaining possible sources of variations in the log2 signal ratios include random noise in the measured signal intensities, small deletions in the test strains and previously unidentified SNP (the Perlegen dataset is estimated to contain about 43% of SNP present in the strains genotyped ). Any of these causes might explain the probes in the dataset that produced a high positive log2 signal ratio but that did not overlap with any SNP locus in the Perlegen dataset. In order to determine whether unpublished SNP might contribute to non-zero log2 signal ratios the data was scanned to discover whether there was any significant excess of probes with log2 signal ratios > 1 in regions of the genome known to contain SNP in the relevant test strains. The mouse genome was divided into 50 kb blocks, then we obtained the number of SNP in each block for each of the test strains. There was a significant excess of probes with log2 signal ratios > 1 in the 50 kb intervals that had at least one SNP (χ21; p < 10-18). There was also an excess of probes that had a log2 signal ratio > 1 and a SNP within 500 bp when compared with the same number of probes chosen at random. A/J and BALB/cJ probes with no mismatches in the Perlegen dataset but with a log2 signal ratio > 1 had a relative risk of 1.7 and 2.3 of having a mismatch within 500 bp compared with probes chosen at random (χ21; p <10-8). For 129P3/J the relative risk was 1.2 (χ21; p = 0.023). The presence of SNP that are informative between strains suggests that each strain carries a different form of the whole haplotype block that covers the region. The different forms of the haplotype may contain multiple SNP or small genomic indels that are not recorded in the Perlegen dataset and that might contribute to altered log2 signal ratios. This raises the possibility that probes with high log2 signal ratios might be used to identify candidate regions for re-sequencing to identify SNP and small CNV.
We also identified some probes with low negative log2 signal ratios, although only around 1/5 as many as those with high positive ratios. Possible explanations for negative log2 signal ratios include the presence of duplications in the test strains, deletions in the C57BL/6 reference DNA , SNP in the test strains that create additional probe binding sites and random noise in the measured signal intensities.
We have demonstrated that sequence mismatches are associated with higher log2 signal ratios from long oligonucleotide microarray probes. This effect is position dependent, with mismatches near the centre of a probe having a stronger effect on log2 signal ratio than mismatches near one end of a probe. There is a strong correlation between log2 signal ratios from probe-target pairs containing 1 mismatch and log2 signal ratios from pairs containing 2 mismatches when the pairs contain the same maximum length of perfectly matched duplex (r2 = 0.43).
Whilst there is extensive evidence for an effect of mismatches on results from microarray hybridisations, much of this applies to results from short oligonucleotide arrays. Some studies have discussed an effect of mismatch position when using short oligonucleotide probes. Terminal mismatches in very short duplexes have long been known to have less effect than internal mismatches . Mismatches near the centre of the probe have a stronger destabilising effect than mismatches close to either end, both for hybridisations in solution  and for microarray hybridisations [13, 23, 28, 30]. However, while this difference in destabilisation has been observed frequently, and used in applications such as SNP detection [15, 28], the difference has not been examined in detail or explained in terms of thermodynamic properties.
Comparatively fewer studies have reported mismatch effects on results obtained using long oligonucleotide probes [17, 18, 20]. Hughes and co-workers described the importance of a base in terms of microarray hybridisation efficiency as roughly proportional to the distance of the base from the array surface, possibly due to steric effects . Letowski and co-workers identified a smaller destabilising effect for mismatches clustered at either end of a probe than for mismatches clustered near the probe centre, and likewise a smaller effect for mismatches clustered in any position than for mismatches spread out along the probe sequence . However, they did not attempt to explain this finding and it is clear that the dependence of duplex stability on the maximum length of perfect match in the probe-target hybrid might provide such an explanation.
The role of maximum perfect match length
We found that, at least for small numbers of sequence mismatches, the mismatch positions themselves are less important than the maximum length of perfect match that results from the mismatches. For one mismatch the length of perfect match also appears to exert a greater influence on log2 signal ratios than the type of polymorphism, accounting for nearly five times as much of the variation in log2 signal ratio.
There is some support for the suggestion that maximum length of perfect match has a role in determining hybridisation efficiency. Kane and co-workers examined cross-hybridisation of non-target DNA to 50 mer oligonucleotide expression arrays. Detectable signals were found from non-target transcripts that contained a continuous region of 15 bases or longer perfectly matched to the probe sequence and longer continuous complementary regions were found to produce a stronger signal . Sasaki and co-workers identified a similar effect on hybridisation of full-length cDNA targets to tiling arrays of Affymetrix 25 mer genomic probes . However, none of these studies investigated the effects of individual single base mismatches, and although the effect has been observed, there has not been a systematic investigation or an explanation of why this effect occurs.
Theoretical model development
We have begun development of a theoretical model of microarray hybrid formation, based upon the Poland-Scheraga model , that explicitly takes into account partial duplex configurations (as outlined above). The current version obtains good agreement with the qualitative aspects of the experimental results presented here. Generally, this highlights the need to build upon existing models of hybrid formation and take into account the specific conditions unique to microarrays. Several research groups have found hybridisation behaviour on microarrays to differ from that in solution, with attachment to a surface having a marked effect [13, 24, 29, 30], though for short oligonucleotide microarrays, hybridised at relatively low temperatures, there are strong correlations between microarray intensities and the free energies for the same probe-target duplexes in solution [26, 30, 35] and between the cost of mismatches for microarray probes and the cost calculated in solution . It is worth noting that entropic contributions to free energy changes on arrays are obviously different to those in solution , due in part to the additional complexities involved in hybridisation of targets to microarray probes, such as the probes being attached at one end to a surface and probes being closely spaced on the array [36–39].
Even without development of new models, these results have implications in terms of microarray design and interpretation of microarray results. Most current approaches to microarray design are based on data from hybridisations in solution [23, 30], which may not accurately reflect the hybridisation conditions for microarrays. As the potential applications for microarrays extend, there is an increasing need to understand the effects of sequence mismatches. Several studies have demonstrated that oligonucleotide arrays can be used for genomic DNA capture for high-throughput sequencing of specific genomic regions [7–10]. For example, it is possible that this approach could be used in attempts to re-sequence all human coding regions in hundreds or even thousands of individuals, providing a resource for investigating links between disease susceptibility and genetic variations [7, 10]. The evidence presented in this study suggests that probes for DNA capture should be designed to avoid SNP loci. If that is not possible, then positioning SNP to maximize the length of continuous perfect match to targets is likely to reduce the risk of selectively capturing only some of the intended target strands.
Our results also raise the possibility of using microarray CGH results to identify putative small CNVs and SNP for confirmation by high-throughput sequencing or other methods. CNVs are an important type of genetic variation. Approximately 4% of the human genome has undergone recent duplications [40–42]. CNV have also been identified between different mouse strains, and even between different colonies of the same inbred mouse strain [27, 43]. Studies of murine CNV have alluded to thousands of single-probe aberrations, which were attributed to the presence of SNP [27, 44]. Microarray CGH analysis software usually requires 3 continuous probes passing a log2 signal ratio threshold in order to call an aberration. However, results from human ultra-high-density tiling arrays find many small CNV < 1000 bp . Egan and co-workers investigated 65 single probe aberrations in a comparison of C57BL/6 mice from different colonies. 20 of these were successfully confirmed as small CNV and a further 11 were found to contain SNP but all these would be missed by a heuristic that required 3 contiguous probes to have a non-zero signal ratio . In this study, we showed a large excess of probes with log2 signal ratio greater than 1 or less than -1 in more than one strain over what would be expected by chance, suggesting some potential for these single probe aberrations to indicate putative SNP or small CNV.
Sequence mismatches have an observable effect in reducing the signal intensity reported by long oligonucleotide probes on CGH microarrays. This effect depends on the position of the mismatch relative to the probe, being stronger for mismatches near the centre of the probe than for those at the ends. We also found that the length of perfect match can have a stronger effect on log2 signal ratios than the type of polymorphism. These observations have implications in terms of array design and analysis, relevant to the use of microarrays in genomic DNA capture for high-throughput sequencing.
Microarray CGH data
We obtained genomic DNA from Jackson Laboratories (Bar Harbor, Maine, USA) for mouse strains C57BL/6J (Jackson stock number 000664), BALB/cJ (000651), 129P3/J (000690) and A/J (000646).
We carried out two-sample hybridisations, using C57BL/6J as a reference, using the Agilent Mouse Genome CGH Microarray 244A platform and a custom 56K Agilent mouse microarray platform. Both platforms use inkjet-printed 60 mer oligonucleotide probes .
We performed one hybridisation plus one dye-flip replicate for each of the three test strains (129P3/J, A/J and BALB/cJ) on each of the two array platforms. We hybridised 12 μg gDNA in 520 μL of 750 μM NaCl at 65°C for 48 hours, followed by two 5 minute washes at 37°C, according to manufacturer's instructions .
We used the Agilent feature extraction software to carry out a linear dye adjustment using a calibration sample of probes, equivalent to a centering normalization protocol , according to the standard procedures described in the Agilent feature extraction software v9.5 reference guide . The inclusion of dye-flips within our experimental design effectively automatically implements a paired slide normalisation to produce centralised log2 signal ratios of test strain to C57BL/6J and eliminates intensity-dependent bias within the log2 signal ratios . We then used Z-scoring to identify aberrant regions, following the standard Agilent procedures described in the CGH Analytics 3.4.40 user guide .
Combining probe sequence and SNP data to identify mismatches in probes
We retrieved SNP data from the Perlegen dataset (genotypes for 8 million polymorphic loci from resequencing 15 inbred mouse strains, commissioned by the National Institute of Environmental Health Sciences) .
We obtained probe sequences and positions on the NCBI34 mouse assembly from Agilent and mapped probes onto a local copy of the NCBI36 mouse assembly using BLASTn. Probe information is available from GEO  along with other array data under accession [GEO:GSE9669]. We discarded probe sequences without a high-scoring match (e-value < e-17) the same length as the probe and did not include them in the dataset. We conducted BLASTn searches against the whole genome build for a 85805 probe sample. Only 1 probe had perfect BLASTn matches on more than one chromosome. For the remaining probes, we only performed BLASTn searches against the chromosome listed in the Agilent annotation. 699 out of over 235000 probes (0.28%) had perfect matches with more than one region of the same chromosome.
We extracted positions of SNP within the NCBI36 mouse assembly from the Perlegen annotation and used them to identify the positions of mismatches within each probe. A table of probes that contained SNP together with SNP position, substitution type, length of perfect match and log2 signal ratio is included in supplementary data for the mouse whole genome array [see Additional file 1] and supplementary data for the custom array [see Additional file 2].
Data handling and analysis
We developed a MySQL database to store the positions of probes and of mismatches between C57BL/6J and each test strain (129P3/J, A/J and BALBc/J) to facilitate analysis. We wrote Perl scripts to make comparisons and calculations using this data, such as counts of the number of probes over various thresholds and statistical tests. The Perl scripts and the database tables are available from the authors upon request.
Comparative Genomic Hybridisation
Copy Number Variation
Single Nucleotide Polymorphism.
We thank Leanne Wardlesworth of the University of Manchester Core Services unit and Tara Hill of Agilent for excellent technical assistance. Funding to HAN, CR, SJK, HH, AB, Wellcome Trust (GR066764MA to SJK.). Funding to CR from a BBSRC PhD studentship. The Wellcome Trust and BBSRC were not involved in the design or execution of this study.
- Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista C, Kim IF, Soboleva A, Tomashevsky M, Edgar R: NCBI GEO: mining tens of millions of expression profiles--database and tools update. Nucl Acids Res. 2007, 35: D760-765. 10.1093/nar/gkl887.PubMedView ArticleGoogle Scholar
- Lipshutz RJ, Fodor SPA, Gingeras TR, Lockhart DJ: High density synthetic oligonucleotide arrays. Nat Genet. 1999, 21: 20-24. 10.1038/4447.PubMedView ArticleGoogle Scholar
- Duggan DJ, Bittner M, Chen Y, Meltzer P, Trent JM: Expression profiling using cDNA microarrays. Nat Genet. 1999, 21: 10-14. 10.1038/4434.PubMedView ArticleGoogle Scholar
- Bowtell DDL: Options available - from start to finish - for obtaining expression data by microarray. Nat Genet. 1999, 21: 25-32. 10.1038/4455.PubMedView ArticleGoogle Scholar
- Southern E, Mir K, Shchepinov M: Molecular interactions on microarrays. Nat Genet. 1999, 21: 5-9. 10.1038/4429.PubMedView ArticleGoogle Scholar
- Hacia JG: Resequencing and mutational analysis using oligonucleotide microarrays. Nat Genet. 1999, 21: 42-47. 10.1038/4469.PubMedView ArticleGoogle Scholar
- Albert TJ, Molla MN, Muzny DM, Nazareth L, Wheeler D, Song X, Richmond TA, Middle CM, Rodesch MJ, Packard CJ, Weinstock GM, Gibbs RA: Direct selection of human genomic loci by microarray hybridization. Nat Meth. 2007, 4: 903-905. 10.1038/nmeth1111.View ArticleGoogle Scholar
- Hodges E, Xuan Z, Balija V, Kramer M, Molla MN, Smith SW, Middle CM, Rodesch MJ, Albert TJ, Hannon GJ, McCombie WR: Genome-wide in situ exon capture for selective resequencing. Nat Genet. 2007, 39: 1522-1527. 10.1038/ng.2007.42.PubMedView ArticleGoogle Scholar
- Okou DT, Steinberg KM, Middle C, Cutler DJ, Albert TJ, Zwick ME: Microarray-based genomic selection for high-throughput resequencing. Nat Meth. 2007, 4: 907-909. 10.1038/nmeth1109.View ArticleGoogle Scholar
- Stratton M: Genome resequencing and genetic variation. Nat Biotech. 2008, 26: 65-66. 10.1038/nbt0108-65.View ArticleGoogle Scholar
- Naef F, Lim DA, Patil N, Magnasco M: DNA hybridization to mismatched templates: A chip study. Physical Review E. 2002, 65: 40902-10.1103/PhysRevE.65.040902.View ArticleGoogle Scholar
- Naef F, Magnasco MO: Solving the riddle of the bright mismatches: Labeling and effective binding in oligonucleotide arrays. Physical Review E. 2003, 68: 11906-10.1103/PhysRevE.68.011906.View ArticleGoogle Scholar
- Zhang L, Miles MF, Aldape KD: A model of molecular interactions on short oligonucleotide microarrays. Nat Biotech. 2003, 21: 818-821. 10.1038/nbt836.View ArticleGoogle Scholar
- Frazer KA, Eskin E, Kang HM, Bogue MA, Hinds DA, Beilharz EJ, Gupta RV, Montgomery J, Morenzoni MM, Nilsen GB, Pethiyagoda CL, Stuve LL, Johnson FM, Daly MJ, Wade CM, Cox DR: A sequence-based variation map of 8.27 million SNPs in inbred mouse strains. Nature. 2007, 448: 1050-1053. 10.1038/nature06067.PubMedView ArticleGoogle Scholar
- Gresham D, Ruderfer DM, Pratt SC, Schacherer J, Dunham MJ, Botstein D, Kruglyak L: Genome-Wide Detection of Polymorphisms at Nucleotide Resolution with a Single DNA Microarray. Science. 2006, 311: 1932-1936. 10.1126/science.1123726.PubMedView ArticleGoogle Scholar
- Kane MD, Jatkoe TA, Stumpf CR, Lu J, Thomas JD, Madore SJ: Assessment of the sensitivity and specificity of oligonucleotide (50mer) microarrays. Nucl Acids Res. 2000, 28: 4552-4557. 10.1093/nar/28.22.4552.PubMedView ArticleGoogle Scholar
- Letowski J, Brousseau R, Masson L: Designing better probes: effect of probe size, mismatch position and number on hybridization in DNA oligonucleotide microarrays. Journal of Microbiological Methods. 2004, 57: 269-278. 10.1016/j.mimet.2004.02.002.PubMedView ArticleGoogle Scholar
- Mantripragada KK, Buckley PG, Diaz de Stahl T, Dumanski JP: Genomic microarrays in the spotlight. Trends in Genetics. 2004, 20: 87-94. 10.1016/j.tig.2003.12.008.PubMedView ArticleGoogle Scholar
- Hughes TR, Mao M, Jones AR, Burchard J, Marton MJ, Shannon KW, Lefkowitz SM, Ziman M, Schelter JM, Meyer MR, Kobayashi S, Davis C, Dai H, He YD, Stephaniants SB, Cavet G, Walker WL, West A, Coffey E, Shoemaker DD, Stoughton R, Blanchard AP, Friend SH, Linsley PS: Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer. Nat Biotech. 2001, 19: 342-347. 10.1038/86730.View ArticleGoogle Scholar
- Kreil DP, Russell RR, Russell S: Microarray oligonucleotide probes. DNA Microarrays, Part A: Array Platforms and Wet-Bench Protocols. Edited by: Alan Kimmel BO. 2006, , Academic Press, 73-98. Volume 410View ArticleGoogle Scholar
- Dai H, Meyer M, Stepaniants S, Ziman M, Stoughton R: Use of hybridization kinetics for differentiating specific from non-specific binding to oligonucleotide microarrays. Nucl Acids Res. 2002, 30: e86-10.1093/nar/gnf085.PubMedView ArticleGoogle Scholar
- Li Y, Zon G, Wilson WD: NMR and molecular modeling evidence for a G.A mismatch base pair in a purine-rich DNA duplex. Proceedings of the National Academy of Sciences. 1991, 88: 26-30. 10.1073/pnas.88.1.26.View ArticleGoogle Scholar
- Wick LM, Rouillard JM, Whittam TS, Gulari E, Tiedje JM, Hashsham SA: On-chip non-equilibrium dissociation curves and dissociation rate constants as methods to assess specificity of oligonucleotide probes. Nucl Acids Res. 2006, 34: e26-10.1093/nar/gnj024.PubMedView ArticleGoogle Scholar
- Zhang L, Wu C, Carta R, Zhao H: Free energy of DNA duplex formation on short oligonucleotide microarrays. Nucl Acids Res. 2007, 35: e18-10.1093/nar/gkl1064.PubMedView ArticleGoogle Scholar
- Horne MT, Fish DJ, Benight AS: Statistical thermodynamics and kinetics of DNA multiplex hybridization reactions. Biophys J. 2006, 91: 4133-4153. 10.1529/biophysj.106.090662.PubMedView ArticleGoogle Scholar
- Carlon E, Heim T: Thermodynamics of RNA/DNA hybridization in high-density oligonucleotide microarrays. Physica A. 2006, 362: 433-449. 10.1016/j.physa.2005.09.067.View ArticleGoogle Scholar
- Egan CM, Sridhar S, Wigler M, Hall IM: Recurrent DNA copy number variation in the laboratory mouse. Nat Genet. 2007, 39: 1384-1389. 10.1038/ng.2007.19.PubMedView ArticleGoogle Scholar
- Fish DJ, Horne MT, Searles RP, Brewood GP, Benight AS: Multiplex SNP discrimination. Biophys J. 2007, 92: L89-91. 10.1529/biophysj.107.105320.PubMedView ArticleGoogle Scholar
- Fotin AV, Drobyshev AL, Proudnikov DY, Perov AN, Mirzabekov AD: Parallel thermodynamic analysis of duplexes on oligodeoxyribonucleotide microchips. Nucl Acids Res. 1998, 26: 1515-1521. 10.1093/nar/26.6.1515.PubMedView ArticleGoogle Scholar
- Fish DJ, Horne MT, Brewood GP, Goodarzi JP, Alemayehu S, Bhandiwad A, Searles RP, Benight AS: DNA multiplex hybridization on microarrays and thermodynamic stability in solution: a direct comparison. Nucl Acids Res. 2007, 35: 7197-7208. 10.1093/nar/gkm865.PubMedView ArticleGoogle Scholar
- Sasaki D, Kondo S, Maeda N, Gingeras TR, Hasegawa Y, Hayashizaki Y: Characteristics of oligonucleotide tiling arrays measured by hybridizing full-length cDNA clones: Causes of signal variation and false positive signals. Genomics. 2007, 89: 541-551. 10.1016/j.ygeno.2006.12.013.PubMedView ArticleGoogle Scholar
- Oligonucleotide array-based CGH for genomic DNA analysis v.5.0. [http://www.chem.agilent.com/scripts/literaturePDF.asp?iWHID=52010]
- Poland D, Scheraga HA: Phase transitions in one dimension and the helix-coil transition in polyamino acids. The Journal of Chemical Physics. 1966, 45: 1456-1463. 10.1063/1.1727785.PubMedView ArticleGoogle Scholar
- Everaers R, Kumar S, Simm C: Unified description of poly- and oligonucleotide DNA melting: Nearest-neighbor, Poland-Sheraga, and lattice models. Physical Review E. 2007, 75: 41918-10.1103/PhysRevE.75.041918.View ArticleGoogle Scholar
- Held GA, Grinstein G, Tu Y: Modeling of DNA microarray data by using physical properties of hybridization. Proceedings of the National Academy of Sciences. 2003, 100: 7575-7580. 10.1073/pnas.0832500100.View ArticleGoogle Scholar
- Gadgil C, Yeckel A, Derby JJ, Hu WS: A diffusion-reaction model for DNA microarray assays. Journal of Biotechnology. 2004, 114: 31-45. 10.1016/j.jbiotec.2004.05.008.PubMedView ArticleGoogle Scholar
- Halperin A, Buhot A, Zhulina EB: Brush effects on DNA chips: Thermodynamics, kinetics and design guidelines. Biophysical Journal. 2005, 89: 796-811. 10.1529/biophysj.105.063479.PubMedView ArticleGoogle Scholar
- Livshits MA, Mirzabekov AD: Theoretical analysis of the kinetics of DNA hybridization with gel-immobilized oligonucleotides. Biophys J. 1996, 71: 2795-2801.PubMedView ArticleGoogle Scholar
- Peterson AW, Wolf LK, Georgiadis RM: Hybridization of mismatched or partially matched DNA at surfaces. J Am Chem Soc. 2002, 124: 14601-14607. 10.1021/ja0279996.PubMedView ArticleGoogle Scholar
- Zhang L, Lu HHS, Chung W, Yang J, Li WH: Patterns of segmental duplication in the human genome. Mol Biol Evol. 2005, 22: 135-141. 10.1093/molbev/msh262.PubMedView ArticleGoogle Scholar
- Goidts V, Cooper D, Armengol L, Schempp W, Conroy J, Estivill X, Nowak N, Hameister H, Kehrer-Sawatzki H: Complex patterns of copy number variation at sites of segmental duplications: an important category of structural variation in the human genome. Human Genetics. 2006, 120: 270-284. 10.1007/s00439-006-0217-y.PubMedView ArticleGoogle Scholar
- Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, Fiegler H, Shapero MH, Carson AR, Chen W, Cho EK, Dallaire S, Freeman JL, Gonzalez JR, Gratacos M, Huang J, Kalaitzopoulos D, Komura D, MacDonald JR, Marshall CR, Mei R, Montgomery L, Nishimura K, Okamura K, Shen F, Somerville MJ, Tchinda J, Valsesia A, Woodwark C, Yang F, Zhang J, Zerjal T, Zhang J, Armengol L, Conrad DF, Estivill X, Tyler-Smith C, Carter NP, Aburatani H, Lee C, Jones KW, Scherer SW, Hurles ME: Global variation in copy number in the human genome. Nature. 2006, 444: 444-454. 10.1038/nature05329.PubMedView ArticleGoogle Scholar
- Graubert TA, Cahan P, Edwin D, Selzer RR, Richmond TA, Eis PS, Shannon WD, Li X, McLeod HL, Cheverud JM, Ley TJ: A high-resolution map of segmental DNA copy number variation in the mouse genome. PLoS Genetics. 2007, 3: e3-10.1371/journal.pgen.0030003.PubMedView ArticleGoogle Scholar
- Lakshmi B, Hall IM, Egan C, Alexander J, Leotta A, Healy J, Zender L, Spector MS, Xue W, Lowe SW, Wigler M, Lucito R: Mouse genomic representational oligonucleotide microarray analysis: Detection of copy number variations in normal and tumor specimens. Proceedings of the National Academy of Sciences. 2006, 103: 11234-11239. 10.1073/pnas.0602984103.View ArticleGoogle Scholar
- Hinds DA, Kloek AP, Jen M, Chen X, Frazer KA: Common deletions and SNPs are in linkage disequilibrium in the human genome. Nat Genet. 2006, 38: 82-85. 10.1038/ng1695.PubMedView ArticleGoogle Scholar
- Wolber PK, Collins PJ, Lucas AB, De Witte A, Shannon KW: The Agilent in situ-synthesized microarray platform. DNA Microarrays, Part A: Array Platforms and Wet-Bench Protocols. Edited by: Alan Kimmel BO. 2006, , Academic Press, 28-57. Volume 410View ArticleGoogle Scholar
- Quackenbush J: Microarray data normalization and transformation. Nat Genet. 2002, 32: 496-501. 10.1038/ng1032.PubMedView ArticleGoogle Scholar
- Agilent feature extraction software v9.5 reference guide. [http://www.chem.agilent.com/scripts/LiteraturePDF.asp?iWHID=50416]
- Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J, Speed TP: Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucl Acids Res. 2002, 30: e15-10.1093/nar/30.4.e15.PubMedView ArticleGoogle Scholar
- CGH analytics v3.4 user guide. [http://www.chem.agilent.com/scripts/LiteraturePDF.asp?iWHID=47787]