Optimization of candidate-gene SNP-genotyping by flexible oligonucleotide microarrays; analyzing variations in immune regulator genes of hay-fever samples

Background Genetic variants in immune regulator genes have been associated with numerous diseases, including allergies and cancer. Increasing evidence suggests a substantially elevated disease risk in individuals who carry a combination of disease-relevant single nucleotide polymorphisms (SNPs). For the genotyping of immune regulator genes, such as cytokines, chemokines and transcription factors, an oligonucleotide microarray for the analysis of 99 relevant SNPs was established. Since the microarray design was based on a platform that permits flexible in situ oligonucleotide synthesis, a set of optimally performing probes could be defined by a selection approach that combined computational and experimental aspects. Results While the in silico process eliminated 9% of the initial probe set, which had been picked purely on the basis of potential association with disease, the subsequent experimental validation excluded more than twice as many. The performance of the optimized microarray was demonstrated in a pilot study. The genotypes of 19 hay-fever patients (aged 40–44) with high IgE levels against inhalant antigens were compared to the results obtained with 19 age- and sex-matched controls. For several variants, allele-frequency differences of more than 10% were identified. Conclusion Based on the ability to improve empirically a chip design, the application of candidate-SNP typing represents a viable approach in the context of molecular epidemiological studies.

ting DNA. Commercially available microarrays, on the other hand, either contain a fixed and usually broadly applicable content or are expensive to purchase with customized features. The fixed-content arrays are useful for taking advantage of the high resolution genetic map of the human genome that is based on single nucleotide polymorphisms [1,1], which define DNA blocks (haplotypes) [1]. Since SNPs are the most common type of genetic variation between individuals, it makes sense to utilize them for the localization of disease genes by identifying haplotypes that are associated with phenotypic traits, especially in the case of multifactorial diseases [1][2][3][4][5]. As a consequence of such a study, however, further analysis is required for improving the resolution of the mapping process or trying to identify the polymorphisms that are actually responsible for the phenotypic variation. Alternatively to the process described above, one can immediately focus on the analysis of particular polymorphisms in candidate genes, if circumstantial evidence indicates their possible relevance to the occurrence of a disease. In either approach the production of a customized microarray is required. Also, experience has demonstrated the need for a careful design of the experimental set-up in order to avoid unacceptable error [6].
Irrespective of the algorithm used for the sequence selection of the probe set, the final functional test of the suitability of an oligonucleotide array for genotyping results from an empirical analysis of the hybridization performance of the oligonucleotide probes. In consequence, it is likely that the initial chip design will be changed by replacing ill-performing oligonucleotides with alternative sequences. For this process, the ability to easily change the chip layout is essential. Light-induced in situ synthesis controlled by a micro-mirror device [7,8] combines high synthesis yields of more than 99.5% per condensation [9] -and therefore good oligonucleotide quality -with the power of producing oligomer arrays of high density, reproducible characteristics and flexible layout.
In this study, we present the process of establishing an oligonucleotide microarray based on an on-site in situ synthesis technology for typing DNA samples in immune regulator genes including cytokines, chemokines and transcription factors. Genetic variants in immune regulator genes have been associated with numerous diseases, including allergies and cancer, with apparently an elevated disease risks in individuals that carry a combination of disease-relevant SNPs. For the array design, we exploited the flexibility of the GeniomOne device [8]. It employs a digital projector to synthesize oligonucleotide array features within channels of a three-dimensional micro-fluidic reaction carrier. The system allows the synthesis of a probe set of up to 64,000 oligonucleotides on a single chip, which subsequently can be hybridized with up to eight samples. For this analysis a microarray that assays 99 relevant SNPs was established by an iterative cycle of probe design and experimental evaluation. Subsequently, the performance of this microarray was investigated in a pilot study. Hay-fever patients aged 40-44 that exhibited high IgE levels against inhalant antigens and an age and sex-matched control group were analyzed.

Results
From a case-control study on hay-fever [10], 19 cases with the most extreme plasma IgE levels against inhalant antigens and 19 age-and sex-matched non-atopic controls were selected for the project. Originally, 141 SNPs in cytokine genes and other immune regulatory factors were selected from published studies and SNP-databases [11][12][13]. If possible, SNPs with known or potential functional relevance and allele frequency information were selected. Also, sequence complexity between the probes was meant to be similar, since it is well established that the rate of reassociation depends on sequence complexity [14]. In addition, the initial compilation was based on theoretical calculations of interactions between all oligonucleotide probes and PCR fragments. The program "SNP Cross-Checker" by Febit GmbH was used to check the cross reactivity between oligoprobes and template sequences reducing the number of PCR-products by 13 to 128. The threshold of maximally possible homology between 23 mere oligoprobe and template sequences was set to 85%. It takes into account that if within the 23 nucleotides of a probe, 20 nucleotides will basepairing with a template, this will produce sufficiently stable complex to produce false positive signals in the genotyping analysis.
Theoretically the probe properties could be assessed basis their sequence similarity and hybridization properties. Experimentally "bad" probe has low specificity, sensitivity and uniformity under given reaction conditions (temperature, base composition, salt concentration, hybridization time). Specificity and stability of DNA duplex formation strongly depend on sequence and base composition [15,16]. Also, the target sequence on either side of the SNP position plays an important role since secondary structures may strongly affect the hybridization behavior of a sample [17]. Therefore, it is frequently insufficient to predict hybridization performance merely on the basis of theoretical calculations. Consequently, we analyzed and optimized the experimental parameters of SNP position in the oligonucleotide and the overall length of the probes as well as hybridization temperature and duration. For each SNP, all four possible sequence variations were applied to the chip. One of the probes is designed to be perfectly complementary to a short stretch of the reference sequence (perfect match -PM) and the other three are identical to the first, except at the interrogation position, where one of the other three bases is substituted (mis-matches -MM). PM/MM scheme enables in addition subtract directly both the background level and crosshybridization signals providing thus with redundancy required for the reliable microarray analysis. The perfect match probe (PM) is designed complementary to the target sequence and the so-called mismatch probe (MM) is identical with the PM, except the base in the middle of the sequence. Ideally, there is 30-fold difference in the signal intensities of PM vs. MM oligo. In hybridization the oligo signal intensity depends directly of its sequence GC content. Depending on sequence content (high G/C content) the MM oligo can result sufficiently high signal and interfere discrimination between PM and MM signals. In such cases the entire set of 24 oligoprimers, specially designed for detection of one SNP from sense and antisense strands, is underperforming and has to be left out of array design. In addition, we tested positional effects by moving the polymorphic nucleotide from the center to positions +2 and -2 as well as +1 and -1 of the oligonucleotide probes (Fig. 1). This shift resulted in differences in signal intensities but did not add to the overall amount of information that could be gathered from an experiment. In consequence, we decided to use only probes that contained the respective SNP in a central position but placed three copies of the same oligosequence at different locations of the microarray.
Design a 23 mer oligonucleotide for SNP detection Figure 1 Design a 23 mer oligonucleotide for SNP detection. In (a) the relevant PCR-product of 166 bp is shown. (b) exhibits the set of oligonucleotides (12 for sense and 12 for antisense strand; at n = 0 the allele is located in the middle of the oligomer, at n = -2 and n = +2 the SNP is shifted by 2 nucleotides to the left and right, respectively. Furthermore, different temperatures for hybridisation (40°C, 45°C, 50°C, and 55°C) and changes in hybridisation time from one to four hours were compared. The time of hybridisation in this experiment had little influence on number of correct and false signals. However, increased hybridisation temperature at 50°C or 55°C reduced cross hybridisation at least 5% and lowered general amount of positive signal to 60% and 40% (respectively). Reduced stringency by decreased hybridisation temperature maximized the overall number and intensity of signals, but this was accompanied with 30% increase of unspecific hybridisation signals.
We also varied probe length, synthesizing on the same chip oligonucleotides of 19, 21, 23, 25 and 27 nucleotides. While longer sequences usually produce higher signal intensities, shorter oligonucleotides permit better discrimination of single base differences due to the more pronounced destabilizing effect of a mismatch. As expected, the signal intensities of both the fully matched (I 1 ) and the mismatch probes (I 2 ) increased with length while discrimination (I 1 /I 2 ) improved the shorter the oligonucleotides were (Fig. 2). Measured signal intensity (I 1 ) increases clearly with higher nucleotides number in the sequence of oligonucleotide-probe: I 1 (27 bp) > I 1 (19 bp) (Fig. 3). Same effect is obtained for MM (I 2 ) oligo-probes as well. Though the discrimination between PM/MM  Tests at different hybridization temperatures (40-55°C) produced the overall best results for the majority of SNPs with 23-mer probes and 3 to 4 hours of hybridization at 45°C. Finally, the selected set of oligoprobes, as well as the hybridization conditions, were tested in addition with 4 genomic DNA samples of control individuals. These control experiments had 5-fold redundancy. Concordance of analyzed genotypes were compared individually.
For selecting the best performing oligoprobes in the initial optimization experiments one test-DNA with good quality was used. All hybridization reactions from chip design step were repeated 3 times. During the optimization process, we identified several oligonucleotide probes that did not perform irrespective of the chosen hybridization conditions (e.g., Fig. 2). Apparently, the previously described selection basis of cross-reactivity could be even more The dependence of signal intensity on oligonucleotide length Figure 2 The dependence of signal intensity on oligonucleotide length. Hybridization was done at 45°C. I 1 /I 2 labels the ratio of the signal at the full-match oligonucleotide and the signals at the mismatched oligonucleotides. 27, 25, 23, 21 and 19 indicates the length of oligomers. The SNP was located either at the center of the oligonucleotides (0) or shifted by two bases in either direction (+1, -1).
Differences in the performance of oligonucleotides stringent e.g. we should allow less base pairing. Following experimental tests revealed additional oligoprobe sets falling out from final chip design because of the same reason. Herewith, basis on experimental results, the threshold for software based oligo probe selection could be set on 80% allowing less base pairing (and less false signals due to nonspecific oligo hybridization) than 85%. In total, 29 out of the 128 SNPs (22%) could not be analyzed adequately. The respective oligomer probes that had been defined as good by the in silico selection process were empirically found to be ill-performing in real hybridizations. Either the absolute signal intensity was too low to permit a statistically solid analysis or the discriminative effect was insufficient. The ratio between PM and MM oligo signal intensities is supposed to be at least 1/3 (Fig.  3). The high number of failing oligonucleotides illustrates the need for a careful experimental validation of in silico designed microarrays.
Using the optimized microarray, we performed genotyping analyses at 99 SNPs in 68 genes that have a putative functional significance for the occurrence of hay-fever. From a case-control study on hay-fever [10], 19 cases with the most extreme plasma IgE levels against inhalant antigens and 19 age-and sex-matched non-atopic controls were selected. Informed consent of the participants was given in writing and the local ethics committee approved the study. PCR-amplifications of the relevant DNAregions were performed either individually or in pools of 5 or 10 samples. While all pentaplex reactions yielded a product for each individual band, two decaplex amplifications failed to produce 1 out of the expected 10 amplicons (Fig. 4). The 99 products were pooled prior to labelling and hybridized concomitantly (Fig. 5). For each sample, analysis was repeated up to four times. The observed allele frequencies are presented in Table 1. To assess the accuracy of the genotyping, ten PCR-products of heterozygote calls obtained from the microarray analyses were subjected to gel-based DNA sequencing for confirmation. In all cases, the results were in full agreement.
Hybridization experiments for all studied 38 individuals were repeated twice.
16% of SNPs presented only one allele in the 38 studied samples. For 14 samples (7 cases and 7 controls) the call rate for all variants was above 90%. And in one case it was below 80% due to the low quality of this particular DNA sample. For 17 SNPs the amplification step basically failed due to low quality of clinical genomic DNA samples. After exclusion of these particular 17 SNPs (indicated with an asterisk in Tab. 1) that performed poorly in hybridizations the average concordance was 93%. From the variants with high quality data, five SNPs in the genes IL2 (rs2069772), TCL1B (rs1064017), IL11 (rs2298885), IL5RA (rs2290610) and TNFRSF1A (rs4149570) had pvalues smaller than 0.05 for the association of carrying the mutant allele with the high IgE phenotype. The homozygous genotype A for the IL5 receptor alpha (IL5RA Ile129 Val) was associated with a 6.8-fold risk (95% confidence interval, 1.6-29.1) of a high IgE phenotype.

Discussion
An oligonucleotide microarray was produced using Gen-iomOne device to facilitate the screening of single nucleotide polymorphisms in several genes that are associated with hay-fever as a pilot project. Based on an in silico design, the selected set of oligonucleotides was optimized by a subsequent experimental analysis. While the in silico process eliminated 9% of the initially 141 SNPs that had been picked purely on the basis of a potential association with the occurrence of hay-fever, the subsequent experimental validation eliminated another 20% of these oligomers, more than twice as many. This result illustrates the importance of experimental validation of the microarray designs. Even in analyses that are based on a continuous detection of the hybridization and dissociation process (dynamic allele-specific hybridization) [18] the selection is critical, although an analysis of the association and dissociation curves of the duplexes permit a more discriminative and accurate SNP detection.
The reasons for the failing probes could be manifold [19]. Although only short fragments were hybridized, secondary structures formed either within one sample molecule or between different targets could cause inefficient binding to the array-bound probe molecules. Also, it is well known that dangling ends of the target molecules may have a profound effect on the hybridization [20]. Documentation of the effectiveness of the genotyping ability of Image of a simultaneous hybridization of 99 PCR-products to an in situ synthesized oligonucleotide microarray Figure 5 Image of a simultaneous hybridization of 99 PCRproducts to an in situ synthesized oligonucleotide microarray. Usually, the features were scrambled across the array. For illustrative purposes, they were placed next to each other in this particular experiment.
particular sets of oligonucleotide probes is essential for a study of high accuracy. Use of flexible in situ synthesized oligonucleotide microarrays to such ends appears to be an efficient and attractive method for fast and cost-efficient pre-screening of candidate SNPs for an eventual highthroughput genotyping.
GeniomOne allows synthesizing 8 × 8.000 probes per array overnight and test them right after in hybridization experiments. In this way many combinations can be tested in parallel without additional cost, which allows selecting an optimal set of oligoprobes for the following experiments. This is a big advantage of GeniomOne technology.
In the analysis of the 38 DNA samples of hay-fever cases and controls, we were able to identify at least five polymorphisms in immune regulator genes that contribute to the extreme IgE phenotype and deserve further testing. For 22% of the selected SNPs, only one genotype was seen in 38 individuals. For several variants, allele-frequency differences between cases and controls exceeded 10%. These include non-synonymous variants in the IL5 receptor alpha (IL5RA Ile129 Val) and TCL1B (Gly93Arg), pro-moter polymorphisms in IL2 (-330 T/G) and TNFRSF1A (-609 G/T), and a polymorphism in the 3' UTR of IL11. IL5RA is a crucial factor in IL5 signalling and a contributor to the genetics of atopy in mice [21]. The extreme phenotype design of the study performed here may be an efficient alternative for the identification of disease-relevant sequence variants.

Conclusion
Based on a platform that permits flexible in situ oligonucleotide synthesis, a set of optimally performing probes could be defined by a selection approach that combined computational and experimental aspects. The final design achieved by this process permitted an effective discrimination of both homo-and heterozygote polymorphisms in hay-fever patients. Allele-frequencies of more than 10% could be identified.

Microarray synthesis
All analysis steps, (i) in situ synthesis of the oligonucleotide microarray, (ii) hybridization of the labeled PCRproduct mixture and (iii) detection of the signal intensities were performed with the GeniomOne device of febit Gel-electrophoretic separation of the products of multiplex-PCR Figure 4 Gel-electrophoretic separation of the products of multiplex-PCR. Two decaplex amplifications are shown in comparison to the respective individual reactions. In both cases presented here, one product was not amplified in the multiplex reaction while the reaction worked fine in the individual amplification.
Controlled by a mask-free, light-controlled process, oligonucleotide probes were synthesized in situ in 3' to 5' direction [9]. For each selected SNP, 24 oligonucleotide probes were synthesized, 12 for either DNA strand (Fig. 1), all designed to exhibit similar hybridization characteristics. The arrays used in this study consisted of 7,448 distinct oligonucleotides (594 perfect match probes and 6,534 mismatch probes, plus 320 copies of a control oligonucleotide). A complete list can be obtained from the authors. Multiplex-PCR was performed in a total volume of 25 µl solution containing 80 ng human genomic DNA, 1.2 µmol/l of each primer, 1 mmol/l deoxynucleotide triphosphates (dNTPs), 5 mmol/l MgCl 2 and 2 units of Thermoprime Plus DNA polymerase (AB Gene). All primer pairs had been checked in silico for possible primer dimers using the program "Primer Premier 5" (Premier Biosoft International, Palo Alto, USA).

PCR-amplification
DNA amplification for individuals, studied in present work, was done as described in single PCR cycling reactions. All PCR-products were checked by electrophoresis on 2% agarose gels.