Microarray-based ultra-high resolution discovery of genomic deletion mutations
© Belfield et al.; licensee BioMed Central Ltd. 2014
Received: 1 November 2013
Accepted: 28 February 2014
Published: 22 March 2014
Oligonucleotide microarray-based comparative genomic hybridization (CGH) offers an attractive possible route for the rapid and cost-effective genome-wide discovery of deletion mutations. CGH typically involves comparison of the hybridization intensities of genomic DNA samples with microarray chip representations of entire genomes, and has widespread potential application in experimental research and medical diagnostics. However, the power to detect small deletions is low.
Here we use a graduated series of Arabidopsis thaliana genomic deletion mutations (of sizes ranging from 4 bp to ~5 kb) to optimize CGH-based genomic deletion detection. We show that the power to detect smaller deletions (4, 28 and 104 bp) depends upon oligonucleotide density (essentially the number of genome-representative oligonucleotides on the microarray chip), and determine the oligonucleotide spacings necessary to guarantee detection of deletions of specified size.
Our findings will enhance a wide range of research and clinical applications, and in particular will aid in the discovery of genomic deletions in the absence of a priori knowledge of their existence.
KeywordsMutation Deletion Microarray Genome Comparative genomic hybridization Probe density
Oligonucleotide microarrays were first developed ~20 years ago . Present day microarrays are vastly superior to their predecessors in terms of quality, probe density, and they can represent an entire species genome [2–4]. Microarray technology, together with advances in genomics and bioinformatics methodologies, revolutionized the ways that we interrogate and study genomes. These approaches have great power because they allow simultaneous survey and profiling of thousands of genes, and enable whole genomes to be assayed at particular moments in time and under specific conditions.
The scientific applications of microarray technology range from gene expression profiling, comparative genomic hybridization (CGH), and chromatin immunoprecipitation analysis, to single nucleotide polymorphism (SNP) detection. Recently, next generation sequencing (NGS) has also been used as a major discovery tool in these applications, particularly where well-annotated whole genome datasets are available . However, for species with large, complex (e.g., transposon-rich ), polyploid genomes, or for species with genomes that are not well annotated, NGS is relatively poorly suited to the detection of SNPs, insertions/deletions (INDELs) and other variants because of the short sequencing reads and depth of sequencing coverage needed [7–9]. Moreover, the computational overheads associated with the analysis of NGS data are a significant barrier to their use in some laboratories. By contrast, microarrays are well-established research tools that require only well-established analysis methods. These can be performed easily on personal computers, and allow rapid and routine analysis of multiple samples. Hence, whilst the use of microarrays and NGS in genomic analysis will likely continue to be complementary , we here describe work specifically aimed at optimizing genomic deletion mutation detection via microarray-based approaches.
The first discovery of a plant genomic deletion mutation using microarray-based CGH utilized the Arabidopsis thaliana ATH1 genome array, a single-chip array featuring 22,500 probes representing approximately 24,000 A. thaliana gene sequences. CGH using this array led to the discovery of a phenotype-causal 523 bp fast-neutron (FN) irradiation-induced deletion mutation . This deletion was within the second exon and intron of the A. thaliana AtHKT1 gene (At4G10310) in a previously identified ion accumulation mutant . Up to that point the identification of mutations that caused phenotypic changes in plants required laborious map-based cloning .
Mutagens such as FNs frequently induce small genomic DNA deletions of 1–6 bp  and permit the identification of phenotype-causal mutations in an unbiased manner versus mutagens such as EMS that almost exclusively cause G:C > A:T point mutations [12, 14–17]. The use of tiling microarrays to discover these causal mutations potentially reduces the time, cost and labour needed, important when large numbers of samples are analysed [4, 11]. However, mutation discovery remains a difficult process, especially with respect to relatively small-sized deletions.
In this study we describe the use of a graduated series of previously characterized A. thaliana genomic deletion mutations to discover the parameters required for rapid and robust detection of genomic deletion mutations, and in particular of relatively small sized deletion mutations. This series presents a range of deletion mutations, from 4 bp to ~5 kb, thus extending over a four orders of magnitude difference in deletion size. We use standard and customized versions of the Roche NimbleGen 2.1 million feature high-resolution microarrays  to detect these deletions via CGH, and are hence able to determine the oligonucleotide densities necessary to robustly detect genomic deletions as small as 4 bp in size. We show that this technology has the potential to uncover deletions that were not detectable with previous array based designs and equalling NGS-based technologies. Our findings may improve microarray design for a wide variety of applications .
Arabidopsisoligonucleotide microarray design
The arrays used in this study are unlike traditional tiling array formats that feature non-overlapping or partially overlapping probes with a maximum of one probe overlapping another. For example, in this study, the staggered probes designed every 2 bp can cover a single nucleotide position in the Arabidopsis genome up to 37 times, versus the maximum of twice using conventional tiling arrays. Probes containing sequences represented more than once in the genome were omitted from the final array design, and all probes in the ‘ultra-high density’ probe sets were unique within the TAIR8 annotated reference genome. The genomic distribution profiles of the ‘ultra-high density’ probes, like standard tiling arrays including the NimbleGen CGH arrays, are essentially linear and unbiased unlike most transcript arrays which are biased towards the 3′ end of transcripts and sometimes overlap . This is because sequence data are typically derived from EST sequences with a 3′ bias, and the 3′ ends of genes are generally more variable and provide greater specificity .
Microarray-based discovery of deletion mutations
Use of DNA (rather than mRNA) for microarray-based deletion cloning strategies has a number of benefits, as the mRNA abundance of many genes is too low to be efficiently and reliably detected. In addition, the relative levels of mRNA could be altered due to the secondary effects of the deletion mutation . DNA from mutant plant lines or control lines labelled with Cy3 or Cy5 were co-hybridized to the CGH arrays. Genome-wide mutant DNA hybridization intensities were normalized with control genomic DNA samples hybridized to the same CGH array. Homotypic hybridizations (self-self: the same nucleic acid samples labelled with two fluorophores and hybridized to a single microarray) should result in a slope of Cy3 versus Cy5 intensity equal to one where normalized (Additional file 1: Figure S3). However, a number of putative deletions and/or copy number variants (CNVs) are always observed in such experiments as judged by a lack of hybridization intensity, or poor hybridization intensity of either the sample or control. Reduced hybridization signals are likely due to experiment-to-experiment variability and variation in the hybridization efficiency between individual probes [21, 22].
Further analysis indicated the existence of seven other putative deletions displaying a probe ‘deletion profile’ threshold of 2 × S.D. in the ga1-3 mutant genome (shown with a *, Figure 2; also see Additional file 1: Figure S4A). To determine if these were genuine deletions, we designed oligonucleotides to a gene within each deleted region and performed diagnostic PCR analyses. PCRs confirmed three of the seven putative deletions were false positives (on chromosomes 4 and 5) as the genes were amplified from both the Ler control and ga1-3 mutant DNA samples, see Additional file 1: Figure S4B. We could not confirm the existence of the four deletions on chromosomes 1–3 as PCR products could not be amplified from either the control or mutant lines. Interestingly, we also observed a potential CNV (duplication) of ~4.5 kb on chromosome 4 (shown with a +; see Figure 2ga1-3 profile), although it is alternatively possible that this signal represents a deletion event in the Ler control.
The most striking result of the FN1148 hybridization profile (versus control) was that there appeared to be a several large (~100 kb to 10 Mb) regions of chromosome 2, 3 and 5 that were deleted but no such deletions were observable on chromosomes 1 and 4 (Figure 2 FN1148 profile and Additional file 1: Figure S4C). Similar to the deletions identified in the ga1-3 mutant versus control above, oligonucleotides were designed to six genes within these putative deleted regions and diagnostic PCR analyses performed to confirm authenticity. We did identify a deletion in the FN1148 mutant on chromosome 2 of ~4.5 kb (a PCR product of ~6 kb was amplified from the control plant DNA while the FN1148 mutant PCR product was ~1.5 kb), see Additional file 1: Figure S4D. This deletion spanned the entire 4 kb sequence of a transposable element (At2G31080). An alternative explanation for this deletion profile could be either the natural loss or movement of this mobile element in the FN1148 mutant genome. PCR analyses showed that the five other genes that appeared to be located in large deleted regions on chromosomes 2, 3 and 5 were actually present in both the control and FN1148 mutant lines, see Additional file 1: Figure S4D. These false positives could be associated with high signal-to-noise due to non-specific probe hybridizations or suboptimal technical aspects of the hybridization procedure such as sample labelling or preparation. Also, there is a possibility that there were DNA sequence specific differences between the A. thaliana FN1148 Col-0 mutant (obtained from J. Schroeder’s lab, UC San Diego, USA) and the control A. thaliana Col-0 line that had been propagated in our own laboratory since the 1990’s.
At a genome-wide scale, similar deletion profiles to the larger deletions (523 bp and ~5 kb) were observed for other Arabidopsis mutants with smaller deletions (4 bp, 28 bp and 104 bp). These deletion profiles were composed of nineteen staggered probe sets (2, 6, 10, 12, 15, 17, 20, 22, 25, 27, 30, 32, 35, 37, 40, 42, 45, 47 and 49 bp). Although the numbers of probes required to detect these smaller deletions was higher than those for the larger deletions, the normalized deletion signals visualized at the PHYB, HY1 and MAX2 loci (in E124, E99 and E207 mutant lines, respectively), were similar in magnitude to the 523 bp hkt1 deletion (FN1148 mutant line) signal log2 ratio of ~ −2 but smaller than that observed for the ga1-3 deletion (signal log2 ratio of ~ −6) (Figure 2). This was an interesting observation since the PHYB, HY1 and MAX2 loci DNA deletions varied in size by one to two orders of magnitude (the MAX2 deletion mutation was 4 bp in size and the PHYB mutation was 104 bp), and as a consequence the number of probes representing these deletions was different. This suggests that there is not a simple relationship between deletion size and the ability to detect it. A total of 77 probes partially covered the PHYB deletion while the >25 times smaller deleted region of MAX2 was represented 20 times in comparison. Also, the number of complete full-length probes covering these deletions differed: 23 covered the PHYB deletion but there were none representing the MAX2 deletion, as the probe lengths were 50–75 mer and the size of the deletion just 4 bp.
Interestingly, we also detected an additional deletion mutation in the E99 mutant (versus control) on chromosome 1 using the standard NimbleGen probe set of 49 bp staggered probes (shown with a ∆; Figure 2). This deletion had a similar normalized signal log2 ratio of ~ −6 to the ga1-3 deletion mutation. Oligos were designed to the flanking regions of this putative deletion and a PCR product was amplified (data not shown) from E99 genomic DNA confirming the deletion of 7,176 bp that encompassed three genes: At1G18075, At1G18080, and At1G18100, which encoded a microRNA, a DNA repair exonuclease and a protein of unknown function, respectively. In addition, the presence of this ~7 kb genomic deletion and a number of other mutations including the 28 bp deletion in the HY1 gene, were confirmed by whole genome sequencing .
In-depth analysis of probe densities required to detect relatively small genomic deletion mutations
To investigate the oligonucleotide probe resolutions required to detect genomic deletions smaller than those in ga1-3 (~5 kb) and FN1148 (~0.5 kb) mutant lines, we compared the normalized hybridization profiles of individual probe datasets obtained for the E124 (104 bp phyB deletion), E99 (28 bp hy1 deletion) and E207 (4 bp max2 deletion) mutants. To do this we used nineteen probe sets staggered every 2, 6, 10, 12, 15, 17, 20, 22, 25, 27, 30, 32, 35, 37, 40, 42, 45, 47 and 49 bp (Additional file 1: Figure S1) over the affected genes, and determined the probe resolutions required to efficiently detect the different sizes of deletion.
The design and experimental performance of nineteen staggered probe sets used to detect a 104 bp genomic DNA deletion in an A. thaliana mutant
Probe set staggering (bp)
Number of probes
% of probes with a log2normalized ratio of ≤ −0.4
Detected with a log2normalized ratio of ≤ −0.4
Further analysis was performed for all nineteen of the ‘Designed’ and ‘Detected’ ‘ultra-high density’ probe sets (2 bp to 49 bp) representing the phyB 104 bp deletion (in the E124 mutant line) and the results are listed in Table 1 and graphically shown in Figure 5A. Using a threshold of 3 consecutive probes that also had normalized log2 ratio intensities of ≤ −0.4 (i.e. a ‘deletion profile’), we found that probes staggered between 45 bp and the NimbleGen ‘standard’ probe density of 49 bp were sufficient to identify this size of deletion mutation (Figure 5A). We conclude that deletions of ~100 bp and larger can be confidently detected using tiling microarrays designed with isothermal probes staggered every 49 bp using a search criteria of ≥ 3 consecutive probes that have a deletion profile. Recently, a similar threshold of at least three consecutive probes with a normalized log2 ratio threshold of ≥ 0.4, has also been successful in identifying mutations (CNVs) in human patients with drug resistant epilepsy .
To detect smaller deletions of 28 bp (present in the E99 mutant line) and 4 bp (present in the E207 mutant line), higher probe densities than those used to detect the 104 bp above were needed. Figure 5B shows that probes would need to be staggered between 22 and 32 bp to detect the hy1 28 bp deletion, and between 10 and 12 bp to detect the 4 bp max2 deletion (Figure 5C), using the same threshold of 3 consecutive probes with a deletion profile. However, if a less stringent threshold of just 2 consecutive probes was used, the 28 bp deletion was detectable using staggered probes of 35 to 47 bp, and likewise the 4 bp would be detectable using staggered probe resolutions of 15 to 25 bp.
Previous studies have shown that one or two probes are sufficient to detect genomic deletions of ~100 bp  and ~500 bp , deletions that are considerably larger than the smallest deletion (4 bp) analyzed in our study. Our analyses of the ‘standard’ probes on the NimbleGen arrays showed they were highly sensitive and highly reliable in detecting deletions of various sizes. For example, the 104 bp phyB deletion (E124 mutant) was represented on our custom microarrays by three probes at the ‘standard’ staggering of 49 bp (Table 1, Figure 5A and Additional file 1: Figure S5), while the smaller 28 bp and 4 bp deletions in the hy1 (E99) and max2 (E207) mutants were each represented just once at the same ‘standard’ staggering, respectively. However, all five of these ‘standard’ probes that represented these three different sized deletions gave normalized DNA hybridization log2 ratio values of less than −0.4 (Figures 5B and C, and Additional file 1: Figures S6 and S7). This shows the excellent sensitivity of this type of microarray, but the detection of a genomic deletion with just a single probe risks background signal-to-noise interference, with the persistent possibility of false positives being called.
Overall, the ‘ultra-high density’ probe sets used in our study enabled us to determine the number of probes required and at the precise staggering (between 2 and 49 bp) that allow reliable detection of deletions of just a few bp in size. These results will aid researchers designing microarrays to detect deletions of different sizes without any a priori knowledge of the deletion. For example, using a stringent 3 consecutive probe criteria, deletions of ~100 bp require microarrays with probes staggered ≤ 45 bp while smaller deletions of ~30 bp need probes staggered ≤ 32 bp and deletions of a just a few bp require probes staggered ≤ 12 bp (Figures 5A–C).
NimbleGen microarray performance analysis
Microarrays are a powerful tool in the genetic analyses of both prokaryotes and eukaryotes. In this study we determined the probe resolution required on microarrays to reliably detect genome deletions. Using irradiated A. thaliana plant lines that were isolated from forward genetic screens, we were able to identify mutant phenotype-causal deletions that ranged in size from just a few bp to larger kb variants. Although forward genetics has been used for almost 100 years, a major difficulty of this technique is the molecular identification of the causal mutations responsible for an observed phenotype. This is especially true for smaller deletions of just a few nucleotides that can remain undetected.
In order to study the appropriate density of probes on a microarray to detect small genomic deletions, we utilized a set of A. thaliana mutant plant lines that we had previously characterized. These mutants, obtained from an irradiation mutagenesis screen, had genomic deletions in three photomorphogenesis genes MAX2, HY1, and PHYB. The deletions were 4 bp to 104 bp in size and made excellent test cases to optimize CGH-microarray based genomic deletion detection. We designed a NimbleGen deletion profiling microarray to detect these A. thaliana mutations. This targeted array contained ultra-high density probes covering the entire genomic regions of the three genes affected and included exon, intron, promoter and UTR sequences. The probes were added to the microarray in nineteen physically staggered designs to cover each region from 2 bp to 49 bp. Following normalization of the hybridization signals of the mutants versus control samples, we could perform in-depth analysis of the probe densities required to confidently detect relatively small genomic deletions.
We found that to detect genomic deletions of ~100 bp, a microarray with probes staggered every 45 bp or less was required (Figure 5A). We also found that smaller deletions of 28 bp and 4 bp required higher densities of probes of at least 22 bp and 10 bp, respectively (Figure 5B and C). It is important to note that the long oligonucleotide probes (60–80 mer) on the NimbleGen microarrays we used in this study provided excellent detection sensitivity and reliability (see Table 1, Figures 5A–C and Figure 6). Microarrays from alternative manufacturers with shorter probes (e.g. 25 or 30 mer) might not have detected similar deletions because they have been shown to be less sensitive than longer probes [26, 27].
Although most microarray applications are research-use-only, this technology is increasingly being used in clinical based genomic applications . For example, microarrays have been used to identify DNA deletions associated with serious human conditions such as cancer , muscular dystrophy , and osteoporosis . In addition, personalized healthcare is a rapidly developing scientific area and microarrays may be utilised in the identification and early detection of treatable diseases . The deletions reported in these previous clinical studies vary in size from 100's of bases to 100's of kilobases. While these studies demonstrate that arrays are an excellent tool for identifying DNA deletions, the density of oligonucleotides required to detect smaller disease-causing deletions remains unclear.
With the advent of NGS it is now possible for genomes to be sequenced at high depth. However, at the present time no single platform, neither NGS nor microarrays, can identify all genomic sequence variants . NGS can be very expensive, computational costs can be high and it can be time prohibitive for large numbers of samples . Indeed, whole-genome sequencing is not necessary for many research studies that focus on specific target regions, such as promoters, exons and regulatory elements. This is also true for genomes with highly repetitive DNA sequences, such as the human and wheat genomes that are composed of 50% and 80% repeats, respectively [34, 35]. For example, the 50 Mbs of the extended human exome could easily be represented on a single array . Microarrays are a less labour intensive alternative to NGS and are likely to remain popular in research laboratories especially for those without extensive bioinformatic support. Also, microarrays can provide rapid and parallel analysis of large numbers of samples. Oxford Gene Technology, for example, processed 20,000 samples in 20 weeks on behalf of The Wellcome Trust Case Control Consortium researching CNVs in common human diseases .
The small deletions analysed in this report are typical of those found in plants following irradiation mutagenesis  and in certain human disorders: frameshift causing deletions of 4 bp have been reported to be responsible for severe brain malformations ; a condition in families that can cause the lung to collapse ; and in a syndrome that can lead to increased risk of cancer of the colon, stomach, and pancreas . Our findings show that the probe density on a microarray is critical in identifying genomic deletions and is fundamental to the success of experiments. Our results will help researchers working on both prokaryotes and eukaryotes to design microarrays with the optimal probe densities to detect both large and small deletions. These findings are applicable to any organism with a well-annotated sequenced reference genome.
Plant material and growth conditions
All experiments used either the Landsberg erecta (Ler) or the Columbia (Col-0) laboratory strains of A. thaliana as genetic background. ga1-3 seeds were originally isolated from a FN-mutagenized Ler plant [40, 41] that had a genomic deletion encompassing part of the GA1 gibberellin biosynthesis gene (At4G02780). Seeds of the Col-0 AtHKT1 mutant (FN1148) were kindly donated by Julian Schroeder, UC San Diego, USA . Plants were grown on soil with a 16 h light/8 h dark photoperiod at 22–24°C (irradiance 120 μmol photons m−2 s−1).
Three Arabidopsis elongated hypocotyl mutants E99, E124 and E207 were obtained from visual screens of seedlings grown from a fast neutron irradiated DELLA deficient (gai-t6, rga-t2, rgl1-1, rgl2-1 and rgl3-4)  seed collection . The plants harbour genomic deletions within the HY1 (At2G26670) , PHYB (At2G18790)  and MAX2 (At2G42620)  genes, respectively. The genomic deletion in each FN elongated hypocotyl mutant was confirmed by direct PCR amplification and Sanger sequencing using the oligonucleotides listed in Additional file 1: Table S3. Table S1 (see Additional file 1) gives an overview of the five Arabidopsis mutants used in this study, the sizes of the fast neutron-induced deletions, the coordinates of the deletions, and the gene affected.
DNA extraction and microarray experiment
Genomic DNA was extracted from plant leaf material using a DNeasy Mini Kit (Qiagen). The yield and quality of the samples was checked by running the samples on a 1% agarose gel (data not shown), and 6 μg was sent to Roche NimbleGen’s custom microarray services facility. Microarray hybridizations and washings were performed using the standard NimbleGen protocol for CGH analyses (http://www.nimblegen.com/products/cgh/index.html).
Control DNA samples were extracted from A. thaliana Col-0 and used to normalize the FN1148 mutant hybridization signal; wild type A. thaliana Ler DNA was used to normalize the ga1-3 mutant hybridization signal; and the DNA of the DELLA-deficient progenitor line (mostly A. thaliana Ler background with a ~3 Mb chromosome 5 Col-0 segment  was used as the control to normalize the E99, E124 and E207 mutant hybridization signals.
We used Roche NimbleGen’s two-colour Arabidopsis CGH 3 × 720 K whole genome custom tiling arrays that feature empirically tested probes of 50–75 mer that provide improved data quality (i.e. signal-to-noise) compared with computationally selected probes (http://www.nimblegen.com/products/cgh). Previous studies have shown that microarrays with longer oligonucleotides (60–80 mer) provide significantly better detection sensitivity than those with shorter oligonucleotides (e.g. 25 or 30 mer) [26, 27, 45].
Using the A. thaliana TAIR8 annotated reference genome sequence (http://www.arabidopsis.org), probes were designed to represent the complete model plant genome every 49 bp with 50–75 bp partially overlapping probes, i.e. the distance between the 5′ ends of consecutive oligonucleotides. In addition to these ‘standard’ array probes staggered every 49 bp, extra probes were added with the purpose of representing the five genes AtGA1, AtHKT1, AtPHYB, AtHY1 and AtMAX2 at ‘ultra-high density’. To achieve this purpose, around 15,000 isothermal ‘ultra-high density’ probes (50–75 bp; see Additional file 1: Table S2) were designed to cover the genomic regions of each of the five genes from 500 bp 5′ to the gene’s start codon, continuing through the genic region (including both exonic and intronic sequences) and ending 500 bp 3’ of the stop codon.
Probes were designed to represent the larger deletions in the FN1148 (~500 bp AtHKT1 genic mutation) and ga1-3 (~5 kb AtGA1 genic mutation) Arabidopsis mutants with oligos staggered every 6 bp, while the extra probes representing the smaller deletions in the E124 (104 bp AtPHYB genic mutation), E99 (28 bp AtHY1 genic mutation) and E207 (4 bp AtMAX2 genic mutation) Arabidopsis mutants were staggered every: 2, 6, 10, 12, 15, 17, 20, 22, 25, 27, 30, 32, 35, 37, 40, 42, 45, 47 and 49 bp (Additional file 1: Figure S1; Additional file 2: Tables S4–S22). To reduce the background noise of the ‘ultra-high density’ probes, probes that mapped perfectly to more than one genomic location were excluded. Also, sequences that were unique to the Arabidopsis TAIR8 annotated reference genome were assigned randomized locations across the NimbleGen array.
Microarray hybridizations and data analysis
Test and reference control genomic DNAs were independently labelled with fluorescent dyes (mutant DNA was labelled with cyanine (Cy) 3 and control DNA labelled with Cy5), co-hybridized to the NimbleGen A. thaliana 2.1 M Whole-Genome CGH arrays, and scanned using a 5 μm scanner. Log2-ratios of the probe signal intensities (Cy3/Cy5) were calculated and plotted versus genomic position using Roche NimbleGen NimbleScan software. To determine the efficiency and to reduce costs associated with CGH-based deletion discovery, we used a single array per deletion experiment without replicate hybridizations.
The criteria used to identify deletions and CNVs by aCGH vary considerably between studies. Such mutations are commonly distinguished from low-level gains/losses using a direct threshold of array data. However, the threshold value often differs greatly, ranging from a log2 ratio of +/−0.4 for some studies [46, 47] to as high as +/−1.0 for others [48, 49]. The criteria we used to detect deletions were based on the aCGH patterns obtained from our mutant versus wild type hybridizations. From our E124, E99 and E207 versus control hybridizations, probes that represented loci with an equal copy number, had a mean log2 normalized intensity ratios of 0.0 +/−0.2 S.D. Based on this variation, empirical analyses (based on looking at the frequency distribution of log ratios for probes along known deleted and duplicated regions), and previous studies [50, 51], we chose a probe ‘deletion profile’ threshold of 2 × S.D., i.e. +/−0.4. Indeed, of the total number probes on the microarrays only 2.73 – 4.97% from each mutant versus control hybridization gave log2 ratios above or below +/−0.4, suggesting the probes on the microarray had a low level of signal-to-noise. In addition, we would expect about 5% of probes to exceed this threshold by chance if the log2 ratios are normally distributed.
Availability of supporting data
The data sets supporting the results of this article are available in the NCBI GEO repository (accession number GSE55327 study at: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE55327).
Comparative genomic hybridization
Copy number variation
Next generation sequencing
Single nucleotide polymorphism.
This work was supported by the Biotechnological Biological Science Research Council grant (BB/F020759/1).
- Augenlicht LH, Kobrin D: Cloning and screening of sequences expressed in a mouse colon-tumor. Cancer Res. 1982, 42 (3): 1088-1093.PubMedGoogle Scholar
- Gregory BD, Yazaki J, Ecker JR: Utilizing tiling microarrays for whole-genome analysis in plants. Plant J. 2008, 53 (4): 636-644.PubMedView ArticleGoogle Scholar
- Stoughton RB: Applications of DNA microarrays in biology. Annu Rev Biochem. 2005, 74: 53-82. 10.1146/annurev.biochem.74.082803.133212.PubMedView ArticleGoogle Scholar
- Nagano AJ, Fukazawa M, Hayashi M, Ikeuchi M, Tsukaya H, Nishimura M, Hara-Nishimura I: AtMap1: a DNA microarray for genomic deletion mapping in Arabidopsis thaliana. Plant J. 2008, 56 (6): 1058-1065. 10.1111/j.1365-313X.2008.03656.x.PubMedView ArticleGoogle Scholar
- Metzker ML: Applications of next-generation sequencing technologies - the next generation. Nat Rev Genet. 2010, 11 (1): 31-46. 10.1038/nrg2626.PubMedView ArticleGoogle Scholar
- Schnable PS, Ware D, Fulton RS, Stein JC, Wei F, Pasternak S, Liang C, Zhang J, Fulton L, Graves TA, Minx P, Reily AD, Courtney L, Kruchowski SS, Tomlinson C, Strong C, Delehaunty K, Fronick C, Courtney B, Rock SM, Belter E, Du F, Kim K, Abbott RM, Cotton M, Levy A, Marchetto P, Ochoa K, Jackson SM, Gillam B, et al: The B73 maize genome: complexity, diversity, and dynamics. Science. 2009, 326 (5956): 1112-1115. 10.1126/science.1178534.PubMedView ArticleGoogle Scholar
- Winfield MO, Wilkinson PA, Allen AM, Barker GLA, Coghill JA, Burridge A, Hall A, Brenchley RC, D’Amore R, Hall N, Brenchley RC, D'Amore R, Hall N, Bevan MW, Richmond T, Gerhardt DJ, Jeddeloh JA, Edwards KJ: Targeted re-sequencing of the allohexaploid wheat exome. Plant Biotechnol J. 2012, 10 (6): 733-742. 10.1111/j.1467-7652.2012.00713.x.PubMedView ArticleGoogle Scholar
- Fu Y, Springer NM, Gerhardt DJ, Ying K, Yeh CT, Wu W, Swanson-Wagner R, D’Ascenzo M, Millard T, Freeberg L, Aoyama N, Kitzman J, Burgess D, Richmond T, Albert TJ, Barbazuk WB, Jeddeloh JA, Schnable PS: Repeat subtraction-mediated sequence capture from a complex genome. Plant J. 2010, 62 (5): 898-909. 10.1111/j.1365-313X.2010.04196.x.PubMedView ArticleGoogle Scholar
- Reumers J, De Rijk P, Zhao H, Liekens A, Smeets D, Cleary J, Van Loo P, Van Den Bossche M, Catthoor K, Sabbe B, Despierre E, Vergote I, Hilbush B, Lambrechts D, Del-Favero J: Optimized filtering reduces the error rate in detecting genomic variants by short-read sequencing. Nat Biotechnol. 2012, 30 (1): 61-68.View ArticleGoogle Scholar
- Gan XC, Stegle O, Behr J, Steffen JG, Drewe P, Hildebrand KL, Lyngsoe R, Schultheiss SJ, Osborne EJ, Sreedharan VT, Despierre E, Vergote I, Hilbush B, Lambrechts D, Del-Favero J: Multiple reference genomes and transcriptomes for Arabidopsis thaliana. Nature. 2011, 477 (7365): 419-423. 10.1038/nature10414.PubMedView ArticleGoogle Scholar
- Gong JM, Waner DA, Horie T, Li SL, Horie R, Abid KB, Schroeder JI: Microarray-based rapid cloning of an ion accumulation deletion mutant in Arabidopsis thaliana. Proc Natl Acad Sci U S A. 2004, 101 (43): 15404-15409. 10.1073/pnas.0404780101.PubMed CentralPubMedView ArticleGoogle Scholar
- Alonso JM, Ecker JR: Moving forward in reverse: genetic technologies to enable genome-wide phenomic screens in Arabidopsis. Nat Rev Genet. 2006, 7 (7): 524-536. 10.1038/nrg1893.PubMedView ArticleGoogle Scholar
- Belfield EJ, Gan X, Mithani A, Brown C, Jiang C, Franklin K, Alvey E, Wibowo A, Jung M, Bailey K, Kalwani S, Ragoussis J, Mott R, Harberd NP: Genome-wide analysis of mutations in mutant lineages selected following fast-neutron irradiation mutagenesis of Arabidopsis thaliana. Genome Res. 2012, 22 (7): 1306-1315. 10.1101/gr.131474.111.PubMed CentralPubMedView ArticleGoogle Scholar
- Rogers C, Wen JQ, Chen RJ, Oldroyd G: Deletion-based reverse genetics in Medicago truncatula. Plant Physiol. 2009, 151 (3): 1077-1086. 10.1104/pp.109.142919.PubMed CentralPubMedView ArticleGoogle Scholar
- Gilchrist E, Haughn G: Reverse genetics techniques: engineering loss and gain of gene function in plants. Brief Funct Genomics. 2010, 9 (2): 103-110. 10.1093/bfgp/elp059.PubMedView ArticleGoogle Scholar
- Bolon Y-T, Haun WJ, Xu WW, Grant D, Stacey MG, Nelson RT, Gerhardt DJ, Jeddeloh JA, Stacey G, Muehlbauer GJ, Orf JH, Naeve SL, Stupar RM, Vance CP: Phenotypic and genomic analyses of a fast neutron mutant population resource in soybean. Plant Physiol. 2011, 156: 240-253. 10.1104/pp.110.170811.PubMed CentralPubMedView ArticleGoogle Scholar
- Greene EA, Codomo CA, Taylor NE, Henikoff JG, Till BJ, Reynolds SH, Enns LC, Burtner C, Johnson JE, Odden AR, Comai L, Henikoff S: Spectrum of chemically induced mutations from a large-scale reverse-genetic screen in Arabidopsis. Genetics. 2003, 164 (2): 731-740.PubMed CentralPubMedGoogle Scholar
- Mulle JG, Patel VC, Warren ST, Hegde MR, Cutler DJ, Zwick ME: Empirical evaluation of oligonucleotide probe selection for DNA microarrays. PLoS One. 2010, 5 (3): e9921-10.1371/journal.pone.0009921.PubMed CentralPubMedView ArticleGoogle Scholar
- Laubinger S, Zeller G, Henz SR, Sachsenberg T, Widmer CK, Naouar N, Vuylsteke M, Scholkopf B, Ratsch G, Weigel D: At-TAX: a whole genome tiling array resource for developmental expression analysis and transcript identification in Arabidopsis thaliana. Genome Biol. 2008, 9 (7): 1-16.View ArticleGoogle Scholar
- Love CG, Graham NS SOL, Bowen HC, May ST, White PJ, Broadley MR, Hammond JP, King GJ: A Brassica exon array for whole-transcript gene expression profiling. PLoS One. 2010, 5 (9): e12812-10.1371/journal.pone.0012812.PubMed CentralPubMedView ArticleGoogle Scholar
- Cox WG, Beaudet MP, Agnew JY, Ruth JL: Possible sources of dye-related signal correlation bias in two-color DNA microarray assays. Anal Biochem. 2004, 331 (2): 243-254. 10.1016/j.ab.2004.05.010.PubMedView ArticleGoogle Scholar
- Yu JD, Othman MI, Farjo R, Zareparsi S, MacNee SP, Yoshida S, Swaroop A: Evaluation and optimization of procedures for target labeling and hybridization of cDNA microarrays. Mol Vis. 2002, 8 (17): 130-137.PubMedGoogle Scholar
- Sun TP, Kamiya Y: The Arabidopsis Ga1 locus encodes the cyclase ent-kaurene synthetase A of gibberellin biosynthesis. Plant Cell. 1994, 6 (10): 1509-1518.PubMed CentralPubMedView ArticleGoogle Scholar
- Galizia EC, Srikantha M, Palmer R, Waters JJ, Lench N, Ogilvie CM, Kasperaviciute D, Nashef L, Sisodiya SM: Array comparative genomic hybridization: Results from an adult population with drug-resistant epilepsy and co-morbidities. Eur J Med Genet. 2012, 55 (5): 342-348. 10.1016/j.ejmg.2011.12.011.PubMed CentralPubMedView ArticleGoogle Scholar
- Bruce M, Hess A, Bai JF, Mauleon R, Diaz MG, Sugiyama N, Bordeos A, Wang GL, Leung H, Leach JE: Detection of genomic deletions in rice using oligonucleotide microarrays. BioMed Central Genomics. 2009, 10: 1-11.Google Scholar
- Relogio A, Schwager C, Richter A, Ansorge W, Valcarcel J: Optimization of oligonucleotide-based DNA microarrays. Nucleic Acids Res. 2002, 30 (11): e51-10.1093/nar/30.11.e51.PubMed CentralPubMedView ArticleGoogle Scholar
- Chou CC, Chen CH, Lee TT, Peck K: Optimization of probe length and the number of probes per gene for optimal microarray analysis of gene expression. Nucleic Acids Res. 2004, 32 (12): e99-10.1093/nar/gnh099.PubMed CentralPubMedView ArticleGoogle Scholar
- May M: The clinical aspirations of microarrays. Science. 2013, 339 (6121): 858-860. 10.1126/science.339.6121.858.View ArticleGoogle Scholar
- Zhao XJ, Weir BA, LaFramboise T, Lin M, Beroukhim R, Garraway L, Beheshti J, Lee JC, Naoki K, Richards WG, Sugarbaker D, Chen F, Rubin MA, Jänne PA, Girard L, Minna J, Christiani D, Li C, Sellers WR, Meyerson M: Homozygous deletions and chromosome amplifications in human lung carcinomas revealed by single nucleotide polymorphism array analysis. Cancer Res. 2005, 65 (13): 5561-5570. 10.1158/0008-5472.CAN-04-4603.PubMedView ArticleGoogle Scholar
- Hegde MR, Chin ELH, Mulle JG, Okou DT, Warren SI, Zwick ME: Microarray-based mutation detection in the dystrophin gene. Hum Mutat. 2008, 29 (9): 1091-1099. 10.1002/humu.20831.PubMed CentralPubMedView ArticleGoogle Scholar
- Narumi S, Numakura C, Shiihara T, Seiwa C, Nozaki Y, Yamagata T, Momoi MY, Watanabe Y, Yoshino M, Matsuishi T, Nishi E, Kawame H, Akahane T, Nishimura G, Emi M, Hasegawa T: Various types of LRP5 mutations in four patients with osteoporosis-pseudoglioma syndrome: identification of a 7.2-kb microdeletion using oligonucleotide tiling microarray. Am J Med Genet A. 2010, 152A (1): 133-140. 10.1002/ajmg.a.33177.PubMedView ArticleGoogle Scholar
- Conrad DF, Pinto D, Redon R, Feuk L, Gokcumen O, Zhang Y, Aerts J, Andrews TD, Barnes C, Campbell P, Fitzgerald T, Hu M, Ihm CH, Kristiansson K, Macarthur DG, Macdonald JR, Onyiah I, Pang AW, Robson S, Stirrups K, Valsesia A, Walter K, Wei J, Wellcome Trust Case Control C, Tyler-Smith C, Carter NP, Lee C, Scherer SW, Hurles ME: Origins and functional impact of copy number variation in the human genome. Nature. 2010, 464 (7289): 704-712. 10.1038/nature08516.PubMed CentralPubMedView ArticleGoogle Scholar
- Li M, Schroeder R, Ko A, Stoneking M: Fidelity of capture-enrichment for mtDNA genome sequencing: influence of NUMTs. Nucleic Acids Res. 2012, 40 (18): e137-10.1093/nar/gks499.PubMed CentralPubMedView ArticleGoogle Scholar
- Treangen TJ, Salzberg SL: Repetitive DNA and next-generation sequencing: computational challenges and solutions. Nat Rev Genet. 2012, 13 (2): 36-46.Google Scholar
- Brenchley R, Spannagl M, Pfeifer M, Barker GLA, D’Amore R, Allen AM, McKenzie N, Kramer M, Kerhornou A, Bolser D, Kay S, Waite D, Trick M, Bancroft I, Gu Y, Huo N, Luo MC, Sehgal S, Gill B, Kianian S, Anderson O, Kersey P, Dvorak J, McCombie WR, Hall A, Mayer KF, Edwards KJ, Bevan MW, Hall N: Analysis of the breadwheat genome using whole-genome shotgun sequencing. Nature. 2012, 491 (7426): 705-710. 10.1038/nature11650.PubMed CentralPubMedView ArticleGoogle Scholar
- Ng SB, Turner EH, Robertson PD, Flygare SD, Bigham AW, Lee C, Shaffer T, Wong M, Bhattacharjee A, Eichler EE, Bamshad M, Nickerson DA, Shendure J: Targeted capture and massively parallel sequencing of 12 human exomes. Nature. 2009, 461 (7261): 272-U153. 10.1038/nature08250.PubMed CentralPubMedView ArticleGoogle Scholar
- Bilguvar K, Ozturk AK, Louvi A, Kwan KY, Choi M, Tatli B, Yalnizoglu D, Tuysuz B, Caglayan AO, Gokben S, Kaymakçalan H, Barak T, Bakircioğlu M, Yasuno K, Ho W, Sanders S, Zhu Y, Yilmaz S, Dinçer A, Johnson MH, Bronen RA, Koçer N, Per H, Mane S, Pamir MN, Yalçinkaya C, Kumandaş S, Topçu M, Ozmen M, Sestan N, et al: Whole-exome sequencing identifies recessive WDR62 mutations in severe brain malformations. Nature. 2010, 467 (7312): 207-210. 10.1038/nature09327.PubMed CentralPubMedView ArticleGoogle Scholar
- Painter JN, Tapanainen H, Somer M, Tukiainen P, Aittomaki K: A 4-bp deletion in the Birt-Hogg-Dube gene (FLCN) causes dominantly inherited spontaneous pneumothorax. Am J Hum Genet. 2005, 76 (3): 522-527. 10.1086/428455.PubMed CentralPubMedView ArticleGoogle Scholar
- Friedl W, Kruse R, Uhlhaas S, Stolte M, Schartmann B, Keller KM, Jungck M, Stern M, Loff S, Back W, Propping P, Jenne DE: Frequent 4-bp deletion in exon 9 of the SMAD4/MADH4 gene in familial juvenile polyposis patients. Genes Chromosomes Cancer. 1999, 25 (4): 403-406. 10.1002/(SICI)1098-2264(199908)25:4<403::AID-GCC15>3.0.CO;2-P.PubMedView ArticleGoogle Scholar
- Koornneef M, Vanderveen JH: Induction and analysis of gibberellin sensitive mutants in Arabidopsis thaliana (L.) Heynh. Theor Appl Genet. 1980, 58 (6): 257-263. 10.1007/BF00265176.PubMedView ArticleGoogle Scholar
- Koornneef M, Elgersma A, Hanhart CJ, Vanloenenmartinet EP, Vanrijn L, Zeevaart JAD: A gibberellin insensitive mutant of Arabidopsis thaliana. Physiol Plant. 1985, 65 (1): 33-39. 10.1111/j.1399-3054.1985.tb02355.x.View ArticleGoogle Scholar
- Achard P, Cheng H, De Grauwe L, Decat J, Schoutteten H, Moritz T, Van Der Straeten D, Peng J, Harberd NP: Integration of plant responses to environmentally activated phytohormonal signals. Science. 2006, 311 (5757): 91-94. 10.1126/science.1118642.PubMedView ArticleGoogle Scholar
- Sharrock RA, Quail PH: Novel phytochrome sequences in Arabidopsis thaliana - structure, evolution, and differential expression of a plant regulatory photoreceptor family. Genes Dev. 1989, 3 (11): 1745-1757. 10.1101/gad.3.11.1745.PubMedView ArticleGoogle Scholar
- Stirnberg P, van de Sande K, Leyser HMO: MAX1 and MAX2 control shoot lateral branching in Arabidopsis. Development. 2002, 129 (5): 1131-1141.PubMedGoogle Scholar
- Hughes TR, Mao M, Jones AR, Burchard J, Marton MJ, Shannon KW, Lefkowitz SM, Ziman M, Schelter JM, Meyer MR, Kobayashi S, Davis C, Dai H, He YD, Stephaniants SB, Cavet G, Walker WL, West A, Coffey E, Shoemaker DD, Stoughton R, Blanchard AP, Friend SH, Linsley PS: Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer. Nat Biotechnol. 2001, 19 (4): 342-347. 10.1038/86730.PubMedView ArticleGoogle Scholar
- Honda S, Satomura S, Hayashi S, Imoto I, Nakagawa E, Goto Y, Inazawa J, Retardation JM: Concomitant microduplications of MECP2 and ATRX in male patients with severe mental retardation. J Hum Genet. 2012, 57 (1): 73-77. 10.1038/jhg.2011.131.PubMedView ArticleGoogle Scholar
- Tonon G, Wong KK, Maulik G, Brennan C, Feng B, Zhang Y, Khatry DB, Protopopov A, You MJ, Aguirre AJ, Martin ES, Yang Z, Ji H, Chin L, Depinho RA: High-resolution genomic profiles of human lung cancer. Proc Natl Acad Sci U S A. 2005, 102 (27): 9625-9630. 10.1073/pnas.0504126102.PubMed CentralPubMedView ArticleGoogle Scholar
- Garnis C, Lockwood WW, Vucic E, Ge Y, Girard L, Minna JD, Gazdar AF, Lam S, MacAulay C, Lam WL: High resolution analysis of non-small cell lung cancer cell lines by whole genome tiling path array CGH. Int J Cancer. 2006, 118 (6): 1556-1564. 10.1002/ijc.21491.PubMedView ArticleGoogle Scholar
- Santuari L, Pradervand S, Amiguet-Vercher AM, Thomas J, Dorcey E, Harshman K, Xenarios I, Juenger TE, Hardtke CS: Substantial deletion overlap among divergent Arabidopsis genomes revealed by intersection of short reads and tiling arrays. Genome Biol. 2010, 11 (1): R4-10.1186/gb-2010-11-1-r4.PubMed CentralPubMedView ArticleGoogle Scholar
- van Duin M, van Marion R, Watson JEV, Paris PL, Lapuk A, Brown N, Oseroff VV, Albertson DG, Pinkel D, de Jong P, Nacheva EP, Dinjens W, van Dekken H, Collins C: Construction and application of a full-coverage high-resolution, human chromosome 8q genomic microarray for comparative genomic hybridization. Cytometry A. 2005, 63A (1): 10-19. 10.1002/cyto.a.20102.View ArticleGoogle Scholar
- Hanemaaijer NM, Sikkema-Raddatz B, van der Vries G, Dijkhuizen T, Hordijk R, van Essen AJ, Veenstra-Knol HE, Kerstjens-Frederikse WS, Herkert JC, Gerkes EH, Leegte LK, Kok K, Sinke RJ, van Ravenswaaij-Arts CM: Practical guidelines for interpreting copy number gains detected by high-resolution array in routine diagnostics. Eur J Hum Genet. 2012, 20 (2): 161-165. 10.1038/ejhg.2011.174.PubMed CentralPubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.