- Methodology article
- Open Access
A comparative analysis of DNA barcode microarray feature size
BMC Genomicsvolume 10, Article number: 471 (2009)
Microarrays are an invaluable tool in many modern genomic studies. It is generally perceived that decreasing the size of microarray features leads to arrays with higher resolution (due to greater feature density), but this increase in resolution can compromise sensitivity.
We demonstrate that barcode microarrays with smaller features are equally capable of detecting variation in DNA barcode intensity when compared to larger feature sizes within a specific microarray platform. The barcodes used in this study are the well-characterized set derived from the Yeast KnockOut (YKO) collection used for screens of pooled yeast (Saccharomyces cerevisiae) deletion mutants. We treated these pools with the glycosylation inhibitor tunicamycin as a test compound. Three generations of barcode microarrays at 30, 8 and 5 μm features sizes independently identified the primary target of tunicamycin to be ALG7.
We show that the data obtained with 5 μm feature size is of comparable quality to the 30 μm size and propose that further shrinking of features could yield barcode microarrays with equal or greater resolving power and, more importantly, higher density.
Genome-wide studies often measure changes in the abundance of all gene products over a period of time or under varying conditions. Microarrays have made these studies possible by enabling researchers to monitor all known genes of an organism simultaneously to detect patterns of gene activity , alternative splicing variants  the presence of single nucleotide polymorphisms , the presence of copy number variants and  DNA binding sites of diverse proteins , among others. One application of microarrays that our laboratory has focused on is the parallel identification of individual molecular barcoded gene deletion mutants grown competitively in pools [6, 7]. Through the efforts of the Yeast Deletion Consortium, a Yeast KnockOut (YKO) collection was constructed consisting of approximately 6,000 heterozygous gene deletions (>96% of all annotated open reading frames), of which over 1,100 are known to be essential for growth . The remaining ~5,000 genes are nonessential, created as homozygous deletions and MAT αand MATα deletion collections. These collections were made by systematic replacement of each gene from start to stop codon by mitotic recombination with a molecular barcoded resistance cassette. Each cassette contains both an upstream barcode (uptag) and a downstream barcode (downtag) that differ in their 20-mer sequence . Drug sensitivity assays, combined with DNA barcode microarrays, were able to reveal genomic profiles for both the drug's targets through H aplo I nsufficiency P rofiling (HIP) and pathways that buffer the drug target pathway through HO mozygous deletion P rofiling (HOP) [8, 9].
Microarrays are made up of thousands to millions of microscopic "features", clusters of identical oligonucleotide probes, which are used to detect hybridized gene products. The microarrays used for HIPHOP assays have gone through several iterations of development, beginning with a feature size of 103 μm on the TAG1 array which consisted of 20 bp (base pair) probes [6, 8]. The S. cerevisiae cassette was originally designed for detection using the TAG1 microarray, which used 20 bp-long oligonucleotide probes. Current Affymetrix microarrays use up to 25 bp probes to detect complementary DNA sequences, and this length is more appropriate for newer barcoded collections as it improves hybridization specificity and increases the number of resolvable potential barcodes . The features on these chips were subsequently miniaturized to 30 μm and provided full deletion pool coverage on the TAG3 array (P/N 510318) . The current TAG4 chips (P/N 511331) with 8 μm feature sizes were designed for improved performance and affordability. This scheme omitted uninformative probes present on previous tag arrays and added five replicates to report non-uniform hybridization and allow adjustment of intensities accordingly . No smaller yeast deletion pool barcode microarray exists due to manufacturing size constraints, however, these barcode probes are also present on the 5 μm yeast whole genome tiling array (S288c genome tiling microarray; P/N 520055) representing 0.25% of the total 6.5 million probes on this array . The area of the features scale quadratically, such that the tiling array features at 5 μm on a side correspond to 25 μm2, and TAG3 features at 30 μm on a side correspond to 900 μm2, or 36 times the area of the tiling features. It is important to note that all arrays have the same oligonucleotide probe density of approximately 4,000 probes/μm2 (personal communication with Affymetrix technical support).
Yeast deletion pools were thawed from frozen stocks and heterozygote essential gene deletion mutants were grown for 20 generations, while homozygous deletion mutants were grown for 5 generations as described . After growth, heterozygous essential deletion mutants were mixed with correspondingly treated homozygous non-essential deletion mutants. Genomic DNA was isolated and molecular barcodes amplified by PCR. Amplicons were then hybridized to microarrays over night, washed, stained and scanned the following day. For further details regarding sample preparation and data analysis, consult Pierce et al  and Hoon et al .
We performed a HIPHOP screen (pooled heterozygous essential strains and homozygous deletion non-essential strains) with tunicamycin treatment (IC10-20 = 0.35 μM). Tunicamycin is a known glycosylation inhibitor, targeting the yeast essential gene ALG7 [15–17], which encodes UDP-N-acetyl-glucosamine-1-P transferase, a vital protein in the dolichol pathway of protein asparagine-linked glycosylation [18, 19]. Upon treatment with tunicamycin, unfolded proteins remain in the ER (endoplasmic reticulum) . A sample treated with 2% DMSO was used as a control. Yeast pools were grown in liquid culture in 48 well plates in a shaking spectrophotometer interfaced to liquid handling robots. After the cells had grown for the desired number of generations, corresponding to a specific optical density (OD), they were robotically harvested . Genomic DNA was isolated from each pool, and the DNA barcodes were amplified by PCR using common primers. These barcodes were subsequently hybridized to three generations of barcode microarrays: the aforementioned TAG3, TAG4 and S. cerevisiae whole genome tiling arrays. Each chip was prepared using the optimal hybridization and wash/stain protocols recommended for that array type. Deletion strain abundance was resolved by averaging scanned downtag and uptag intensities for each strain and comparing intensities between the tunicamycin-treated pool and the DMSO-treated pool  (see Additional File 1).
Results and Discussion
All three microarray generations, the TAG3, TAG4 and S. cerevisiae whole genome tiling arrays, identified ALG7 as the primary target of tunicamycin, as expected (Figure 1). The tiling array also identified several other genes as additional potential targets. This list of targets includes ADO1, FYV8, GET2, HAC1 and IRE1, all of which have been shown to be sensitive to tunicamycin when knocked out, as well as BCK1, a gene which has previously been shown to be resistant to tunicamycin when overexpressed [19, 21–24]. In particular, ADO1 is a prime example of a gene deletion strain exhibiting increased sensitivity on the tiling array, since it is detected at a log2 ratio of 2.59 in the tiling array data, but at 0.50 and 0.66 in the TAG3 and TAG4 data, respectively. In addition to known sensitive strains, our screen identified COP1 and RER2, which are involved in ER to Golgi vesicle-mediated transport (see Table 1 for summary of sensitive strains) [25, 26]. As with most sensitive strains, these genes were detected at slightly higher levels on the tiling array than on the other array generations. The tiling array appears to have slightly higher variance in its log2 ratios than the other arrays (standard deviation of 0.58 in tiling, compared to 0.37 and 0.43 in TAG4 and TAG3 arrays, respectively). We determined this to be due to its increased sensitivity to hybridized barcode abundance since sometimes strains that appear sensitive on the tiling array, fall into the background signal of the other arrays, as with ADO1. It is reassuring to observe both the primary target of tunicamycin and genes annotated as sensitive to tunicamycin in our results. Additionally, we also identified genes associated with the endoplasmic reticulum and involved in the unfolded protein response because tunicamycin promotes protein misfolding.
Because the tiling array has millions of probes, only a few thousand of which are barcode probes, we hypothesized that non-specific hybridization of barcode DNA to the genome tiling probes could potentially contribute to noise in target identification. This may have been problematic because the tiling probes were not designed for explicit use with the barcode probes, which could lead to unanticipated cross-hybridization of barcode samples to tiling probe features. To determine if non-specific binding was a factor in our experiments, we co-hybridized barcode DNA with unlabeled digested genomic DNA (gDNA). The digested gDNA (20-150 bp) competitively hybridized to tiling probes of the array to which barcodes may have had a non-specific affinity. We asked if the addition of gDNA could result in an increase of specific binding of barcodes to barcode probes, yielding a HIPHOP profile with greater dynamic range and more distinct targets (making the millions of tiling probes unavailable for barcode hybridization) analogous to the addition of salmon or herring sperm to a Southern blot to prevent non-specific hybridization [27, 28]. However, in practice, we found that the addition of gDNA did not improve resolution of the target ALG7 when compared to a microarray without competitive gDNA co-hybridization (Additional File 2).
Our initial experiments used protocols for each microarray that were optimized for that particular technology. For example, each array type has particular hybridization, washing and staining protocols. To minimize the effect of these subtle variations and to accurately compare intensity data across array generations, we hybridized a reference sample (treated with 2% DMSO) to TAG3, TAG4 and tiling microarrays and applied TAG4 wash protocols to each array type. The hybridization conditions were fixed so that we could be certain that any changes we observed were attributed solely to feature size and not protocol variation. We scanned the microarrays following this protocol, and subsequently applied the tiling array antibody stain wash step to all three chips and, once again, scanned them. In this manner, each array was treated identically. In general, we observed median downtag intensity was higher than median uptag intensity (Figure 2), an observation that was also reported by Pierce et al [11, 14]. In addition, the median intensities differed across generations, with TAG3 intensity lower than TAG4 intensity, which was lower than tiling intensity.
We found that TAG4 and tiling array intensities were very highly correlated (Tables 2 and 3; example in Figure 3). This correlation increased slightly once the arrays had been antibody stained during the tiling wash protocol. In contrast, TAG3 intensities did not correlate as well with either TAG4 or tiling, and this decreased significantly after antibody staining. However, this low correlation is unlikely to affect identification of drug targets on TAG3 arrays, as these strains are often the most distinguishable from the background, as shown previously (Figure 1).
The relatively recent design of the TAG4 microarray includes five replicates of each barcode probe . However, we noticed that intensity values do not vary greatly between these replicates, and, therefore, a minimum of three replicates should be included to allow for appropriate trim mean calculations and masking of unusable barcode probes . This finding confirms an earlier assertion by Pierce et al. that suggests that the minimum number of replicates required to achieve high correlation is three replicates, and that the increase in correlation from the fourth and fifth replicates is marginal . Although the TAG3 and tiling results contain only single data points for each barcode and are able to determine ALG7 as the primary target of tunicamycin (Figure 1), replicate data points are advised to accommodate hybridization, washing and staining inconsistencies.
Here we present a systematic comparison of the behavior of 12,000 20 bp barcode probes at three feature sizes. Counter to our expectation, we found that the smallest features, representing less than 1/30 the space of the largest features, perform best in terms of signal intensity and in their ability to identify drug targets in complex pooled assays. We show that microarrays with reduced feature size are equally able to assess DNA barcode abundance when compared to barcode microarrays with larger features. An increased sensitivity was also observed with arrays with smaller features. They identified a previously described target of tunicamycin with greater confidence than the microarrays with greater feature size.
A widely held opinion is that next generation DNA sequencing technologies will replace microarrays in gene product detection . However, microarrays can still increase genome coverage by decreasing feature sizes to as small as 1 μm because current microarray scanners can detect probe intensities at sub-μm resolution. In theory, such reductions in feature size could yield microarrays with approximately 202 million probes/chip (compared to 6.5 million using 5 μm features). Such probe densities would rival next generation sequencing technologies in terms of genome coverage.
Affymetrix microarray library files for the TAG3, TAG4 and tiling arrays are available at http://chemogenomics.med.utoronto.ca.
The supplementary figure displays the tiling array profiles when the DMSO and tunicamycin treatment chips are hybridized with the barcodes alone or with the addition of gDNA.
Lockhart DJ, Dong H, Byrne MC, Follettie MT, Gallo MV, Chee MS, Mittmann M, Wang C, Kobayashi M, Horton H, et al: Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol. 1996, 14 (13): 1675-1680. 10.1038/nbt1296-1675.
Shoemaker DD, Schadt EE, Armour CD, He YD, Garrett-Engele P, McDonagh PD, Loerch PM, Leonardson A, Lum PY, Cavet G, et al: Experimental annotation of the human genome using microarray technology. Nature. 2001, 409 (6822): 922-927. 10.1038/35057141.
Wang DG, Fan JB, Siao CJ, Berno A, Young P, Sapolsky R, Ghandour G, Perkins N, Winchester E, Spencer J, et al: Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome. Science. 1998, 280 (5366): 1077-1082. 10.1126/science.280.5366.1077.
Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, Fiegler H, Shapero MH, Carson AR, Chen W, et al: Global variation in copy number in the human genome. Nature. 2006, 444 (7118): 444-454. 10.1038/nature05329.
Lieb JD, Liu X, Botstein D, Brown PO: Promoter-specific binding of Rap1 revealed by genome-wide maps of protein-DNA association. Nat Genet. 2001, 28 (4): 327-334. 10.1038/ng569.
Shoemaker DD, Lashkari DA, Morris D, Mittmann M, Davis RW: Quantitative phenotypic analysis of yeast deletion mutants using a highly parallel molecular bar-coding strategy. Nat Genet. 1996, 14 (4): 450-456. 10.1038/ng1296-450.
Giaever G, Chu AM, Ni L, Connelly C, Riles L, Veronneau S, Dow S, Lucau-Danila A, Anderson K, Andre B, et al: Functional profiling of the Saccharomyces cerevisiae genome. Nature. 2002, 418 (6896): 387-391. 10.1038/nature00935.
Giaever G, Shoemaker DD, Jones TW, Liang H, Winzeler EA, Astromoff A, Davis RW: Genomic profiling of drug sensitivities via induced haploinsufficiency. Nat Genet. 1999, 21 (3): 278-283. 10.1038/6791.
Giaever G, Flaherty P, Kumm J, Proctor M, Nislow C, Jaramillo DF, Chu AM, Jordan MI, Arkin AP, Davis RW: Chemogenomic profiling: identifying the functional interactions of small molecules in yeast. Proc Natl Acad Sci USA. 2004, 101 (3): 793-798. 10.1073/pnas.0307490100.
Xu Q, Schlabach MR, Hannon GJ, Elledge SJ: Design of 240,000 orthogonal 25mer DNA barcode probes. Proc Natl Acad Sci USA. 2009, 106 (7): 2289-2294. 10.1073/pnas.0812506106.
Pierce SE, Fung EL, Jaramillo DF, Chu AM, Davis RW, Nislow C, Giaever G: A unique and universal molecular barcode array. Nat Methods. 2006, 3 (8): 601-603. 10.1038/nmeth905.
Juneau K, Palm C, Miranda M, Davis RW: High-density yeast-tiling array reveals previously undiscovered introns and extensive regulation of meiotic splicing. Proc Natl Acad Sci USA. 2007, 104 (5): 1522-1527. 10.1073/pnas.0610354104.
Hoon S, Smith AM, Wallace IM, Suresh S, Miranda M, Fung E, Proctor M, Shokat KM, Zhang C, Davis RW, et al: An integrated platform of genomic assays reveals small-molecule bioactivities. Nat Chem Biol. 2008, 4 (8): 498-506. 10.1038/nchembio.100.
Pierce SE, Davis RW, Nislow C, Giaever G: Genome-wide analysis of barcoded Saccharomyces cerevisiae gene-deletion mutants in pooled cultures. Nat Protoc. 2007, 2 (11): 2958-2974. 10.1038/nprot.2007.427.
Barnes G, Hansen WJ, Holcomb CL, Rine J: Asparagine-linked glycosylation in Saccharomyces cerevisiae: genetic analysis of an early step. Mol Cell Biol. 1984, 4 (11): 2381-2388.
Kukuruzinska MA, Lennon K: Diminished activity of the first N-glycosylation enzyme, dolichol-P-dependent N-acetylglucosamine-1-P transferase (GPT), gives rise to mutant phenotypes in yeast. Biochim Biophys Acta. 1995, 1247 (1): 51-59.
Kukuruzinska MA, Robbins PW: Protein glycosylation in yeast: transcript heterogeneity of the ALG7 gene. Proc Natl Acad Sci USA. 1987, 84 (8): 2145-2149. 10.1073/pnas.84.8.2145.
Rine J, Hansen W, Hardeman E, Davis RW: Targeted selection of recombinant clones through gene dosage effects. Proc Natl Acad Sci USA. 1983, 80 (22): 6750-6754. 10.1073/pnas.80.22.6750.
Cherry JM, Ball C, Weng S, Juvik G, Schmidt R, Adler C, Dunn B, Dwight S, Riles L, Mortimer RK, et al: Genetic and physical maps of Saccharomyces cerevisiae. Nature. 1997, 387 (6632 Suppl): 67-73.
Parodi AJ: Protein glucosylation and its role in protein folding. Annu Rev Biochem. 2000, 69: 69-93. 10.1146/annurev.biochem.69.1.69.
Chen Y, Feldman DE, Deng C, Brown JA, De Giacomo AF, Gaw AF, Shi G, Le QT, Brown JM, Koong AC: Identification of mitogen-activated protein kinase signaling pathways that confer resistance to endoplasmic reticulum stress in Saccharomyces cerevisiae. Mol Cancer Res. 2005, 3 (12): 669-677. 10.1158/1541-7786.MCR-05-0181.
Krause SA, Xu H, Gray JV: The synthetic genetic network around PKC1 identifies novel modulators and components of protein kinase C signaling in Saccharomyces cerevisiae. Eukaryot Cell. 2008, 7 (11): 1880-1887. 10.1128/EC.00222-08.
Tan SX, Teo M, Lam YT, Dawes IW, Perrone GG: Cu, Zn superoxide dismutase and NADP(H) homeostasis are required for tolerance of endoplasmic reticulum stress in Saccharomyces cerevisiae. Mol Biol Cell. 2009, 20 (5): 1493-1508. 10.1091/mbc.E08-07-0697.
Schuldiner M, Metz J, Schmid V, Denic V, Rakwalska M, Schmitt HD, Schwappach B, Weissman JS: The GET complex mediates insertion of tail-anchored proteins into the ER membrane. Cell. 2008, 134 (4): 634-645. 10.1016/j.cell.2008.06.025.
Sutterlin C, Doering TL, Schimmoller F, Schroder S, Riezman H: Specific requirements for the ER to Golgi transport of GPI-anchored proteins in yeast. J Cell Sci. 1997, 110 (Pt 21): 2703-2714.
Belgareh-Touze N, Corral-Debrinski M, Launhardt H, Galan JM, Munder T, Le Panse S, Haguenauer-Tsapis R: Yeast functional analysis: identification of two essential genes involved in ER to Golgi trafficking. Traffic. 2003, 4 (9): 607-617. 10.1034/j.1600-0854.2003.00116.x.
Sambrook J, Russell DW: Molecular cloning: a laboratory manual. 2001, Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press, 3
Wahl GM, Stern M, Stark GR: Efficient transfer of large DNA fragments from agarose gels to diazobenzyloxymethyl-paper and rapid hybridization by using dextran sulfate. Proc Natl Acad Sci USA. 1979, 76 (8): 3683-3687. 10.1073/pnas.76.8.3683.
Lister R, Gregory BD, Ecker JR: Next is now: new technologies for sequencing of genomes, transcriptomes, and beyond. Curr Opin Plant Biol. 2009, 12 (2): 107-118. 10.1016/j.pbi.2008.11.004.
Ptacek J, Devgan G, Michaud G, Zhu H, Zhu X, Fasolo J, Guo H, Jona G, Breitkreutz A, Sopko R, et al: Global analysis of protein phosphorylation in yeast. Nature. 2005, 438 (7068): 679-684. 10.1038/nature04187.
Collins SR, Kemmeren P, Zhao XC, Greenblatt JF, Spencer F, Holstege FC, Weissman JS, Krogan NJ: Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae. Mol Cell Proteomics. 2007, 6 (3): 439-450.
McClellan AJ, Xia Y, Deutschbauer AM, Davis RW, Gerstein M, Frydman J: Diverse cellular functions of the Hsp90 molecular chaperone uncovered using systems approaches. Cell. 2007, 131 (1): 121-135. 10.1016/j.cell.2007.07.036.
Schuldiner M, Collins SR, Thompson NJ, Denic V, Bhamidipati A, Punna T, Ihmels J, Andrews B, Boone C, Greenblatt JF, et al: Exploration of the function and organization of the yeast early secretory pathway through an epistatic miniarray profile. Cell. 2005, 123 (3): 507-519. 10.1016/j.cell.2005.08.031.
Briand JF, Navarro F, Rematier P, Boschiero C, Labarre S, Werner M, Shpakovski GV, Thuriaux P: Partners of Rpb8p, a small subunit shared by yeast RNA polymerases I, II and III. Mol Cell Biol. 2001, 21 (17): 6056-6065. 10.1128/MCB.21.17.6056-6065.2001.
Collins SR, Miller KM, Maas NL, Roguev A, Fillingham J, Chu CS, Schuldiner M, Gebbia M, Recht J, Shales M, et al: Functional dissection of protein complexes involved in yeast chromosome biology using a genetic interaction map. Nature. 2007, 446 (7137): 806-810. 10.1038/nature05649.
Bowers K, Lottridge J, Helliwell SB, Goldthwaite LM, Luzio JP, Stevens TH: Protein-protein interactions of ESCRT complexes in the yeast Saccharomyces cerevisiae. Traffic. 2004, 5 (3): 194-210. 10.1111/j.1600-0854.2004.00169.x.
We would like to thank G. D. Bader, C. Boone and J. Moffat for helpful suggestions, K. Tsui for the gDNA sample and Affymetrix technical support for extensive assistance. This work was supported by the Canadian Institutes of Health Research [MOP-81340 to G.G., MOP-84305 to C.N.]; and the National Human Genome Research Institute [HG00317-05].
The authors declare that they have no competing interests.
RA, AMS, GG, CN conceived of the project and designed experiments. RA and AMS performed the experiments. RA, AMS, LEH, GG, CN analyzed the data. RA, AMS, GG, CN wrote the paper.
Ron Ammar, Andrew M Smith contributed equally to this work.