Cell lines
Ten human cell lines derived from the following human tissues were selected for UHRR including liver, testis, mammary gland, cervix, brain, skin, liposarcoma, macrophage, T-lymphoblast and B-lymphocyte [20]. Eleven mouse cell lines, representing liver, kidney, testis, mammary gland, embryo, alveolar macrophages, skin, muscle, macrophage, T-lymphocyte and B-lymphocyte, were chosen for UMRR [21]. Fourteen rat cell lines derived from liver, kidney, brain, testis, mammary gland, embryo, lung, skin, fibroblast, muscle, macrophage, basophil, T-lymphocyte and B-lymphoblast, were used for URRR [22]. The human and mouse cells were grown to 60–80% confluence in RPMI-1640 media while rat cell lines were grown to the same extent in DMEM media. Both media's were supplemented with 2 mM L-glutamine, 10% fetal bovine serum and penicillin/streptomycin. At this point, old media was replaced with fresh media and the cells harvested after 24 hours by trypsinization, washed with 1X PBS, the cell pellets frozen in liquid nitrogen, and stored at -80°C.
RNA isolation
Total RNA was isolated using modified StrataPrep™ Total RNA isolation kit [23] (Stratagene, La Jolla, CA). One to three × 108 cells were lysed in 15 ml of lysis buffer containing guanidine isothiocyanate and filtered in a spin cup to remove particles and reduce contaminating DNA. An equal volume of 70% ethanol was added to the cell lysate followed by vortexing. The mixture was transferred to a second spin cup containing an RNA binding filter and centrifuged for 5–10 min at 5000 × g followed by washing with 15 ml of low-salt buffer. DNase I (500 U of DNase I in 500 μl of DNase digestion buffer) was added directly to the spin cup filter and incubated at 37°C for 15 min. The filter was washed with 10 ml of high-salt wash buffer, 15 ml of low-salt wash buffer and finally with 10 ml of low-salt wash buffer. Total RNA was eluted from the filter using 1 ml of elution buffer added directly to the spin cup filter. The cup was incubated for 2 min at room temperature, and centrifuged for 3 min at 3000 × g. The latter step was repeated twice more. The total volume of elution buffer added was 3 ml. The quantity and quality of isolated RNA was determined by spectrophotometry. RNA integrity was determined by two methods; formaldehyde-agarose gel electrophoresis and Agilent Bioanalyzer analysis (Agilent Technologies, Palo Alto, CA). URR was prepared by pooling equal mass quantities of total RNA from each cell line, dividing the pool into 200 μg aliquots followed by ethanol precipitation and storage at -80°C.
cDNA synthesis and labeling
Labeled cDNAs were synthesized with the FairPlay cDNA labeling kit (Stratagene, La Jolla, CA). 20 μg of total RNA (individual cell line RNA or URR) in 12 μl of DEPC-treated water was combined with 1 μl of 500 ng/μl oligo-d(T)12–18. The mixture was incubated at 70°C for 10 min and cooled on ice. For each reaction, 2 μl of 10X StrataScript reaction buffer, 1 μl of unlabeled 20X dNTP mix containing amino-allyl dUTP, 1.5 μl of 0.1 M dithiothreitol and 0.5 μl of RNase Block (40 U/μl) were prepared and mixed with the RNA sample and 1 μl of 50 U/μl StrataScript RT. After incubation at 48°C for 30 min an additional 1 μl of StrataScript RT was added and incubation was continued for 30 additional minutes. RNA was degraded by adding 10 μl of 1 M NaOH, followed by a 10-min incubation at 70°C and the mixture neutralized with 10 μl of 1 M HCl. Unincorporated nucleotides were removed by precipitation of the cDNA with 4 μl of 3 M sodium acetate, 1 μl of 20 mg/ml glycogen and 100 μl of 95% ethanol at -20°C for 1 hr. After centrifugation and washing the pellet with 70% ethanol, it was resuspended in 5 μl of 2x coupling buffer provided in the kit. Cy3 or Cy5 dye (Amersham), resuspended in 45 ul of DMSO, was added and the reaction incubated for 30 min at room temperature in the dark. Dye-coupled cDNA was purified with a DNA-binding spin cup, as described in the FairPlay cDNA labeling kit protocol, and the final volume adjusted to 5 μl.
Microarrays
Human 7,600-spot and 10,000-spot, mouse 8,700-spot and rat 6,500-spot cDNA microarrays were printed at the National Cancer Institute (NCI; NIH, Gaithersburg, MD). Human 12,000-spot (human 1), mouse 8,500-spot and 14,500-spot rat cDNA microarrays were purchased from Agilent Technologies (Palo Alto, CA). Human 41,000 and 43,000-spot cDNA microarrays were printed at the Stanford Functional Genomics Core Facility (Stanford University; http://www.microarray.org). Mouse 7,500-spot oligonucleotide microarrays were printed at the University of North Carolina (UNC; Chapel Hill, NC) using the Compugen – Sigma murine oligo set.
Microarray pre-hybridization (blocking)
Microarrays were pre-hybridized at 42°C for at least 1 hr in 20–30 μl of pre-hybridization buffer (5X SSC, 0.1% SDS and 1% BSA) covered with coverslips. The slides were then washed by rapidly dipping them in distilled water for 2 min, followed by dipping in isopropanol for 2 min followed by air drying.
Microarray hybridization and data processing
5 μl each of Cy3-labeled and Cy5-labeled cDNA targets were combined with 2 μl of 10 μg/μl human Cot 1 DNA (mouse Cot 1 DNA was used for mouse and rat microarrays; Gibco-BRL), 2 μl of 8 μg/μl poly d(A)40–60 and 2 μl of 4 μg/μl yeast tRNA (Gibco-BRL). Labeled cDNA target (16 μl) was denatured at 100°C for 1 min and cooled on ice. 16 μl of 2X hybridization buffer (50% formamide, 10X SSC and 0.2% SDS) was added and 30 μl of the mixture was applied to a single microarray under a glass coverslip. Microarrays were incubated at 42°C for 16 hr in sealed chambers with humidity maintained by a small reservoir of 3X SSC. Arrays were washed in 2X SSC, 0.1% SDS for 4 min, 1X SSC, 0.1% SDS for 4 min, 0.2X SSC for 4 min, 0.05X SSC for 1 min and air dried. Hybridization signal was visualized and collected using an Axon microarray scanner.
Data analysis
Data from each array was collected with GenePix 3.0 (Axon Instruments). Each spot was defined by manual positioning of a grid over the array image. Aberrant and empty spots were manually flagged and excluded from further analysis. The average pixel intensity within each circle was determined and local background was computed for each spot. Net signal was determined by subtracting local background from the average intensity.
Microarray coverage of UHRR on 43,000-spot microarrays using control spots
Data files generated by GenePix 3.0 were exported into the Stanford Microarray Database (SMD). After background subtraction, normalization and filtering the raw intensity values were used for analysis. Fluorescence intensities of spots representing human genes and negative control spots were compared to estimate the number of genes represented in the UHRR. Human 43,000-spot microarrays (Stanford University) have 384 yeast gene spots used as negative controls. Signals produced at these control spots when hybridized to human cDNA were considered non-specific. (BLAST analysis of 384 yeast ORF nucleotide sequences against UniGene human database did not show cross-reactivity between the yeast spots and human cDNA with the expected values lower than 1E-14). A signal intensity threshold from 25 to 96 fluorescence units, depending on microarray print, was defined such that 1% of the control spots showed greater signal intensity (four of the most intense yeast control spots). 62 ± 3% of the human gene spots had signal intensity greater than this threshold. A second threshold of 18–65 fluorescence units was defined such that 5% of the control spots showed greater signal intensity (nineteen of the most intense yeast control spots). 71 ± 3% of the human gene spots had signal intensity greater than this threshold.
Microarray coverage of URR on different microarrays using the average background intensity
For microarrays lacking the control spots described above, average background intensity values in each channel were used as the threshold. Data files generated by GenePix 3.0 were exported into GeneTraffic (Iobion Bioinformatics, Toronto, Canada). After background subtraction, normalization, and filtering of spot intensities, those spots with intensity above 1X and 2X (1 and 2-fold higher than average of background intensity, respectively) were considered positive.
Evaluation of gene number contributed by individual cell line RNA to the reference pool
Highly expressed genes in individual cell lines were identified by comparing microarrays hybridized with total RNA prepared from each cell line and URR pools. This evaluation was performed for 10 human, 11 mouse and 14 rat cell lines. Data was analyzed using GeneTraffic. Signal intensities between two fluorescent images were normalized using Locally Weighted Scatter Plot Smoother (LOWESS) sub-grid normalization [24]. All the spots with signal intensity less than the local background for Cy5 and Cy3 channels were flagged. In the second step, spots with signal intensity greater than 1000 (highly expressed genes) in the Cy5 channel, and with Cy5/Cy3 ratio's greater than 2, were selected and the number of spots with these parameters average of background intensity in one cell line was evaluated.
Batch-to-batch comparison
Two batches of UHRR were compared by co-hybridizing Cy3-labeled cDNA reverse-transcribed from batch 1 and Cy5-labeled cDNA from batch 2 to the same microarray. The experiment was repeated by switching the dyes. The Pearson correlation coefficient for background-subtracted mean intensities for Cy5 and Cy3 channels was calculated. Identification of genes whose expression level significantly differed between two batches of URR was accomplished using the following statistical analysis (only data with intensity value exceeding 2X background were used). First, individual values of the log2 intensity ratios in 4 hybridizations from each of two URR industrial batches were compared using a 2-tail non-parametric Mann-Whitney test. The critical z-value was calculated as 4.448 using the Bonferroni correction for the multiple of 5,926 tests (e. g. see http://home.clara.net/sisa/bonhlp.htm). Second, Significance Analysis of Microarrays (SAM) software was used [25]. False discovery rate (FDR; percentage of genes selected by chance) was determined by recursive permutations. Data files generated by GenePix 3.0 were entered into GeneTraffic and signal intensities between two fluorescent images were normalized using the LOWESS sub-grid normalization method. All spots with intensity less than local background in Cy5 or Cy3 channels were flagged and excluded from further analysis. A table including all non-flagged spots was generated and exported to SAM. We used the following parameters for data analysis: two class, unpaired data (log, base2), number of permutations 100 and average of background intensity number of neighbors 10.