Comparing gene discovery from Affymetrix GeneChip microarrays and Clontech PCR-select cDNA subtraction: a case study
© Cao et al; licensee BioMed Central Ltd. 2004
Received: 17 February 2004
Accepted: 27 April 2004
Published: 27 April 2004
Several high throughput technologies have been employed to identify differentially regulated genes that may be molecular targets for drug discovery. Here we compared the sets of differentially regulated genes discovered using two experimental approaches: a subtracted suppressive hybridization (SSH) cDNA library methodology and Affymetrix GeneChip® technology. In this "case study" we explored the transcriptional pattern changes during the in vitro differentiation of human monocytes to myeloid dendritic cells (DC), and evaluated the potential for novel gene discovery using the SSH methodology.
The same RNA samples isolated from peripheral blood monocyte precursors and immature DC (iDC) were used for GeneChip microarray probing and SSH cDNA library construction. 10,000 clones from each of the two-way SSH libraries (iDC-monocytes and monocytes-iDC) were picked for sequencing. About 2000 transcripts were identified for each library from 8000 successful sequences. Only 70% to 75% of these transcripts were represented on the U95 series GeneChip microarrays, implying that 25% to 30% of these transcripts might not have been identified in a study based only on GeneChip microarrays. In addition, about 10% of these transcripts appeared to be "novel", although these have not yet been closely examined. Among the transcripts that are also represented on the chips, about a third were concordantly discovered as differentially regulated between iDC and monocytes by GeneChip microarray transcript profiling. The remaining two thirds were either not inferred as differentially regulated from GeneChip microarray data, or were called differentially regulated but in the opposite direction. This underscores the importance both of generating reciprocal pairs of SSH libraries, and of real-time RT-PCR confirmation of the results.
This study suggests that SSH could be used as an alternative and complementary transcript profiling tool to GeneChip microarrays, especially in identifying novel genes and transcripts of low abundance.
Gene expression profiling has become an invaluable tool in functional genomics. Since the mid-1990's, DNA microarrays [1–3], cDNA subtraction [4–7] and Serial Analysis of Gene Expression (SAGE)  have emerged as the leading transcript profiling technologies in the global analysis of biological systems. One of the high throughput technologies, high-density oligonucleotide GeneChip® microarrays, manufactured by Affymetrix [1, 3, 9], makes it possible to simultaneously measure the relative abundance of thousands of mRNAs in a cell. However, DNA microarray technology is limited by its insensitivity to transcripts of low abundance . A similar low sensitivity was also seen with SAGE . However, recently a PCR-select cDNA subtraction method (called suppressive, subtractive hybridization, or SSH) was developed by Clontech, which, due to a normalization step to equalize the abundance of cDNAs within the target population, makes it possible to detect some low abundance transcripts [4, 7]. Although custom DNA microarrays have been used in combination with the cDNA subtraction technology in identifying differentially expressed genes [11–15], no direct comparison of the sensitivity and bias of the SSH and GeneChip technologies has been done so far.
In order to comparatively evaluate the SSH and GeneChip technologies, we explored the similarities and differences in regulated genes discovered using SSH and GeneChip microarrays. We compared the regulated genes identified through SSH with the genes found to be differentially regulated using the GeneChip microarrays in a human dendritic cell (DC) differentiation paradigm. We regard this as a "case study" of the potential for novel gene discovery using SSH methodology, that would not be accessible using Affymetrix profiling alone. The same RNA samples isolated from immature DC (iDC) and RNA samples isolated from monocytes were used for GeneChip microarray probing and SSH library construction. Overall, about two thirds of the transcripts identified using SSH methodology were not identified using GeneChip microarrays alone. These results suggest that SSH could be used as an alternative and complimentary transcript profiling tool to Affymetrix GeneChip microarrays, especially in identifying novel genes or transcripts of low abundance.
Genes not represented on Affymetrix GeneChip microarrays can be identified through SSH
Certain genes present in the SSH libraries and represented on GeneChip microarrays were not detected through GeneChip microarray analysis
Discrepancy between genes identified through SSH and genes identified through GeneChip microarray analysis
Real time RT-PCR analysis of selective genes identified through SSH
To find out how genes with conflicting SSH data and GeneChip microarray data are differentially expressed, we used the more sensitive real time RT-PCR (TaqMan® analysis) to quantitate the RNA levels of selected genes in the iDC and monocyte samples used for both the SSH and GeneChip microarray analysis. As shown in Table 2, among the 4 genes that appeared only in the H56 library (iDC minus monocyte), but were suggested by GeneChip microarray profiling to be upregulated in monocytes, three of them have higher levels of expression in iDC, while one of them has higher level of expression in monocytes. These data suggests that there are false positives in both SSH data and GeneChips® profiling data. By using SSH in addition to GeneChip microarray profiling, we can identify some differentially expressed genes with false GeneChip microarray profiling results. However, more sensitive RNA quantitative measures, such as real-time RT-PCR analysis, are needed for more reliable verification of these differentially expressed genes. Since all RNA quantitation methods, including real-time RT-PCR, have their limitations, further validation of the differential gene expression pattern might need to be carried out. For examples, Northen blot may not be as sensitive as real-time RT-PCR, but the size of the bands on the blot may be used as indications for the specificity of the signals. If antibodies are available for the gene products under study, Western blot, flow cytometry and other protein analysis tools may also be used to verify the differentially gene expression pattern.
In this study, we evaluated the similarities and differences in genes discovered using SSH and GeneChip microarrays by comparing the genes found to be differentially expressed during DC differentiation from monocytes using these two technical approaches. Our results showed that among the genes identified in the SSH libraries, more than half of those genes would not have been identified as differentially expressed by using GeneChip microarrays alone. Some of these genes were either novel or not represented on the GeneChip microarrays. However, a significant number of genes were missed by GeneChip microarray analysis despite the presence of probe sets for these genes on the microarrays; whether this number could be lower if the new and improved U133 series GeneChip microarrays were used remains untested.
DNA microarrays are powerful tools that enable the global analysis of a variety of complex biological systems. The expression levels of thousands of genes can be monitored simultaneously by using this high throughput, cost effective technology. However, this technology is also limited by its insensitivity to identify transcripts of low abundance, i.e. genes expressed at low levels or in a small fraction of the cells studied. Even some transcripts of high abundance could be missed by DNA microarrays as well due to the poor hybridization between the probes and the labeled cRNA targets. One factor that could affect the hybridization step is the sequence targeted by the GeneChip probes. Since the GeneChip probes are 3'-biased to match the target generation characteristics of the sample amplification method, the sensitivity of some probes could be compromised either due to their positioning toward the 5' region, or the poor in vitro transcription efficiency caused by the complexity of their sequences. The complexity of these targeted sequences may also affect the hybridization efficiency between the labeled cRNA targets and the GeneChip probes. On the other hand, the normalization step in the SSH protocol equalizes the abundance of cDNAs within the target population and the subtraction step excludes the common sequences between the target and driver populations. So a comprehensive analysis of at least 5000 to 10000 clones isolated from the SSH cDNA libraries may enable the detection of some transcripts of low abundance that would not be revealed by other transcript profiling protocols. Genes not represented on the DNA microarrays, including some genes with novel identities may also be identified through sequencing the SSH cDNA libraries. However, the construction and sequencing of subtractive cDNA libraries is time consuming and labor intensive. These restrictions will limit the number of samples that can be surveyed by this technology in each study.
In practice, we suggest DNA microarrays as the preferred approach for transcript profiling of a large number of samples. This is especially true when the RNA is derived from homogenous cell populations. . However, in a number of cases, such as clinical tissues, the relevant cell type may be difficult to purify or in low abundance. In these cases, normalized subtractive cDNA libraries are preferable. Our results indicate that even though DNA microarrays and SSH may each be preferred in distinct situations, neither technique can adequately identify all regulated genes. Thus, even when homogenous cell populations were examined as we did in this study, more than half of the genes discovered through sequencing the SSH libraries would not have been identified by using GeneChip® technology alone. In conclusion, using normalized cDNA subtraction as an alternative and complementary transcript profiling tool to DNA microarrays will help identify novel genes and low abundance transcripts, therefore achieving a more comprehensive global view of the transcriptome in the biological system studied.
iDC generation and RNA preparation
CD14+ monocytes were isolated from the peripheral blood samples of healthy donors by negative selection using magnetic cell-sorting (Miltenyi, Auburn, CA) and differentiated into immature dendritic cells (iDC) in RPMI/10%FBS containing 1000 U/ml GM-CSF and 1000 U/ml IL-4 (Peprotech, Rocky Hill, NJ) [16–18]. Total RNA of monocytes and iDC was isolated using RNAeasy minikit (Qiagen, Valencia, CA).
Affymetrix GeneChip® Microarray studies
The cRNA labeling and hybridizations were performed according to protocols from Affymetrix Inc. (Santa Clara, CA). Briefly, the mRNA in 5 μg of total cellular RNA was converted to double-stranded cDNA using Superscript (Gibco-Invitrogen) with a T7-(dT)24 primer containing T7 RNA polymerase promoter. The cDNA was in vitro transcribed to biotinylated complementary RNA (cRNA) by incorporating biotin-CTP and biotin-UTP using Enzo BioArray High Yield RNA labeling kit (Enzo Diagnostics, New York, NY). Biotinylated cRNA from each sample was fragmented to approximately 40–100 bases and 10 μg of the fragmented cRNA were hybridized to the Affymetrix human U95 probe array series (A, B, C, D, and E) for 16 h at 45°C with constant rotation at 60 rpm. Following washes, the hybridized chips were sequentially stained with streptavidin-phytoerythrin (Molecular Probes, Eugene, OR), biotinylated goat anti-streptavidin (Vector Laboratories, Burlingame, CA) and another streptavidin-phytoerythrin for signal amplification. After a series of washes, chips were scanned with an argon-ion laser confocal microscope (Hewlett-Packard, Palo Alto, CA) for fluorescence signal detection. All washes and staining procedures were performed on an Affymetrix Fluidics station. The raw expression data derived from Affymetrix Microarray Suite 4.0.1 software gave each transcript an absolute expression level (signal intensity) and a "present" or "absent" call based on the signal/noise ratio. The data were analyzed on two levels. At the detection level, a call of "present" suggests that positive signal is detected for a probe, while a call of "absence" suggests that negative signal is detected for a probe. Gene expression ratio of different samples for each donor was inferred using the PFOLD algorithm  that employs a Bayesian estimation scheme for estimating the fold-change of gene expression and also the significance of the change (P-value). The comparison level analysis of the iDC and monocytes defines a gene as up-regulated if the signal log ratio between the iDC and monocyte samples is larger than 1 (equals a 2-fold increase) and the target sample is present. RNA samples from 3 individuals were analyzed.
The construction and sequencing of subtraction suppression hybridization (SSH) cDNA libraries
SSH libraries were generated using the reagents and protocols provided by Clontech (Clontech, Palo Alto, CA). In one SSH library (H56), the RNA from iDC was used as "tester" and the RNA from monocytes was used as "driver". In another SSH library (H57), the RNA from iDC was used as "driver" and the RNA from monocytes was used as "tester". In both cases, the starting RNA material was a pool of the RNA samples from 3 individuals used in the microarray experiment. RT-PCR analysis of the SSH products showed that the level of the house-keeping gene GAPDH decreased more than 1000 fold in both H56 and H57 cDNA when compared with unsubtracted cDNA (data not shown), suggesting that the subtraction procedure was very effective. 10000 clones from each SSH library were sequenced with M13 primers using the ABI BigDye Terminator v2.0 Cycle Sequencing Kit (Applied Biosystems, Foster City, California) and ABI 3700 DNA Analyzers (Applied Biosystems), according to the manufacturers' protocols and manuals. The SSH cDNA was also used to prepare the cRNA for GeneChip microarrays. In vitro transcription was carried out from the T7 promoter in the PCR primers for SSH. The cRNA generated was used for GeneChip microarray hybridization as described above.
Annotation of the sequence results
Sequences generated through the deep sequencing were clustered into contigs before being submitted to BLAST searches of various online databases to elucidate the identity of clones. These included the National Center for Biotechnology Information (NCBI) nr (nonredundant GenBank, EMBL, DDBJ, and PDB), EST (nonredundant GenBank, EMBL, and DDBJ EST divisions), Incyte LifeSeq® database (Incyte, Palo Alto, CA) and Celera database (Celera, Rockville, MD). The sequences extended in silico were used to search for correspondent qualifiers on the U95 series of GeneChip microarrays (Figure 1).
Real-time RT-PCR analysis
Probes and primers used in TaqMan® analysis
2) Forward primer
3) Reversed primer
Toll-like receptor 4 (TLR4)
2) GCATAAAGTA TGGTAGAGGTGAAAACAT
3) GAAGATGGTGATGGGATT TC
TaqMan® analysis of genes identified through SSH
GeneChip (fold change)*
Copies in H56/H57
TaqMan (fold change)*
Toll-like receptor 4 (TLR4)
The authors are grateful to Drs. Robert A Lewis and Michael Tocci for their support and insights to this project, and to Drs. Chang S. Hahn and Timothy Connolly for their helpful discussion and suggestions.
- Lipshutz RJ, Morris D, Chee M, Hubbell E, Kozal MJ, Shah N, Shen N, Yang R, Fodor SP: Using oligonucleotide probe arrays to access genetic diversity. Biotechniques. 1995, 19: 442-447.PubMedGoogle Scholar
- Schena M, Shalon D, Davis RW, Brown PO: Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995, 270: 467-470.View ArticlePubMedGoogle Scholar
- Chee M, Yang R, Hubbell E, Berno A, Huang XC, Stern D, Winkler J, Lockhart DJ, Morris MS, Fodor SP: Accessing genetic information with high-density DNA arrays. Science. 1996, 274: 610-614. 10.1126/science.274.5287.610.View ArticlePubMedGoogle Scholar
- Diatchenko L, Lau YF, Campbell AP, Chenchik A, Moqadam F, Huang B, Lukyanov S, Lukyanov K, Gurskaya N, Sverdlov ED, Siebert PD: Suppression subtractive hybridization: a method for generating differentially regulated or tissue-specific cDNA probes and libraries. Proc Natl Acad Sci U S A. 1996, 93: 6025-6030. 10.1073/pnas.93.12.6025.PubMed CentralView ArticlePubMedGoogle Scholar
- Lisitsyn N, Wigler M: Cloning the differences between two complex genomes. Science. 1993, 259: 946-951.View ArticlePubMedGoogle Scholar
- Lisitsyn NA: Representational difference analysis: finding the differences between genomes. Trends Genet. 1995, 11: 303-307. 10.1016/S0168-9525(00)89087-3.View ArticlePubMedGoogle Scholar
- Diatchenko L, Lukyanov S, Lau YF, Siebert PD: Suppression subtractive hybridization: a versatile method for identifying differentially expressed genes. Methods Enzymol. 1999, 303: 349-380.View ArticlePubMedGoogle Scholar
- Velculescu VE, Zhang L, Vogelstein B, Kinzler KW: Serial analysis of gene expression. Science. 1995, 270: 484-487.View ArticlePubMedGoogle Scholar
- Pease AC, Solas D, Sullivan EJ, Cronin MT, Holmes CP, Fodor SP: Light-generated oligonucleotide arrays for rapid DNA sequence analysis. Proc Natl Acad Sci U S A. 1994, 91: 5022-5026.PubMed CentralView ArticlePubMedGoogle Scholar
- Evans SJ, Datson NA, Kabbaj M, Thompson RC, Vreugdenhil E, De Kloet ER, Watson SJ, Akil H: Evaluation of Affymetrix Gene Chip sensitivity in rat hippocampal tissue using SAGE analysis. Serial Analysis of Gene Expression. Eur J Neurosci. 2002, 16: 409-413. 10.1046/j.1460-9568.2002.02097.x.View ArticlePubMedGoogle Scholar
- Maekawa T, Bernier F, Sato M, Nomura S, Singh M, Inoue Y, Tokunaga T, Imai H, Yokoyama M, Reimold A, Glimcher LH, Ishii S: Mouse ATF-2 null mutants display features of a severe type of meconium aspiration syndrome. J Biol Chem. 1999, 274: 17813-17819. 10.1074/jbc.274.25.17813.View ArticlePubMedGoogle Scholar
- Villaret DB, Wang T, Dillon D, Xu J, Sivam D, Cheever MA, Reed SG: Identification of genes overexpressed in head and neck squamous cell carcinoma using a combination of complementary DNA subtraction and microarray analysis. Laryngoscope. 2000, 110: 374-381. 10.1097/00005537-200003000-00008.View ArticlePubMedGoogle Scholar
- Wang T, Hopkins D, Schmidt C, Silva S, Houghton R, Takita H, Repasky E, Reed SG: Identification of genes differentially over-expressed in lung squamous cell carcinoma using combination of cDNA subtraction and microarray analysis. Oncogene. 2000, 19: 1519-1528. 10.1038/sj.onc.1203457.View ArticlePubMedGoogle Scholar
- Xu J, Stolk JA, Zhang X, Silva SJ, Houghton RL, Matsumura M, Vedvick TS, Leslie KB, Badaro R, Reed SG: Identification of differentially expressed genes in human prostate cancer using subtraction and microarray. Cancer Res. 2000, 60: 1677-1682.PubMedGoogle Scholar
- Beck MT, Holle L, Chen WY: Combination of PCR subtraction and cDNA microarray for differential gene expression profiling. Biotechniques. 2001, 31: 782-4, 786.PubMedGoogle Scholar
- Romani N, Gruner S, Brang D, Kampgen E, Lenz A, Trockenbacher B, Konwalinka G, Fritsch PO, Steinman RM, Schuler G: Proliferating dendritic cell progenitors in human blood. J Exp Med. 1994, 180: 83-93. 10.1084/jem.180.1.83.View ArticlePubMedGoogle Scholar
- Chapuis F, Rosenzwajg M, Yagello M, Ekman M, Biberfeld P, Gluckman JC: Differentiation of human dendritic cells from monocytes in vitro. Eur J Immunol. 1997, 27: 431-441.View ArticlePubMedGoogle Scholar
- Palucka KA, Taquet N, Sanchez-Chapuis F, Gluckman JC: Dendritic cells as the terminal stage of monocyte differentiation. J Immunol. 1998, 160: 4587-4595.PubMedGoogle Scholar
- Theilhaber J, Bushnell S, Jackson A, Fuchs R: Bayesian estimation of fold-changes in the analysis of gene expression: the PFOLD algorithm. J Comput Biol. 2001, 8: 585-614. 10.1089/106652701753307502.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.