Endogenous reference genes are commonly used to normalize expression levels of other genes with the assumption that the expression of the former is constant in different tissues and in different physiopathological conditions. Whether this assumption is correct it is, however, still matter of debate. In this study, we searched for stably expressed genes in 384 cDNA array hybridization experiments encompassing different tissues and cell lines.
Several genes were identified whose expression was highly stable across all samples studied. The usefulness of 8 genes among them was tested by normalizing the relative gene expression against test genes whose expression pattern was known. The range of accuracy of individual endogenous reference genes was wide whereas consistent information could be obtained when information pooled from different endogenous reference genes was used.
This study suggests that even when the most stably expressed genes in array experiments are used as endogenous reference, significant variation in test gene expression estimates may occur and the best normalization is achieved when data from several endogenous reference genes are pooled together to minimize minimal but significant variation among samples. We are presently optimizing strategies for the preparation of endogenous reference gene mixtures that could yield information comparable to that of data pooled from individual endogenous reference gene normalizations.
Endogenous reference alse referred to as house keeping genes defines in biology the theoretical assumption that certain genes are ubiquitously expressed in nucleated cells possibly because their stable expression is essential for cell survival and welfare in all physio-pathological circumstances. In practical terms, endogenous reference genes provide a useful constant reference to normalize the expression of test genes in different tissues and in different conditions. This is obviously important when estimates of gene expression are provided in relative terms rather than absolute units of measurement. Thus, endogenous reference genes are used as common denominator in biological fractions where the expression of a test gene is described as the relative ratio over an arbitrarily selected internal control presumed to be stably expressed in all circumstances relevant to the experiment [1–3]. Most frequently, glyceraldehydes-3-phosphate dehydrogenase (GAPDH) [4, 5], albumin (for hepatocytes) , β-, γ-actins [7, 8], cyclophilin [9, 10], α-, β-tubulins [7, 11], hypoxantine phosphoribosyltransferase (HRPT) [12, 13], L32 [14, 15] and 18S, 28S ribosomal RNA (rRNA) [16–18] have been used as endogenous reference genes. Depending upon the experimental design, endogenous reference genes have been used individually or in combination for Northern blot analysis, reverse transcription polymerase chain reaction (RT-PCR) and quantitative real-time PCR (qRT-PCR) analysis [19, 20]. With the development cDNA microarray technology endogenous reference genes have been used for array data normalization. However, accumulation of extensive data bases suggests that the expression of frequently used endogenous reference genes can vary substantially according to materials and conditions studied [1, 2, 6, 14, 17, 18, 20–27].
Powerful insights in patterns of gene expression could be attained recently through cDNA or oligonucleotide-based global transcript analysis tools that apply a constant reference system to determine ratios of gene expression across large data sets [28, 29]. The constant reference is provided for each gene in question by consistently co-hybridizing individual test samples with a differentially labeled reference sample maintained identical throughout all the hybridization experiments. Gene expression data are then expressed as the ratio of expression between test and reference samples for each gene. By keeping the reference sample identical the resulting ratio represents a precise estimate of the relative expression of each gene across the various conditions tested bypassing the need to normalize with endogenous reference genes. This holds true if the hybridization kinetics between test and reference sample are accurately reproducible. We will refer to this concept as "reference concordance" and in the results we will discuss how reference concordance was used to validate the reproducibility of cDNA array data from which putative candidate endogenous reference genes were identified.
In the present study, we tested a set of 419 consecutive experiments performed on a 17,000 gene cDNA array platform to which RNA from neoplastic or normal tissues were consistently co-hybridized with a differentially-labeled reference RNA derived from peripheral blood mononuclear cells (PBMC) pooled from six normal donors. The following steps were pursued: 1) Reproducibility assessment of the data set through determination of reference concordance. This was achieved by repeating 14 reference experiments using the melanoma cell line A375 as test sample (Cy5) co-hybridized with pooled PBMC as reference (Cy3). To test for inter-array and printing variation, slide number one and every other 25 slides in sequential order of printing (100 slides per printing set) were used for the repetitive A375 / pooled PBMC hybridizations. In addition, to assess labeling bias, reciprocal labeling was alternatively applied as previously described . In this fashion a pool of genes expressed with high level of reference concordance was selected. 2) Identification of putative endogenous reference genes was performed on 384 array experiments of unequivocal quality by selecting genes that had demonstrated high reference concordance (>90% of the genes in the arrays) and ranking them from the lowest to the highest variance of Log2 test / reference ratios across all array experiments. 3) Validation of the candidate endogenous reference genes as predictor of relative gene expression in large data series. For this purpose, we tested the relative estimates of expression of the melanocytic lineage-restricted melanoma differentiation antigen gp100/Pmel17 (gp100)  in melanocytic and non-melanocytic tissues. Estimates of expression of gp100 were compared after normalization with different endogenous reference genes. For this analysis, we used a new tool that spots cDNA libraries from different tissues on an array platform to allow high-throughput evaluation of individual gene expression across broad tissue collections. We termed this tool: "transcriptome array". 4) Validation of endogenous reference gene-based normalization of gene expression by qRT-PCR and protein expression.
The GenePix Pro 4.0 software was used for array image analysis and calibration. All statistical analyses were performed with the SPLUS package http://www.insightful.com. Thirteen arrays with missing data in more than 30% of the spots and 8 arrays with irregular distribution according to M-A plots (M and A representing respectively log-ratios and average intensities)  were excluded. Spots in which > 50% of the pixels reached saturation in either channel, flagged spots and spots with intensity ≤ 200 in one channel and ≤ 500 in the other channel were filtered out. If the intensity in one channel was < 200 but that of the other channel was > 500, we arbitrarily assigned an intensity of 200 to the channel with the lowest intensity. The log2Cy5/Cy3 ratios were normalized by approximating median values to zero. Spots with ratios > 100 or < 1/100 were truncated at 100 and 1/100 respectively. Data normalization was done by median centering Log2ratio of all genes for each array. For data normalization we used certain arbitrary cutoff criteria to remove spots with weak signal. Weak signal approximating background fluorescence is not reliable and the corresponding log ratios are often distorted resulting in disproportionately high or low values that would bias the statistical results. Because the interpretation of spots with low signal is difficult to make we have adopted the policy of excluding them to focus the analysis on genes whose expression pattern is more reliably tracked by the array tool. In regard to spots with a ratio of >100 or <-1/100, most of them are due to extremely weak signal in one channel which generates a disproportionately extreme value. Although there is no published "gold standard" for the selection of such cutoffs, in practice the parameters that we used for this study are most commonly accepted by investigators and reasonable because allow retention of most of the data in the array excluding only the most extreme and least reliable information.
Validation of the data set through analysis of "reference concordance"
To select from the data base the most reliable genes, we first assessed reproducibility by testing the level of concordance in 14 reference experiments. Reference concordance relies on the expectation that results obtained by repeatedly hybridizing the same test and reference material should perfectly collimate and the degree of deviation from such prediction estimates experimental variance. Concordance can be easily measured by periodically re-hybridizing the same test sample with the constant reference sample. We analyzed a matrix of 7 forward and 7 reciprocally-labeled replicate array experiments hybridized periodically every other 25 cDNA array slides. Reciprocal labeling was applied to measure labeling bias . Data generated by reciprocal experiments were mathematically reversed into the same labeling direction for data analysis. Genes that were discordant due to labeling bias were identified by student's t test as those with ratios significantly different between the 7 forward and 7 reciprocal experiments after reversal of the reciprocal values (P < 0.05) and the median ratio difference between the 7 forward and 7 reciprocal experiments was larger than 1.5 fold. The genes whose variances of the log ratio across all experiments were among the top 1 percentile of variance of all genes were identified as discordant due to hybridization bias. In addition, genes with more than 50% missing values were excluded. Overall 1,343 out of 16,738 were judged potentially discordant and were excluded from further analysis. The remaining 15,395 concordant genes were used for subsequent analyses.
Endogenous reference gene identification
We studied 384 of 419 consecutive array experiments remaining after the exclusion of the 14 reference concordance experiments and 21 experiments judged of poor quality due to missing data or irregular distribution by M-A plot analysis. Selection of candidate endogenous reference genes followed these steps.
First, the seven experiments used for the analysis of reference concordance in which labeling was done identically to the rest of the experimental samples (test labeled with Cy5 and reference with Cy3) were used to calculate experimental variance (Table 1). Median of variances across the seven replicate arrays was calculated for all genes with average intensity in both channels > 2,000. This parameter provides an estimate of the variance due to experimental variation (background variance) since theoretically no differences in gene expression should be noted by using the same material. Based on the assessment of the seven repeat experiments, we defined differential expression as > 2 standard deviations (SD) from the mean of the 384 arrays which is equivalent to 1.46-fold change (± 0.549 in log2 ratio). Consequently, genes with fold changes < 1.46 from zero across all 384 arrays were categorized as candidate endogenous reference genes and were ranked according to ascending values of SD of Log ratio (Table 1).
Among these genes, we further selected candidate endogenous reference genes according to the following criteria: in at least 95% of the 384 arrays the Log2 ratio had to be within 2 SD from zero and the average intensities of both channels across all samples need to be higher than 2,000. The intensity parameter was added to ensure that the selected endogenous reference genes were expressed at a relatively high level in most tissues to mitigate excessive fluctuations in Log2 ratios occurring when a low value is applied as denominator. This is important when a reference gene is applied as a denominator in the equation used to normalize the ratio of other tests genes; in such cases robust gene expression decreases the range of ratios resulting from the analysis decreasing, therefore, the experimental variance. Sixty-nine genes fit these criteria. The range of mean Log2 ratio was from 0 to 0.15 and the SD of mean Log2 ratio was from 0.27 to 0.42. Analysis of 7 replicate array based on the same group of candidate endogenous reference genes demonstrated a similar distribution of mean Log2 ratio and SD of mean Log2 ratios (with exception of TFAP2B) suggesting that this variance could be attributed to predominantly experimental noise. Among these genes we further selected for validation 11 that had high mean fluorescence intensity (underlined in Table 1). However, probe preparation and other technical considerations limited to analysis to only 8 of these genes which included: NEED8, HIRIP5, GPLD1, RAB31, SNRPD3, FZD6, COL8A1 and AFAP.
"Leave-one-out cross-validations" were used to estimate the error [33, 34]. In each validation experiment one array was left out. The remaining 383 arrays were used to identify endogenous reference genes using the same parameters used for the original analysis (test/reference ratios < 2 SD in at least 95% of the experiments and average intensity > 2,000 in both channels). The median Log2 ratio of the endogenous reference gene in the left out array was then compared with the "real" normalization factor consisting of the median Log2 ratio for all genes in that array supposed to approximate a balanced expression of test and reference genes for that array. This procedure was repeated for each array 384 times. Only in 3 arrays out of 384 (2.3%) were found errors in endogenous reference gene selection (errors were considered median Log2 ratio of endogenous reference genes in the left out array > 2 standard deviations apart from the "real" normalization factor in the same array (which is zero for normalized arrays).
The expression of each candidate endogenous reference gene identified in this study was then compared with available information about the expression of the same genes in 12 normal human tissues reported by the Affymetrix HG-U95A-E probe sets http://nciarray.nci.nih.gov/cards/index.html. We found a good correlation between the expression patterns observed by us and that reported by the Affymetrix GeenChip array (data not shown). In addition, a cluster of endogenous reference genes suggested by Applied Biosystems for standardization during qRT-PCR was analyzed by evaluating the SD of their Log2 ratios across the 384 arrays. As data indicated in Table 2, although some genes (bold) demonstrated mean SD Log2 ratios comparable to that of the endogenous reference genes identified in this study, the corresponding percentage of the arrays in which Log2 ratios were within 2 SD from zero ranged between 0.68–0.85 which is significantly lower than the frequency in which the same parameter fell for the 69 candidate house keeping genes identified here (>95%). In addition, we looked at the variation in our arrays of genes classically used as endogenous reference genes including: ribosomal protein L32, HRPT, β-actin and tubulin-α3. In all cases the mean SD of Log2Ratio was relatively larger than for the endogenous reference genes identified in Table 1 with the following respective values: 0.88 for Ribosomal protein L32, 1.13 for HRPT, 1.32 for β-actin and 1.42 for laminin-α3. Interestingly, however, the mean intensities were relatively higher for all of these genes compared with the ones identified in this study (7,369; 12,778, 43,794 and 42,241 for the four genes respectively) suggesting that these genes are expressed probably at higher concentrations in tissue and, therefore, although relatively variable in expression in different tissues, they have been a useful marker of RNA abundance when only rough estimates are required like, for instance, for Northern Blotting.
Test gene expression estimates according to house-keeping gene selection: the transcriptome array
To validate the usefulness of candidate endogenous reference genes in large sample populations we developed a new tool that we are planning to use in the future for validation of gene expression across extensive data sets. This tool displays cDNA libraries originated from different tissues or cell lines individually spotted on a solid surface. The principle of this technology is similar to RNA dot blot which uses RNA isolated from samples and transferred to membrane making, therefore, the transcriptome array a high-density dot blot. The labeled DNA probe of interest is hybridized to the immobilized complementary strain of mRNA. A reference gene hybridization will carried out simultaneously to estimate the relative expression of the gene of interest compared with the reference gene. The new tool we describe here, termed transcriptome array, utilizes cDNA generated from source mRNA for target immobilization to improve the stability of the immobilized targets and differentially fluorescence-labeled test and reference probes (RNA or double strained DNA) then can co-hybridize on to the same spots. Using a validated RNA amplification technology , large quantity of pure amplified RNA with proportional representation of source mRNA species could be generated from which cDNA could be obtained through a reverse transcription reaction. Because of the minimum amount of cDNA used for fabricating each transcriptome array (<5 nano gram cDNA/spot) and the size of spots (100 um), the expression of a large number of genes can be theoretically analyzed on thousands of different samples simultaneously. Since the amount of cDNA spotted may vary according to the quality of the starting material and the efficiency of RNA preparation for each sample, absolute estimation of fluorescence from the hybridized probe is not informative of the expression of the given gene in each sample. Therefore, a reference system is necessary so that comparative expression of the test gene can be presented proportional to that of a consistently expressed gene across all samples. Therefore, interpretation of data derived from the transcriptome array relies on endogenous reference gene normalization.
To test the usefulness of various endogenous reference gene we resorted to the well characterized expression of the melanocytic lineage-specific gene gp100/Pmel17  that is expressed exclusively though heterogeneously in cells of melanocytic lineage [35, 36]. The assumption of this experiment was that in spite of its heterogeneity of expression in samples of melanocytic origin, overall the expression of gp100 should be higher in meloanocytic compared with non melanocytic samples with a high degree of significance. Rho-C has been associated with the metastatization process in melanoma but its specificity for melanocytic lineage is unknown . The differentiation control element DICE is found in the 3'-UTR of numerous eukaryotic mRNAs and there is no solid association between its pattern of expression and specific physio-pathological or developmental conditions . Therefore, we compared the expression profile of these three genes in 829 cDNA libraries that included 106 melanoma cell lines, 127 melanoma metastases, 2 benign nevi (total of 235 melanocytic samples) and 593 miscellaneous samples containing a predominance of primary esophageal, renal and colon cancers paired with normal tissues from the same organ and a large number of circulating mononuclear cells. In addition, samples from most other normal and cancerous tissues were included although in smaller number (complete list of samples available upon request).
As endogenous reference genes we chose β-Actin and new candidate genes identified by this study (see previous section). The test genes (gp100, Rho-C and DICE) were separately hybridized to the transcriptome array. Each gene labeled with Cy3 was co-hybridized with individual endogenous reference genes labeled with Cy5. We then compared Log2 Cy5 / Cy3 ratios between melanocytic and non-melanocytic tissues. gp100 was, as expected, expressed more in melanocytic lesions with high degree of significance no matter what gp100 / endogenous reference gene combination was used. The range of significance, however, varied extensively depending upon the gp100 / endogenous reference gene combination. This difference was considered representative of the ability of different endogenous reference genes to normalize for tissue-specific gene expression patterns (Figure 1). In details, all endogenous reference genes could segregate melanocytic from non melanocytic lesions with a high degree of significance (Figure 2; unpaired two sample t test p2-values ranged from 6 × 10-8 to 3 × 10-36). There was, however, a big range in the discriminatory capacity among endogenous reference genes with NEDD8, RAB3 and FZD6 providing the highest predictive value (t test p2-value = 3 × 10-36, 1 × 10-32 and 1 × 10-21 respectively) and β-actin being one of the least reliable (t test p2-value = 1 × 10-8).
The large variation in the results obtained using different endogenous reference genes to normalize gp100 expression could have been due to a higher stability of the expression of some genes across all samples or to a differential expression of the endogenous reference genes themselves in melanocytic lesion. In the latter case, the better results obtained could be coincidental and not useful in other experimental situations. We, therefore, tested whether pooling results obtained with all endogenous reference genes independently of each predictive value in this controlled experimental situation could yield results as informative about gp100 pattern of expression as those obtainable with individual endogenous reference genes, particularly those that provided the best prediction. The advantage of this strategy is that it does not depend on previous knowledge of gene expression patterns for the selection of individual endogenous reference genes not applicable in conditions, in which the suitability of a gene for a given experimental situation is, contrary to gp100, is unknown. Thus, we compared the predictive value of the average of gp100 / endogenous reference gene Log2 ratios obtained with all endogenous reference genes and those obtained using the genes that provided the best five or the best three results (Figure 2). The use of pooled information from various endogenous reference genes appeared to stabilize the non-melanocytic Log2 ratios gp100 / endogenous reference. In fact, the SD of theLog2 ratios gp100 / endogenous reference among non-melanocytic samples derived by pooling endogenous reference gene results were significantly lower than the SD obtainable with any individual endogenous reference gene (F-test). The basis for this test was that non-melanocytic lesions uniformly should not express gp100 and, therefore, the Log2 ratios for any gp100 / endogenous reference gene combination should be constant resulting in very low SD. SD for non-melanocytic lesions were much lower using pooled endogenous reference genes for the normalization (Figure 2) and this was true at extreme levels of significance (F-test value for SD when all endogenous reference genes were used for normalization compared with individual endogenous reference were 0, 1.5 × 10-9, 0, 5.6 × 10-7, 9.0 × 10-5, 0, 0, 0 and 0 for NEDD8, RAB3, FZD6, AFAP, COL8A, SNRPD, HIRIP5, β-actin and GPLD1 respectively). Similar results were obtained when results from the best 5 and best 3 endogenous reference genes were pooled together. In addition, even if pooling data together did not provide the highest level of discrimination compared to the best results obtained with selected individual endogenous reference genes, the capacity of pooled data sets to discriminate gp100 expression between melanocytic and non melanocytic samples was still very high (2 × 10-22; 4 × 10-29 and 1 × 10-34 respectively when all, the best 5 or the best 3 endogenous reference gene results were pooled together). Interestingly, different endogenous reference genes provided not only different levels of discrimination but also provided different estimates of gp100 expression in melanocytic lesions with gp100 / endogenous reference gene Log2 ratio above (NEED8, RAB3, FZD6, AFAP, SNRPD3, HIRIP5, GPLD1) or below (COL8A1, β-actin) 0 (Bottom graph, Figure 2). This, of course, has no impact on the ability to characterize patterns of expression of different genes in different tissues but rather suggests that the latter group of endogenous reference genes is expressed at higher copy number, therefore, biasing the data toward its own fluorescence channel. Simple adjustment in probe concentration could easily solve this problem.
We then analyzed the expression of Rho-C and DICE whose pattern of expression in the transcriptome array cannot be predicted based on available information save for the notion that of the expression of Rho-C is related to the metastatic process of melanoma . The pattern of Rho-C was characterized with all the endogenous reference genes and representative information is presented in Figure 3. Most endogenous reference genes yielded a pattern suggestive of a preferential expression of Rho-C in melanoma metastases. In addition, averages of Rho-C / endogenous reference gene Log2ratios demonstrated clearly a specific expression of this gene in melanoma metastases. This pattern was not necessarily specific for melanoma as the other cancerous tissues spotted in the transcriptome array were obtained from primary lesions. Thus, it is possible that the preferential expression in melanoma metastases was due to the metastatic process rather than the melanocytic lineage. Interestingly, the melanoma cell line Mel-A375 did not constitutively express Rho-C as previously observed by Clark et al. . In addition, none of the melanoma cell lines mostly derived from melanoma metastases demonstrated expression of Rho-C. This information suggests that Rho-C may be involved in the natural metastatic process in vivo but is not constitutively expressed in vitro. Finally, DICE did not demonstrate any specificity of expression and appeared variably expressed in all specimens independently of the endogenous reference gene used for normalization.
Validation of data by protein analysis and quantitative real-time PCR (qRT-PCR)
We tested whether gp100 / endogenous reference gene Log2 ratios correlated with gp100 protein expression level as measured by intra-cellular FACS analysis (Figure 4). The gp100 / endogenous reference gene Log2 ratios were derived from the transcriptome arrays while protein expression was based on previous characterizations of the same cell lines [35, 39]. Overall, a good correlation (Pearson's correlation) was noted for gp100 values normalized with most endogenous reference genes. However, with the exception of NEDD8, the best correlation was obtained when pooled information was used from the best 5 and 3 candidate endogenous reference genes as defined before. Interestingly, RAB3 that scored very high as a predictor of gp100 transcriptional expression in melanocytic lesions and β-actin provided the worst correlation with protein expression. This data suggest that, although different endogenous reference genes may yield better predictive value than others in large data sets, it is likely that their reliability varies in different conditions and possibly the best results can be obtained by pooling information from several of them.
We also tested by quantitative real-time PCR (qRT-PCR) the discriminatory capacity of endogenous reference genes in samples with known patterns of expression of gp100. We selected 6 previously characterized [35, 36] melanoma cell lines four of which were known to express (MEL-553B, MEL-1317, MEL-526, MEL-1102) and two known not to express (MEL-836, MEL-1195) gp100 at the transcriptional and protein level (Figure 5) [36, 39]. In addition, 4 fine needle aspirate (FNA) samples from melanoma metastases expressing gp100  and a series of tumors (2 renal cell, 2 esophageal one gastric carcinomas and HELA cells line) known not to express gp100 were tested. Absolute expression of gp100 was measured and although the lesions supposed to be gp100 expressing appeared more positive than those supposed to be gp100 negative there was no absolute demarcation among the two groups (Figure 5a). A better differentiation between positive and negative lesions could be obtained when the various endogenous reference genes were used for the analysis or the combined values from all of them (Figure 5b).
Although the concept of endogenous reference genes is appealing it may be unrealistic to expect that a gene could be equally expressed in all eukaryotic cells in all physiopathological conditions. Thus, the endogenous reference gene concept can only approximate such ideal biological behavior. In fact, numerous studies noted that the expression of the ostensible endogenous reference genes varies according to distinct physio-pathological situations and can be affected by experimental manipulation [2, 20, 24, 26, 27, 41]. For instance, the levels of expression of GAPDH, cyclophilin and β-actin fluctuate in different tissues, disease stage and are affected by cell behavior like proliferation [6, 15, 18, 20, 42]. In addition, endogenous reference genes expression varies markedly in cancer compared to normal tissue [17, 26, 41] suggesting that oncogenesis may influence their expression .
In this study, therefore, we looked for relative stable endogenous reference gene challenging a 17,000 cDNA clone data base against approximately 400 specimens that included a broad variety of normal and neoplastic tissues as well as cell lines. Indeed, few previously unnamed genes approximated the ideal endogenous reference gene by being expressed consistently in more than 95% of the specimens analyzed within a variance that was barely and insignificantly different from what could be expected from intrinsic experimental fluctuation (Table 1). In fact, the threshold that we selected based on the variance observed by periodically repeating identical experiments was consistent with criteria suggested by others  who claimed that genes with fold changes greater than 1.5 can be classified as differentially expressed at ~95% confidence interval. This is consistent to our selection criterion that contained endogenous reference genes within a 1.46 fold maximal variation. Although these genes have not been proposed before as candidate endogenous reference genes, analysis of available data bases suggested that they are relatively stably expressed in various normal tissues (i.e. Affymetrix HG-U95A-E probe sets). Selection of these candidate endogenous reference genes provided useful information when the expression of a lineage specific gene such as gp100 was compared between relevant and irrelevant samples. In addition, gp100 expression normalized according to some of the new endogenous reference genes more closely aligned to lineage specificity than normalization performed using β-actin.
This data, however, confirm that there is no such thing as a perfect endogenous reference gene. Although some endogenous reference genes demonstrated high power of discrimination their ranking was dictated by previous knowledge of the expected experimental results (gp100 being melanocytic lineage specific). However, gene expression analysis is done to determine such specificity and in most cases no previous knowledge of tissue specificity is truly available. One may argue that, in spite of the extreme variability in normalization power demonstrated by various endogenous reference genes, still significant differences could extracted between the melanocytic and non-melanocytic samples no matter what endogenous reference genes were used. This statement is correct in this experimental situation where hundreds of samples could be compared with the transcriptome array. However, in cases when only few samples are analyzed the difference in discriminatory capacity becomes critical. For instance, when randomly selected gp100 / endogenous reference gene Log2 ratios from 10 melanoma metastases were compared with those from 10 kidney specimens only NEDD8, RAb3, FZD6, AFAP, COL8, SNRPD and pooled endogenous reference gene-based normalizations yielded significant differences between the two groups while non significant values were obtained for normalizations based on HIRIP5, β-actin and GPLD1.
The lowest limit of detection in the transcriptome array of specific probes in term of copy numbers is currently under investigation. Decreasing concentrations of β-actin alone printed the array spots as internal controls suggest that the detection range is from 30 ng / spot to 3 pg / spot when standard fluorescent intensity range parameters are applied. The higher limit of detection before saturation can be easily adjusted to avoid under estimates of highly expressed genes, therefore, increasing the dynamic range of detection. This is achieved by adjusting the fluorescent intensity right below the saturation for both channel and subsequently calculating the abundance of reference gene expression using serially diluted internal control spots as standard curve. With few exceptions, the expression of most genes can be estimated with relative accuracy using this semi-quantitative approach.
Since in most cases endogenous reference genes are selected based on previous knowledge of their usefulness for a particular experimental situations, it is likely that the best consistency could be achieved in such circumstances when averages of several endogenous reference gene normalizations are used. This may represent a reasonable solution particularly if a cocktail of endogenous reference genes identically labeled could be used as reference within a single experiment. We are presently testing various combinations of lineage specific genes and pools of endogenous reference genes for hybridization to the transcriptome array and for qRT-PCR. Such attempts, however, have been quite disappointing so far as results obtained using mixture of endogenous reference genes do not yield information comparable to that achievable when results obtained with the same endogenous reference genes used individually are pooled together. This is possibly due to imbalanced density of various endogenous reference genes in different samples with predominant effects of some over others. Therefore, extensive work in the future will be aimed at equilibrating the interactions among endogenous reference genes in various experimental conditions to test whether mixtures of them could be used in multiplex for normalization of large expression analysis studies.
Our observations, together with previous work by others indicate that much caution should be taken when using endogenous reference to normalize the expression of test genes. It is likely that previous analyses based on single endogenous reference genes might have been strongly biased by the individual selections and the interpretation of the results should looked at with caution. In particular, this study suggests that even when the most stably expressed genes in array experiments are used as endogenous reference genes, significant variation in test gene expression estimates may occur and the best normalization is achieved when data from several endogenous reference genes are pooled together to minimize minimal but significant variation among samples. We are presently optimizing strategies for the preparation of endogenous reference gene mixtures that could yield information comparable to that of data pooled from individual endogenous reference gene normalizations.
The list of samples used for cDNA microarray hybridization or for preparation of the transcriptome array is available upon request. Most melanoma and renal carcinoma cell lines were generated at the Surgery Branch, National Cancer Institute (NCI), National Institutes of Health (NIH), Bethesda, MD and maintained in RPMI (Biosource International, Camarillo, CA) supplemented with 10% fetal calf serum (Biosource). Cell lines were harvested by Trypsin /Versene (Biosource) digestion. PBMC were collected by leukapheresis from six unrelated normal donors in the Department of Transfusion Medicine, Clinical Center, NIH and purified by Ficoll gradient separation. Surgically removed tumor tissues were collected from the Tissue Network (Philadelphia, PA), Surgery Branch and CPSB specimen bank, NCI, (Bethesda, MD). Fine needle aspiration (FNA) biopsies were obtained from patients with metastatic melanoma referred to the Surgery Branch, NCI for immunotherapy. Total RNA was isolated using Qiagen RNeasy kit and its quality and quantity was estimated using Agilent Bioanalyzer 2000 (Agilent Technologies, Palo Alto, CA).
Target preparation for cDNA microarray hybridization
Total RNA extracted from test and reference samples was transcribed in vitro into anti-sense RNA (aRNA) and reverse-transcribed into fluorescence labeled cDNA for hybridization to 17,000 gene cDNA-based microarrays [30, 44]. Pooled PBMC were used to prepare reference aRNA to be co-hybridized in all experiments with test aRNA. cDNA targets were labeled with Cy3 (green) for reference material and Cy5 (red) for test material with the exception of reciprocal experiments.
cDNA (UniGene cluster) microarrays were printed at the Immunogenetics Section, DTM, CC, NIH with a configuration of 32 × 24 × 23 containing 17,500 elements. Clones used for printing were selected from the Research Genetics RG_HsKG_031901 8 k clone set and the 9k RG_Hs_seq_ver_070700 40 k clone set. The 17,500 spots included 12,072 uniquely named genes, 875 duplicate genes and about 4,000 expression sequence tags. For a complete list of genes included in the Hs-CCDTM-17.5k-1px printing please visit our web site at http://nciarray.nci.nih.gov/gal_files/index.shtml.
A collection of aRNA-based libraries was prepared from 960 frozen tissue samples or cell lines and individual aRNAs were reverse transcribed into cDNA in the presence of 1 μl of dN6 primer (8 μg/μl), 6 μl of first strand buffer, 3 μl of 10 mM dNTP, 3 ul of 50 mM dTT, 1.5 μl of Rnasin and 10 μl of aRNA (6–12 ug) with addition of 3 μl of Superscript II (BRL) in 5.5 μl volume after heating to 65°C for 5 minutes. Reactions were carried out in 96-well plates at 42°C for 90 minutes followed with addition of 1 μl Rnase H and heating for 20 min at 37°C. cDNA were further purified by utilizing CentriCept gelfitration plates to remove non incorporated primer and dNTP. Purified cDNA were transferred to 384-well plates and dried by speedvac. Samples were re-suspended in 13 μl of 3 × SSC followed by shaking at 3,000 rpm for 15 min. Reconstituted cDNA libraries from individual samples and spiked reference gene at different concentration were printed on to poly-L-lysine coated glass slides at the concentration of 0.5–1 μg/μl using the OmniGrid (GeneMachine. San Carlos, CA) printer and Telechem printing pins (TeleChem International, Inc. Sunnyvale, CA). Each 100 μm diameter spot was duplicated at a 250 μm distance. After complete exsiccation, slides were post-processed as described at http://nciarray.nci.nih.gov/reference/Post_Processing.html.
Specific PCR primers for the amplification of each gene probe were designed using Primer 3 program http://molbio.info.nih.gov/molbio/analysis.htm. In order to generate single strand fluorescence labeled cDNA for hybridization, modified specific 5' primers with an extension of the T7 promoter region were used for PCR amplification. Double strand PCR fragments (300–800 bp) were used as template for in vitro transcription to generate single stranded mRNA. 3 μg of amplified RNA were then reverse transcribed into cDNA in present of 4 μl of first strand buffer, 1 μl dN6 primer (8μg/μl; Boehringer Mannheim), 2 μl 10 X low T-dNTP (5 mM A, C and GTP, 2 mM dTTP), 2 μl Cy-dUTP (1 mM, Cy3 or Cy5), 2 μl 0.1 M DTT, 1 μl RNasin, 3 μg amplified mRNA in 8 μl DEPC H2O. After heating to 65°C for 5 minutes and cooling to 42°C, 2 μl of SSII was added and the labeling reaction was carried out at 42°C for 90 min. Probe purification and hybridization were performed as previously described [30, 44].
Primer and probe sets from each candidate genes were designed using the Primer Express 2.0 program (Applied Biosystems, Foster City, CA) and synthesized by BioSource. Taqman probes were labeled at the 5' end with the reporter dye molecule 6-carboxy-fluorescein (FAM; emission λmax = 518 nm) and at the 3' end with the quencher dye molecule 6-carboxytetramethyl-rhodamine (TAMRA; emission λmax = 582 nm). Total RNA was used for gene expression validation. Measurement of gene expression was performed using the ABI 7900HT sequence detection system (Perkin Elmer, Foster City, CA) as previously described [40, 45]. Messenger RNA from test samples were reverse transcribed into cDNA in the presence of oligo dT (12–18 mer). The qRT-PCR procedure was performed by alternating 2 minute cycles at 50°C and 10 minute at 95°C. Forty cycles involving denaturation at 95°C for 15 seconds and annealing/extension at 60°C for 1 minute in 50 μl volume with 1 × TaqMan Master Mix (PerkinElmer). Standard curves were generated for each gene with high and accurate PCR amplification efficiency as determined by the slope of the standard curves. Linear regression analysis of all standard curves documented in all cases an R value ≥ 0.99. The sequence of primer sets and probes used for each gene are shown in Table 3 with statistics for each standard curve. Standard curve extrapolation of copy number was performed for the gene of interest as well as an endogenous reference gene for each sample. To correct for concentration of starting material, normalization of samples was performed by dividing the copies of the gene of interest by copies of the reference gene as previously described .
Thellin O, Zorzi W, Lakaye B, De Borman B, Coumans B, Hennen G, Grisar T, Igout A, Heinen E: Housekeeping genes as internal standards: use and limits. J Biotechnol. 1999, 75: 291-295. 10.1016/S0168-1656(99)00163-7.
Petersen BH, Rapaport R, Henry DP, Huseman C, Moore WV: Effect of treatment with biosynthetic human growth hormone (GH) on peripheral blood lymphocyte populations and function in growth hormone-deficient children. J Clin Endocrinol Metab. 1990, 70: 1756-1760.
Goldsworthy SM, Goldsworthy TL, Sprankle CS, Butterworth BE: Variation in expression of genes used for normalization of Northern blots after induction of cell proliferation. Cell Prolif. 1993, 26: 511-518.
Choi JK, Holtzer S, Chacko SA, Lin ZX, Hoffman RK, Holtzer H: Phorbol esters selectively and reversibly inhibit a subset of myofibrillar genes responsible for the ongoing differentiation program of chick skeletal myotubes. Mol Cell Biol. 1991, 11: 4473-4482.
Wei Q, Guan Y, Cheng L, Radinsky R, Bar-Eli M, Tsan R, Li L, Legerski RJ: Expression of five selected human mismatch repair genes simultaneously detected in normal and cancer cell lines by a nonradioactive multiplex reverse transcription-polymerase chain reaction. Pathobiology. 1997, 65: 293-300.
Serels S, Day NS, Wen YP, Giraldi A, Lee SW, Melman A, Christ GJ: Molecular studies of human connexin 43 (Cx43) expression in isolated corporal tissue strips and cultured corporal smooth muscle cells. Int J Impot Res. 1998, 10: 135-143. 10.1038/sj.ijir.3900336.
Bhatia P, Taylor WR, Greenberg AH, Wright JA: Comparison of glyceraldehyde-3-phosphate dehydrogenase and 28S-ribosomal RNA gene expression as RNA loading controls for northern blot analysis of cell lines of varying malignant potential. Anal Biochem. 1994, 216: 223-226. 10.1006/abio.1994.1028.
Goidin D, Mamessier A, Staquet MJ, Schmitt D, Berthier-Vergnes O: Ribosomal 18S RNA prevails over glyceraldehyde-3-phosphate dehydrogenase and beta-actin genes as internal standard for quantitative comparison of mRNA levels in invasive and noninvasive human melanoma cell subpopulations. Anal Biochem. 2001, 295: 17-21. 10.1006/abio.2001.5171.
Blanquicett C, Johnson MR, Heslin M, Diasio RB: Housekeeping gene variability in normal and carcinomatous colorectal and liver tissues: applications in pharmacogenomic gene expression studies. Anal Biochem. 2002, 303: 209-214. 10.1006/abio.2001.5570.
Lee PD, Sladek R, Greenwood CM, Hudson TJ: Control genes and variability: absence of ubiquitous reference transcripts in diverse mammalian expression studies. Genome Res. 2002, 12: 292-297. 10.1101/gr.217802.
Kawakami Y, Eliyahu S, Delgado CH, Robbins PF, Sakaguchi K, Appella E, Yannelli JR, Adema GJ, Miki T, Rosenberg SA: Identification of a human melanoma antigen recognized by tumor-infiltrating lymphocytes associated with in vivo tumor rejection. Proc Natl Acad Sci U S A. 1994, 91: 6458-6462.
Panelli JNMC, Hackett JA, Bettinotti M, Mixon A, Wunderlich J, Parker L, Restifo NP, Ferrone S, Marincola FM: Natural variation of the expression of HLA and endogenous antigen modulates CTL recognition in an in vitro melanoma model. Int J Cancer. 1999, 80: 781-790. 10.1002/(SICI)1097-0215(19990301)80:5<781::AID-IJC24>3.0.CO;2-A.
Riker A, Kammula US, Panelli MC, Wang E, Ohnmacht GA, Steinberg SM, Rosenberg SA, Marincola FM: Threshold levels of gene expression of the melanoma antigen gp100 correlate with tumor cell recognition by cytotoxic T-lymphocytes. Int J Cancer. 2000, 86: 818-826. 10.1002/(SICI)1097-0215(20000615)86:6<818::AID-IJC10>3.3.CO;2-N.
Reimann I, Huth A, Thiele H, Thiele BJ: Suppression of 15-lipoxygenase synthesis by hnRNP E1 is dependent on repetitive nature of LOX mRNA 3'-UTR control element DICE. J Mol Biol. 2002, 315: 965-974. 10.1006/jmbi.2001.5315.
Hijazi JNYM, Abati A, Fetsch P, Bettinotti M, Steinberg SM, Rosenberg SA, Marincola FM: Heterogeneous expression of melanoma-associated antigens (MAA) and HLA-A2 in metastatic melanoma in vivo. Int J Cancer. 1998, 75: 517-524. 10.1002/(SICI)1097-0215(19980209)75:4<517::AID-IJC5>3.3.CO;2-M.
Ohnmacht GA, Wang E, Mocellin S, Abati A, Filie A, Fetsch PA, Riker A, Kammula US, Rosenberg SA, Marincola FM: Short term kinetics of tumor antigen expression in response to vaccination. J Immunol. 2001, 167: 1809-1820.
Solanas M, Moral R, Escrich E: Unsuitability of using ribosomal RNA as loading control for Northern blot analyses related to the imbalance between messenger and ribosomal RNA content in rat mammary tumors. Anal Biochem. 2001, 288: 99-102. 10.1006/abio.2000.4889.
Yang IV, Chen E, Hasseman JP, Liang W, Frank BC, Wang S, Sharov V, Saeed AI, White J, Li J, Lee NH, Yeatman TJ, Quackenbush J: Within the fold: assessing differential expression measures and reproducibility in microarray assays. Genome Biol. 2002, 3: research0062-
Wang E, Marincola FM: Amplification of small quantities of mRNA for transcript analysis. In DNA arrays – A Molecular Cloning Manual. Edited by: Bowtell D, Sambrook J. 2002, Cold Springs Harbor, NY: Cold Spring Harbor Laboratory Press, 204-213.
Kammula US, Lee K-H, Riker A, Wang E, Ohnmacht GA, Rosenberg SA, Marincola FM: Functional analysis of antigen-specific T lymphocytes by serial measurement of gene expression in peripheral blood mononuclear cells and tumor specimens. J Immunol. 1999, 163: 6867-6879.
Marincola FM, Hijazi YM, Fetsch P, Salgaller ML, Rivoltini L, Cormier JN, Simonis TB, Duray PH, Herlyn M, Kawakami Y, Rosenberg SA: Analysis of expression of the melanoma associated antigens MART-1 and gp100 in metastatic melanoma cell lines and in in situ lesions. J Immunother. 1996, 19: 192-205.
PJ validated the system combining high-throughput data with gene-directed analysis by quantitative real-time PCR. YZ performed statistical analysis. YN, MCP, DN, VM and KS participated in the preparation and utilization of array experiments used for the identification of stably expressed genes. NH, HS and PRT provided a large library of samples necessary for the development of the transcriptome array. FMM supervise the project as principal investigator.
EW provided leadership for the development of cDNA arrays, transcriptome array and for the overall conduct of the experiments.
Ping Jin, Yingdong Zhao contributed equally to this work.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.