Analyses and interpretation of whole-genome gene expression from formalin-fixed paraffin-embedded tissue: an illustration with breast cancer tissues

Background We evaluated (a) the feasibility of whole genome cDNA-mediated Annealing, Selection, extension and Ligation (DASL) assay on formalin-fixed paraffin-embedded (FFPE) tissue and (b) whether similar conclusions can be drawn by examining FFPE samples as proxies for fresh frozen (FF) tissues. We used a whole genome DASL assay (addressing 18,391 genes) on a total of 72 samples from paired breast tumor and surrounding healthy tissues from both FF and FFPE samples. Results Gene detection was very good with comparable success between the FFPE and FF samples. Reproducibility was also high (r2 = 0.98); however, concordance between the two types of samples was low. Only one-third of the differentially expressed genes in tumor tissues (compared to corresponding normal) from FF samples could be detected in FFPE samples and conversely only one-fourth of the differentially expressed genes from FFPE samples could be detected in FF samples. GO-enrichment analysis, gene set enrichment analysis (GSEA) and GO-ANOVA analyses also suggested small overlap between the lead functional groups that were differentially expressed in tumor detectable by examining FFPE and FF samples. In other words, FFPE samples may not be ideal for picking individual target gene(s), but may be used to identify some of the lead functional group(s) of genes that are differentially expressed in tumor. The differentially expressed genes in breast cancer found in our study were biologically meaningful. The "cell cycle" & "cell division" related genes were up-regulated and genes related to "regulation of epithelial cell proliferation" were down-regulated. Conclusions Gene expression experiments using the DASL assay can efficiently handle fragmentation issues in the FFPE tissues. However, formalin fixation seems to change RNA and consequently significantly alters gene expression in a number of genes which may not be uniform between tumor and normal tissues. Therefore, considerable caution needs to be taken when interpreting gene expression data from FFPE tissues, especially in relation to specific genes.


Background
High-throughput microarray technology is a powerful tool for genome-wide genotyping and gene expression analysis. Microarray-based gene expression assessment is a very useful method for prediction of diseases, tumor classification and drug responses. Although good quality RNA can be extracted from fresh frozen (FF) tissues, tissues preserved in RNAlater reagent and primary cell culture, the limited availability of these sources is a problem with regards to the utility of gene expression measurements. As Formalin-Fixed Paraffin-Embedded (FFPE) sample collection and storage are routine practices in pathological laboratories worldwide, there is great interest in the use of RNA extracted from those archived samples.
Integrity of nucleic acid is a very important issue for microarray analysis. It is well known that the FFPE tissue RNA is often degraded and at the same time, it is chemically modified [1,2]. Previous studies using different microarray platforms showed that only approximately 3% or less of the RNA isolated from the FFPE tissue are useful for cDNA synthesis, which is an important step for gene expression analysis on a microarray platform [3]. Illumina Inc. introduced a gene expression profiling method, DASL (cDNA-mediated Annealing, Selection, extension and Ligation) specially designed for analysis of fragmented RNA samples [4][5][6]. In the present study, we used the Whole Genome DASL Assay on paired breast tumor and surrounding healthy tissue from both FFPE tissue and FF tissue to examine: (a) the feasibility of genome-wide gene expression analyses using DASL on FFPE tissue and (b) whether similar conclusions can be drawn by examining widely available FFPE tissue as a proxies for FF tissues.

Tissue Samples
The study was approved by the Institutional Review Board of The University of Chicago. We obtained four different sets of tissue samples (paired tumor and adjacent normal breast tissue from fresh frozen as well as corresponding FFPE blocks) from the same patients through our Human Tissue Resource Center (HTRC) at The University of Chicago http://pathcore.bsd.uchicago. edu/BSB/BSB_overview.shtml. HTRC collected these breast tissues from mastectomy/lumpectomy specimens that were not needed for pathological diagnosis by the Department of Pathology at The University of Chicago Medical Center. There were a total of 21 such patients for whom all 4 types of tissues were available. For the present study, tissue samples from the first 18 patients were used. These tissue samples were archived for a 3-6 year period.

RNA extraction and quality control
For the extraction of total RNA from both FFPE and FF breast tissue samples, we used the High Pure RNA Paraffin Kit (Roche Applied Science, cat# 03 270 289 001). Xylene was used for deparaffinization of the FFPE samples. Proteinase K digestion was carried out by overnight incubation at 55°C. DNAse treatment was carried out for all of the samples and samples were eluted with 40 μl of the elution buffer provided with the extraction kit. For quantification, each sample was (a) tested in a NanoDrop 1000 for concentration and 260/280 ratio and also (b) run in an Agilent Bioanalyzer using the Eukaryotic total RNA Nano Series II kit. Prequalification of all the RNA samples derive from FFPE and FF tissue was done by qPCR using SYBR Green detection. cDNA was synthesized as recommended by Illumina, and then a 90 bp amplicon was amplified from the highly expressed RPL13a ribosomal protein gene (GenBank Accession # NM_012423.2). The amplification reaction was carried out for all FFPE and FF samples in duplicates in ABI Prism 7900 HT Sequence Detection Systems (Applied Biosystems). The ΔCt was calculated as Ct (test sample) -Ct (control sample). As per Illumina protocol for DASL, a ΔCt of up to 12 was acceptable as prequalification for the assay. Only 4 out of 72 breast tissue samples had ΔCt just above 12 and none > 12.5. Only 1 of these 4 samples showed a poor performance on the assay (see result section).

Whole genome DASL assay
We used 300 ng of total RNA (5 μL at 60 ng/μL concentration) as starting material for each sample run on the microarray. Each chip accommodates 8 samples. Four types of sample for each patient were processed in one chip. The four types of sample were FF tumor tissue (T_FF), FF adjacent normal tissue (N_FF), FFPE tumor tissue (T_FFPE) and (4) FFPE adjacent normal tissue (N_FFPE). Thus, a single chip contained 8 samples from 2 patients and three such chips (24 samples) were processed in one batch. We also ran four technical replicates of the T_FFPE RNA samples to analyze the reproducibility of the DASL assay. According to the manufacturer's protocol, RNA was first converted to cDNA through reverse transcription with biotinylated primers. The cDNA was then annealed to the assay oligonucleotide and bound to streptavidin conjugated paramagnetic beads. After the oligohybridization, non-hybridized and miss-hybridized sequences were washed away and the hybridized sequences were extended and ligated to start PCR amplification with fluorescent tagged primers. The fluorescently labeled amplified PCR products were hybridized overnight onto the bead chip to be scanned on Bead Array reader.

Statistical analyses and power Statistical analysis
To compare the continuous variables (e.g. number of detected genes/samples or average signal intensity across different groups), we used one-way analysis of variance (ANOVA). For the gene expression data, we used quantile normalization in BeadStudio software before exporting the data for PARTEK Genomic Suite [7] for further statistical analyses. Principle components analysis (PCA) and sample histograms were checked as a part of quality control analyses of the data. Mixed-model multi-way ANOVA (which allows more than one ANOVA factor to be entered in each model) was used to compare the individual gene expression data across different groups. In general, "disease status" (tumor/adjacent normal) and "storage method" (FFPE/FF) were used as categorical variables with fixed effect since the levels tumor/normal or FFPE/FF represent all conditions of interest; whereas "case ID#" (as proxy of inter-person variation) and "batch#" were treated as categorical variable with random effect, since the person or batch are only a random sample of all the levels of that factor. Method of moments estimation was used to obtain estimates of variance components for mixed models [8]. As per study design we processed all four types of samples from one individual in a single chip (i.e., in same batch), therefore batch# and case ID# were linked, and thus they were not entered in the same model. In combined analysis when we included FFPE and FF samples together, the log 2 -transformed signal intensity was used as the response variable, and "tissue type " (or "disease" i.e. tumor or normal), "sample storage" (i.e. FFPE or FF) and case ID# were entered as ANOVA factors. In order to examine interaction between "tissue type" and "sample storage", we also entered the cross-product term "tissue type" × "sample storage" in the 3-way ANOVA model. When we separately analyzed the FF and FFPE samples to identify differentially expressed genes in breast tumor compared to adjacent normal tissue, disease status or "tissue type", "batch#" and "block archival age" were entered as ANOVA factors in the 3-way ANOVA model. One example of a model [9] that uses method of moments for combined analysis is as follows: Where Y ijkl represents the gene expression intensity of gene "Y" in l-th sample with i-th tissue with j-th storage for the k-th case ID; μ is the common effect for the whole experiment; and ε ijkl represents the random error, which is assumed to be normally and independently distributed with mean 0 and standard deviation δ for all measurements. Case ID is a random effect in this model.
Similarly, when we analyzed the data separately for FF and FFPE samples to detect individual genes differentially expressed in tumor, we used the following 3-way ANOVA model by using method of moments: Where Y ijkl represents the gene expression intensity of gene "Y" in l-th sample with i-th tissue for the j-th Batch no with k-th Block_Age. As in the model for combined analyses above, μ is the common effect for the whole experiment and ε ijkl represents the random error, which is assumed to be normally and independently distributed with mean 0 and standard deviation δ for all measurements. Batch no is a random effect in this model.
In GO Enrichment analysis, we tested if the genes found to be differentially expressed fall into a Gene Ontology category more often than expected by chance. We used Chi-square test to compare "number of significant genes from a given category/total number of significant genes" vs. "number of genes on chip in that category/total number of genes on the microarray chip". Negative log of the p-value for this test was used as the enrichment score. Therefore, a GO group with a high enrichment score represents the lead functional group. The enrichment scores were analyzed in a hierarchical visualization and in tabular form.
In addition to looking at differential expression at individual gene level, we also examined the differential expression of gene sets using the Gene Set Enrichment Analysis (GSEA) [10]. Given an a priori defined set of genes S (sharing the same GO category), the goal of GSEA was to determine whether the members of S were randomly distributed throughout the ranked list or primarily found at the top or bottom. Considering the fact that GSEA can look at single variable (unadjusted expression), we also used GO-ANOVA that offers adjustments for other factors like "person-to-person" variation, "tissue type" variation etc.
GO-ANOVA is a mixed model ANOVA to test the expression of a set of genes (sharing the same GO category) instead of an individual gene in different groups [7]. The analysis is performed at the gene level, but the result is expressed at the level of the GO-category by averaging the member genes' results. The equation for the model was: Model:Y T P G S(T*P) = μ + + + + + ε Where Y represents expression of a GO-category, μ is the common effect or average expression of the GOcategory, T is tissue-to-tissue (tumor/healthy) effect, P is patient-to-patient effect, G is gene-to-gene effect (differential expression of genes within the GO-category independent of tissue types), S(T*P) is sample-to-sample effect (this is a random effect, and nested in tissue and patient) and ε represents the random error.
Cross-validation: For the one-level cross validation, the data was first divided into 10 random partitions. At each iteration, 10% samples were held out for testing while the remaining 90% samples were used to fit the parameters of the model. We also used a 10 × 10 two-level nested crossvalidation [11]. In the outer cross-validation, with 10% of samples held out as test cases, the remaining 90% were used in a 10-fold cross-validation to determine the optimal predictor variables and other classifier parameters. The model that performed the best on the inner crossvalidation was applied to the held-out test samples in the outer cross-validation. Thus an inner cross-validation was performed in order to select predictor variables and optimal model parameters, and an outer cross-validation was used to produce overall accuracy estimates for the classifier. In the first step, we considered all the differentially expressed genes for inclusion in the model and then in the next step(s), for selecting the top 50 or 100 genes, expression of which could be used to differentiate the FFPE samples from the FF samples, we used 3-way ANOVA ["storage" (FFPE or FF) adjusted for "tissue type" (tumor or normal) and "person-to-person" variation]. We tested three classification methods -(a) K-Nearest Neighbor (KNN) with Euclidean distance measure and 1-neighbor, (b) nearest centroid with equal prior probability and (c) linear discriminent analysis with equal prior probability.

Statistical power
In genome-wide gene expression experiments requiring multiple testing, it is more powerful and more reasonable to control false discovery rate (FDR) or positive FDR (pFDR) [12][13][14] instead of type I error. We followed that strategy for sample size calculation. When controlling FDR, the traditional approach of estimating sample size by controlling type I error is no longer applicable [15]. In their paper, Liu P and Hwang JTG have compared calculations of sample size using four different approaches -all of which had good agreement and show that, for standardized effect size Δ/σ = 1 [e.g., for fold change of 1.4 with σ = 0.5; Δ/σ = (log 2 1.4)/0.5) = 1], if identification of 95% of the truly altered genes are desired (our set target), then the estimated sample size for each group would be between 33 and 34. For standardized effect size Δ/σ = 2 [in other words, for fold change of 2 with σ = 0.5; as Δ/σ = (log 2 2)/0.5) = 2], if identification of 95% of the truly altered genes is desired (our set target), then the estimated sample size for each group would be 11. Therefore, our study was sufficiently powered to detect a 1.4-fold change difference in the combined analysis and a 2-fold change in the subgroup analysis where FF and FFPE samples were analyzed separately.

Results
Performance of the assay, evaluated by the number of detected genes per sample, was impressive. Among the 18,391 genes on the DASL assay, on average, about 11290 genes (61.4%) were detected in each sample at p < 0.05 level. A gene was said to be detected at p < 0.05 level if the mean signal intensity from multiple probes for that gene was significantly higher (at the level of p < 0.05) than the negative control on the same chip. Table 1 shows that the number of genes detected per sample at p < 0.05 level per sample was comparable in all the four tissue types (N_FF, T_FF, N_FFPE and T_FFPE), suggesting uniform performance of DASL across FF and FFPE samples stratified by tissue type.
The average signal intensity for all the genes and for the housekeeping genes are also shown in Table 1.

Reproducibility
We ran 4 samples in duplicate (technical replicates) to check the reproducibility. A representative figure showing the correlation between log 2 signal intensities is presented in Figure 1A which shows r 2 = 0.98, suggesting acceptable reproducibility. The central line represents the regression line and the two lines on either side of the central line represent the boundary for 2-fold change.
Correlations of log 2 signal intensity between a pair of RNA samples extracted from N_FF and N_FFPE tissue (r 2 = 0.86) and between T_FF & T_FFPE tissue (r 2 = 0.75) from same subject (case ID#50527) are shown in Figure 1B and Figure 1C respectively. Both the graphs show that in normal as well as in tumor tissue, a number of genes were found to be more than 2-fold up-or down-regulated in FFPE samples compared to corresponding FF samples. Similarly, correlations between signal intensity from paired RNA samples extracted from N_FF and corresponding T_FF tissue (r 2 = 0.85) and paired RNA samples extracted from N_FFPE and T_FFPE tissue (r 2 = 0.72) from the same subject (case ID#50527) are shown in Figure 1D and Figure 1E respectively. As expected, in both the FF and FFPE samples, there were a number of genes that were more than 2-fold up-or down-regulated in the tumor tissue compared to the corresponding normal tissue.

Sources of variation in the gene expression
As part of the quality control of the data, we utilized principle component analysis (PCA) which indicated only one sample as an outlier. This sample had only 4606 genes detected at p < 0.05 level and also had a ΔCt = 12.36. We excluded this sample from further analysis. PCA on the remaining 71 breast tissue samples is shown in Figure 2A, where the samples are color-coded by of sample storage type. The figure suggests a clustering effect of sample storage type. When we color-coded the samples by tissue type (normal or tumor, i.e. "disease"), we also saw a clustering effect of tissue type, but less prominent than for the type of sample storage.
In the next step, to further investigate the source of variation in the expression, we used multivariate ANOVA. We did not use any filter for selecting the genes to be included in the analysis. In other words, ANOVA was performed on log 2 -transformed intensity value for all the 18,391 genes irrespective of their level of detection. The average F-ratio (F-statistics for the test variable/F-statistics for error) for all the genes was considered as representative of significance of signal-tonoise ratio. Figure 2B shows the significance of different sources of variation in the entire data in a model where tissue type (tumor/normal), sample storage (FFPE/FF) and person-to-person variation (case ID#) were entered as explanatory variables at a time for gene expression. The figure shows that "sample storage" (F ratio 16.02) was the most significant source of variation, followed by the "disease status" (F-ratio 7.14) and person-to-person variation (F-ratio 2.22).

Differential gene expression in FFPE samples compared to FF
In the total set of 71 breast samples, we looked at the genome-wide (n = 18,391) differential gene expression in FFPE samples compared to FF samples after adjustment for tissue type (tumor/normal), and inter-person variation. There were a total of 9186 differentially expressed genes (49.94% of 18391) at FDR 0.05. Concerning fold change, a total of 1864 genes had > 2-fold change in either direction (1098 up-and 766 downregulated). Combining these two criteria, there were a total of 1863 genes (10.12% of total 18,391 genes) that were differentially expressed by at least 2-fold at FDR 0.05 level (766 up-and 1097-down-regulated in FFPE). For the statistically most significantly (p = 2.17 × 10 -28 ) differentially expressed gene, GIGYF1, 83.6% of the variation was due to "storage". Taking all of these 1,863 differentially expressed genes in FFPE samples into account, we found that overall 30.32% of the variation in individual gene expression was contributed by "storage" (FFPE or FF), 4.94% by tissue type (tumor or normal), 25.43% by "person-to-person" variation and the rest 39.31% of the variation could not be explained by the ANOVA model. It may be noted that among these 1,863 genes, a total of 138 genes (7.4%) were also differentially expressed (at least two-fold change) in tumor tissue compared to healthy tissue at FDR 0.05 in our data set. In the Breast Cancer Gene Database (BCGD) [16], there was a total of 62 genes of which 51 were studied in our gene-chip. We also looked if the expression of any of those 51 breast cancer related genes in BCGD list was also found to be affected by FFPE in our setting. In fact five of them (APC, CDKN2A, IGF1R, TGFA and TSG101) were in our list. We also separately tested the normal and tumor tissue data to see the effect of FFPE preservation. In the analysis of normal tissue only, a total of 2820 genes in FFPE samples were differentially expressed (1712 down-and 1108 up-regulated) at least by 2-fold at FDR 0.05 level, compared to corresponding FF samples. Cluster analysis based on these genes could effectively separate the FFPE samples from the FF samples (see Figure 3A). In the analysis of tumor tissue only, a total of 1159 genes in FFPE samples were differentially expressed (555 down-and 604 up-regulated) at least by 2-fold at FDR 0.05 level, compared to corresponding FF samples. Cluster analysis based on these genes also could effectively separate the tumor FFPE samples from the tumor FF samples (figure not shown). The Venn diagram in Figure 3B shows the overlap among these gene lists obtained from (a) combined analysis, (b) normal tissue alone and (c) tumor tissue alone to look for differentially expressed genes in FFPE tissue compared to corresponding FF tissue.

Cross-validation
A 10 × 10 two-level nested cross-validation analysis using all the 1,863 differentially expressed genes in FFPE in present study suggested that the expression of these genes could differentiate FFPE samples from FF samples with overall accuracy of 95.7%. The same 10 × 10 twolevel nested cross-validation analysis suggested that the expression of the top 100 genes could be used in a  Figure 1D & 1E show correlations between corresponding tumor (y-axis) and normal (x-axis) tissue from same patient using FF and FFPE samples respectively.
model that could differentiate FFPE samples from FF samples with overall accuracy of 100%. We also tested the top 50 genes, and again the overall accuracy was 100%. This further supported our ANOVA finding that a number of genes are differentially expressed in FFPE samples compared to their corresponding FF sample.
Differential expression profile at "gene set" level in FFPE samples compared to corresponding FF In addition to individual gene level analysis, we also performed GSEA to find out "gene sets" that are up-or down-regulated in FFPE. The top five "gene sets" (each representing a GO-category) that were down-regulated in FFPE include: "nucleotide excision repair, DNA gap filling" (normalized enrichment score-NES 2.11), "negative regulation of epithelial cell proliferation" (NES 1.98), "chaperone binding", "DNA polymerase activity" and "DNAdirected DNA polymerase activity". The up-regulated GO-category includes: "positive regulation of gene-specific transcription" (NES -1.95), "regulation of Rho protein signal transduction", "Rho guanyl-nucleotide exchange factor activity", "dopamine receptor signaling pathway", and "thymic T cell selection". GO-ANOVA also confirmed these findings. Figure 4 shows one of the examples: the genes in GO-category "negative regulation of epithelial cell proliferation" (p = 6.65 × 10 -14 ) were overall down-regulated in FFPE.
Considering the effect of FFPE on the gene expression, we then looked at the differential gene expression in tumor tissue compared to adjacent normal tissue in two ways -(a) looking at the combined data (FF and FFPE) with adjustment for "sample storage" and "person to person variation" and (b) looking at the FF and FFPE data separately with paired comparisons to detect differential expression in tumor and to see if similar conclusion(s) could be made.

Differential gene expression in breast tumor tissue compared to corresponding adjacent normal breast tissue
Among the 18 individuals included in the analyses of breast tissues, one had fibroadenoma, 3 had no abnormality detected and in one the histopathology was unknown. Therefore, to get a clearer picture of genes differentially expressed in breast cancer, we included 52 samples from 13 patients with known histopathological diagnosis of breast cancer (4 samples from each patient). Of these Statistical significance of the different "sources of variation" in the total gene expression estimated by 3-way ANOVA models. F ratio for each factor (source) represents the F-statistics for that factor/F-statistics for error (noise). The mean of the F Ratio for all the genes for a particular factor are shown on Y-axis and the different factors (sources of variation) are shown in x-axis. 52 samples, one FF tumor sample was excluded due to poor performance on the chip (an outlier on PCA, as mentioned above). The analysis was restricted to the 51 arrays. In addition to PCA, we also used unsupervised hierchical clustering based on all 18,391 genes which showed clustering by sample storage (FF/FFPE) as well as by histopathology [figure not shown].
In this total set of 51 samples, we looked at the genome-wide (n = 18391) differential gene expression in breast tumor tissue compared to adjacent normal tissue after adjustment for "sample storage" (FF/FFPE), and "person to person variation" (case ID#). There were a total of 1319 genes (7.17% of total 18,391 genes; 604 were up-regulated and the rest 715 were down regulated in tumor) that were differentially expressed at least by 2-fold at FDR 0.05 level.
In the next step, we analyzed the data separately for FF and FFPE samples to evaluate the significance of "batch effect" and "sample block age" as sources of variation in the overall gene expression data. The F-ratio for these factors in these two types of sample storage methods are shown in Figure 5A and Figure 5B, respectively. The data shows that both the "batch effect" and "block age" contributed more significantly as the source of variation in overall expression data in the FFPE samples. However, by study design we assayed the paired samples on the same chip to minimize batch-effect. In paired analysis of the FF samples (tumor & adjacent normal), a total of 1275 genes were differentially expressed in breast tumor tissue by at least 2-fold at FDR 0.05 level (503 up-and 772 down-regulated in tumor). On the other hand, using the same criteria in FFPE samples, a total of 966 genes were differentially expressed in breast tumor (346 up-and 620 downregulated in tumor). Figure 5C shows the overlap of differentially expressed genes in the FF and FFPE samples and combined analysis. Only 315 genes were picked in both FF and FFPE samples, and all these 315 were also picked-up in the combined analysis. In other words, only one-third of the How do these gene lists compare to breast cancer or multiple cancer signatures from prior studies Ein-dor et al [17] correctly described earlier that several microarray studies yielded gene sets whose expression profile successfully predicted survival, but the overlap between these gene sets was very low. We compared our three gene lists (FF, FFPE and from combined analysis) with some of the published data related to prognosis/survival [16,[18][19][20][21] in breast and/or other cancer and two commercially available gene sets for real-time PCR assays from SA BioScience http://www.sabiosciences.com/ ArrayList.php. It was not surprising to see different degree of overlap [see Table 2]. However it was interesting to see the predictive accuracy of those different gene lists from different studies or sources in our data set. For example, although only 4 out of 51 genes in BCGD list [16] was common with our 1319 genes, expression pattern of those 51 genes could correctly identify 39 out of 51 samples included in our study (predictive accuracy 76.53%). On the other hand, 10 out of 21 genes of the Martin KJ et al [18] list (almost 50%) overlapped with our list from combined analysis; however prediction based on expression of those 21 genes in our data set was just slightly higher at 80.23%.

GO Enrichment Analyses of the lists of differentially expressed genes in breast cancer tissue compared to normal breast tissue in FF and FFPE samples
The GO-Enrichment analysis tests if the genes found to be differentially expressed (at individual gene level) in a given list fall into a Gene Ontology category more often than expected by chance. Enrichment score was calculated as the negative log of the p-value of the test statistics. Hence, higher enrichment scores represent lead functional GO-groups. The functional groupings are sometimes easier to understand than pathways. The scores are used to rank the functional groupings. In GO, the genes are categorized into different GO-terms in three ways: (a) by "biological process", (b) by "molecular function" and (c) by "cellular components". First, we used GO-Enrichment analysis for the two lists of upregulated genes in breast tumor derived from analysis of FF and FFPE samples. Figure 6(i) and Figure 6(ii) show hierarchical clustering of the enrichment scores for the "cellular process" category under the broad term "Biological process" for FF and FFPE samples respectively. It may be noted that in both the lists derived from FF and FFPE samples, the GO-term "cell division" and "cell cycle" were the top-ranking lead groups. This is very much biologically relevant with respect to any carcinogenesis in general. In fact, enrichment scores of "cell division" and "cell cycle" related genes constituted 56.87% of the total score for the "cellular process" for the up-regulated genes found in FF samples and 63.2% of the up-regulated genes found in FFPE sample. Similarly, we looked at the GO-Enrichment analysis for the down-regulated gene lists in tumor derived from FF and FFPE samples (see Figure 7(i) and Figure 7(ii) respectively for "biological process" categorization). The topranking functional lead groups show similarity between the gene lists derived from FF and FFPE samples. Similarly, under the broad category "Cellular components", the GO-term "extracellular region" comes up in both FF as well as FFPE samples (results not shown), emphasizing the significance of alteration in the extracellular matrix in breast carcinogenesis.
Therefore our findings suggest that the analysis of FFPE samples does not identify the exact same genes that would have been identified by analyzing FF samples, but at least, the list shows some similarity in terms of enrichment of GO-terms representing the lead functional groups of genes. In other words, FFPE samples may not be ideal for picking individual target gene(s), but may be used to identify the lead functional group(s) of genes that are differentially expressed in tumor.
For the up-regulated genes in breast tumor, that we found by analyzing the FF samples (and also in FFPE samples), if we looked at the enrichment of GO-terms under all the three major groups together -"biological process", "molecular function" and "cellular components", the top most ranking GO-term was "nucleosome" under "cellular components". All the up-regulated genes under this GO-term are from the same family of genes that are related to HIST1 and HIST2 proteins.
For the down-regulated genes in breast tumor, that we found by analyzing the FF samples, if we look at the enrichment of GO-terms under all the three major groups -"biological process", "molecular function" and "cellular components", the top ranking GO-term was "regulation of epithelial cell proliferation" under the "biological process". This finding is also very much relevant in the context of breast cancer pathogenesis. These down-regulated genes include FGF2, FGF9, APC, IGF1, CDKN2B, NOTCH1, LAMC1, LAMB1, NKX3-1, TBX18, and GAS1. We performed GO-Enrichment analysis of the 315 genes common in all the three ANOVA analyses -(a) FF samples, (b) FFPE samples and in (c) combined analysis. Subsequent hierarchical clustering suggested that a number of those genes (including PPARG, FGF2, APOB, CRHBP, CETP, and RXRG) were significantly associated with breast cancer that appeared repeatedly in different GO-terms. For example, PPARG was significant in the GO-terms "anatomical structure development", "organ development", "developmental maturation", "lipid metabolism", "regulation of biological quality", "transcription factor activity", "molecular transducer activity", "drug   binding", and "lipid binding". The dot plot in Figure 8 shows the expression of PPARG and FGF2 in paired samples (breast tumor and adjacent normal tissue; corresponding samples from same patient are connected by line) preserved as FF as well as FFPE sections. PPARG is shown to be down-regulated in breast cancer in all the patients and FGF2 is shown to be down-regulated in breast cancer in all but one patient in both FF and FFPE sections.
Interaction between "tissue type" and "sample storage" In mixed model ANOVA, we also entered the interaction term "tissue type (tumor/normal) × sample storage (FFPE/ FF)" to identify the genes that show a different degree, or even direction, in differential expression in tumor compared to normal tissue for FF or FFPE samples. There were a total of 772 genes that showed significant interaction at FDR 0.05; of which 178 also had differential expression in tumor at FDR 0.01 with at least 1.4-fold change (see Table 3). The present study suggests that differential expression of these genes in tumor should be cautiously interpreted if FFPE samples are studied. Figure  9(i) shows differential expression of a gene (LPL) without any significant interaction. Expression of the LPL gene was down-regulated in tumor tissue compared to adjacent normal tissue in an identical fashion in both FF as well as FFPE samples. On the other hand, Figure 9(ii-iv) shows the examples of the genes which show significant interaction, i.e. degree of differential expression in tumor (compared to normal), is significantly influenced by the storage. For example, if we look at the expression of ZNF800 gene [ Figure 9(iv)] in FF tissue, we would notice significant down-regulation in tumor, but if we look at FFPE samples, there was no differential expression in tumor.      Figure 9 Examples of interaction between "tissue type" and "sample storage". Error bar plots for least square means for differential gene expression data for (i) LPL (ii) APRT, (iii) C12 orf 139 and (iv) ZNF 800 genes. LPL shows the example with no interaction between "tissue type" and "sample storage", where as the other three genes show the examples of significant interaction -i.e., the differential expression in tumor tissue is significantly affected by the storage type. In each plot, breast tumor tissue is shown as blue square and adjacent normal breast tissue as red triangle; data from FF samples are presented on the left side and that of FFPE samples on the right side. The error bars represent standard error of mean. Expression of known "clinically relevant" genes in our data From a therapeutic point of view, determination of HER2, estrogen receptor (ESR) and progesterone receptor (PGR) gene expressions is very helpful in decisionmaking for therapeutic regimens [22]. We therefore looked at these genes in particular in our data set (data not shown). Expression of the ESR1 gene (ESR1) was significantly affected by FFPE storage. Expression of the ESR2 and PGR genes were very minimally affected by storage, and that of HER2 was not affected at all. We found HER2 to be up-regulated by 1.5 fold (p = 0.0054) and PGR to be down-regulated by 1.9 fold (p = 0.006) in tumor compared to adjacent normal tissue.
Do we see same "gene sets" to be differentially expressed in breast tumor tissue compared to corresponding adjacent normal breast tissue in FF and FFPE samples?
After looking into the differential expression at individual gene level, we also looked for differential expression of different "gene sets" in breast cancer tissue by using the Gene Set Enrichment Analysis (GSEA) as well as GO-ANOVA. GSEA of the FF samples showed that a total of 100 "gene sets" (each representing different GO description terms) were differentially expressed (p-value = < 0.05) in breast cancer tissue compared to adjacent healthy tissue. On the other hand, GSEA of the FFPE samples showed a total of 440 "gene sets" were differentially expressed at p-value = < 0.05 in breast cancer FFPE tissue compared to adjacent healthy FFPE tissue. Only 38 "gene sets" (i.e. 38% "gene sets" from of the FF analysis and only 8.64% from FFPE analysis) were common in both the analyses. Therefore, GSEA also suggested the need for caution for interpretation of gene expression data from FFPE samples. GO-ANOVA of the FF samples showed that a total of 883 "gene sets" (each representing different GO description terms) were differentially expressed at FDR 0.05 in breast cancer tissue compared to adjacent healthy tissue. On the other hand, GO-ANOVA of the FFPE samples showed that 1,034 "gene sets" were differentially expressed at FDR = < 0.05 in breast cancer FFPE tissue compared to adjacent healthy FFPE tissue. A total of 641 "gene sets" (i.e. 72.59% gene sets from FF and 61.99% from FFPE analysis) were common in both the analyses. This result also suggested the need for caution for interpretation of gene expression data from FFPE samples.
The overlaps of these results from GSEA and GO-ANOVA in FF and FFPE samples are presented in Figure 10. In general, GO-ANOVA could detect larger number of differentially expressed gene sets (GOdescription terms), however it may be noted that most of the gene sets detected by GSEA were also detected by GO-ANOVA -81.0% in FF and 88.4% in FFPE samples. There was a total of 26 "gene sets" that was common to all four analyses. Those 26 gene sets are shown in Table 4. It may be noted that genes related to "transforming growth factor beta receptor signaling pathway" were down-regulated (positive NES in this case) and genes related to "nucleosome" were up-regulated (negative NES in this case) in breast cancer. It may be noted that GO-Enrichment analysis of the differentially expressed gene lists also showed highest enrichment score for genes related to "nucleosome".

Discussion
Gene expression profiling of human cancer has proved valuable in cancer research leading not only to the identification of targets but also contributing to our understanding of the mechanisms of the process [23,24]. The application of microarrays is limited by the availability of fresh frozen tissue or the tissue preserved in RNAlater reagent. As FFPE samples are available in almost all the pathological laboratories and are often available in conjunction with clinical and follow-up data, they would be considered as the most valuable sources for microarray analysis [25], provided similar information can be obtained as would be expected from analyzing the FF samples. Because of fragmentation [3,26,27] and some other chemical modifications of RNA in FFPE samples, currently gene expression studies are largely limited to immuno-histochemical (IHC) staining and RT-PCR, which allow only a few genes to be amplified at a time [3,28,29]. In this paper we mainly focus on the use of FFPE samples in genome-wide gene expression experiments. We have tried to analyze the data for differential expression from different angles -at individual gene level, at "gene set" level and also used different statistical methods. In follow-up paper we would focus more on gene selection and relevance to breast cancer biology.
Like other investigators [5,28,30,31], we also observed high reproducibility across technical replicates regardless of the sample type. However, the concordance between the paired FF and FFPE samples was weaker in our study, which is also consistent with other studies [23,32]. The tissue archival age, or the "FFPE block age" is another factor for consideration. Cronin et al. compared frozen section breast tissue with FFPE samples of "various block ages" by RT PCR for 92 genes and found a 90% signal loss in FFPE samples [26]. Our data on the genome-wide level also suggested the significance of FFPE "block age" on gene expression data. Srinivasan et al. [2], and Karsten et al [33] and Masuda et. al [1] have reviewed in detail the effect of fixative and tissue processing on the content and integrity of nucleic acids. There are four types of reactions of formaldehyde with nucleic acids: (1) Addition reaction or methylolation -the N-H groups of primarily adenine and thiamine are converted to N-CH2-OH groups (methylol groups). The poly(A) tail on RNA is thus heavily methylolated leading to poor reverse transcription. This methylolation is possibly reversible through hydrolysis. (2) Cross-linking -methylolated bases can react with N-H groups (on proteins and nucleic acids) to form -N-CH 2 _N-cross-links, which are not easily hydrolyzed. (3) Formation of apyrimidinic/apurinic sites -the N-glycosidic bond between A, C, G, T or U and the sugar backbone is broken, leaving a blank space in the sequence. This is not base-specific, but is not reversible. (4) Fragmentation -Formaldehyde catalyzes the hydrolysis of phosphodiester bonds, fragmenting strands of nucleic acids, which is also not reversible. Therefore, it was not a surprise to find differential gene expression in FFPE samples compared to FF samples in the present study.
Bibikova et al. used a smaller panel of the DASL assay with 16 pairs of FF and FFPE samples from healthy and tumor breast tissue and healthy and colon cancer tissue [6]. They found that FFPE samples had 50% less gene expression compared to matched FF samples, which may be due to RNA degradation related to fixation and storage [6]. The present study is one of the first few adequately powered, whole-genome DASL assays interrogating more than 18,000 genes that has compared the results of paired tumor and normal tissue from FF and FFPE samples.
Our findings suggest that the analysis of FFPE samples does not identify the exact same genes that would have been identified by analyzing FF samples, but at least, the list shows some similarity in terms of enrichment of GO-terms representing the lead functional groups of genes. In other words, FFPE samples may not be ideal for picking individual target gene(s), but may be used to identify the lead functional group(s) of genes that are differentially expressed in tumor. Findings of the differentially expressed genes in breast cancer were biologically meaningful. On the one hand the "cell cycle" & "cell division" related genes were up-regulated and on the other hand, genes related to "regulation of epithelial cell proliferation" were down-regulated. Genes involved in metalloexopeptidase activity, transforming growth factor beta signaling pathway, BMP signaling pathway, were found to be down-regulated in breast cancer.
As mentioned in the results section, some of the genes (including PPARG and FGF2) were found repeatedly in the lists of different GO-terms. Peroxisome proliferatoractivated receptor-γ (PPARG), is expressed in a large number of human cancers, including breast, colon, stomach, prostate, pancreas, bladder, placenta, lung, chondrosarcoma and in leukemia [34,35]. Recently Jiang et al. showed that PPARG expression in immunohistochemistry was positively correlated to estrogen receptor status, inversely associated with histological grade and tumor size, and in survival analysis patients with higher PPARG expression had significantly better prognosis [36]. In the same direction, the present study showed evidence of down-regulation of PPARG in breast cancer tissue. GWAS using germ line DNA showed a significant association of SNP in the FGFR2 gene with breast cancer [37]. In the same line, in the present study, we also found FGF2 to be down-regulated in breast cancer. Another biologically relevant gene that we found to be differentially expressed in breast cancer was corticotropin releasing hormone binding protein (CRHBP). Corticotropin-releasing hormone is a potent stimulator of synthesis and secretion of preopiomelanocortin-derived peptides. Although CRH concentrations in the human peripheral circulation are normally low, they increase throughout pregnancy and fall rapidly after parturition. Maternal plasma CRH probably originates from the placenta. Human plasma contains a CRH-binding protein that inactivates CRH and may prevent inappropriate pituitary-adrenal stimulation in pregnancy.
An apparent weakness of the present study is the lack of gene expression data from breast tissue preserved in RNA stabilization buffer, which would have served as the gold standard against which FFPE samples could have been compared. However, 72% of the genes that we found to be differentially expressed in FFPE breast tissue compared to corresponding FF in present study were also differentially expressed in FFPE skin tissue compared to the "gold standard"-RNAlater preserved corresponding skin tissue sample (unpublished data from our group). One of the strengths of the present study is that it was adequately powered to detect differential expression arising from tissue storage (FFPE vs. FF) or disease status (tumor vs. adjacent normal tissue).

Conclusion
In agreement with other studies using the DASL platform, our present study also suggests the usefulness of DASL chemistry to study gene expression in fragmented RNA samples. DASL can efficiently handle the fragmentation issue of RNA in FFPE samples. However, formalin fixation used in FFPE induces significant gene expression change in a number of genes, and these changes may differ in degree or even in direction between tumor and normal tissue. Therefore, FFPE samples should not be directly compared with FF samples and considerable caution must be taken when interpreting gene expression data from FFPE samples. Despite these constraints, we found a number of biologically meaningful, differentially expressed genes related to HIST1, HIST2 proteins, and some other such as PPARG, FGF2, APOB, CRHBP, CETP, and RXRG in breast cancer tissue compared to corresponding adjacent normal breast tissue. The validity of these specific observations, however, needs to be confirmed in future larger studies.