Predicting prognosis using molecular profiling in estrogen receptor-positive breast cancer treated with tamoxifen
- Sherene Loi†1, 2,
- Benjamin Haibe-Kains†1, 3,
- Christine Desmedt1,
- Pratyaksha Wirapati4,
- Françoise Lallemand1,
- Andrew M Tutt5,
- Cheryl Gillet5,
- Paul Ellis5,
- Kenneth Ryder5,
- James F Reid6, 8,
- Maria G Daidone8,
- Marco A Pierotti6, 8,
- Els MJJ Berns7,
- Maurice PHM Jansen7,
- John A Foekens7,
- Mauro Delorenzi4,
- Gianluca Bontempi3,
- Martine J Piccart1 and
- Christos Sotiriou1Email author
© Loi et al; licensee BioMed Central Ltd. 2008
Received: 28 January 2008
Accepted: 22 May 2008
Published: 22 May 2008
Estrogen receptor positive (ER+) breast cancers (BC) are heterogeneous with regard to their clinical behavior and response to therapies. The ER is currently the best predictor of response to the anti-estrogen agent tamoxifen, yet up to 30–40% of ER+BC will relapse despite tamoxifen treatment. New prognostic biomarkers and further biological understanding of tamoxifen resistance are required. We used gene expression profiling to develop an outcome-based predictor using a training set of 255 ER+ BC samples from women treated with adjuvant tamoxifen monotherapy. We used clusters of highly correlated genes to develop our predictor to facilitate both signature stability and biological interpretation. Independent validation was performed using 362 tamoxifen-treated ER+ BC samples obtained from multiple institutions and treated with tamoxifen only in the adjuvant and metastatic settings.
We developed a gene classifier consisting of 181 genes belonging to 13 biological clusters. In the independent set of adjuvantly-treated samples, it was able to define two distinct prognostic groups (HR 2.01 95%CI: 1.29–3.13; p = 0.002). Six of the 13 gene clusters represented pathways involved in cell cycle and proliferation. In 112 metastatic breast cancer patients treated with tamoxifen, one of the classifier components suggesting a cellular inflammatory mechanism was significantly predictive of response.
We have developed a gene classifier that can predict clinical outcome in tamoxifen-treated ER+ BC patients. Whilst our study emphasizes the important role of proliferation genes in prognosis, our approach proposes other genes and pathways that may elucidate further mechanisms that influence clinical outcome and prediction of response to tamoxifen.
Breast cancers are biologically heterogeneous with regards to their clinical behavior and response to therapies. However, treatment-decision making for women diagnosed with breast cancer is still reliant on classical histopathological appearance and immunohistochemical markers that give little insight into tumor biology and potential response to treatment. There are a few biomarkers routinely used that can predict response to commonly prescribed therapies. The presence of estrogen receptors is the best indicator of response to anti-estrogen agents such as tamoxifen. However, 30–40% of women with estrogen receptor-positive breast cancer (ER+BC) will develop distant metastases and die despite tamoxifen treatment. The underlying biological mechanisms of resistance to tamoxifen are incompletely understood.
Gene expression profiling of tumors appears to be a promising new strategy for predicting clinical outcome in breast cancer patients. Recent studies have proposed that the heterogeneity of clinical response can be correlated with different molecular "portraits" [1, 2]. Gene signatures have been developed that can distinguish subgroups of patients with different prognoses or response to chemotoxic and antiestrogen agents. However, issues have emerged since these initial studies relating to design and validation of gene classifiers , particularly the small numbers of patient samples used to derive the classifier and the little overlap in these gene signatures. Furthermore, it has been shown that membership in a prognostic gene list is not necessarily indicative of a gene's importance in cancer pathology . Extracting biological meaning from whole genome molecular profiling remains a significant challenge.
We have recently shown that in ER+ BC, its proliferative status is the most important predictor of prognosis in these women : highly proliferative tumors have a worst clinical outcome, either with or without systemic treatment. However, proliferation is a downstream consequence and the understanding of the upstream activators is essential for advancing biological knowledge and development of targeted approaches that may be tested in the clinical setting, potentially in combination with anti-estrogen agents. In this study, we hypothesized that developing a gene classifier using clusters of correlated genes as single variables may allow for both prediction of clinical outcome in tamoxifen-treated patients and facilitate new biological understanding of resistance mechanisms as these clusters could represent biological networks or pathways. Furthermore, we assessed the performance of our classifier on several independent data sets of tamoxifen-treated samples, both in the adjuvant as well as advanced setting. These were obtained from a number of institutions and samples had been hybridized on varying microarray platforms.
Tamoxifen-treated dataset used in development of the classifier
The dataset used for training the classifier consisted of 255 early-stage (stage I, II) BC samples, diagnosed between 1980 and 1995, all of whom had received tamoxifen only as their adjuvant treatment (hereby referred to as the "tamoxifen-treated dataset"). The demographics can be found in [Additional file 1], and data processing methods are described in Loi et al.  as a large proportion of this dataset has been previously used in another research study. The raw data for the tamoxifen-treated dataset are available at the GEO database (accession number GSE6532). This dataset contained samples from the John Radcliffe Hospital (OXFT), Oxford, United Kingdom, Guys Hospital (GUYT), London, United Kingdom and Uppsala University Hospital (KIT), Uppsala, Sweden. All samples had been hybridized using Affymetrix U133 Genechips™ (HG-U133A, B for OXFT and KIT, and PLUS2 for GUYT). All samples were required to be estrogen (ER) and/or progesterone receptor (PR) positive by ligand-binding assay and had been prescribed tamoxifen monotherapy for 5 years post diagnosis as adjuvant therapy. The cut-off value for classification of patients as positive or negative for ER and PR was 10 fmol per mg protein. The primary endpoint used for generating the classifier was the first distant metastatic event (distant metastasis free survival, DMFS), as survival can be confounded by local recurrence and treatments given at relapse. Each hospital's institutional ethics board approved the use of the tissue material for the purposes of this research study.
We used a clustering method in order to identify clusters of highly correlated genes, prior to feature selection and model building as we hypothesized that this would to reduce the number of variables, increase signature stability, allow platform independency and to preserve biological interpretation . Preliminary clustering was performed on separate dataset consisting of 137 samples from untreated women with early stage breast cancer (data available from GEO database, accession number GSE6532). These samples were not used in the signature development to avoid any possible overfitting when performing the cluster identification. Control probe sets and those absent in at least 95% of the samples were removed. The data set was then filtered based on overall variance with the top 20% of probe sets selected for further clustering. Hierarchical clustering with Pearson correlation similarity metric and complete linkage was used. The generated dendrogram was then cut at a height of 0.5. Clusters were discarded if there were less than 5 known genes (as per Unigene) per cluster. After this procedure, a total of 110 clusters were obtained for signature development [Additional file 2a]. Of note, these clusters were able to be reproduced in the tamoxifen-treated population (data not shown). The cluster centroid, i.e. the average expression level of all the probes per cluster, was then obtained for each cluster in the tamoxifen-treated dataset. Each cluster was subsequently treated as a single variable called a "probe cluster" (pclust).
Although the preliminary clustering significantly reduced the dimensionality of the data, the number of features remained too large to efficiently build the classifier. The selection of the most relevant pclusts was performed using a ranking based on the likelihood ratio statistic of univariate Cox model. The Cox model specifies the hazard of a patient i as
λ i (t) = λ0(t)exp(β pclust j ), (1)
where λ0(t) is the baseline hazard assumed to be equal for all patients (proportional hazards), and pclust j is the j th pclust of patient i. The likelihood ratio statistic is twice the difference in the log partial likelihood between the null model (β is equal to 0) and the model with estimate of β. The only parameter of this feature selection is the signature size, i.e. the number k of pclusts that will be used to build the classification model.
Signature size and stability
The Stab statistic is equal to 1 if the same signature is always selected over M10FOLDCV given a signature size and if there is no overlap. It must be noted that Stab statistic converges to 1 as the signature size converges to the total number of variables. Therefore, k was chosen as a trade-off between signature size and stability, i.e. a signature size exhibiting maximal possible stability and being smaller than the total number of variables.
As multivariate survival models using microarray data are prone to overfitting, we built the model by combining the univariate Cox models computed during feature selection . Each univariate model is defined as β pclust j , also referred as risk score in the literature. We used the sum rule as this method outperforms more complex combination schemes . We set all the weights to 1 and computed the combined risk score as .
To avoid over-optimistic estimation of prediction accuracy, a leave-one-out cross-validation (LOOCV) and M10FLODCV procedures were used. As LOOCV does not depend on the order of patients in the dataset, these results will be discussed in details.
Independent validation data sets
Four independent validation sets were used to assess the performance of the classifier. These demographics are shown in [Additional file 3].
Guy's hospital dataset (GUYT2)
This external validation set was kindly provided by the Guy's Hospital, London, United Kingdom, consisting of 77 patients diagnosed with early stage breast cancer and treated with adjuvant tamoxifen monotherapy. Samples were hybridized using Affymetrix U133PLUS2 Genechips™ according to standard Affymetrix protocols. Gene expression values from the CEL were normalized by use of the standard quantile normalization method in RMA  and are available from GEO database, accession number GSE9195.
Dataset of Ma et al. (Ma)
This dataset consisted of 60 patients diagnosed at the Massachusetts General Hospital, Boston, United States of America, and who were treated with adjuvant tamoxifen monotherapy. The samples were hybridized on the Agilent microarray platform and have been previously described . The raw data was obtained at the GEO database (accession number GSE1378).
Dataset of Reid et al. (Reid)
This external validation set was kindly provided by the Department of Experimental Oncology, Istituto Nazionale per lo Studio e la Cura dei Tumori, Milan, Italy, consisting of 113 patients who had received adjuvant tamoxifen monotherapy. Samples were hybridized on their local cDNA microarray platform. Part of this dataset has previously been published . We were unaware of the clinical data at all times and the survival analyses were performed in Milan.
Dataset of Jansen et al. (Jansen)
This dataset consisted of 112 patients diagnosed at the Erasmus MC, Rotterdam, Netherlands and who were treated with tamoxifen in the metastatic setting as first line hormonal therapy. Tumour response previously described  included both complete and partial responses and progressive disease. Samples were hybridized on 18 K human cDNA arrays manufactured at the Central Microarray Facility at the Netherlands Cancer Institute, Rotterdam, Netherlands. The raw data was kindly supplied by our Rotterdam colleagues.
Mapping across microarray platforms was done using the "Cleanex" database  to retrieve corresponding gene symbols and Affymetrix probe sets.
Although the risk score can be used as a continuous variable, we divided the dataset into two prognostic groups to generate a high or low risk status as this allowed us to estimate hazard ratios and produce Kaplan Meier curves. For the purposes of this study, a binary classification was generated using a 70:30 cutoff that is, 70% of samples would be considered low risk (hence, the majority of these patients would still suitable for tamoxifen) and 30% high risk of relapse on adjuvant tamoxifen monotherapy. This cutoff was an arbitrary figure chosen by the authors to balance the cost of tamoxifen vs. other more expensive endocrine agents against relapse risk. The results shown in this manuscript were from analyses using the 70:30 cutoff for tamoxifen-treated and the external validation GUYT2 datasets, though similar results were obtained with a 50:50 cutoff. However, the samples in the Ma and Reid datasets were chosen to be balanced for recurrences within 5 years and non-recurrences after 6 years (case control studies, a non-consecutive series). Therefore, a 50:50 cutoff was used to take into account for the balanced number of events. Performance of the classifier computed by LOOCV was assessed using Kaplan-Meier survival curves and log rank p-values. The overall performance of the classifier in the three adjuvant data sets was estimated using classical meta-analysis methods .
Hazard ratios (HRs) for the risk groups defined by the classifier were calculated using a Cox's regression stratified by clinical center to account for possible heterogeneity in patient selection or other potential confounders among the various centers. For each independent validation data set, the HR (with their 95% confidence intervals [CI]) was displayed on a forest plot and tested for heterogeneity using a chi-square test . HRs were then combined using the inverse variance-weighted method with fixed effect model  to compute an overall HR.
To establish if the model predicted response to treatment, a univariate logistic regression model was used with the risk score as explanatory variable. Significance was determined by the Wald test and a false discovery rate (FDR) < 0.05.
Statistical analysis was performed using SPSS statistical software package version 13.0 and the R software package version 2.3 .
Correlation with the grade gene expression index (GGI)
The Spearman's correlation between the risk scores produced by the predictor and that produced by GGI, previously described in Sotiriou et al, 2006 , was calculated to assess the contribution of proliferation-related genes to the prognostic ability of the current predictor.
Network and pathway analysis
Analysis of gene interactions for each cluster of the final classifier was performed using Ingenuity Pathways Analysis (IPA) tools version 3.0 . Affymetrix probe sets of each cluster were used as input to generate biological networks based on a curated list of molecular interactions in IPA. IPA then calculated a significance value for enrichment of the functional classes and canonical pathways generated for each of these networks. Only significant functions and pathways are shown.
Classifier performance on the training set
Performance of the classifier. Performance of the 13 clusters classifier algorithm re-training and validation on the separate institutional populations using both leave-one-out and multiple 10-fold cross-validations.
Training set (total/events)
Validation set (total/events)
Hazard ratio (95%CI)
Log rank p value
Distant Metastases Free Survival for low risk group only#
DMFS at 3 years
DMFS at 5 years
DMFS at 10 years
Leave-one-out cross validation (255/67)*
Multiple 10-fold cross-validation (255/67)
Cox regression analysis. Univariate and multivariate Cox regression analysis for time to distant metastases in 255 patients.
Hazard ratio (95%CI)
Hazard ratio (95%CI)
Histological grade (1 vs. 2 vs. 3)
Tumor size (≤ 20 mm vs. ≥ 20 mm)
Nodal status (positive vs. negative)
ER high vs. low expression
PgR high vs. low expression
HER2 high vs. low expression
13 cluster gene classifier*
Independent validation on external datasets: a meta-analysis
Mapping. Number of probe sets able to be mapped across datasets during independent validations.
Total probe sets present in classifier
Total n (%)
Prediction of response in metastatic breast cancer patients treated with tamoxifen
In order to delineate whether our classifier was predicting response to tamoxifen and/or the intrinsic aggressiveness of a breast tumor (prognostic), we applied our classifier to a data set of women who had received tamoxifen in the advanced setting where response to the treatment was clearly defined . Twenty-nine of the 79 probes that could be mapped were significantly associated with clinical response (complete or partial vs. progressive disease, false discovery rate (FDR) < 0.05) and 4 cluster groups seemed to have some predictive ability (FDR < 0.15, pclusts 148,120,375,201). However, overall, we found that our classifier had no discrimination ability in this group of patients. Interestingly one cluster centroid, cluster 375, was significantly associated with response (FDR = 0.008), suggesting that this cluster of 3 genes [see Additional file 2b] could predict response to tamoxifen treatment. These results could imply that our classifier is mainly prognostic, though as only 30.5% of probe sets were able to be mapped from the cDNA platform, technical limitations could have significantly contributed to these results.
Correlation with the grade gene expression index (GGI)
The GGI is an algorithm which can quantify the expression of proliferation genes in a breast tumor . Given that many current prognostic predictors derive a significant proportion of their discriminatory ability from proliferation-related genes , we were interested to assess this in our current predictor. Despite the different discovery methods, the groupings produced by our classifier and the GGI were highly correlated: (GUYT2 0.91; Reid 0.86; Ma 0.69; Jansen 0.55; all p-values < 0.05), suggesting a significant proportion of its predictive power can be attributed to cell cycle-related genes.
Functional analysis. Functional analysis of the 13 clusters from the gene signature (for full gene list [see Additional file 2b]).
Top Network overall
Top high level function
Top canonical pathway
Number of focus genes able to be mapped*
Cancer Inflammatory disease Cell cycle
Lipid metabolism Molecular transport
cAMP mediated signaling
Cancer Immune response
1 carbon pool by folate
Gene expression Protein synthesis
Cell cycle Cellular movement
Cellular movement, inflammatory disease
DNA recombination and repair
Cell morphology Cellular development
DNA recombination and repair
Cell to cell signaling and interaction
Cell death Cellular development
Cellular function and maintenance
Developing gene signatures that are stable, are effective at distinguishing prognostic groups and provide important biological information from whole genome microarray data remains a significant challenge. We propose a method which has similarities to a technique proposed by Bair and colleagues [21, 22], in combination with an estimation of signature stability  and to our knowledge, the largest dataset of ER+ patients homogenously treated in an attempt to address these issues. Whilst Bair et al.  used the clinical data to define a subset of survival-related genes prior to clustering, we performed an initial unsupervised clustering procedure to form the clusters which could act as biological networks, which were then used as single variables to build the classifier. We hypothesized that this would limit the effect the training set has over the final selection of genes for inclusion in the classifier  and allow a larger gene list for biological hypothesis generating. The inclusion of an assessment of "stability" facilitates determination of the most robust variables and hence presumably important biological information.
With this method, we were able to develop and validate a gene classifier that could predict which patients with ER+ BC were at high risk of relapse despite tamoxifen treatment. Importantly, we were able to validate the classifier on independent samples utilizing raw data from different microarray platforms using a meta-analytical approach. Demonstration of prognostic ability is important if we are to assemble gene lists from microarray data for biological hypotheses generation and potential laboratory experimental validation, which was one of the most important aims of this study. Validation of gene classifiers with independent samples from which they were developed from is a major challenge for microarray studies, especially those with clinical implications, and combining multiple datasets can be difficult due to different patient populations, sample preparation and microarray platforms. Our study uses one of the largest training and validation sets reported in the literature on tamoxifen (only) treated patients.
Whilst, in the future we may have a microarray-based diagnostic test incorporating all 181 genes in the 13 clusters, at present the routine use of this technology is not logistically feasible. However, the advantage of our approach is that as each cluster consists of a group of genes that are highly correlated and hence effectively act as one covariate. Thus, a diagnostic test of just 13 genes (one per cluster) could be developed for clinical use if desired, even though for biological study the researcher would be more interested in all the genes per cluster. To demonstrate this, we took a series of 13 individual probe sets (one per cluster) and correlated their performance with the full classifier on the training set of 255 patients. The median correlation was 0.94 (range: 0.88–0.97). The top 26 ranked 13-gene classifiers (with a correlation ranging from 0.95–0.97) and their corresponding probe sets are listed in [Additional file 7]. These "simple" tests will require further independent assessment but could be validated using immunohistochemistry or quantitative RT-PCR and are attractive option for potential clinical implementation.
Due to the pressing clinical need, several other investigators have also developed gene predictors that can predict outcome in ER+ BC treated with adjuvant tamoxifen monotherapy [11, 13, 23, 24]. These studies have used a variety of bioinformatics approaches to develop these gene signatures. These range from a candidate gene approach , selection of genes using a biological approach  and similar to our study, a discovery-based approach using supervised analyses correlated with clinical outcome . Likewise, different patient populations were used in the development process. Ours is the only study to use a large, consecutive series of patients as a training set as opposed to samples obtained from a clinical trial ; or a case control population . Only one of these reported gene classifiers has undergone noteworthy clinical validation , however unfortunately these genes provide no new potential therapeutic targets or insights into the underlying biology. Of note, we have previously published that proliferation-related genes are the common biological thread linking many of these currently published classifiers [5, 6]. Our current classifier also has a significant amount of cell cycle genes, and is highly correlated with the GGI, but one of the aims of this study was to identify other potential biological mechanisms upstream of proliferation. All the clusters in the final classifier were the most common chosen during the cross-validation process suggesting the presence of other strong biological signals. Further experimental validation in in-vitro and in-vivo models will be required to test these hypotheses and their relevance to the clinical question. Interestingly, the cluster 375 was significantly predictive in the dataset of metastatic breast cancer patients treated with tamoxifen as first line treatment for relapsed disease. However, we were not able to validate the full gene classifier. The best approach on distinguishing prognosis versus therapy prediction using gene expression profiling remains unclear. It is possible that developing a predictor of true response to therapy may only be possible using samples from a randomized trial in the metastatic setting where response can be clearly defined and transcriptional profiles can be compared with an untreated control group.
Using a discovery-based whole genome approach, we have developed and validated a gene classifier that can distinguish patients at high risk of distant metastasis despite adjuvant tamoxifen monotherapy. In the future, these poor prognosis patients could be selected for prescription of other treatment modalities, such as chemotherapy and/or biological agents. In this study we propose an approach which has the advantage of facilitating both signature stability and biological interpretation. These are critical issues in the challenging task of building gene predictors for breast cancer patients as we endeavor to delineate meaningful biological and clinically useful information from the microarray-produced data.
Sherene Loi is supported by the American Society of Clinical Oncology (ASCO) Young Investigators' grant and the National Breast Cancer Foundation of Australia. Christos Sotiriou, Christine Desmedt and Benjamin Haibe-Kains are supported by the Belgian National Foundation for Cancer Research. Christos Sotiriou is supported by the E. Lauder Breast Cancer Foundation, and the MEDIC Foundation.
- Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, Hastie T, Eisen MB, van de Rijn M, Jeffrey SS, Thorsen T, Quist H, Matese JC, Brown PO, Botstein D, Eystein Lonning P, Borresen-Dale AL: Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A. 2001, 98 (19): 10869-10874. 10.1073/pnas.191367098.PubMedPubMed CentralView ArticleGoogle Scholar
- Sotiriou C, Neo SY, McShane LM, Korn EL, Long PM, Jazaeri A, Martiat P, Fox SB, Harris AL, Liu ET: Breast cancer classification and prognosis based on gene expression profiles from a population-based study. Proc Natl Acad Sci U S A. 2003, 100 (18): 10393-10398. 10.1073/pnas.1732912100.PubMedPubMed CentralView ArticleGoogle Scholar
- Michiels S, Koscielny S, Hill C: Prediction of cancer outcome with microarrays: a multiple random validation strategy. Lancet. 2005, 365 (9458): 488-492. 10.1016/S0140-6736(05)17866-0.PubMedView ArticleGoogle Scholar
- Ein-Dor L, Kela I, Getz G, Givol D, Domany E: Outcome signature genes in breast cancer: is there a unique set?. Bioinformatics. 2005, 21 (2): 171-178. 10.1093/bioinformatics/bth469.PubMedView ArticleGoogle Scholar
- Sotiriou C, Wirapati P, Loi S, Harris A, Fox S, Smeds J, Nordgren H, Farmer P, Praz V, Haibe-Kains B, Desmedt C, Larsimont D, Cardoso F, Peterse H, Nuyten D, Buyse M, Van de Vijver MJ, Bergh J, Piccart M, Delorenzi M: Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J Natl Cancer Inst. 2006, 98 (4): 262-272.PubMedView ArticleGoogle Scholar
- Loi S, Haibe-Kains B, Desmedt C, Lallemand F, Tutt AM, Gillet C, Ellis P, Harris A, Bergh J, Foekens JA, Klijn JG, Larsimont D, Buyse M, Bontempi G, Delorenzi M, Piccart MJ, Sotiriou C: Definition of clinically distinct molecular subtypes in estrogen receptor-positive breast carcinomas through genomic grade. J Clin Oncol. 2007, 25 (10): 1239-1246. 10.1200/JCO.2006.07.1522.PubMedView ArticleGoogle Scholar
- Kittler J, Hatef M, Duin R, Matas J: On Combining Classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1998, 10 (3): 226-238. 10.1109/34.667881.View ArticleGoogle Scholar
- Haibe-Kains B, Desmedt C, Loi S, Delorenzi M, Sotiriou C, Bontempi G: Computational Intelligence in Clinical Oncology- a case study. Studies in Computational Intelligence. Edited by: Smolinski TG, Milanova MM, Hassanien AE. 2008, Springer-Verlag, Applications of computational intelligence in bioinformatics and biomedicine:current trends and open problems:Google Scholar
- Davis CA, Gerick F, Hintermair V, Friedel CC, Fundel K, Kuffner R, Zimmer R: Reliable gene signatures for microarray classification: assessment of stability and performance. Bioinformatics. 2006, 22 (19): 2356-2363. 10.1093/bioinformatics/btl400.PubMedView ArticleGoogle Scholar
- Bolstad BM, Irizarry RA, Astrand M, Speed TP: A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics. 2003, 19 (2): 185-193. 10.1093/bioinformatics/19.2.185.PubMedView ArticleGoogle Scholar
- Ma XJ, Wang Z, Ryan PD, Isakoff SJ, Barmettler A, Fuller A, Muir B, Mohapatra G, Salunga R, Tuggle JT, Tran Y, Tran D, Tassin A, Amon P, Wang W, Enright E, Stecker K, Estepa-Sabal E, Smith B, Younger J, Balis U, Michaelson J, Bhan A, Habin K, Baer TM, Brugge J, Haber DA, Erlander MG, Sgroi DC: A two-gene expression ratio predicts clinical outcome in breast cancer patients treated with tamoxifen. Cancer Cell. 2004, 5 (6): 607-616. 10.1016/j.ccr.2004.05.015.PubMedView ArticleGoogle Scholar
- Reid JF, Lusa L, De Cecco L, Coradini D, Veneroni S, Daidone MG, Gariboldi M, Pierotti MA: Limits of predictive models using microarray data for breast cancer clinical treatment outcome. J Natl Cancer Inst. 2005, 97 (12): 927-930.PubMedView ArticleGoogle Scholar
- Jansen MP, Foekens JA, van Staveren IL, Dirkzwager-Kiel MM, Ritstier K, Look MP, Meijer-van Gelder ME, Sieuwerts AM, Portengen H, Dorssers LC, Klijn JG, Berns EM: Molecular classification of tamoxifen-resistant breast carcinomas by gene expression profiling. J Clin Oncol. 2005, 23 (4): 732-740. 10.1200/JCO.2005.05.145.PubMedView ArticleGoogle Scholar
- Praz V, Jagannathan V, Bucher P: CleanEx: a database of heterogeneous gene expression data based on a consistent gene nomenclature. Nucleic Acids Res. 2004, 32 (Database issue): D542-7. 10.1093/nar/gkh107.PubMedPubMed CentralView ArticleGoogle Scholar
- Cochrance WG: Problems arising in the analysis of a series of similar experiments. Journal of the Royal Statistical Society. 1937, 4: 102-118.Google Scholar
- Team RCD: The R Project for Statistical Computing. [http://www.r-project.org]
- Systems I: Ingenuity Pathway Analysis. [http://www.ingenuity.com]
- Liu LT, Peng JP, Chang HC, Hung WC: RECK is a target of Epstein-Barr virus latent membrane protein 1. Oncogene. 2003, 22 (51): 8263-8270. 10.1038/sj.onc.1207157.PubMedView ArticleGoogle Scholar
- Prasad A, Fernandis AZ, Rao Y, Ganju RK: Slit protein-mediated inhibition of CXCR4-induced chemotactic and chemoinvasive signaling pathways in breast cancer cells. J Biol Chem. 2004, 279 (10): 9115-9124. 10.1074/jbc.M308083200.PubMedView ArticleGoogle Scholar
- Turner S, J AS, Cameron D: Tamoxifen treatment failure in cancer and the nonlinear dynamics of TGFbeta. J Theor Biol. 2004, 229 (1): 101-111. 10.1016/j.jtbi.2004.03.008.PubMedView ArticleGoogle Scholar
- Bair E, Tibshirani R: Semi-supervised methods to predict patient survival from gene expression data. PLoS Biol. 2004, 2 (4): E108-10.1371/journal.pbio.0020108.PubMedPubMed CentralView ArticleGoogle Scholar
- Park MY, Hastie T, Tibshirani R: Averaged gene expressions for regression. Biostatistics. 2007, 8 (2): 212-227. 10.1093/biostatistics/kxl002.PubMedView ArticleGoogle Scholar
- Oh DS, Troester MA, Usary J, Hu Z, He X, Fan C, Wu J, Carey LA, Perou CM: Estrogen-regulated genes predict survival in hormone receptor-positive breast cancers. J Clin Oncol. 2006, 24 (11): 1656-1664. 10.1200/JCO.2005.03.2755.PubMedView ArticleGoogle Scholar
- Paik S, Shak S, Tang G, Kim C, Baker J, Cronin M, Baehner FL, Walker MG, Watson D, Park T, Hiller W, Fisher ER, Wickerham DL, Bryant J, Wolmark N: A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med. 2004, 351 (27): 2817-2826. 10.1056/NEJMoa041588.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.