A qualitative transcriptional signature to reclassify histological grade of ER-positive breast cancer patients

Background Histological grade (HG) is commonly adopted as a prognostic factor for ER-positive breast cancer patients. However, HG evaluation methods, such as the pathological Nottingham grading system, are highly subjective with only 50–85% inter-observer agreements. Specifically, the subjectivity in the pathological assignment of the intermediate grade (HG2) breast cancers, comprising of about half of breast cancer cases, results in uncertain disease outcomes prediction. Here, we developed a qualitative transcriptional signature, based on within-sample relative expression orderings (REOs) of gene pairs, to define HG1 and HG3 and reclassify pathologically-determined HG2 (denoted as pHG2) breast cancer patients. Results From the gene pairs with significantly stable REOs in pathologically-determined HG1 (denoted as pHG1) samples and reversely stable REOs in pathologically-determined HG3 (denoted as pHG3) samples, concordantly identified from seven datasets, we extracted a signature which could determine the HG state of samples through evaluating whether the within-sample REOs match with the patterns of the pHG1 REOs or pHG3 REOs. A sample was classified into the HG3 group if at least a half of the REOs of the 10 gene pairs signature within this sample voted for HG3; otherwise, HG1. Using four datasets including samples of early stage (I–II) ER-positive breast cancer patients who accepted surgery only, we validated that this signature was able to reclassify pHG2 patients into HG1 and HG3 groups with significantly different survival time. For the original pHG1 and pHG3 patients, the signature could also more accurately and objectively stratify them into distinct prognostic groups. And the up-regulated and down down-regulated genes in HG1 compared with HG3 involved in cell proliferation and extracellular signal transduction pathways respectively. By comparing with existing signatures, 10-GPS was with prognostic significance and was more aligned with survival of patients especially for pHG2 samples. Conclusions The transcriptional qualitative signature can provide an objective assessment of HG states of ER-positive breast cancer patients, especially for reclassifying patients with pHG2, to assist decision making on clinical therapy.


Introduction
Breast cancer has the highest incidence and mortality among females [1]. The microscopic morphological assessment of the degree of tumor cell differentiation, represented as tumor histological grades (HGs), has powerful prognostic prediction capability in breast cancer [2][3][4][5] and has been incorporated into the eighth edition of American Joint Commission of Cancer staging system [6]. According to the Nottingham grading system, after assessing tubule formation (tubularity), nuclear pleomorphism (nuclearity) and mitotic count, each patient can be assigned to histologic grade 1 (HG1, welldifferentiated, slow-growing tumor), histologic grade 2 (HG2, moderately differentiated, slightly faster growing tumor) or histologic grade 3 (HG3, poorly differentiated, highly proliferative tumor) [5]. The higher grade is associated with lower survival rate [3,4,7]: the 5-year survival rates of untreated HG1, HG2 and HG3 patients are 95%, 75% and 50%, respectively [5,8,9]. Considering the excellent prognoses, HG1 patients are amenable for a mild and less harmful anti-cancer therapy. On the contrary, HG3 patients require a more powerful anti-cancer therapy. Genomics analysis indicates that HG1 and HG3 breast carcinomas develop independently along different genetic pathways [10,11], while HG2 patients (comprising ~50% of breast cancer cases) contain a blend of histological features, some of which are common to both HG1 and HG3 tumors, and exhibit a mixed gene expression profiles of HG1 and HG3 [12,13]. Thus, HG2 breast carcinomas should not be classified as individual HG, but represent clinical and molecular hybrids between HG1 and HG3 diseases [14,15]. The heterogeneity of the HG2 breast cancers resulted in uncertain disease outcome prediction and there is no standard treatment protocol for 4 clinical decision making [7,16].
However, the pathological Nottingham grading system, the most employed HG evaluation method, is dependent on adequately prepared hematoxylin-eosin-stained tumor tissue sections to be assessed by an appropriately trained pathologist, which is highly subjective with only 50%-85% inter-observer agreements [17][18][19][20]. And the consensus was even lower for HG2 samples [15,21,22]. Therefore, many studies have tried to identify transcriptional signatures to reclassify pathologically-determined HG states especially HG2 (pHG2) status of patients in order to improve the therapeutic planning for breast cancer patients [13,16,[23][24][25]. However, most of the previously proposed signatures for classifying samples were based on summarized expression measurements of the signature genes, which lack robustness for clinical applications due to widespread batch effects and quality uncertainties of clinical samples [26][27][28][29]. The data normalization of samples collected in advance also hinders the feasibility of these signatures in routine clinical practice [16]. In contrast, the qualitative transcriptional signatures based on the within- Approximately 70% breast cancer patient express estrogen receptor (ER) according to American Cancer Society [1], adjuvant endocrine therapy is the routine regimen, and only for ER-positive patients with high HG, combined chemotherapy is suggested. It has been reported that there is a significant transcriptional difference between ER-positive and ERnegative cohorts [38]. ER-positive status is associated with a heterogeneous mixture of histologic grades, whereas ER-negative status is generally associated with HG3 [39].
In this study, we aimed to develop a qualitative transcriptional signature to identify HG states objectively in ER-positive breast cancers. Using gene expression profiles of 932 ERpositive early stage breast cancer patients, we developed a qualitative signature to allocate each patient into the pathologically-determined HG1 (denoted as pHG1) or HG3 (denoted as pHG3) group. Using four independent validation datasets including a total 524 samples of ER-positive breast cancer patients who accepted surgery only, the signature could find out a certain percentage of pHG1 patients as HG3 patients with worse prognoses and some pHG3 patients identified as HG1 patients with better prognoses.
Especially, we adopted an objective approach to validate the signature through evaluating whether the pHG2 patients reclassified as HG1 had better prognoses than the pHG2 patients reclassified as HG3.

Development of the REO-basedgrade signature
The flowchart of this study was described in Fig. 1. For each of the seven training datasets (Table 1), with FDR < 0.1, we firstly identified gene pairs with significantly stable REOs in the pHG1 and pHG3 groups, respectively, and then identified gene pairs with reversal REOs between the two groups. Then, 437 gene pairs were commonly identified from the seven datasets and they consistently showed the same reversal REO patterns between the pHG1 and pHG3 groups in the seven datasets. Next, we performed a forward-stepwise selection procedure to search a set of gene pairs that achieved the highest F-score 6 according to the classification rule as follows: a sample was classified into the HG3 group if at least a half of the REOs of the set of gene pairs within the sample voted for HG3; otherwise, into the HG1 group. Finally, we obtained 10 gene pairs, denoted as 10-GPS (Table 2), to distinguish different histological grades with the highest F-score (0.8884). In the training data, the apparent specificity for HG1 samples was 90.77% and the apparent sensitivity for HG3 samples was 86.99%. The performance of the transcriptional grade signature in each training dataset can be found in Additional file 1: Table S1. Notably, the apparently imperfect performance should be reasonable because HG evaluation based on the pathological Nottingham grading system is highly subjective with only 50%-85% interobserver agreements [17][18][19][20]. We speculated that the 10-GPS could provide a more objective and clinically relevant measure of tumor grade with prognostic significance.
We validated the above speculation based on the knowledge that HG3 patients were with lower survival rate than HG1 patients [3,7]. Here, we collected another four independent datasets (Table 1) including samples with RFS or OS data of early stage ER-positive breast cancer patients who accepted surgery only. When the 10-GPS was applied to these datasets, the averaged apparent sensitivity for all HG3 samples was 83.1% and the average apparent specificity for all HG1 samples was 78.4%. In a merged dataset from the three validation datasets with the RFS information, the 12 pHG3 patients reclassified as HG1 by the signature had significantly higher 10-years RFS rate than that of the 46 HG3 patients confirmed by the signature (p=0.0143; HR=8.17, 95% CI: 1.10-60.6; C-index = 0.61, Fig. 2a). And, we also compared 10-years RFS rates between the 13 pHG1 patients reclassified as HG3 by the 10-GPS and the 93 HG1 patients confirmed by the 10-GPS.
In each of the three validation datasets with the RFS information, we also compared the survival between the pHG1 and pHG3 patients diagnosed by the pathological Nottingham grading system and the survival between the HG1 and HG3 patients reclassified by the 10-GPS from the pHG1 and pHG3 patients. Significant difference of RFS between the pHG1 and pHG3 patients was observed only in GSE6532 and GSE4922 dataset (Additional file 2:

Application of the signature to reclassification of HG2 samples
Then, we used the 10-GPS to reclassify the pHG2 samples of the above four validation datasets with RFS or OS information (Table 1) into the HG1 and HG3 groups, respectively, and evaluated their prognostic differences.

Transcriptional characteristics of the low-HG and high-HG samples recognized by the 10-GPS
In the TCGA-BRCA dataset, we used the limma algorithm and found 6,194 differentially overlapped DEGs was 99.92%, which was unlikely to happen by chance (binomial test, p < 1.10E-16). Moreover, after reclassifying HG status of samples for each of the four validation, we identified differential expressed genes identified from pHG1 and pHG3 samples, from reclassified HG1 and HG3 samples from pHG1 and pHG3 samples, from reclassified HG1 and HG3 samples from pHG2 samples, and reclassified HG1 and HG3 samples from overall samples respectively (Fig. 6). After identified enriched pathway lists by differential expressed genes for all the four datasets, the most common pathways were associated with proliferation and extracellular signal transduction (Additional file 5: Fig.   S3) The clearer transcriptional differences between the two reclassified groups indicated that the 10-GPS could more accurately and objectively stratify samples into distinct histological grade groups.

Comparison of 10-GPS prognostic significance with GGI, Oncotype DX and PAM50
For each of the four validation datasets, we obtained relapse risk by using existing signatures such as Gene expression Grade Index (GGI) [41] PAM50 [42]and Oncotype DX [43]. Chi-square test were conducted, revealing that there were significant difference in relapse risks generated by almost all the three existing signatures between HG1 and HG3 cohorts (Fig.7, Additional file 1: Table S5). This indicating that histological grade status reclassified by 10-GPS were with prognostic significance. Meanwhile, no prognostic differences between low-risk group identified by GGI and HG1 group identified by 10-GPS were observed (Fig. 8 a-d). It was similar for high-risk group identified by GGI and HG3 group identified by 10-GPS were observed ( Fig. 8 e-h). Significant prognostic difference between HG1 samples who were identified as low-risk by GGI and HG1 samples who were identified as high-risk by GGI was observed only in GSE4922, which might result from unbalanced samples (Additional file 6 Fig. S4 a-d). No significant prognostic difference between HG3 samples who were identified as low-risk by GGI and HG3 samples who were identified as high-risk by GGI was observed(Additional file 6 Fig. S4 e-h). It showed that the prognostic significance of 10-GPS was more aligned with the survival of patients.
Moreover, HG1 samples who were identified as high-risk and HG3 samples who were identified as low-risk were mainly from pHG2 cohort. The consistency between the result and prior knowledge that pHG2 are with low inter-observer agreements was reasonable. It implied that 10-GPS was feasible for reclassifying histological grade status, especially for pHG2 samples. All these comparisons indicated that 10-GPS could effectively reclassify into distinct histological grade groups with significantly prognostic difference.

Discussion
In this study, we developed a histological grade signature consisting of 10 gene pairs Fortunately, based on the working assumption that the majority labels of the pHG1 and pHG3 samples were right, thus we employed a supervised learning method to develop the signature. Imperfect F-score of 0.8884 just suggested that the 10-GPS did not over-fit the training dataset. There's no surprise that the apparent sensitivity for all HG3 samples was 83.10% and the apparent specificity for all HG1 samples was 78.40% in the validation datasets. In this study, we adopted a more objective approach to validate the signature through evaluating whether the reclassified HG1 patients could have better prognosis than that of HG3 patients. In four independent validation datasets, the reclassified HG1 and 11 HG3 groups recognized by the 10-GPS from the original HG2 patients or from the original HG1 and HG3 patients had significantly different survival.
We expected that the 10-GPS can replace or serve as auxiliary reference of the pathological Nottingham grading system to stratify ER-positive patients into two distinct groups in clinical practices. When applying the 10-GPS to all 132 samples of the GSE7390 dataset, the grade signature classified 66 patients into the HG1 group and 66 patients into the HG3 group. The RFS rate of the former group was significantly higher than that of the Some signatures had been developed to re-classify the HG status of BC samples, for example, during the development of GGI, samples with ≥100ng and a RIN≥7 were considered as qualified, and the quantitative threshold has been adopted during the reclassification process which might be affected by constituent ratios of samples [41], which results lacking of reproducibility for datasets generated by different labs or platforms and A limitation of this study is that we were unable to directly evaluate the signature in RNAsequencing, such as those archived in the TCGA database, where no patients accepted surgery only. Here, we only indirectly validate the 10-GPS in RNA-seq data of the TCGA-BRCA dataset through analysis of DEGs between the HG1 and HG3 groups. Another limitation of our study is no evidence could prove that application of the signature to patients who have undergo chemotherapy or radiation therapy is feasible (Additional file 8 Fig. S6). In the future, we will evaluate the performance of the signature developed in this study for expression data produced by RNA-sequencing or PCR platforms and evaluate applying the signature to patients who have undergo chemotherapy or radiation therapy.

Conclusions
Pathological histological grade evaluation methods are with high subjectivity, especially for the evaluation of HG2 breast cancer specimens. The transcriptional qualitative signature is objective for the evaluation and is robust for application of microscale samples, samples with lower RIN and samples with low tumor-purity, and can assist making on clinical therapy especially for patients with pHG2.

Data collection and Pre-processing
We collected gene expression profiles of 932 ER-positive breast cancer samples with pHG1 or pHG3 diagnosed by the pathological Nottingham grading system. To evaluate whether the reclassified two groups have significantly different survival time, we also collected independent expression data of 524 early stage (I-II) ER-positive breast cancer patients who accepted surgery only. All the breast cancer datasets used in this study were summarized in Table 1. The overall pathological histologic grades of TCGA samples were obtained from the study of Zheng Ping et al [44].
For Affymetrix array data, raw intensity files (.cel), downloaded from the Gene Expression Omnibus database, were processed with the Robust Multichip Average algorithm (RMA) algorithm for background adjustment without quantile normalization [45]. For Illumina beadchip data, the normalized expression data under accession number 13 EGAD00010000210 and EGAD00010000211 [40] were downloaded from the European Genome-Phenome Archive (http://www.ebi.ac.uk/ega/). When processing the data of the two platforms, each probe set ID was mapped to Gene ID according the corresponding annotation files, and then probe sets that mapped to multiple Gene IDs or did not map to any Gene ID were removed. The expression measurements of all probe sets corresponding to the same Gene ID were averaged to obtain a single measurement (on the log2 scale).

Development of the transcriptional signature for histological grade
Firstly, we identified the significantly stable REOs in pHG1 groups of each training dataset. For a given gene pair (G i , G j ), if the REO pattern (G i >G j or G i <G j ) was kept in more samples than expected by random chance, we defined the REO pattern of this gene pair, G i >G j in pHG1 group (or equally G i <G j in pHG3 group) as a stable REO characterizing pHG1 samples. The significance of the REO pattern is determined by a binomial test [47] as follows,

See Formula 1 in Supplemental Files.
where s is the number of samples in which gene i has a higher (or lower) expression level than gene j in a total of n samples, p 0 is the probability of observing a certain REO pattern 14 (G i > G j or G i < G j ) in a sample by chance (p 0 = 0.5). The Benjamini-Hochberg multiple testing correction was used to estimate the false discovery rate (FDR) [48]. Then, we identified the gene pairs with stable REOs in pHG3 group but reversal REO patterns between the pHG1 and pHG3 groups in each training dataset.
After selecting gene pairs with concordant reversal REOs among the seven training datasets, a forward-stepwise selection algorithm was performed to search for optimal subset of these gene pairs that resulted in the highest F-score. The F-score, harmonic mean of sensitivity and specificity, was calculated as follows,

See Formula 2 in Supplemental Files.
where sensitivity was defined as the proportion of correctly identified HG3 samples among all pHG3 samples, and specificity was defined as the proportion of correctly identified HG1 samples among all pHG1 samples.

Survival analysis
Recurrence-free survival (RFS) and overall survival (OS) served as the prognosis endpoint.
Kaplan-Meier survival plots and log-rank tests [49] were used to evaluate the differences in RFS and OS of distinct groups. The Cox proportional-hazards model was also performed to calculate the hazard ratios (HRs) and their 95% confidence intervals (CIs) [50]. To evaluate the predictive performance of a signature we also adopted the concordance index (C-index), which is a measure of overall concordance between predicted risk scores and observed survival [51,52].

Differential expression and functional enrichment analysis
After using limma package in R, the expression values of all tumor samples of TCGA-BRCA dataset were log-transformed by voom and the batch effects such as plate was corrected by removeBatchEffect. Then differential expressed genes were identified. The Fisher [53] was used to determine the significance of biological pathways enriched with a set of 15 interested genes by hypergeometric distribution test.

Declarations
Additional file 1: Table S1. The performance of the transcriptional grade signature in each training dataset, shown with apparent specificity, sensitivity and F-score.

Table S2. The enriched biological pathways by DEGs between the LHG and HHG group, performed by accumulated hypergeometric analysis.
Additional file 2: Fig. S1. Kaplan-Meier estimates of survival. (a-c) Relapse-free survival curves for pHG1 and pHG3 patients in dataset GSE7390, GSE6532 and GSE4922.      Composition of samples for HG1 and HG3 groups in four validation datasets by