Involvement of potential pathways in malignant transformation from Oral Leukoplakia to Oral Squamous Cell Carcinoma revealed by proteomic analysis

Background Oral squamous cell carcinoma (OSCC) is one of the most common forms of cancer associated with the presence of precancerous oral leukoplakia. Given the poor prognosis associated with oral leukoplakia, and the difficulties in distinguishing it from cancer lesions, there is an urgent need to elucidate the molecular determinants and critical signal pathways underlying the malignant transformation of precancerous to cancerous tissue, and thus to identify novel diagnostic and therapeutic target. Results We have utilized two dimensional electrophoresis (2-DE) followed by ESI-Q-TOF-LC-MS/MS to identify proteins differentially expressed in six pairs of oral leukoplakia tissues with dysplasia and oral squamous cancer tissues, each pair was collected from a single patient. Approximately 85 differentially and constantly expressed proteins (> two-fold change, P < 0.05) were identified, including 52 up-regulated and 33 down-regulated. Gene ontological methods were employed to identify the biological processes that were over-represented in this carcinogenic stage. Biological networks were also constructed to reveal the potential links between those protein candidates. Among them, three homologs of proteosome activator PA28 a, b and g were shown to have up-regulated mRNA levels in OSCC cells relative to oral keratinocytes. Conclusion Varying levels of differentially expressed proteins were possibly involved in the malignant transformation of oral leukoplakia. Their expression levels, bioprocess, and interaction networks were analyzed using a bioinformatics approach. This study shows that the three homologs of PA28 may play an important role in malignant transformation and is an example of a systematic biology study, in which functional proteomics were constructed to help to elucidate mechanistic aspects and potential involvement of proteins. Our results provide new insights into the pathogenesis of oral cancer. These differentially expressed proteins may have utility as useful candidate markers of OSCC.


Backgound
Oral, head, and neck squamous cellular carcinoma is one of the most common forms of cancer associated with the presence of precancerous lesions. It is now believed that OSCC follows a similar pattern in its development, and thus is preceded by precancerous lesions, among which oral leukoplakia (OLK) is the most common type. The World Health Organization (WHO) first defined oral leukoplakia as a white plaque that could not be characterized clinically or pathologically as any other disease in oral mucosa. The malignant potential of oral leukoplakia was evidenced by the progression from metaplasia without dysplasia to low grade dysplasia, high grade dysplasia, and ultimately to invasive carcinoma [1]. The risk of developing malignancies is 8-10 times higher in people who have oral leukoplakia than people who do not [2]. The risk is also increasing with the series of dysplasia stages [3]. There is an urgent need to elucidate the molecular determinants and key signal pathways underlying the malignant transformation from precancerous to cancerous tissue, and thus to identify novel diagnostic and therapeutic targets.
Proteomics is an established molecular profiling technology that may significantly accelerate human cancer research. Recently, a lot of progress has been made in oral cancer proteomics generating some potential applications in this emerging field. This technology platform has been utilized to discover highly sensitive and specific protein markers for oral cancer diagnosis and prognosis by comparing the protein profiles of cancer cells [4,5], tissues [6], plasma [7], and saliva [8,9] with appropriate controls. However, there are fewer reports about discrimination of protein expression profiles between tumor and precancerous lesion with different stages of dysplasia.
The development of bioinformatics tools has allowed the compilation of searchable genomic and proteomic databases accessible via the Internet. Among them, the application of Gene Ontology (GO) and the pathway analysis was considered as a powerful tool in systematic biology for elucidating the complexity of expression profiles in cellular processes. The term of GO describes the role of a given gene in a biological process, its molecular function and cellular component. Each gene is provided with different levels of GO terms, ranging from high-level, broadly descriptive terms to very low-level, highly specific terms [10]. Thus, profiling the expression data based on GO will provide another dimension for understanding the key regulatory processes in oral cancer. The application of the pathway analysis reveals the interactions between the proteins, thus quickly generating new insights into potential complex molecular mechanisms underlying disease related processes [11].
In this study, we have evaluated protein expression differences to identify potential biomarkers of disease progress from oral leukoplakia to OSCC in order to gain further insight into potential mechanisms underlying these transformations. Six pairs of protein lysates were obtained from six patients. The tissues were analyzed by twodimensional gel electrophoresis, followed by ESI-Q-TOF tandem mass spectrometry. GO analysis was applied to identify biological processes over-represented in the carcinogenesis. Biological networks were also constructed to reveal the potential links between the protein candidates. By using this approach, new therapeutic targets or protein markers can possibly be identified to improve patient survival.

2-DE profiling of OSCC and the oral leukoplakia tissues
A total of 6 pairs of OSCC tumor tissues and the oral leukoplakia control tissues were obtained from 6 patients. Figure 1 showed their representative clinical photos and HE-staining histographs. 2-DE with immobilized pH gradients was performed to study the expression patterns of proteins extracted from both tissues, and each sample was analyzed twice to ensure the reproducibility. Figure 2 showed representative 2-DE patterns obtained from the paired tissues. After automatic spot detection, background subtraction, and volume normalization, 859 ± 68 protein spots in OSCC tissue, and 844 ± 56 protein spots in control tissues were detected. Of these spots, 730 (85%) in OSCC tissues and 708 (84%) in control tissues were reproducibly detected in all of the twelve runs, and only the reproducibly detected spots were subjected to statistical analysis(p < 0.05). As a result, 85 protein spots showed more than two fold changes in at least 4 of the twelve repeats (marked in figure 2) while 68 proteins (80%) repeated in more than 8 of 12 pairs.11 proteins were selected as examples (boxed in figure 2) showing the consistent expression changes in enlarged form (see additional file 1).

Identification of putative OSCC biomarkers
As shown in Table 1, a total of 85 differentially expressed proteins, including 52 up-regulated and 33 down-regulated proteins, were identified. Proteasome activator complex PA28 a and b were chosen for further validation. They both exhibited a high expression level in cancer tissues (4-6 fold increase) when compared with the precancerous OLK tissues. The mass spectra for both were shown in Figure 3.

Finding functional enrichment in transformation from oral leukoplakia to OSCC through gene ontology
Ontological methods were employed to structure the biological processes that were over-represented in the carcinogenic stage from oral leukoplakia to infiltrative oral cancer. A web-based tool (GOTree) was employed in which the biological process of the proteins encoded by the genes is scored. Those with a level higher than 4 were highlighted. The observed versus the expected number of genes in the GO categories using a Homo sapiens reference data set was analyzed. 18 novel proteins, each marked with an asterisk in Table 1, were demonstrated to be expressed in head and neck by the analysis of tissue expression profile using GOTree, and their bar chart is shown in Figure 4. X-axis represented the different tissue. The Y-axis means the proteins has been reported in corresponding tissue. Figure 5 shows the tree-like structure with their respective molecular functions, cellular components and biological processes. Taken the biological processes as an example, it involved six processes, the processes responsive to stimulus (including heat shock 70 kDa, 60 kDa and 27 kDa proteins, tubulin, heat shock 70 kDa protein 8, annexin A5, proliferating cell nuclear antigen, S100A7,8,9, annexin A1 and A8, and interleukin 4); physiological response(including serpin peptidase inhibitor, Rho GDP dissociation inhibitor beta, fibrinogen, PA28 a and b); negative regulation of biological process (including capping protein muscle, annexin A4, SET translocation, glutathione S-transferase, S100A11, annexin A1, non-metastatic cells 2 protein, and prohibitin); locomotion proteins (including annexin A1, SERPINB5, Rho GDP dissociation inhibitor (GDI) alpha, tropomyosin 1, and heat shock 27 kDa protein 1); cell death process (including heat shock 27 and 70 kDa protein, annexin A4 annexin A5, galectin 7, and prohibitin); and coagulation (including annexin A4, A5, A8, and fibrinogen).

Link of the Proteins to Biological Pathways
To establish an overview of the interactions among differentially regulated proteins, and to prioritize proteins and pathways for further evaluation, we used the Pathway Studio software to explore the associations between differentially regulated proteins, essentially based on the available knowledge about eukaryotic molecular interactions documented in the ResNet database. Notably, central nodes of proteins of the biological network, generated by the Pathway Architect assembly of GO designations for the most prominently over-represented genes, were shared in oral leukoplakia tissues and tumor tissues including S100 family (A7, 8, 9, 10, and 11), HSP family (HSPB1 and HSPA8), ANX family (A1, 3, 4, and 5), tumor metastasis suppressor NME2 and Rho GDP dissociation inhibitor alpha and beta (ARHGDIA and B), pyruvate kinase (PKM2), transgelin (TAGLN), glutathione S-transferase (GSTP1), SERPINA1, PCNA, and so on ( Figure 6A). Those proteins and their interactive pathways merit further study for their roles in oral carcinogenesis.

Biological Pathways related to PA28
Pathway analysis was also used to reveal the protein interactions and potential pathways of PA28. The result showed that some proteins interacted with PA28 either directly or indirectly, which was not identified by the proteomics approach due to their low abundance ( Figures  6B). It was generally accepted that PA28ab contributes to Class I presentation in immune tissues. Our results also reinforced the connection between PA28ab and cellular immunity as the proteins involved in the MHC-I antigen presenting pathway like PA28, proteosome, HSP70 and HSP90 have been detected to be up-regulated (Table 1, Figures 6C).

Overexpression of PA28 in OSCC Cancer Cells and Cancer Tissues
Evaluation of PA28 expression in seven OSCC cell lines relative to the human immortalized oral keratinocytes (HOK16E6E7) and of PA28 expression in OSCC tumor tissues versus oral leukoplakia tissues was performed by real time RT-PCR and Western blotting. Consistent with observations from 2-DE, expression of PA28 was markedly increased at both the mRNA and protein levels in OSCC cells and tumor tissues compared with normal keratinocytes and oral leukoplakia tissues ( Figure 7).

Discussion
Only a few studies were done on proteomic analysis of the malignant transformation mechanism from precancerous lesion with different stage of dysplasia into invasive cancer. We have performed a comparative proteomic analysis to profile differentially expressed proteins in the transformation process. Using GO analysis, we further analyzed the biological process and pathway network of these proteins, which can generate a new insight into systemic biology in carcinogenesis.
Oral squamous carcinoma, like esophageal adenocarcinoma, has been associated with the presence of precancerous lesion with different stage of dysplasia, thus providing a good model to elucidate every stage of carcinogenesis in more detail. In our study, each pair of precancerous and cancer tissues was from the same patient, which provides an opportunity to eliminate or at least reduce heterogeneity. Using the proteomic approach, we have identified 85 differently expressed gene products and found some pro-Representative two-dimensional maps marked altered proteins Figure 2 Representative two-dimensional maps marked altered proteins. Representative two-dimensional maps of a pair of OSCC cancer tissue and precancerous oral leukoplakia tissue from the same patient. The proteins were separated on a pH 3-10 nonlinear IPG strip, followed by a 12% SDS-polyacrylamide gel, as stated under Methods. The gel was Coomassie-blue stained and the spots were analyzed by ESI-Q-TOF-LC-MS/MS. Arrows indicate identified protein spots significantly and consistently altered between carcinoma tissue and control tissue. 11 boxed proteins were selected as examples showing the consistent expression changes in enlarged form in additional file 1.   . Data were represented as mean ± SD h frequency of the up-regulation (or down-regulation) of the 85 proteins identified in the six sample pairs. * new proteins obtained by analysis of expression pattern through web-based tool (GOTree) teins are related to apoptosis, response to stimulus, metabolic regulation and etc. We can thus conclude that these proteins may play an important role in malignant transformation process. Our most significant finding was that several proteins in the same protein families and homologs were identified in this transformation process, such as peroxiredoxins (Peroxiredoxin-3 and 4), Annexin family (A1, A 3, A 4, A 5, A 8), Rho GDP-dissociation inhibitor 1 and 2, Heat shock protein family (70 kDa protein 1, 71 kDa protein, and Heat shock protein beta-1), PA28 homolog (PA28 a and b), Protein S100 family (A7, A8, A9, A10, A11, and A16). Among which, the annexins and S100 are two super-families of closely related calcium and membrane-binding proteins and their relationship with carcinogenesis has been widely studied. They have a diverse range of cellular functions including vesicle trafficking, cell division, apoptosis, calcium signaling and growth regulation.
Many studies have revealed the annexins to be among the genes whose expression is differentially altered in neopla-sia. Some annexins showed increased expression in specific types of tumors, while others displayed loss of expression. In our report, the expression level of annexin A1, 3, 4, 5, 8 were all decreased while annexin A8 showed increased expression. Annexin A1 has been extensively studied in vitro and in vivo. The loss of expression of annexin A1 in our study confirmed previous findings in head and neck squamous carcinomas [12]. Expression of annexin A3 has only been studied in a limited number of tumor types with only one report regarding its expression in head and neck cancer [13]. For annexin A4, a few studies reported its increased expression in clear cell renal cancer and colorectal cancer by using a combination of proteomics tools [14,15]. The change of Annexin A5 was also observed in our study, which has been considered as one of the signals on the surface of the apoptotic cells and has been used as a probe for apoptosis [16]. Annexin A8 has been shown to be consistently over-expressed in acute promyelocytic leukaemia, breast cancers, pancreatic cancer by a combination of gene expression microarrays and immunohistochemistry. The expression of annexin A4, 5, and 8 in head and neck cancer has been reported in this study for the first time, which was consistent with the results obtained in other studies concerning their expression in other cancers [17,18].

Results of PA28 b and a as the representative protein identified using ESI-Q-TOF-MS/MS
The S100 proteins are a multi-gene calcium-binding family of proteins comprising 20 known human members. There has been growing interest in the S100 protein family and their relationship with different cancers. While the precise role of S100 proteins in the development and promotion of cancer remains unclear, it is evident that the S100 proteins have a variety of intracellular and extracellular roles, and that disruption of any one of these functions may contribute to carcinogenesis. There is evidence that these proteins play a major role in tumor metastasis by interacting with a number of different proteins, including matrix metalloproteinase, cytoskeletal proteins, p53, Jab1, Cox-2 and BRCA1. In this study, we have identified a series of members including S100A7, 8,9,10,11, and 16 with differential expression in OSCC tissues. S100A7 (psoriasin) was a member first characterized as being highly expressed in psoriatic keratinocytes [19]. There is accumulating evidence that S100A7 is up-regulated in bladder cancer skin tumors and some invasive carcinomas. Its expression is associated with a poorer prognosis and reduced survival [20][21][22]. On the contrary, other reports in OSCC showed its expression was associated with a better prognosis based on the finding that S100A7 is highly expressed in pre-invasive, well-differentiated and early staged OSCC, but little or no expression was found in poorly differentiated, later-staged invasive tumors [23]. Other reports showed that S100A7 inhibits both OSCC cell proliferation in vitro and tumor growth/invasion in vivo [24]. These results were echoed by our study, in which S100A7 was identified to be down-regulated in the transformation process form precancerous dysplasia to invasive cancer. Therefore, unlike in other tumors, our data suggests S100A7 to be a tumor suppressor in OSCC. The detailed function should be further elucidated. S100A8 and S100A9 which form a heterodimer complex 90 are up-regulated in many cancers and have been implicated in the metastatic process including gastric cancer, prostate cancer, colorectal cancer, and breast cancer [25][26][27]. In OSCC, one study reported there was more than a 10-fold over-expression of S100 A8 in HPV18+ OSCC [28]. For S100A11, its function has been somewhat controversial. In bladder carcinoma and renal carcinoma, its expression is related to tumor suppression, and decreased expression of S100A11 has been associated with an increase in histopathological grade, poorer prognosis and Expression Profiling Diagram Figure 4 Expression Profiling Diagram. The diagram was constructed with the use of the Ingenuity Pathway Analysis software as described in Materials and Methods and in Results. 18 novel proteins were shown to be firstly expressed in head and neck.
decreased survival [29]. However, in prostate cancer and breast cancer it is thought to be a tumor promoter. Its increased expression in prostate cancers has been shown to be associated with advanced pathological stage [30]. There is only one report about the gene expression of S100A11 related to its diverse functions [31]. S100A16 protein, a new and unique member of the EF-hand Ca (2+)-binding proteinswas found to accumulate within nucleoli and be translocated to the cytoplasm in response to Ca (2+) stimulation [32]. Here we report for the first time the expression of S100A16 protein in carcinogenesis from precancerous dysplasia to OSCC. It is possible that each S100 protein may play multiple roles in tumourigenesis and metastasis. This highlights the need for an improved understanding of the S100 family, before the design of S100 protein-targeted therapies can be achieved.
Proteasomes are large complexes that carry out crucial roles in many cellular pathways by degrading proteins in the cytosol and nucleus of eukaryotic cells [33]. Proteasomes are activated by protein complexes that bind to the end rings of subunits. PA28 (also known as 11S or REG) has been shown to bind specifically to and activate 20S proteasomes against model peptide substrates [34]. The biological roles of PA28 are less well understood. There are three PA28 homologs, called a, b and g. Although PA28a and b subunits are expressed in many organs, they are particularly abundant in immune tissues and are virtually absent from the brain. By the late 1990s, PA28ab was found to contribute to Class I presentation, based on the high levels of PA28ab in immune tissues, the IFNg induction of PA28ab and many components of the class I pathway, and the direct production of some Class I epitopes by PA28ab-proteasome complexes [35,36]. Our results also reinforce the connection between PA28ab and cellular immunity by showing that the key proteins in the MHC I antigen presenting pathway like PA28, proteosome, HSP70 and HSP90 have been detected up-regulated in OSCC tissues. An earlier study showed that PA28g expression correlated with cell proliferation. Recently, some researchers have gained more insight into the role of PA28g in apoptosis [37]. These findings were paralleled by studies suggesting that PA28g functions in cell cycle progression and has an immune role [38]. Two-hybrid screens have identified several proteins that interact with PA28g as well. Interestingly, all these findings suggest that PA28g is an anti-apoptotic factor. Less is known about how PA28g may suppress apoptosis in oral carcinesogenesis. In our validation study, Three homologs were all included. The results have confirmed the up-regulation of PA28 in carcinogenesis by comparison between several OSCC cell lines and oral keratinocytes. In our further studies, we would valuate the PA28 ab complex and PA28 g immunostaining pattern in different stage of tissue samples from normal, precancerous to infiltrative OSCC. Moreover, the relationships of immunostaning with survival rate and recurrence will be analyzed.
Directed acyclic graph (DAG) view of the enriched GO categories in the transformation process from precancerous oral leu-koplakia to OSCC  The key proteins in MHC antigen presenting pathway, among which PA28, proteosome, HSP70 and HSP90 have been detected up-regulated in our study. Each node represents either a protein entity or a control mechanism of the interaction. Connecting lines between the protein symbols indicate interactions; different types of interactions are denoted by symbols on the lines. Green square indicates regulation; purple square, binding; blue square, expression; orange circle, protein modification; red diamond, metabolism; green circle, promoter binding; yellow triangle, transport; "+" in gray circle, positive effect; and "-" in gray circle, negative effect.
Validation of PA28 in OSCC tissues and cells relative to control oral leukoplakia tissue and normal keratinocytes

Conclusion
In summary, we have applied proteomic technologies to analyze the malignant transformation from precancerous oral leukoplakia to oral cancer from 6 patients, and we have identified 85 different proteins with altered expression levels in OSCC in the transformation, of which 53 were up-regulated. Previous characterizations regarding their functions and possible interactions with other proteins and in particular the pathways involved were also evaluated. As a key factor in tumor metastasis, PA was chosen for transformation at first. The proteosome activator PA28 was studied for their expression and interactive networks correlated with oral malignancy. We have being started the further research on other potential biomakers like peroxiredoxins, annexin family and S100 family. This is an example of a systems biology study, in which functional proteomics was constructed to help to elucidate mechanical aspects and potential involvement of proteins of interest in biological pathways.

Tissue collection and sample preparation
Six pairs of tumors and oral leukoplakia tissues with dysplasia were obtained from six patients in West China Stomatological Hospital, Sichuan University. The specimens were examined histologically by hematoxylin and eosin (HE) staining, and the clinicopathologic stage was determined according to the TNM classification system of the International Union against Cancer [39]. Patients receiving previous chemotherapy or radiation treatment were excluded. All the tissue specimens were snap-frozen in liquid nitrogen for proteomic analysis. Hematoxylin-stained 5 μm frozen sections were reviewed by a Board-Certified pathologist (Y. Chen) for tumor cellularity (oral squamous carcinoma) or oral leukoplakia mucosa (moderatehigh grade dysplasia). Informed consent was obtained from all patients or their relatives for the experimental use of their tissues. Medical records were reviewed and data were coded to protect patient confidentiality. The project was approved by the Scientific and Ethics Committee of Sichuan University.

Two-Dimensional Electrophoresis
Two-dimensional gel electrophoresis was performed essentially as previously described [40]. The protein concentration of the supernatants was determined using a Bio-Rad protein kit. All the paired samples were quantitatively analyzed in group. Samples of 1 mg protein were applied on immobilized pH 3-10 nonlinear gradient strips in sample cups at their basic and acidic ends. Focusing started at 200 V and the voltage was gradually increased to 8000 at 4 V/min and kept constant for a further 3 h (approximately 150 000 Vh totally). The second dimensional separation was performed in 12% SDS-polyacrylamide gels. The gels (180 × 200 × 1.5 mm 3 ) were run at 40 mA/gel. After protein fixation in 50% methanol containing 5% phosphoric acid for 2 h, the gels were stained with Coomassie Brilliant Blue R-250 (Merck, Germany) for 12 h and the protein spots were visualized. Each experiment was performed twice to ensure the accuracy of analyses. The images were scanned using a Bio-Rad high quality white light GS-800 scanner (400-750 nm). The differentially expressed proteins were identified using the PD-Quest 2DE analysis software (Bio-Rad, USA). The quantity of each spot in a gel was normalized as a percentage of the total quantity of all spots in that gel and evaluated in terms of O.D. The student's t-test was applied to compare the spot relative volume between two groups. Significant spots that showed changed consistently and at least 2.0-fold difference (p < 0.05) were selected for tandem mass spectrometry (MS/MS) analysis.

Protein identification by nano-HPLC-ESI-Q-TOF-MS/MS
The protein spots were excised manually and digested using sequence grade trypsin (V511A, Promega). The protein samples were reduced, alkylated, and then digested with trypsin using standard protocols as previously described. The digests were analyzed using a nano-HPLC system coupled to Q-TOF Primer mass spectrometer (Q-TOF, Micromass, Micromass, Manchester, UK) equipped with an electrospray ionization source. Spectra were accumulated until a satisfactory signal/noise ratio had been obtained. Only double, or more than double, charge peaks, in the mass range from 400 to 1600 m/z, were considered for MS/MS. Ions exhibiting a detection intensity exceeding 10 counts/second were selected for production of ion spectra by Collision Induced Dissociation (CID). Trypsin autolysis products and keratin-derived precursor ions were automatically excluded. Three MS/MS ions were selected for each survey scan. All data used to extract peak information, which was used to create the MS/MS peak list, were generated from one combined spectrum. The tandem mass spectrometry (MS/MS) data, "pkl list (pkl) " files acquired by the software of ProteinLynx 2.2.5 (Waters), included the mass values, the intensity and the charge of the precursor ions (parent ions with +2 or +3 charge in this study). The pkl files were analyzed using