Volume 14 Supplement 3
Revealing selection in cancer using the predicted functional impact of cancer mutations. Application to nomination of cancer drivers
© Reva; licensee BioMed Central Ltd. 2013
Published: 28 May 2013
Every malignant tumor has a unique spectrum of genomic alterations including numerous protein mutations. There are also hundreds of personal germline variants to be taken into account. The combinatorial diversity of potential cancer-driving events limits the applicability of statistical methods to determine tumor-specific "driver" alterations among an overwhelming majority of "passengers". An alternative approach to determining driver mutations is to assess the functional impact of mutations in a given tumor and predict drivers based on a numerical value of the mutation impact in a particular context of genomic alterations.
Recently, we introduced a functional impact score, which assesses the mutation impact by the value of entropic disordering of the evolutionary conservation patterns in proteins. The functional impact score separates disease-associated variants from benign polymorphisms with an accuracy of ~80%. Can the score be used to identify functionally important non-recurrent cancer-driver mutations? Assuming that cancer-drivers are positively selected in tumor evolution, we investigated how the functional impact score correlates with key features of natural selection in cancer, such as the non-uniformity of distribution of mutations, the frequency of affected tumor suppressors and oncogenes, the frequency of concurrent alterations in regions of heterozygous deletions and copy gain; as a control, we used presumably non-selected silent mutations. Using mutations of six cancers studied in TCGA projects, we found that predicted high-scoring functional mutations as well as truncating mutations tend to be evolutionarily selected as compared to low-scoring and silent mutations. This result justifies prediction of mutations-drivers using a shorter list of predicted high-scoring functional mutations, rather than the "long tail" of all mutations.
Numerous somatic mutations are detected in thousands of genes in all cancers [1–13]. Mutations vary in their impact on a gene's function [14, 15] and in their contribution to cancer [16–18]. Every tumor has its own mutation spectrum of ~10 to 10,000 of protein-altering mutations. A challenge is to identify mutations that provide a selective advantage to tumors ("drivers"). Knowing driver mutations for individual tumors, one can develop the personalized approaches to treat cancer .
Driver mutations are commonly determined from distributions of mutations in a large group of tumor samples [1, 20–24]. It is assumed that many of the tumors are under similar selection pressure and those mutations, which are fixed more frequently than expected based on a given background mutation rate (e.g. recurrent mutations observed in many tumors and across many cancers ) give selective advantage to cancer. It is also assumed (although rarely articulated) that the number of cancer-causing combinations of driver mutations is limited and therefore a large enough set of sequenced cancer genomes will represent all combinations of driver mutations in an amount sufficient for statistical conclusions.
However, massive sequencing of cancer genomes [1–13] has revealed an enormous diversity of genomic aberrations as well as the high diversity of background mutation rates within many types of common cancers [8, 9]. The huge diversity of genomic alterations and mutation rates obviously limits the predictive power of statistical approaches. Typically, genomic alterations in the top cancer genes found by statistics do not affect all tumors [1–7, 10–13]. Thus, statistical approaches leave two important questions without answers: First, are there more genes contributing to carcinogenesis in a given type of cancer? Second, what are the concrete driver mutations in a given tumor?
An alternative, personalized approach is to determine cancer drivers based on in-depth analysis of the impact a mutation may have on protein molecular function in the tumor-specific context of genomic alterations. Currently, the implementation of this approach as a primary method for determining drivers is limited by incompleteness of the present knowledge of gene function and gene-regulation networks, and insufficiency of the existing molecular modeling approaches. Typically, the assessment of the functional impact of mutations is used in the subsequent analysis of already found driver mutations [12, 13, 26–28]. However, more accurate predictions of driver mutations can be achieved by integration of the statistical and the functional approaches. Hence, new approaches have been recently reported [13, 29], which integrate functional predictions and mutation distribution statistics. However, the methodology of integration of statistical and functional information is not yet well established. In particular, the statistical model of  is not applicable for determining drivers in individual tumors; it is also unclear what is the actual power of the "functional mutation burden"  to predict driver mutations.
Recently, we introduced the functional impact score (FIS), which assesses the functional impact of a mutation by a value of entropic disordering of the evolutionary conservation patterns in protein families and subfamilies . The FIS function (implemented as a web-based service mutationassessor.org) was validated by assessing the accuracy of separation of known disease-associated variants from benign polymorphisms and by separation of known recurrent cancer mutations (drivers) from single mutations (passengers) [25, 31]. The original FIS function of the mutation assessor was also independently tested and integrated with other mutation scores in the CONDEL  and Oncodrive-FM  methods; the FIS function was recently implemented and rigorously tested in the "transFIC" approach to differentiate driver and passenger mutations .
However the fact that the FIS of the mutation assessor (or other approaches) differentiates preselected drivers from passengers does not automatically mean that it will not produce too many false positives in analysis of total sets of somatic mutations found in tumors. Therefore, before using the FIS to nominate driver mutations in a large set of somatic mutations, it is necessary to answer an important practical question: how the value of the predicted functional impact correlates with the contribution of a given mutation to carcinogenesis? Assuming that cancer-drivers are positively selected in tumor evolution, we propose and test a hypothesis: "high scoring functional mutations tend to be selected in tumor evolution". Testing this hypothesis is interesting because the FIS represents the evolutionary conservation of residues; a value of the score can be simply interpreted as a measure of conservation. Testing this hypothesis is also practical because the impact score of the mutation assessor is used routinely for assessment of the mutation impact in large-scale sequencing projects [3–6, 11, 12] and in newly developed combined approaches [29, 32, 33].
This hypothesis has several testable implications. If it is true, then the fraction of cancer genes (e.g. tumor suppressors and oncogenes) should increase among genes affected by functional mutations. Another general signature of selection, non-uniformity of distribution of mutations across genes, should also increase among functional mutations. Functional mutations should more frequently affect genes, which are likely under selection pressure, i.e. genes affected by truncating mutations or by copy number alterations.
Therefore, we tested the hypothesis by comparing distributions of silent, truncating and missense mutations categorized by the predicted functional impact . We investigated how the predicted functional impact correlates with the frequency of affected tumor suppressors and oncogenes, non-uniformity of distribution of mutations and frequency of concurrent genomic alterations. These tests are general and can be used in studying selection and nominating driver mutations using any scoring function.
All tests conducted on ~120K missense mutations among six types of cancers studied by TCGA showed that high-scoring functional mutations tend to be evolutionary selected. These results justify nominations of the driver mutations based on the predicted functional impact score of the mutation assessor.
Results and discussion
Cancer-driver mutations are defined as those that give selective advantage to cancer cells. Therefore cancer-driver mutations are specifically selected in tumor evolution. It is easy to identify as evolutionarily selected recurrent cancer mutations. The distributions of the FIS for recurrent mutations and disease-associated variants are practically indistinguishable . Can the FIS be used to bring on the top both recurrent and non-recurrent cancer-driver mutations? To this end, one needs to prove that non-recurrent high-scoring mutations are generally under stronger selection pressure as compared to low-scoring or silent mutations. Below we present computational tests that reveal a stronger selection pressure for predicted high-scoring functional mutations.
Figures 1A and 1B present distributions of truncating (TM), silent (SM) and predicted functional missense mutations (FM) affecting tumor suppressors and oncogenes in colon cancer 4. (The lists of tumor suppressors (TS) and oncogenes (OG) are taken from the annotated lists of cancer genes (Additional File 1, Tables S1; Additional File 2[30, 34, 35]).
In spite of the fact that the cancer gene list is incomplete, non-specific to a given cancer and have erroneous annotations, the distributions of truncating, silent and predicted functional mutations clearly demonstrate natural selection. First, one should note a striking difference between truncating mutations and silent mutations affecting tumor suppressors (Figure 1A) and oncogenes (Figure 1B). Truncating mutations affect tumor-suppressors approximately three times more often than silent mutations, while they affect oncogenes with the same frequency as silent mutations. The difference in frequencies is caused by natural selection. Truncating mutations result in loss of function of certain tumor suppressors that give advantage to affected cancer cell. Therefore truncating mutations in tumor suppressors are been fixed in evolution. However, truncating mutations in oncogenes are not generally advantageous to cancer cells, and, hence, they are not fixed in tumor evolution. The distributions of predicted functional missense mutations also show the clear tendency of high-scoring mutations to be evolutionarily selected for tumor suppressors and oncogenes as compared to low-scoring and silent mutations (Figure 1A-B). With an increase of the functional impact (FIS), a fraction of mutations affecting tumor suppressors and oncogenes increases and gets the maximum value at FIS ~3.0. At higher values of FIS, the total number of mutations becomes very low that may affect statistics.
The Figure 1C presents distributions of silent, truncating and predicted functional mutations affecting all annotated cancer genes (Additional File 1, Table S1; Additional File 2) in several cancer types (TCGA). While the fractions of silent mutations affecting cancer genes, stays about the same across all studied cancers, the fractions of truncating and predicted functional mutations vary significantly for different cancers. What is the most remarkable is that the fractions of affected cancer genes increase with the value of the functional impact score for all cancers, i.e. predicted functional mutations tend to be selected in cancer genes in different type of cancers.
However, the observed shift of the FIS distribution of mutations in cancer genes towards higher values can be also explained by better evolutionarily conservation of cancer genes . (Let's assume that cancer genes are conserved significantly better than non-cancer genes. Then, uniformly (or randomly) distributed mutations in cancer genes will automatically get higher FIS values and a fraction of cancer genes will be disproportionally high among high-scoring mutations. Under this assumption, the observed enrichment of high-scoring mutations in cancer genes (Figure 1) will simply reflect the better conservation of cancer genes, rather than selection of the specific mutations in cancer genes).
Selection of mutation in tumor evolution results in non-uniformity of mutation distributions. The non-uniformity of mutation distributions is especially high in cancer genes, many of which are affected by recurrent mutations. Therefore, to assess an applicability of the FIS to predict driver mutations, one needs to answer a key question: what is a correlation between the value of the FIS and the non-uniformity of mutation distribution? This question is based on the following hypothesis: driver mutations are selected in special (and therefore better conserved) positions of cancer genes and scoring higher than passenger mutations. Then, the higher the score, the more likely the mutation is a driver, and the distribution of high scoring mutations should reflect the main feature of selection - more mutations in fewer genes. The alternative hypothesis is that driver and passenger mutations in cancer genes are scoring essentially equally. Then, the FIS is not relevant for differentiating drivers and passengers. Thus, the question of what factor plays the major role in the increase of a fraction of high-scoring mutations in cancer genes - the better conservation of cancer genes in evolution of species or the specific selection of driver mutations in tumor evolution - is actually superseded by other questions: does the non-uniformity of mutation distribution increase with the value of the FIS, and, does the non-uniformity of distribution increase for high-scoring mutations in cancer genes (many of which are under selection pressure) versus non-cancer genes (many of which are not under selection)?
As expected, distributions of predicted functional mutations and truncating mutations are essentially non-uniform (μ~5-40) that differ them drastically from the more uniform distributions of silent mutations (μ~1.4-1.9). The non-uniformity of distributions increases with the value of the functional impact showing the increase of selection pressure for predicted functional mutations (Figure 2A).
We also compared the non-uniformities of distributions of predicted functional mutations affecting different groups of genes. The non-uniformities of distributions were computed for predicted functional mutations affecting all genes, annotated cancer genes, annotated tumor suppressors and oncogenes, and, genes that have no cancer annotations. The Figure 2B presents typical dependencies obtained for glioblastoma cancer. Similar results are obtained for all studied cancers. The non-uniformity of distributions in cancer genes increases with an increase of predicted functional impact, while the non-uniformity of the mutation distribution in non-cancer genes does not increase and even has a tendency to decrease (the lower μ, the bigger non-uniformity). The non-uniformity μ gets the maximal value at FIS ~3.0 and starts to decrease at higher FIS. This simply reflects the drastic decrease of a number of mutations and a number of affected genes at higher FIS.
Computing non-uniformity of distributions, we did not take into account gene length. Differences in gene lengths can affect computed values of non-uniformity, especially when the number of genes and mutations are small and differences in gene lengths are big. Although, in the general case, the non-uniformity of a mutation distribution may depend upon a spectrum of nucleotide substitution and cancer type, the main effect of gene length differences can be assessed by assuming that mutations are distributed proportionally to gene lengths. Thus, we determined the effective number of genes that would carry majority of uniformly distributed mutations. The coding length of human genes was taken from MAPBACK database . We found that the effective number of the longest genes, which cover the whole genome is ~9,400 that gives for the non-uniformity coefficient a value of ~2. Thus, the non-uniformity of the unbiased mutation distribution caused by the difference in gene lengths is very close to the non-uniformity coefficients computed for the observed distribution of silent mutations across different cancers (~1.4-2).
However, taking into account gene lengths is not necessary for comparison characteristics of distributions of the whole mutation classes (truncating, silent, missense) affecting the same large groups of genes (thousands of mutations and genes). The hallmark of selection can be seen in the significant increase of μ from 3.5 to 7.8 for predicted functional mutations affecting tumor suppressors at FIS~3.0; correspondingly, no selection is observed for mutations affecting non-cancer genes or silent mutations. Actually, one can compare non-uniformity of mutation distributions for different groups of genes: if the numbers of mutated genes in gene groups are large enough (~100 or more), the effects of different gene lengths on non-uniformity of distributions become insignificant because of averaging large numbers of mutations affecting genes of different lengths. Therefore the non-uniformity coefficients μ are generally small (close to one) for silent mutations and large for truncating and predicted functional mutations selected in tumor suppressor, oncogenes and all cancer genes.
We report more details comparing the non-uniformity of mutation distributions in cancer genes and in non-cancer genes for high-scoring missense mutations, for all missense mutations, for combination of high-scoring mutations and truncating mutations and for truncating mutations taken alone (Additional File 1, Table S2). The main results of these tests can be summarized as follows: (i) the non-uniformity of distributions of high-scoring functional missense mutations in cancer genes is always higher as compared to the non-uniformity of all missense mutations both in cancer genes and in non-cancer genes; (ii) the non-uniformity of mutations distribution increases for combination of missense mutations and truncating mutations; (iii) the non-uniformity of mutation distributions is the highest for combination of the high-scoring missense mutations and truncating mutations in cancer genes. These results resolve the question of biasing of the FIS caused by potentially better conservation of cancer genes. Regardless of the potential shift of the FIS, the increase of the non-uniformity of distributions of high-scoring mutations in cancer genes proves selection of these mutations in cancer genes.
Thus, the comparison of distributions of missense and predicted functional mutations in combination with truncating mutations both in cancer genes and in non-cancer genes (Figure 1, 2, Additional File 1, Table S2) demonstrates natural selection of predicted high-scoring functional mutations and truncating mutations in cancer genes. Based on this result, one can make recommendations for determining tumor specific (personalized) drivers: nominate as likely drivers high-scoring mutations in known cancer genes; nominate as possible drivers high-scoring mutations in remaining non-cancer genes.
Based on the distributions of Figure 3, one can make estimates of the total numbers of common driver genes for a given cancer. We propose to rank (cancer) genes by a total number of highly functional mutations (FIS>2.5 and truncating mutations) and nominate a set of the "effective genes" as a set of common drivers. This is motivated by the idea that highly functional mutations are selected during tumor evolution in a limited number of conserved positions in certain (cancer) genes. These genes are enriched by highly functional mutations and can be revealed by the increased non-uniformity of distributions of highly functional mutations.
It is difficult to make accurate comparisons between cancers, because the overall diversity of the observed mutation spectrum depends on a number of samples and stage of cancer, but one can notice that the number of the effective genes representing the mutation spectrum for ovarian, colon and brain cancer is smaller than the numbers of the effective genes for kidney, breast and especially lung cancer. Generally, the smaller the number of the effective genes, the stronger the selection. However, the effective number of genes that are likely under selection pressure is estimated as ~200 for ovarian cancer and ~350-400 for brain and colon cancers. The large numbers of genes affected by predicted evolutionarily selected mutations highlight a diversity of cancer drivers and suggest that a typical tumor has more drivers, rather than few drivers. The large numbers of the effective genes have to be compared with the total number of mutated genes; the resulting reduction in numbers of potential driver genes is ~10-30 times. (For more accurate and comprehensive nomination of driver genes, it is necessary to take into account statistics of gene copy number alterations and gene expressions that is beyond the scope of this study).
Percentage of silent (SM) and truncating mutations (TM) affecting genes with different copy number alterations
Gene copy number alterations
Percentage of silent, truncating and functional mutations affecting genes with one copy loss.
all missense mutations
Missense mutations selected by FIS
The data of Figure 5 confirm this expectation. In all studied cancers, fractions of genes - tumor suppressors - affected by both truncating mutations and predicted functional mutations increase at higher values of FIS (Figure 5A). This tendency is general and observed for all genes, but the strongest concurrency between predicted functional mutations truncating mutations is observed for tumor suppressors. The difference in concurrency of predicted functional mutations and truncating mutations affecting different genes groups is well displayed in mutations of lung cancer (Figure 5B). For the total counts of missense mutations, all genes groups have approximately the same percentage of genes affected by truncating mutations. However, among the genes affected by predicted functional mutations, the annotated cancer genes and tumor suppressors are more frequently affected by truncation mutations as compared to the group of "non-cancer genes"; on the contrary, the annotated oncogenes affected by predicted functional mutations are less frequently affected by truncating mutations. These differences demonstrate natural selection of functional mutations in those groups of genes.
Another group of genes, which are likely under selection pressure are genes affected by copy number alterations. Table 1 presents statistics of silent and truncating mutations affecting genes with discretized copy number alterations . Silent mutations with no impact on gene's function (and no selection) are distributed fairly uniformly across genes affected by copy number alterations. Truncating mutations affect protein function; driven by selection, they are distributed significantly differently as compared to silent mutations: over-presented in regions of heterozygous deletions in all studied cancers and under-presented in regions of copy gains (although only in two of six studied cancers).
The percentages of truncating mutations affecting genes with copy loss can be used as a reference for comparison distribution of predicted functional mutations (Table 2).
As expected, predicted functional mutations tend to be selected in genes with copy loss more frequently as compare to silent or low-scoring mutations. Predicted high-scoring functional missense mutations tend to be selected in genes with one copy loss practically as frequent as truncating mutations.
The functional impact score
The details of the derivation of the functional impact score of the MutationAssessor are given in . Here we simply review the assumptions used in the derivation. The estimate of the functional impact of a mutation in a given protein sequence is derived from a multiple alignment of homologous sequences under two assumptions: 1) a multiple alignment of protein family sequences is treated as a statistical ensemble at equilibrium; 2) a distribution of residues in any aligned position of a protein alignment is treated independently of other positions in the alignment. In other words, it is assumed that all possible mutations were tried in evolution in each sequence position so that the observed distributions of residues in aligned positions of homologous sequences reflect all possible constraints imposed on these residues. Thus, critically important residues are conserved in the setting of diverse sequence homologs, while evolutionarily unfavorable residues are not observed or observed less frequently than neutral or important residues. In addition to protein family conservation, we use conservation within protein subfamilies, which are derived from clustering multiple sequence alignments . The clustering algorithm groups the sequences of a protein family alignment into distinct subfamilies, so as to minimize the sequence diversity within subfamilies and to maximize the overall difference between subfamilies at a select number of "specificity" positions . Evolutionary constraints are inferred from the patterns of residue conservation in the computed protein subfamilies.
Here α and β are residue types (α, β = 1,...,21, indexing 20 residues types and alignment gaps); , are, respectively, the numbers of residues of types α and β in an alignment column i; the index p refers to the particular subfamily to which the mutated sequence is assigned as the result of clustering and and are, respectively, the numbers of residues of types α and β in sequence position i of a subfamily p.
The two terms of Eq.1 are complementary measures of evolutionary conservation; therefore, a combination of these scores provides more information about the potential functional impact of a mutation.
The statistical measure of the non-uniformity of distributions
Any selection process results in non-uniformity of distributions. Therefore we compared silent, truncating and missense mutations by the non-uniformity of distributions of these mutations across genes. We compared separately the non-uniformity of distributions within different groups of genes such as tumor suppressors (~850), oncogenes (~150), annotated cancer genes (~3,700), and remaining non-cancer genes. We tested a hypothesis that the non-uniformity of distributions increases with the value of the functional impact.
is the Simpson diversity index .
In the case, when all genes are mutated fairly proportionally that gives . Then the effective number of genes is close to the actual number of genes and the non-uniformity .
However, when mutations of only one or few genes represent the overwhelming majority of all mutations, the distribution of mutations across genes is extremely non-uniform and the diversity index . Then the effective number of mutations and becomes a large number, when the total number of genes is a dataset is large.
Thus, the non-uniformity coefficient μ can be used as a measure of selection of mutations in cancer; μ is close to one, when there is no selection or selection is weak and μ is larger, when mutations undergo selection pressure.
Cancer gene lists
The cancer gene list used in this study is a combination of the three lists: the web-based resource of CancerGenes, which combines gene lists annotated by experts with information from key public databases , the cancer genes of Sanger Institute  and a gene list of frequently mutated genes with predicted functional mutations  derived from the COSMIC database [25, 31]. The Additional File 1 (Table S1) provides with summarized statistics in the lists and Additional File 2 presents the actual genes with the basic cancer annotations.
The main task in analysis of somatic mutations in cancer is determining driver mutations that provide a selective advantage to cancer cells. The recurrence of driver mutations is a signature of selection. Recurrent driver mutations can be differentiated from benign passengers by the predicted functional impact 30. In this work, we showed that the predicted functional impact can be generally applied to identify drivers by revealing trends of evolutionary selection of predicted functional mutations in systematic tests conducted on ~120 missense mutations of six different cancers. We found an important correlation between the value of the predicted functional impact and selection: higher predicted functional impact correlates with stronger selection trends. Hence, we conclude that the functional impact score can be used for prediction of driver mutations and genes.
The functional impact score used in this work  represents the evolutionary conservation of residues in protein sequences. The greater values of the score correspond to higher evolutionary conservation. Thus, the conducted tests showed that mutations affecting evolutionary conserved residues tend to be selected in tumor evolution. Or, in other words, rapidly unfolding tumor evolution selects mutations affecting protein residues conserved in millions of years of natural history. This means that the main reservoirs of functional diversity in proteins are the residues that are selected and conserved in molecular evolution.
In this study, we showed that predicted functional mutations (potential drivers) are selected in annotated cancer genes. This underscores the practical usefulness of cancer gene lists. With more cancer genome sequencing, a general list of cancer genes as well as specific cancer gene lists are likely to be very useful in the practice of personalized cancer treatment.
We interpreted as a trend of selection the fact that predicted functional mutations are concurrently selected in genes affected by truncation mutations and by copy number losses. This fact emphasizes the diversity of genomic alterations in cancer. Thus, accurate prediction of cancer driver mutations can be done only in the context of all genomic alterations, possibly by utilizing an integrated profile of functional genomic alterations where predicted functional missense mutation are taken into account together with truncating mutations and gene copy number alterations.
The author is grateful to Alexei Finkelstein, Chris Sander and Niki Schultz for constructive discussions, to Will Lee and Robert Fieldhouse for careful reading and useful remarks. This work was supported by NIH grant R01 CA132744-02.
The publication costs for this article were funded by NIH grant R01 CA132744-02.
This article has been published as part of BMC Genomics Volume 14 Supplement 3, 2013: SNP-SIG 2012: Identification and annotation of SNPs in the context of structure, function, and disease. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcgenomics/supplements/14/S3
- Kan Z, Jaiswal BS, Stinson J, Janakiraman V, Bhatt D, Stern HM, Yue P, Haverty PM, Bourgon R, Zheng J et al: Diverse somatic mutation patterns and pathway alterations in human cancers. Nature. 2010, 466 (7308): 869-873. 10.1038/nature09208.View ArticlePubMed
- Wood LD, Parsons DW, Jones S, Lin J, Sjoblom T, Leary RJ, Shen D, Boca SM, Barber T, Ptak J et al: The genomic landscapes of human breast and colorectal cancers. Science. 2007, 318 (5853): 1108-1113. 10.1126/science.1145720.View ArticlePubMed
- Cancer Genome Atlas N: Comprehensive molecular portraits of human breast tumours. Nature. 2012, 490 (7418): 61-70. 10.1038/nature11412.View Article
- Cancer Genome Atlas N: Comprehensive molecular characterization of human colon and rectal cancer. Nature. 2012, 487 (7407): 330-337. 10.1038/nature11252.View Article
- Cancer Genome Atlas Research N: Integrated genomic analyses of ovarian carcinoma. Nature. 2011, 474 (7353): 609-615. 10.1038/nature10166.View Article
- Cancer Genome Atlas Research N: Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature. 2008, 455 (7216): 1061-1068. 10.1038/nature07385.View Article
- Sjoblom T, Jones S, Wood LD, Parsons DW, Lin J, Barber TD, Mandelker D, Leary RJ, Ptak J, Silliman N et al: The consensus coding sequences of human breast and colorectal cancers. Science. 2006, 314 (5797): 268-274. 10.1126/science.1133427.View ArticlePubMed
- Chin L, Hahn WC, Getz G, Meyerson M: Making sense of cancer genomic data. Genes & development. 2011, 25 (6): 534-555. 10.1101/gad.2017311.View Article
- Ding L, Wendl MC, Koboldt DC, Mardis ER: Analysis of next-generation genomic data in cancer: accomplishments and challenges. Hum Mol Genet. 2010, 19 (R2): R188-196. 10.1093/hmg/ddq391.PubMed CentralView ArticlePubMed
- Cancer Genome Atlas Research N, Hammerman PS, Hayes DN, Wilkerson MD, Schultz N, Bose R, Chu A, Collisson EA, Cope L, Creighton CJ et al: Comprehensive genomic characterization of squamous cell lung cancers. Nature. 2012, 489 (7417): 519-525. 10.1038/nature11404.View Article
- Consortium TCGA: Integrative analysis of genomic alterations in clear cell renal carcinoma. Nature (submitted). 2012
- Chapman MA, Lawrence MS, Keats JJ, Cibulskis K, Sougnez C, Schinzel AC, Harview CL, Brunet JP, Ahmann GJ, Adli M et al: Initial genome sequencing and analysis of multiple myeloma. Nature. 2011, 471 (7339): 467-472. 10.1038/nature09837.PubMed CentralView ArticlePubMed
- Hodis E, Watson IR, Kryukov GV, Arold ST, Imielinski M, Theurillat JP, Nickerson E, Auclair D, Li L, Place C et al: A landscape of driver mutations in melanoma. Cell. 2012, 150 (2): 251-263. 10.1016/j.cell.2012.06.024.PubMed CentralView ArticlePubMed
- Petitjean A, Mathe E, Kato S, Ishioka C, Tavtigian SV, Hainaut P, Olivier M: Impact of mutant p53 functional properties on TP53 mutation patterns and tumor phenotype: lessons from recent developments in the IARC TP53 database. Hum Mutat. 2007, 28 (6): 622-629. 10.1002/humu.20495.View ArticlePubMed
- Heo WD, Meyer T: Switch-of-function mutants based on morphology classification of Ras superfamily small GTPases. Cell. 2003, 113 (3): 315-328. 10.1016/S0092-8674(03)00315-5.View ArticlePubMed
- Haber DA, Settleman J: Cancer: drivers and passengers. Nature. 2007, 446 (7132): 145-146. 10.1038/446145a.View ArticlePubMed
- Bozic I, Antal T, Ohtsuki H, Carter H, Kim D, Chen S, Karchin R, Kinzler KW, Vogelstein B, Nowak MA: Accumulation of driver and passenger mutations during tumor progression. Proceedings of the National Academy of Sciences of the United States of America. 2010, 107 (43): 18545-18550. 10.1073/pnas.1010978107.PubMed CentralView ArticlePubMed
- Kobayashi S, Boggon TJ, Dayaram T, Janne PA, Kocher O, Meyerson M, Johnson BE, Eck MJ, Tenen DG, Halmos B: EGFR mutation and resistance of non-small-cell lung cancer to gefitinib. The New England journal of medicine. 2005, 352 (8): 786-792. 10.1056/NEJMoa044238.View ArticlePubMed
- Chin L, Andersen JN, Futreal PA: Cancer genomics: from discovery science to personalized medicine. Nature medicine. 2011, 17 (3): 297-303. 10.1038/nm.2323.View ArticlePubMed
- Getz G, Hofling H, Mesirov JP, Golub TR, Meyerson M, Tibshirani R, Lander ES: Comment on "The consensus coding sequences of human breast and colorectal cancers". Science. 2007, 317 (5844): 1500-View ArticlePubMed
- Dees ND, Zhang Q, Kandoth C, Wendl MC, Schierding W, Koboldt DC, Mooney TB, Callaway MB, Dooling D, Mardis ER et al: MuSiC: identifying mutational significance in cancer genomes. Genome Res. 2012, 22 (8): 1589-1598. 10.1101/gr.134635.111.PubMed CentralView ArticlePubMed
- Greenman C, Stephens P, Smith R, Dalgliesh GL, Hunter C, Bignell G, Davies H, Teague J, Butler A, Stevens C et al: Patterns of somatic mutation in human cancer genomes. Nature. 2007, 446 (7132): 153-158. 10.1038/nature05610.PubMed CentralView ArticlePubMed
- Fischer A, Greenman C, Mustonen V: Germline fitness-based scoring of cancer mutations. Genetics. 2011, 188 (2): 383-393. 10.1534/genetics.111.127480.PubMed CentralView ArticlePubMed
- Illingworth CJ, Mustonen V: Distinguishing driver and passenger mutations in an evolutionary history categorized by interference. Genetics. 2011, 189 (3): 989-1000. 10.1534/genetics.111.133975.PubMed CentralView ArticlePubMed
- Forbes SA, Bhamra G, Bamford S, Dawson E, Kok C, Clements J, Menzies A, Teague JW, Futreal PA, Stratton MR: The Catalogue of Somatic Mutations in Cancer (COSMIC). Curr Protoc Hum Genet. 2008, Chapter 10: Unit 10 11
- Linardou H, Dahabreh IJ, Bafaloukos D, Kosmidis P, Murray S: Somatic EGFR mutations and efficacy of tyrosine kinase inhibitors in NSCLC. Nature reviews Clinical oncology. 2009, 6 (6): 352-366. 10.1038/nrclinonc.2009.62.View ArticlePubMed
- Fratev F, Jonsdottir SO, Mihaylova E, Pajeva I: Molecular basis of inactive B-RAF(WT) and B-RAF(V600E) ligand inhibition, selectivity and conformational stability: an in silico study. Molecular pharmaceutics. 2009, 6 (1): 144-157. 10.1021/mp8001107.View ArticlePubMed
- Dixit A, Yi L, Gowthaman R, Torkamani A, Schork NJ, Verkhivker GM: Sequence and structure signatures of cancer mutation hotspots in protein kinases. PLoS One. 2009, 4 (10): e7485-10.1371/journal.pone.0007485.PubMed CentralView ArticlePubMed
- Gonzalez-Perez A, Lopez-Bigas N: Functional impact bias reveals cancer drivers. Nucleic Acids Res. 2012
- Reva B, Antipin Y, Sander C: Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic acids research. 2011, 39 (17): e118-e118. 10.1093/nar/gkr407.PubMed CentralView ArticlePubMed
- Bamford S, Dawson E, Forbes S, Clements J, Pettett R, Dogan A, Flanagan A, Teague J, Futreal PA, Stratton MR et al: The COSMIC (Catalogue of Somatic Mutations in Cancer) database and website. Br J Cancer. 2004, 91 (2): 355-358.PubMed CentralPubMed
- Gonzalez-Perez A, Lopez-Bigas N: Improving the assessment of the outcome of nonsynonymous SNVs with a consensus deleteriousness score, Condel. Am J Hum Genet. 2011, 88 (4): 440-449. 10.1016/j.ajhg.2011.03.004.PubMed CentralView ArticlePubMed
- Gonzalez-Perez A, Deu-Pons J, Lopez-Bigas N: Improving the prediction of the functional impact of cancer mutations by baseline tolerance transformation. Genome medicine. 2012, 4 (11): 89-10.1186/gm390.PubMed CentralView ArticlePubMed
- Futreal PA, Coin L, Marshall M, Down T, Hubbard T, Wooster R, Rahman N, Stratton MR: A census of human cancer genes. Nat Rev Cancer. 2004, 4 (3): 177-183. 10.1038/nrc1299.PubMed CentralView ArticlePubMed
- Higgins ME, Claremont M, Major JE, Sander C, Lash AE: CancerGenes: a gene selection resource for cancer genome projects. Nucleic Acids Res. 2007, 35 (Database): D721-726. 10.1093/nar/gkl811.PubMed CentralView ArticlePubMed
- Furney SJ, Higgins DG, Ouzounis CA, Lopez-Bigas N: Structural and functional properties of genes involved in human cancer. BMC genomics. 2006, 7: 3-10.1186/1471-2164-7-3.PubMed CentralView ArticlePubMed
- Beroukhim R, Getz G, Nghiemphu L, Barretina J, Hsueh T, Linhart D, Vivanco I, Lee JC, Huang JH, Alexander S et al: Assessing the significance of chromosomal aberrations in cancer: methodology and application to glioma. Proceedings of the National Academy of Sciences of the United States of America. 2007, 104 (50): 20007-20012. 10.1073/pnas.0710052104.PubMed CentralView ArticlePubMed
- Reva BA, Antipin YA, Sander C: Determinants of protein function revealed by combinatorial entropy optimization. Genome Biol. 2007, 8 (11): R232-10.1186/gb-2007-8-11-r232.PubMed CentralView ArticlePubMed
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.