WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation
© Capriotti et al.; licensee BioMed Central Ltd. 2013
Published: 28 May 2013
SNPs&GO is a method for the prediction of deleterious Single Amino acid Polymorphisms (SAPs) using protein functional annotation. In this work, we present the web server implementation of SNPs&GO (WS-SNPs&GO). The server is based on Support Vector Machines (SVM) and for a given protein, its input comprises: the sequence and/or its three-dimensional structure (when available), a set of target variations and its functional Gene Ontology (GO) terms. The output of the server provides, for each protein variation, the probabilities to be associated to human diseases.
The server consists of two main components, including updated versions of the sequence-based SNPs&GO (recently scored as one of the best algorithms for predicting deleterious SAPs) and of the structure-based SNPs&GO3d programs. Sequence and structure based algorithms are extensively tested on a large set of annotated variations extracted from the SwissVar database. Selecting a balanced dataset with more than 38,000 SAPs, the sequence-based approach achieves 81% overall accuracy, 0.61 correlation coefficient and an Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve of 0.88. For the subset of ~6,600 variations mapped on protein structures available at the Protein Data Bank (PDB), the structure-based method scores with 84% overall accuracy, 0.68 correlation coefficient, and 0.91 AUC. When tested on a new blind set of variations, the results of the server are 79% and 83% overall accuracy for the sequence-based and structure-based inputs, respectively.
WS-SNPs&GO is a valuable tool that includes in a unique framework information derived from protein sequence, structure, evolutionary profile, and protein function. WS-SNPs&GO is freely available at http://snps.biofold.org/snps-and-go.
In the last few years the cost of high-throughput sequencing experiments has rapidly decreased; however the analysis and interpretation of sequencing data are still challenging issues in Molecular Biology and Bioinformatics . During the last 10 years of experiments, over 3,000 billions of nucleotides from human genomes have been released . This information in conjunction with new data from the HapMap Consortium  and the Human Variation Project  allowed to identify tens of millions of Single Nucleotide Polymorphisms (SNPs) as the main cause of variability between individuals . Currently dbSNP , which is the most comprehensive database of genetic variations, collects ~51.8 million of SNPs. Depending on the region where they occur, SNPs could affect gene expression and function with different mechanisms . Although recent published data from the ENCODE Consortium  enabled to assign a biochemical function to the 80% of non-coding regions, the non-synonymous variations in the coding regions still represent the largest component of genetic variants annotated as "pathogenic" in the dbSNP database. For this reason, the annotation of non-synonymous SNPs (nsSNPs) is important to understand the relationships between variations and diseases.
Curators of dbSNPs  and SwissVar  databases are collecting information to annotate the impact of SNPs on human health. The process requires expensive and time-consuming functional experiments and clinical trials. Thus, during the last few years, several methods have been developed to predict the impact of nsSNPs on protein stability [10–16], protein function [17, 18] and to detect their pathologic effect [19–29]. A more comprehensive lists of these tools is available on a recent review . In particular, the class of algorithms, capable of discriminating between disease-related and neutral SNPs can be extensively used for personal genome interpretation  and personalized medicine . These methods are mainly based on statistical and/or machine learning approaches that take as input information from protein sequence [18, 19, 23, 24, 26, 28, 29], structure [20, 25] and function [19, 21, 27].
In general, all the predictors rely on evolutionary information that is extracted using different procedures. For example, the SIFT algorithm  exploits the information contained in sequence alignments to calculate the probability that a mutation of a residue in a given sequence position is tolerated. For instance, if a position in an alignment contains hydrophobic residues, then SIFT assumes that the position can only contain hydrophobic residues. PhD-SNP is a machine learning approach that takes as input the frequencies of wild-type and mutant residues from a sequence profile calculated with the BLAST algorithm . PolyPhen  evaluates a substitution score calculating the Position Specific Independent Count (PSIC) matrix. More sophisticated methods such as MutPred  and SNPs&GO  also include outputs of other predictors and/or a functional score calculated using the Gene Ontology . In this paper we present the web server implementation of the SNPs&GO algorithms (based on sequence or structure inputs). The server is freely available to the whole scientific community.
Dataset and benchmarking
Composition of the datasets
Disease related variations
Measures of performance
In this work the efficiency of our predictors have been scored using the following statistical indexes.
for each class s (D and N, disease-related and neutral variations respectively); p(s) and n(s) are the total number of correct predictions and correctly rejected assignments, respectively. u(s) and o(s) are the numbers of false negatives and false positives for the s class.
where p(s) and u(s) are the same as in Equation 3.
Where p(s) and o(s) are the same as in Equation 3 (ranging from 0 to 1).
where O(D) is the probability associated to the disease-related (D). O(D) is the output of the method (ranging from 0 to 1) returned when LIBSVM tool  is executed using the probability estimation option.
Finally the area under the ROC curve (AUC) is calculated by plotting the true positive rate (TPR = S(D)) as a function of the False Positive Rate (FPR = 1-P(N)) at different prediction thresholds.
The SNPs&GO algorithms predict the impact of protein variations using functional information codified by Gene Ontology (GO) terms of the three main roots: Molecular Function, Biological Process and Cellular Component. Here we introduce a web server implementation of previous method, relying either on protein sequence and protein structure information, namely SNPs&GO and SNPs&GO3d, respectively. With respect to the previous version, the new SNPs&GO has slightly different input features representing the PANTHER output and the functional information. When compared with the standard sequence-based algorithm, in the recently developed SNPs&GO3d, the sequence features were replaced with structural based features including the structure-environment and the solvent exposure of the mutated residue.
The SNPs&GO algorithm (Figure 1 panel A) takes in input only protein sequence information. For each given sequence, the algorithm automatically generates the input profile by calculating the pair-wise alignments with the BLAST algorithm . The sequence profile is calculated performing one run of BLAST against the UniRef90 dataset (ftp://ftp.ebi.ac.uk/pub/databases/uniprot/uniref/) to select homologous sequences with E-value lower than 10-9. Besides features derived from sequence profile, the SVM input vector also includes the sequence environment of the variation and a log-odd score calculated considering all the Gene Ontology terms associated to the mutated protein and their parents in the GO graph. The SNPs&GO3d algorithm (Figure 1 panel B) takes in input structural information and generates a SVM input vector where the sequence environment used in SNPs&GO, is replaced by the structural environment and the Relative Solvent Accessible Area (RSA) of the wild-type residue. To summarize, the sequence-based algorithm calculates for each mutation a 51-elements feature vector including: i) the mutation (20 values all set to 0 with the exception of the position corresponding to the mutated and wild-type residues that are set to 1 and -1, respectively), ii) the sequence environment (20 values corresponding to the frequency of the different residues in a 19-residue long window); iii) the sequence profile (5 values corresponding to: the elements of the profile related to the mutated and wild-type residues, the number of aligned sequences observed in the mutated position in the whole alignment and the conservation index of the mutated position ); iv) 4 elements features from the output of PANTHER algorithm  encoding for the probability of deleterious variations, the frequencies of wild-type and mutant residues, and the number of independent counts; v) the functional annotation score (2 values, the GO log-odd scores and the number of GO-terms used).
When for a given protein the structure is available, it is possible to run SNPs&GO3d. In this case, the server calculates a 52-elements feature vector where the 20-elements vector encoding for the sequence environment is replaced by a 20 elements vector encoding for the structural environment. The structural environment is computed considering the residue composition within a 6 Å radius sphere around the Cα (carbon alpha) of the wild-type residue. One further element is added to encode for RSA as derived from the DSSP program . The remaining 31 input features are computed as described above for the sequence-based algorithm. With respect to a previous implementation (SNPs&GO ) the input vector of the sequence-based method differs by the bit indicating the presence of absence of the GO-terms (see ) that is now replaced by an integer value counting the number of GO-terms used to compute the GO-score (introduced already in ). In the case in which PANTHER algorithm is not able to return an output, an arbitrarily input vector is included assigning a probability of 0.5 for deleterious variations and 0 for the other three remaining PANTHER features. According to this choice, in the last version of SNPs&GO, we removed from the previous 5-elements feature vector the element indicating the presence of PANTHER output.
Depending on the information available to the user, either SNPs&GO and/or SNPs&GO3d can be activated. The server is endowed with two alternative input pages that are linked to the WS-SNPs&GO home page.
SNPs&GO input. The standard SNPs&GO server needs as input the protein sequence, its relative variations and the functional annotations (see Figure 1 panel A). The input can be provided in three different ways: i) by pasting in the appropriate textbox area the protein sequence in FASTA or raw format; ii) by uploading a file from the local machine; iii) by typing the SwissProt code. When the SwissProt code of the protein is provided, the server automatically assigns the associated GO terms of all the three subontologies (Biological Process, Cellular Component and Molecular Function) as defined in the Gene Ontology. Alternatively, protein functional annotation can be provided using the appropriate input box. In this case the server automatically runs the GO-TermFinder program  for the retrieval of all the GO-term ancestors. When functional information is not provided the method assigns zeros to the two-elements vector encoding the protein function.
SNPs&GO3d input. The SNPs&GO3d interface (see Figure 1 panel B) is slightly different because in this case the server requires structural information. The input consists of: i) the PDB code (or a PDB file) of the mutated protein and the relative chain; ii) the list of mutations, iii) the protein GO terms. Also in this case, when the SwissProt code of the mutated protein is provided, the server automatically assigns all the annotation terms. More details about the input features are described in a previous work .
The server has been designed to return the prediction output on the fly, providing a link to a web page that is refreshed approximately each 20 seconds or by e-mail. The outputs of SNPs&GO and SNPs&GO3d are similar. The html output page provides the links to the sequence or structure given in input, to the results of the BLAST search visualized with MView , to the file with all the GO terms associated to the mutated protein and the output in text format. In the second part of the output, the protein sequence is visualized and a table including all the mutations and their relative predictions is reported. In details, the table is composed of 5 columns, including the mutated residue, the prediction (either Disease or Neutral), the reliability index (RI), the probability associated to the disease-related class and the information about the prediction method. If the probability corresponding to disease-related is larger than 0.5 the variation is predicted as disease-related. In addition, a click on the variations in the output table, allows to highlight the mutated residue in the protein sequence visualized above. When available, the server also reports the output of the PANTHER algorithm  which is included in the input features of SNPs&GO (see WS-SNPs&GO, Description section). When the protein function is not available, the "All methods" option runs PhD-SNP  and S3D-PROF (the 3D structure version of PHD-SNP). Both programs are based on sequence or structure profiles and the mutation environment. For SNPs&GO3d the server returns outputs similar to those of SNPs&GO. The output includes also the Relative Solvent Accessible area (RSA) of the mutated residue calculated using the DSSP program . In the case of structural prediction the server exploits Jmol applet (http://sourceforge.net/projects/jsmol/) to visualize the protein structure and a click on the variation shows the mutated residue (in red) and its structural environment (in green). When the "All methods" option is activated the SNPs&GO3d algorithm also returns the standard sequence-based SNPs&GO prediction.
Performance of the different methods on the SAP-SEQ dataset
Performance of the different methods on the SAP-3D dataset
Performance of the different methods on the SAP-NEW dataset
It has to be noticed that the accuracies of the sequence-based methods (SNPs&GO and PhD-SNP) on the SAP-NEW dataset are lower than those obtained on SAP-SEQ dataset. This difference can be due to the possible fluctuations that strongly affect small datasets. Indeed, poorer SNPs&GO and PhD-SNP predictions are mainly observed in the subset of SNP-NEW neutral polymorphisms that is composed by only 529 mutations that correspond to ~3% of those in SAP-SEQ. The comparison of the results obtained by S3D-PROF and SNPs&GO3d on SAP-3D and SAP-NEW datasets shows that structural information allows to partially recover the loss accuracy due to less discriminative sequence based features. This observation reinforces the idea that protein structure is an important piece of information to improve the detection of disease-related variants.
Conclusions and discussion
Recently it has been observed that the correlation among disease associated variation types and perturbation of protein stability is moderate . The advantage of the WS-SNPs&GO server is that the impact of SAPs is predicted directly from variations in the protein sequence and/or structure relying on function. When the GO-score computation does not require the reconstruction of ancestry paths in the GO graph (20), SNPs&GO returns its prediction in a time interval comparable with one run of the BLAST algorithm on the UniRef90 database. Our server is a good alternative to well-established tools like SIFT  and PolyPhen  since it returns high quality predictions as shown in previous works [19, 20, 40] and confirmed in the 2011 edition of the Critical Assessment of Genome Interpretation experiments (http://genomeinterpretation.org/). In particular SNPs&GO was scored among the best methods in the prediction of deleterious mutations in RAD50. To our knowledge WS-SNPs&GO is the only function-based server for the prediction of deleterious variants tested on a large number of single point variations related to all types of disease. Finally, it is also worth noticing that the sequence-based method SNPs&GO in its previous version  has been scored among the best methods for the prediction of deleterious protein variations by independent assessors . Here we present an updated version together with SNPs&GO3d that can exploit (when available) the structural information of proteins and when this is possible, it returns more accurate results. We propose our WS-SNPs&GO server as a useful source of annotation of protein variations in transcript and exome sequencing high-throughput experiments.
Additional file 1
SAP-SEQ dataset: http://snps.biofold.org/snps-and-go/pages/data/200910_SAP-SEQ.txt
Additional file 2
SAP-3D dataset: http://snps.biofold.org/snps-and-go/pages/data/200910_SAP-3D.txt
Additional file 3
SAP-NEW dataset: http://snps.biofold.org/snps-and-go/pages/data/201112_SAP-NEW.txt
This work has been supported by the European Community through the Marie Curie International Outgoing Fellowship program [PIOF-GA-2009-237225] to EC and COST BMBS Action TD1101; PRIN 2009 [009WXT45Y] and PON project [PON01_02249] from the Italian Ministry of Research, to RCasadio. EC is currently supported by start-up funds from the Department of Pathology at the University of Alabama, Birmingham. RBA is supported by NIH LM05652, LM GM102365, the NSF CNS-0619926.
The publication costs for this article were funded by the above grants supporting EC and RCasadio.
This article has been published as part of BMC Genomics Volume 14 Supplement 3, 2013: SNP-SIG 2012: Identification and annotation of SNPs in the context of structure, function, and disease. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcgenomics/supplements/14/S3
- Fernald GH, Capriotti E, Daneshjou R, Karczewski KJ, Altman RB: Bioinformatics challenges for personalized medicine. Bioinformatics. 2011, 27 (13): 1741-1748. 10.1093/bioinformatics/btr295.PubMed CentralView ArticlePubMedGoogle Scholar
- International Human Genome Sequencing Consortium: Finishing the euchromatic sequence of the human genome. Nature. 2004, 431 (7011): 931-945. 10.1038/nature03001.View ArticleGoogle Scholar
- International HapMap Consortium: The International HapMap Project. Nature. 2003, 426 (6968): 789-796. 10.1038/nature02168.View ArticleGoogle Scholar
- Cotton RG, Auerbach AD, Axton M, Barash CI, Berkovic SF, Brookes AJ, Burn J, Cutting G, den Dunnen JT, Flicek P et al: GENETICS. The Human Variome Project. Science. 2008, 322 (5903): 861-862. 10.1126/science.1167363.PubMed CentralView ArticlePubMedGoogle Scholar
- 1000 Genomes Project Consortium: A map of human genome variation from population-scale sequencing. Nature. 2010, 467 (7319): 1061-1073. 10.1038/nature09534.View ArticleGoogle Scholar
- Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K: dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2001, 29 (1): 308-311. 10.1093/nar/29.1.308.PubMed CentralView ArticlePubMedGoogle Scholar
- Cline MS, Karchin R: Using bioinformatics to predict the functional impact of SNVs. Bioinformatics. 2011, 27 (4): 441-448. 10.1093/bioinformatics/btq695.PubMed CentralView ArticlePubMedGoogle Scholar
- Encode Project Consortium: An integrated encyclopedia of DNA elements in the human genome. Nature. 2012, 489 (7414): 57-74. 10.1038/nature11247.View ArticleGoogle Scholar
- Mottaz A, David FP, Veuthey AL, Yip YL: Easy retrieval of single amino-acid polymorphisms and phenotype information using SwissVar. Bioinformatics. 2010, 26 (6): 851-852. 10.1093/bioinformatics/btq028.PubMed CentralView ArticlePubMedGoogle Scholar
- Capriotti E, Fariselli P, Casadio R: I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure. Nucleic Acids Res. 2005, 33 (Web Server): W306-310. 10.1093/nar/gki375.PubMed CentralView ArticlePubMedGoogle Scholar
- Capriotti E, Fariselli P, Rossi I, Casadio R: A three-state prediction of single point mutations on protein stability changes. BMC Bioinformatics. 2008, 9 (Suppl 2): S6-10.1186/1471-2105-9-S2-S6.PubMed CentralView ArticlePubMedGoogle Scholar
- Dehouck Y, Grosfils A, Folch B, Gilis D, Bogaerts P, Rooman M: Fast and accurate predictions of protein stability changes upon mutations using statistical potentials and neural networks: PoPMuSiC-2.0. Bioinformatics. 2009, 25 (19): 2537-2543. 10.1093/bioinformatics/btp445.View ArticlePubMedGoogle Scholar
- Guerois R, Nielsen JE, Serrano L: Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. J Mol Biol. 2002, 320 (2): 369-387. 10.1016/S0022-2836(02)00442-4.View ArticlePubMedGoogle Scholar
- Masso M, Vaisman II: Accurate prediction of stability changes in protein mutants by combining machine learning with structure based computational mutagenesis. Bioinformatics. 2008, 24 (18): 2002-2009. 10.1093/bioinformatics/btn353.View ArticlePubMedGoogle Scholar
- Parthiban V, Gromiha MM, Schomburg D: CUPSAT: prediction of protein stability upon point mutations. Nucleic Acids Res. 2006, 34 (Web Server): W239-242. 10.1093/nar/gkl190.PubMed CentralView ArticlePubMedGoogle Scholar
- Zhou H, Zhou Y: Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction. Protein Sci. 2002, 11 (11): 2714-2726.PubMed CentralView ArticlePubMedGoogle Scholar
- Bromberg Y, Rost B: SNAP: predict effect of non-synonymous polymorphisms on function. Nucleic Acids Res. 2007, 35 (11): 3823-3835. 10.1093/nar/gkm238.PubMed CentralView ArticlePubMedGoogle Scholar
- Ng PC, Henikoff S: SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003, 31 (13): 3812-3814. 10.1093/nar/gkg509.PubMed CentralView ArticlePubMedGoogle Scholar
- Calabrese R, Capriotti E, Fariselli P, Martelli PL, Casadio R: Functional annotations improve the predictive score of human disease-related mutations in proteins. Hum Mutat. 2009, 30 (8): 1237-1244. 10.1002/humu.21047.View ArticlePubMedGoogle Scholar
- Capriotti E, Altman RB: Improving the prediction of disease-related variants using protein three-dimensional structure. BMC Bioinformatics. 2011, Suppl 4: S3-View ArticleGoogle Scholar
- Capriotti E, Altman RB: A new disease-specific machine learning approach for the prediction of cancer-causing missense variants. Genomics. 2011, 98 (4): 310-317. 10.1016/j.ygeno.2011.06.010.PubMed CentralView ArticlePubMedGoogle Scholar
- Capriotti E, Arbiza L, Casadio R, Dopazo J, Dopazo H, Marti-Renom MA: Use of estimated evolutionary strength at the codon level improves the prediction of disease-related protein mutations in humans. Hum Mutat. 2008, 29 (1): 198-204. 10.1002/humu.20628.View ArticlePubMedGoogle Scholar
- Capriotti E, Calabrese R, Casadio R: Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information. Bioinformatics. 2006, 22 (22): 2729-2734. 10.1093/bioinformatics/btl423.View ArticlePubMedGoogle Scholar
- Li B, Krishnan VG, Mort ME, Xin F, Kamati KK, Cooper DN, Mooney SD, Radivojac P: Automated inference of molecular mechanisms of disease from amino acid substitutions. Bioinformatics. 2009, 25 (21): 2744-2750. 10.1093/bioinformatics/btp528.PubMed CentralView ArticlePubMedGoogle Scholar
- Ramensky V, Bork P, Sunyaev S: Human non-synonymous SNPs: server and survey. Nucleic Acids Res. 2002, 30 (17): 3894-3900. 10.1093/nar/gkf493.PubMed CentralView ArticlePubMedGoogle Scholar
- Thomas PD, Kejariwal A: Coding single-nucleotide polymorphisms associated with complex vs. Mendelian disease: evolutionary evidence for differences in molecular effects. Proc Natl Acad Sci USA. 2004, 101 (43): 15398-15403. 10.1073/pnas.0404380101.PubMed CentralView ArticlePubMedGoogle Scholar
- Kaminker JS, Zhang Y, Watanabe C, Zhang Z: CanPredict: a computational tool for predicting cancer-associated missense mutations. Nucleic Acids Res. 2007, 35 (Web Server): W595-598. 10.1093/nar/gkm405.PubMed CentralView ArticlePubMedGoogle Scholar
- Thusberg J, Vihinen M: Pathogenic or not? And if so, then how? Studying the effects of missense mutations using bioinformatics methods. Hum Mutat. 2009, 30 (5): 703-714. 10.1002/humu.20938.View ArticlePubMedGoogle Scholar
- Carter H, Chen S, Isik L, Tyekucheva S, Velculescu VE, Kinzler KW, Vogelstein B, Karchin R: Cancer-specific high-throughput annotation of somatic mutations: computational prediction of driver missense mutations. Cancer Res. 2009, 69 (16): 6660-6667. 10.1158/0008-5472.CAN-09-1133.PubMed CentralView ArticlePubMedGoogle Scholar
- Capriotti E, Nehrt NL, Kann MG, Bromberg Y: Bioinformatics for personal genome interpretation. Brief Bioinform. 2012, 13 (4): 495-512. 10.1093/bib/bbr070.PubMed CentralView ArticlePubMedGoogle Scholar
- Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25 (17): 3389-3402. 10.1093/nar/25.17.3389.PubMed CentralView ArticlePubMedGoogle Scholar
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT et al: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000, 25 (1): 25-29. 10.1038/75556.PubMed CentralView ArticlePubMedGoogle Scholar
- Rose PW, Beran B, Bi C, Bluhm WF, Dimitropoulos D, Goodsell DS, Prlic A, Quesada M, Quinn GB, Westbrook JD et al: The RCSB Protein Data Bank: redesigned web site and web services. Nucleic Acids Res. 2011, 39 (Database): D392-401. 10.1093/nar/gkq1021.PubMed CentralView ArticlePubMedGoogle Scholar
- Chang CC, Lin CJ: LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and Technology. 2011, 2 (3): 1-27.View ArticleGoogle Scholar
- Pei J, Grishin NV: AL2CO: calculation of positional conservation in a protein sequence alignment. Bioinformatics. 2001, 17 (8): 700-712. 10.1093/bioinformatics/17.8.700.View ArticlePubMedGoogle Scholar
- Kabsch W, Sander C: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983, 22 (12): 2577-2637. 10.1002/bip.360221211.View ArticlePubMedGoogle Scholar
- Boyle EI, Weng S, Gollub J, Jin H, Botstein D, Cherry JM, Sherlock G: GO::TermFinder--open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics. 2004, 20 (18): 3710-3715. 10.1093/bioinformatics/bth456.PubMed CentralView ArticlePubMedGoogle Scholar
- Brown NP, Leroy C, Sander C: MView: a web-compatible database search or multiple alignment viewer. Bioinformatics. 1998, 14 (4): 380-381. 10.1093/bioinformatics/14.4.380.View ArticlePubMedGoogle Scholar
- Casadio R, Vassura M, Tiwari S, Fariselli P, Martelli PL: Correlating disease-related mutations to their effect on protein stability: a large-scale analysis of the human proteome. Hum Mutat. 2011, 32 (10): 1161-1170. 10.1002/humu.21555.View ArticlePubMedGoogle Scholar
- Thusberg J, Olatubosun A, Vihinen M: Performance of mutation pathogenicity prediction methods on missense variants. Hum Mutat. 2011, 32 (4): 358-368. 10.1002/humu.21445.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.