- Open Access
Short Linear Motifs recognized by SH2, SH3 and Ser/Thr Kinase domains are conserved in disordered protein regions
© Ren et al; licensee BioMed Central Ltd. 2008
- Published: 16 September 2008
Protein interactions are essential for most cellular functions. Interactions mediated by domains that appear in a large number of proteins are of particular interest since they are expected to have an impact on diversities of cellular processes such as signal transduction and immune response. Many well represented domains recognize and bind to primary sequences less than 10 amino acids in length called Short Linear Motifs (SLiMs).
In this study, we systematically studied the evolutionary conservation of SLiMs recognized by SH2, SH3 and Ser/Thr Kinase domains in both ordered and disordered protein regions. Disordered protein regions are protein sequences that lack a fixed three-dimensional structure under putatively native conditions. We find that, in all these domains examined, SLiMs are more conserved in disordered regions. This trend is more evident in those protein functional groups that are frequently reported to interact with specific domains.
The correlation between SLiM conservation with disorder prediction demonstrates that functional SLiMs recognized by each domain occur more often in disordered as compared to structured regions of proteins.
- Conservation Score
- Relative Conservation
- Intrinsic Disorder
- Disorder Prediction
- PXXP Motif
Selective protein-protein interactions are important for many cellular functions and are often mediated by short regions, but such regions are difficult to identify because of their short lengths and degenerate sequences. A significant advance came when peptide-library methods were developed to identify sequences recognized by SH2 domains, which is a globular domain that plays important roles in cellular signal transduction. These peptide-library methods did not depend on prior knowledge of interaction sites in vivo . Similar peptide library experiments have been performed to map motifs recognized by other domains . Motifs discovered through polypeptide library screening showed remarkable consonance with reported domain interaction sites [1, 2]. Such sites later became the basis for Scansite [3, 4], a bioinformatics tool developed to predict target sites recognized by specific protein domains.
Attempts have been made to find such binding regions using purely computational approaches. Eukaryotic linear motifs (ELMs) are identified by their over-representation among protein sequences that bind to a common partner . Short linear motifs (SLiMs) are also identified as specific sequence patterns that are over-represented in proteins that bind to a common partner, but the algorithms used to discover SLiMs employ filters to remove homologous proteins whereas the ELM-discovery algorithms do not. Thus, ELMs and SLiMs are both identified as sequence patterns in multiple proteins that bind to a common target, with the SLiM-containing set likely to be entirely non-homologous but with no such restriction on the ELM-containing set.
Traditionally proteins are believed to function in some form of three-dimensional (3D) structure represented by the "lock and key" or by the "induced fit" theory. More and more examples show that some biological functions of proteins require that the protein structure be more flexible. Disordered protein regions are those sequences in protein that do not have rigid three-dimensional structures. In plots of disorder prediction versus residue number, several sharp dips flanked by regions strongly predicted to be disordered in several different proteins were associated with sites that bind to respective protein partners . This observation was independently made somewhat later . Further analysis on such complexes was carried out [8, 9], predictors were developed [10, 11], and these binding regions were first named molecular recognition elements  then molecular recognition features (MoRFs) .
MoRFs differ from ELMs and SLiMs in not depending on a specific sequence motif, but rather upon a pattern in a disorder prediction output. Yet, interestingly, recent analysis suggests that linear motifs (LMs) (thus not differentiating between ELMs and SLiMs) show high overlap with MoRFs . Taken all together, these observations suggest that regions of intrinsic disorder often play a role in protein-protein interactions [13–18]. In addition, there are documented cases where the binding of these disordered regions is coupled to their folding [7, 19, 20].
Molecular functional groups frequently reported to interact with Domains.
Cell surface receptor
Guanine nucleotide exchange factor
GTPase activating protein
Cell cycle control protein
RNA binding protein
Transcription regulatory protein
The Src homology 2 (SH2) domain is a prototypical functional module of ~100 amino acids that contains a central anti-parallel β-sheet surrounded by two α-helices . SH2 domains represent the largest class of known phosphotyrosine (pTyr)-recognition domains . These domains bind specific pTyr-containing motifs, which are typically found in complexes as an extended β-strand that lies at right angles to the SH2 β-sheet . The SLiM-SH2 interactions typically couple activated protein tyrosine kinases (PTKs) to a number of intracellular pathways regulating various aspects of cellular communication . Overall, the SH2 domain is an important functional module found in a great variety of proteins regulating functionally diverse processes. Recently, these SH2-containing proteins were classified into 11 functional categories . The illustrative examples of functions modulated by the SH2-containing proteins include signal regulation, tyrosine phosphorylation, control of phospholipids metabolism, small GTPase regulation, gene expression, chromatin remodeling, ubiquitylation, and cytoskeletal organization. Furthermore, some of the SH2-containing proteins serve as adaptors and scaffolds .
Src-homology 3 (SH3) domains generally bind to Pro-rich peptides that form a left-handed polyPro type II helix. SH3 domains are small protein modules of ~60 amino acid residues that typically contain five or six β-strands arranged as two tightly packed anti-parallel β-sheets . The linker regions may contain short helices. Two SH3 variable loops, the RT and n-Src loops, flank a SLiM-binding site that consists of a hydrophobic patch that contains a cluster of conserved aromatic residues . Two classes of SH3 domains have been defined, Class 1 and Class 2, which recognize RKXXPXXP and PXXPXR motifs, respectively . An interesting feature of SH3 domains is the palindromic nature of their ligands; i.e. these domains can bind the SLiMs in either orientation . SH3 domains are found in a great variety of intracellular or membrane-associated proteins, e.g., in a number of proteins with enzymatic activity, in adaptor proteins that lack catalytic sequences and in cytoskeletal proteins, such as fodrin and yeast actin-binding protein ABP-1. SH3 domains mediate assembly of specific protein complexes via binding to proline-rich peptides in their respective binding partner. They are involved in cell-cell communication and signal transduction from the cell surface to the nucleus . Interestingly, SH2 and SH3 domains are frequently found together in the same protein. However, certain proteins contain a single SH2 or SH3 domain, while others contain several copies of either domain [25, 27]. Some SH2 domains (e.g., Crk SH2 domain) contain specific SH3 domain-binding sites , thus linking together SH2- and SH3-mediated regulatory networks.
Protein phosphorylation is one of the most ubiquitous post-translational modifications of proteins, being the most common mechanism of protein function regulation known to date. In eukaryotes, phosphorylation is carried out by protein kinases, which represent about 2% of the proteins encoded by eukaryotic genomes [30–33]. In human genome, kinases are the third most common protein . Protein kinases are key signalling enzymes, that participate in the regulation of multiple cellular responses and have evolved two properties that are essential for their function: sensitive means of regulation and high specificity for substrates . Ser/Thr kinases transfer the terminal phosphate from ATP to a specific Ser or Thr residue on protein substrates. Some illustrative examples of the most crucial Ser/Thr kinases include mitogen-activated protein kinase (MAPK), glycogen synthase kinase 3 (GSK3), cAMP-dependent protein kinase (PKA), phosphorylase kinase, cyclin-dependent kinase (CDK), protein kinase B (PKB) and phosphoinositide-dependent protein kinase-1 (PDK1) families. Early studies on model Ser/Thr protein kinases revealed that the principal substrate specificity determinants for these kinases were "recognition motifs", located in short segments of the primary sequence around the phosphorylation sites [35, 36].
Invariant amino acid residues in SLiMs recognized by SH2, SH3 and Ser/Thr Kinase domains.
SH3 Type 1
SH3 Type 2
Protein classification and sequence data
Protein sequence data was obtained from SwissProt database downloaded from ftp://ftp.ncbi.nih.gov in November 2005. Reported protein-protein interactions, protein molecular function classifications, biological processes and sub-cellular localizations were according to the Hprd dataset , which is a non-redundent manually curated protein database, downloaded in November 2005 from http://www.hprd.org. Phosphorylated sites were obtained from the Phospho.ELM database  kindly provided by Francesca Diella in December 2005.
For our protein functional classification analysis we selected all (7248) human proteins that satisfy following criteria: (i) Each protein had sequence annotated by SwissProt; (ii) Each protein had molecular function annotated by Human protein reference database (Hprd) ; (iii) The function of the protein is within 34 protein functional groups in Hprd, all of which are found 50 or more times in Hprd.
Selection of homologous proteins
Using 7248 human protein sequences selected as described above, we did a BLAST search against 12 other higher eukaryotic species (Canis familiaris, Bos taurus, Mus musculus, Rattus norvegicus, Gallus gallus, Xenopus tropicalis, Tetraodon nigroviridis, Danio rerio, Strongylocentrotus purpuratus, Drosophila melanogaster, Apis mellifera, and Caenorhabditis elegans) to obtain sequences homologous to the human protein examples. Species were selected according to their unique evolutionary positions (four mammals, four non-mammal vertebrates and four invertebrates) and sequence availability in the RefSeq database . Sequence data for all non-human species were from RefSeq database downloaded from ftp://ftp.ncbi.nih.gov in June 2006 except for Tetraodon nigroviridis which was from the NCBI Entrez non-redundant protein sequence database downloaded from ftp://ftp.ncbi.nih.gov in June 2006. We applied two cutoff levels to avoid inclusion of insignificant hits: a score cutoff of 50 bits, and an overlap cutoff of 50%, as applied in Inparanoid . If more than one homologous sequence were obtained from a single species, the one with the lowest E-value was selected for this study. However, different from Inparanoid  or COG (Cluster of Orthologous Groups) , which consider all species as equal entries, because most biochemical data we used including protein interaction data and protein classification data were from human, sequences from all other species were compared to those of human. Therefore, we only considered the best hit from non-human species as homologous to human query protein but not necessarily mutually best matches between human and non-human species or non-human species themselves. Sequence alignments were manually checked and modified when necessary.
Predictions of intrinsic disorder from protein sequence were carried out using a well-characterized disorder predictor VL3 [42, 43], which is publicly accessible at our web site http://www.ist.temple.edu/disprot. This predictor is trained on the experimentally (X-ray and NMR) confirmed disordered protein regions, while the ordered training set included completely ordered protein regions extracted from the non-redundant set of proteins from PDB Select 25. The accuracy of this predictor, benchmarked on the 42 CASP5 targets, reached 78%. The result is best on all measures, on both no-density segments and B-factors, and is significantly better than the predictors from other groups that participated in CASP5 .
Calculation of the conservation score of SLiM
SLiMs that have amino acid residues critically invariant for each domain (as shown in Table 2) were obtained for evolutionary analysis (Thr-SLiMs were not included in the analysis for Ser/Thr kinases domains since we only have peptide library mapped motifs for Ser-SLiMs). For a particular protein sequence assume sequence identity rate between a reference species (human in this study) and species i is p(i) (equal to the number of identical sites divided by the total number of sites aligned), and the SLiM under study is n amino acids in length (in cases where the SLiM is at the terminal of a protein and is only partially available, the available length is considered). If the SLiM is under the same evolutionary selectivity as the full-length protein, then the probability that the SLiM is conserved between the two species is given by:
P1(i) = p(i)n
The probability that the SLiM is unconserved is given by:
P2(i) = 1 - P1(i) = 1 - p(i)n
Here we define Relative Conservation (CR) between human and the ith species as:
a. if the SLiM is conserved:
CR(i) = 1/P1(i) = 1/p(i)n;
b. if the SLiM is unconserved:
CR(i) = P2(i) = 1 - p(i)n;
A CR score greater than 1 indicates the SLiM is CR times more conserved than the average level of the protein. A score smaller than 1 indicates 1/CR times greater variability between species.
This relative conservation approach is originally developed to study domain recognized motifs within protein sequences in different functional groups (Ren & Chen et al submitted). The method may not be suitable for SLiMs longer than 10 amino acids, since it assumes that most residues in the SLiM could influence the interaction. This may not be the case in longer sequences where only a small subset of the residues is critical to binding. Although not all residues in a SLiM shorter than 10 amino acids are essential for interaction, their relative conservation is usually strong enough to be detected.
Please see Additional file 1 for information on additional materials and methods.
Traditional methods measure sequence conservation without considering the conservation background of the protein. Here, we took background conservation into consideration by measuring the relative conservation score. Our central hypothesis was that SLiMs should be subject to two kinds of evolutionary selection. The first is background selection, which is imposed upon the entire length of the protein sequence, due to the integral function of the protein. The second is SLiM-specific selection superimposed on the background, due to the special function mediated by the SLiM.
Analysis of SH2 domain recognized SLiMs in 11 most studied Receptor Tyrosine Kinases (RTKs)
In this section and the sections that follow, we use "SLiM conservation" to indicate relative conservation unless specified otherwise.
Short Linear Motifs recognized by SH2, SH3 and Ser/Thr kinases domains are conserved in disordered regions
Although the conservation of the SLiMs is more manifest in disordered than ordered protein regions in all three domains examined, there are still some differences among the three domains. Tyr-SLiMs recognized by SH2 domains are conserved in disordered but not in ordered protein regions. Ser-SLiMs (since we only had motif with a central Serine residue, only Ser-SLiMs but not Thr-SLiMs were analysed) recognized by Ser/Thr kinases are conserved in both ordered and disordered protein regions but are more conserved in disordered regions. PXXP containing SLiMs recognized by SH3 domains are conserved in disordered but not ordered protein regions. Interestingly, the sequences nearby the PXXP motifs recognized by SH3 have high conservation score. One possible explanation is that the proline residue is strongly disorder-promoting [46, 47], and so a structured sequence containing a PXXP motif would be expected to be an unstable element in the rigid structure. In order to compensate for the loss of structural stability brought about by the PXXP motif, the neighbouring residues would become more important for the maintenance of the stability, which may explain their evolutionary conservation.
Protein disorder is believed to play an important role in protein-protein interactions. In this study, we show that the SH2, SH3 and Ser/Thr Kinase domain-recognizable short linear motifs in disordered regions of proteins are more conserved than those in ordered protein regions. This difference is most significant in those molecular functional classes that are frequently reported to interact with their respective domains, but weak in functional groups that are rarely reported to interact with their respective domains.
From an evolutionary perspective, ordered or structural regions are generally more conserved than disordered regions . In this study, calculating the relative conservation of sequences enabled the detection of a conservation signal of a SLiM compared to the conservation background of the protein in which the SLiM resides.
The enrichment of relatively conserved SLiMs in disordered protein regions is highly related to their function. Location of SLiMs in intrinsically disordered regions provides several important functional benefits for interactions with domains. First, SLiMs in disordered regions are more accessible to domains since they are necessarily fully exposed. Second, SLiM domain interaction are usually very weak due to small recognition surface involved. Localization within intrinsically disordered proteins allows the SLiM to adapt to recognition surface and thus improve the stability of the interaction. Third, being located within disordered regions enables overlapping SLiMs to change their conformations to bind to different partners and thus increase signalling complexity. For example, the SH2 domain binds to Tyr-SLiMs previously phosphorylated by Tyr-kinases, so the same region has overlapping motifs, one for the kinase and one for the SH2. The structure of this region changes when it binds to the different partners, and this structural change is facilitated by the flexibility of intrinsic disorder.
Phosphorylation is an important post-translational modification that merits closer attention. Phosphorylation occurs in ~30–50% of the proteins in eukaryotes . Sites of phosphorylation usually occur in disordered regions . Several of SLiMs analyzed in this paper are phosphorylated and we have established that the domain-recognized SLiMs are preferably located in disordered protein regions. Therefore, the results of our analysis support this previous work and vice versa – the previous work supports our finding.
Furthermore, several computational methods have been developed for identifying protein phosphorylation sites according to their surrounding peptide sequences. Some of these methods (including NetPhos , NetPhosK , PredPhospho , GPS , PPSP , ScanSite  and Phospho.ELM ) depend on datasets of both phosphorylated and non-phosphorylated peptide sequences for training and therefore relying on specific sequence motifs, whereas DisPhos  uses disorder, but does not use sequence motifs.
If phosphorylation does indeed occur in disordered regions, then phosphorylation predictors based on the sequence motifs would give a false positive whenever the motif is in a region of structure. That is, if a sequence motif is in a structured region of a protein, the site would be hard to phosphorylate since it does not have the flexibility to fit onto the active site of the kinase (note: binding to the active site requires extended structure and accessible backbone hydrogen bonds, which are hallmarks of disordered proteins )
On the other hand, it would be expected that DisPhos would give a false positive when the Ser/Thr or Tyr in a disordered region is not within a kinase recognition motif. These observations suggest that combining a motif-based prediction method with a disorder-based prediction method should give a large increase in phosphorylation prediction accuracy because each method would reduce the false positives from the other method.
This hypothesis was recently confirmed by an elegant study where a new method named PhoScan was elaborated to predict phosphorylation sites for specific protein kinases without using non-phosphorylated training data . The authors have combined both the common (or disorder-based) and the kinase-specific feature sets and added new features that were identified from the training data of known phosphorylation sites. Among these new added features there was the flexibility (disorder) tendency of the local regions surrounding phosphorylation sites evaluated using approach of Iakoucheva et al. . PhoScan was shown to achieve a specificity of > 90% and sensitivity ~90% at kinase-family level . This represents a very large improvement compared to the previous methods (about 20%), which likely occurs because the motif-based approach reduces the false positives of the disorder-based approach and vice versa.
Although the SLiM conservation signal is more evident in disordered than ordered protein regions in all the three domains examined, some SLiMs in ordered regions can also interact with domains under physiological condition. For example, serine residues in the structured activation loop of several kinases can be phosphorylated and change the kinase activities. However, these loops undergo large-scale conformational shifts following phosphorylation, and so it is likely that the loops become disordered during the phosphorylation event. This observation suggests that each example in which a motif is apparently in a structured region should be checked for the possibility of transient disorder during binding. Use of transient disorder for signalling presents a number of opportunities for regulation and control . This study has a limited coverage of domains that can interact with SLiMs in the genome. In the future it should be possible to examine other domains-recognized SLiMs using available sequence motifs.
This study provides evolutionary evidence for the importance of intrinsic disorder in the context of functional protein interactions. Specifically, SLiMs within disordered protein regions are more conserved than equivalent sites within ordered regions. Study of manually extracted SH2 interaction sites in 11 most studied receptor tyrosine kinases provided experimental evidence that Tyr-SLiMs within disordered regions are more likely to be involved in interaction. Although there is currently no direct evidence to show that this is the general rule for SLiMs recognized by domains studied here or other domains in vivo, we hope our current observations will contribute to discussion of the role of intrinsically disordered protein regions.
This work was supported in part by the grants R01 LM007688-01A1 (to A.K.D and V.N.U.) and GM071714-01A2 (to A.K.D and V.N.U.) from the National Institutes of Health, the Programs of the Russian Academy of Sciences for the "Molecular and cellular biology" and "Fundamental science for medicine" (to V. N. U.) and under a grant with the Pennsylvania Department of Health (to Z.O.). We gratefully acknowledge the support of the IUPUI Signature Centers Initiative.
This article has been published as part of BMC Genomics Volume 9 Supplement 2, 2008: IEEE 7th International Conference on Bioinformatics and Bioengineering at Harvard Medical School. The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2164/9?issue=S2
- Songyang Z, Shoelson SE, Chaudhuri M, Gish G, Pawson T, Haser WG, King F, Roberts T, Ratnofsky S, Lechleider RJ: SH2 domains recognize specific phosphopeptide sequences. Cell. 1993, 72 (5): 767-778. 10.1016/0092-8674(93)90404-E.PubMedView ArticleGoogle Scholar
- Songyang Z, Shoelson SE, McGlade J, Olivier P, Pawson T, Bustelo XR, Barbacid M, Sabe H, Hanafusa H, Yi T: Specific motifs recognized by the SH2 domains of Csk, 3BP2, fps/fes, GRB-2, HCP, SHC, Syk, and Vav. Mol Cell Biol. 1994, 14 (4): 2777-2785.PubMedPubMed CentralView ArticleGoogle Scholar
- Yaffe MB, Leparc GG, Lai J, Obata T, Volinia S, Cantley LC: A motif-based profile scanning approach for genome-wide prediction of signaling pathways. Nat Biotechnol. 2001, 19 (4): 348-353. 10.1038/86737.PubMedView ArticleGoogle Scholar
- Obenauer JC, Cantley LC, Yaffe MB: Scansite 2.0: Proteome-wide prediction of cell signaling interactions using short sequence motifs. Nucleic Acids Res. 2003, 31 (13): 3635-3641. 10.1093/nar/gkg584.PubMedPubMed CentralView ArticleGoogle Scholar
- Puntervoll P, Linding R, Gemund C, Chabanis-Davidson S, Mattingsdal M, Cameron S, Martin DM, Ausiello G, Brannetti B, Costantini A: ELM server: A new resource for investigating short functional sites in modular eukaryotic proteins. Nucleic Acids Res. 2003, 31 (13): 3625-3630. 10.1093/nar/gkg545.PubMedPubMed CentralView ArticleGoogle Scholar
- Garner E, Romero P, Dunker AK, Brown C, Obradovic Z: Predicting Binding Regions within Disordered Proteins. Genome Inform Ser Workshop Genome Inform. 1999, 10: 41-50.PubMedGoogle Scholar
- Callaghan AJ, Aurikko JP, Ilag LL, Gunter Grossmann J, Chandran V, Kuhnel K, Poljak L, Carpousis AJ, Robinson CV, Symmons MF: Studies of the RNA degradosome-organizing domain of the Escherichia coli ribonuclease RNase E. J Mol Biol. 2004, 340 (5): 965-979. 10.1016/j.jmb.2004.05.046.PubMedView ArticleGoogle Scholar
- Mohan A, Oldfield CJ, Radivojac P, Vacic V, Cortese MS, Dunker AK, Uversky VN: Analysis of molecular recognition features (MoRFs). J Mol Biol. 2006, 362 (5): 1043-1059. 10.1016/j.jmb.2006.07.087.PubMedView ArticleGoogle Scholar
- Vacic V, Oldfield CJ, Mohan A, Radivojac P, Cortese MS, Uversky VN, Dunker AK: Characterization of molecular recognition features, MoRFs, and their binding partners. J Proteome Res. 2007, 6 (6): 2351-2366. 10.1021/pr0701411.PubMedPubMed CentralView ArticleGoogle Scholar
- Cheng Y, Oldfield CJ, Meng J, Romero P, Uversky VN, Dunker AK: Mining alpha-helix-forming molecular recognition features with cross species sequence alignments. Biochemistry. 2007, 46 (47): 13468-13477. 10.1021/bi7012273.PubMedPubMed CentralView ArticleGoogle Scholar
- Oldfield CJ, Cheng Y, Cortese MS, Romero P, Uversky VN, Dunker AK: Coupled folding and binding with alpha-helix-forming molecular recognition elements. Biochemistry. 2005, 44 (37): 12454-12470. 10.1021/bi050736e.PubMedView ArticleGoogle Scholar
- Fuxreiter M, Tompa P, Simon I: Local structural disorder imparts plasticity on linear motifs. Bioinformatics. 2007, 23 (8): 950-956. 10.1093/bioinformatics/btm035.PubMedView ArticleGoogle Scholar
- Wright PE, Dyson HJ: Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm. J Mol Biol. 1999, 293 (2): 321-331. 10.1006/jmbi.1999.3110.PubMedView ArticleGoogle Scholar
- Dunker AK, Obradovic Z: The protein trinity – linking function and disorder. Nat Biotechnol. 2001, 19 (9): 805-806. 10.1038/nbt0901-805.PubMedView ArticleGoogle Scholar
- Uversky VN: Natively unfolded proteins: a point where biology waits for physics. Protein Sci. 2002, 11 (4): 739-756. 10.1110/ps.4210102.PubMedPubMed CentralView ArticleGoogle Scholar
- Dunker AK, Cortese MS, Romero P, Iakoucheva LM, Uversky VN: Flexible nets. The roles of intrinsic disorder in protein interaction networks. Febs J. 2005, 272 (20): 5129-5148. 10.1111/j.1742-4658.2005.04948.x.PubMedView ArticleGoogle Scholar
- Uversky VN, Oldfield CJ, Dunker AK: Showing your ID: intrinsic disorder as an ID for recognition, regulation and cell signaling. J Mol Recognit. 2005, 18 (5): 343-384. 10.1002/jmr.747.PubMedView ArticleGoogle Scholar
- Oldfield CJ, Meng J, Yang JY, Yang MQ, Uversky VN, Dunker AK: Flexible nets: Disorder and induced fit in the associations of p53 and 14-3-3 with their partners. BMC Genomics. 2008, 9 (S1): S1-10.1186/1471-2164-9-S1-S1.View ArticleGoogle Scholar
- Radhakrishnan I, Perez-Alvarado GC, Parker D, Dyson HJ, Montminy MR, Wright PE: Solution structure of the KIX domain of CBP bound to the transactivation domain of CREB: a model for activator:coactivator interactions. Cell. 1997, 91 (6): 741-752. 10.1016/S0092-8674(00)80463-8.PubMedView ArticleGoogle Scholar
- Longhi S, Receveur-Brechot V, Karlin D, Johansson K, Darbon H, Bhella D, Yeo R, Finet S, Canard B: The C-terminal domain of the measles virus nucleoprotein is intrinsically disordered and folds upon binding to the C-terminal moiety of the phosphoprotein. J Biol Chem. 2003, 278 (20): 18638-18648. 10.1074/jbc.M300518200.PubMedView ArticleGoogle Scholar
- Waksman G, Kominos D, Robertson SC, Pant N, Baltimore D, Birge RB, Cowburn D, Hanafusa H, Mayer BJ, Overduin M: Crystal structure of the phosphotyrosine recognition domain SH2 of v-src complexed with tyrosine-phosphorylated peptides. Nature. 1992, 358 (6388): 646-653. 10.1038/358646a0.PubMedView ArticleGoogle Scholar
- Pawson T, Gish GD, Nash P: SH2 domains, interaction modules and cellular wiring. Trends Cell Biol. 2001, 11 (12): 504-511. 10.1016/S0962-8924(01)02154-7.PubMedView ArticleGoogle Scholar
- Liu BA, Jablonowski K, Raina M, Arce M, Pawson T, Nash PD: The human and mouse complement of SH2 domain proteins-establishing the boundaries of phosphotyrosine signaling. Mol Cell. 2006, 22 (6): 851-868. 10.1016/j.molcel.2006.06.001.PubMedView ArticleGoogle Scholar
- Pawson T, Nash P: Protein-protein interactions define specificity in signal transduction. Genes Dev. 2000, 14 (9): 1027-1047.PubMedGoogle Scholar
- Schlessinger J: SH2/SH3 signaling proteins. Curr Opin Genet Dev. 1994, 4 (1): 25-30. 10.1016/0959-437X(94)90087-6.PubMedView ArticleGoogle Scholar
- Nguyen JT, Turck CW, Cohen FE, Zuckermann RN, Lim WA: Exploiting the basis of proline recognition by SH3 and WW domains: design of N-substituted inhibitors. Science. 1998, 282 (5396): 2088-2092. 10.1126/science.282.5396.2088.PubMedView ArticleGoogle Scholar
- Cohen GB, Ren R, Baltimore D: Modular binding domains in signal transduction proteins. Cell. 1995, 80 (2): 237-248. 10.1016/0092-8674(95)90406-9.PubMedView ArticleGoogle Scholar
- Pawson T: Protein modules and signalling networks. Nature. 1995, 373 (6515): 573-580. 10.1038/373573a0.PubMedView ArticleGoogle Scholar
- Anafi M, Rosen MK, Gish GD, Kay LE, Pawson T: A potential SH3 domain-binding site in the Crk SH2 domain. J Biol Chem. 1996, 271 (35): 21365-21374. 10.1074/jbc.271.35.21365.PubMedView ArticleGoogle Scholar
- Hunter T, Plowman GD: The protein kinases of budding yeast: six score and more. Trends Biochem Sci. 1997, 22 (1): 18-22. 10.1016/S0968-0004(96)10068-2.PubMedView ArticleGoogle Scholar
- Plowman GD, Sudarsanam S, Bingham J, Whyte D, Hunter T: The protein kinases of Caenorhabditis elegans: a model for signal transduction in multicellular organisms. Proc Natl Acad Sci USA. 1999, 96 (24): 13603-13610. 10.1073/pnas.96.24.13603.PubMedPubMed CentralView ArticleGoogle Scholar
- Morrison DK, Murakami MS, Cleghon V: Protein kinases and phosphatases in the Drosophila genome. J Cell Biol. 2000, 150 (2): F57-62. 10.1083/jcb.150.2.F57.PubMedView ArticleGoogle Scholar
- Manning G, Whyte DB, Martinez R, Hunter T, Sudarsanam S: The protein kinase complement of the human genome. Science. 2002, 298 (5600): 1912-1934. 10.1126/science.1075762.PubMedView ArticleGoogle Scholar
- Biondi RM, Nebreda AR: Signalling specificity of Ser/Thr protein kinases through docking-site-mediated interactions. Biochem J. 2003, 372 (Pt 1): 1-13. 10.1042/BJ20021641.PubMedPubMed CentralView ArticleGoogle Scholar
- Kemp BE, Bylund DB, Huang TS, Krebs EG: Substrate specificity of the cyclic AMP-dependent protein kinase. Proc Natl Acad Sci USA. 1975, 72 (9): 3448-3452. 10.1073/pnas.72.9.3448.PubMedPubMed CentralView ArticleGoogle Scholar
- Zetterqvist O, Ragnarsson U, Humble E, Berglund L, Engstrom L: The minimum substrate of cyclic AMP-stimulated protein kinase, as studied by synthetic peptides representing the phosphorylatable site of pyruvate kinase (type L) of rat liver. Biochem Biophys Res Commun. 1976, 70 (3): 696-703. 10.1016/0006-291X(76)90648-3.PubMedView ArticleGoogle Scholar
- Peri S, Navarro JD, Kristiansen TZ, Amanchy R, Surendranath V, Muthusamy B, Gandhi TK, Chandrika KN, Deshpande N, Suresh S: Human protein reference database as a discovery resource for proteomics. Nucleic Acids Res. 2004, D497-501. 10.1093/nar/gkh070. 32 DatabaseGoogle Scholar
- Diella F, Cameron S, Gemund C, Linding R, Via A, Kuster B, Sicheritz-Ponten T, Blom N, Gibson TJ: Phospho.ELM: a database of experimentally verified phosphorylation sites in eukaryotic proteins. BMC Bioinformatics. 2004, 5: 79-10.1186/1471-2105-5-79.PubMedPubMed CentralView ArticleGoogle Scholar
- Pruitt KD, Tatusova T, Maglott DR: NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2005, D501-504. 33 DatabaseGoogle Scholar
- O'Brien KP, Remm M, Sonnhammer EL: Inparanoid: a comprehensive database of eukaryotic orthologs. Nucleic Acids Res. 2005, D476-480. 33 DatabaseGoogle Scholar
- Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN: The COG database: an updated version includes eukaryotes. BMC Bioinformatics. 2003, 4: 41-10.1186/1471-2105-4-41.PubMedPubMed CentralView ArticleGoogle Scholar
- Obradovic Z, Peng K, Vucetic S, Radivojac P, Brown CJ, Dunker AK: Predicting intrinsic disorder from amino acid sequence. Proteins. 2003, 53 (Suppl 6): 566-572. 10.1002/prot.10532.PubMedView ArticleGoogle Scholar
- Peng K, Vucetic S, Radivojac P, Brown CJ, Dunker AK, Obradovic Z: Optimizing long intrinsic disorder predictors with protein evolutionary information. J Bioinform Comput Biol. 2005, 3 (1): 35-60. 10.1142/S0219720005000886.PubMedView ArticleGoogle Scholar
- Melamud E, Moult J: Evaluation of disorder predictions in CASP5. Proteins. 2003, 53 (Suppl 6): 561-565. 10.1002/prot.10533.PubMedView ArticleGoogle Scholar
- Kashiwada M, Giallourakis CC, Pan PY, Rothman PB: Immunoreceptor tyrosine-based inhibitory motif of the IL-4 receptor associates with SH2-containing phosphatases and regulates IL-4-induced proliferation. J Immunol. 2001, 167 (11): 6382-6387.PubMedView ArticleGoogle Scholar
- Williams RM, Obradovi Z, Mathura V, Braun W, Garner EC, Young J, Takayama S, Brown CJ, Dunker AK: The protein non-folding problem: amino acid determinants of intrinsic order and disorder. Pac Symp Biocomput. 2001, 89-100.Google Scholar
- Campen AWR, Brown CJ, Uversky VN, Dunker AK: TOP-IDP-Scale: A new amino acid scale measuring propensity for intrinsic disorder. Protein and Peptide Letters. 2008,Google Scholar
- Brown CJ, Takayama S, Campen AM, Vise P, Marshall TW, Oldfield CJ, Williams CJ, Dunker AK: Evolutionary rate heterogeneity in proteins with long disordered regions. J Mol Evol. 2002, 55 (1): 104-110. 10.1007/s00239-001-2309-6.PubMedView ArticleGoogle Scholar
- Pinna LA, Ruzzene M: How do protein kinases recognize their substrates?. Biochim Biophys Acta. 1996, 1314 (3): 191-225. 10.1016/S0167-4889(96)00083-3.PubMedView ArticleGoogle Scholar
- Vucetic S, Obradovic Z, Vacic V, Radivojac P, Peng K, Iakoucheva LM, Cortese MS, Lawson JD, Brown CJ, Sikes JG: DisProt: a database of protein disorder. Bioinformatics. 2005, 21 (1): 137-140. 10.1093/bioinformatics/bth476.PubMedView ArticleGoogle Scholar
- Blom N, Gammeltoft S, Brunak S: Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. J Mol Biol. 1999, 294 (5): 1351-1362. 10.1006/jmbi.1999.3310.PubMedView ArticleGoogle Scholar
- Blom N, Sicheritz-Ponten T, Gupta R, Gammeltoft S, Brunak S: Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence. Proteomics. 2004, 4 (6): 1633-1649. 10.1002/pmic.200300771.PubMedView ArticleGoogle Scholar
- Kim JH, Lee J, Oh B, Kimm K, Koh I: Prediction of phosphorylation sites using SVMs. Bioinformatics. 2004, 20 (17): 3179-3184. 10.1093/bioinformatics/bth382.PubMedView ArticleGoogle Scholar
- Zhou FF, Xue Y, Chen GL, Yao X: GPS: a novel group-based phosphorylation predicting and scoring method. Biochem Biophys Res Commun. 2004, 325 (4): 1443-1448. 10.1016/j.bbrc.2004.11.001.PubMedView ArticleGoogle Scholar
- Xue Y, Li A, Wang L, Feng H, Yao X: PPSP: prediction of PK-specific phosphorylation site with Bayesian decision theory. BMC Bioinformatics. 2006, 7: 163-10.1186/1471-2105-7-163.PubMedPubMed CentralView ArticleGoogle Scholar
- Li T, Li F, Zhang X: Prediction of kinase-specific phosphorylation sites with sequence features by a log-odds ratio approach. Proteins. 2008, 70 (2): 404-414. 10.1002/prot.21563.PubMedView ArticleGoogle Scholar
- Dunker AK, Uversky VN: Signal transduction via unstructured protein conduits. Nat Chem Biol. 2008, 4 (4): 229-230. 10.1038/nchembio0408-229.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.