Identification of arginine- and lysine-methylation in the proteome of Saccharomyces cerevisiae and its functional implications
BMC Genomics volume 11, Article number: 92 (2010)
The methylation of eukaryotic proteins has been proposed to be widespread, but this has not been conclusively shown to date. In this study, we examined 36,854 previously generated peptide mass spectra from 2,607 Saccharomyces cerevisiae proteins for the presence of arginine and lysine methylation. This was done using the FindMod tool and 5 filters that took advantage of the high number of replicate analysis per protein and the presence of overlapping peptides.
A total of 83 high-confidence lysine and arginine methylation sites were found in 66 proteins. Motif analysis revealed many methylated sites were associated with MK, R GG/R XG/R GX or WXXXR motifs. Functionally, methylated proteins were significantly enriched for protein translation, ribosomal biogenesis and assembly and organellar organisation and were predominantly found in the cytoplasm and ribosome. Intriguingly, methylated proteins were seen to have significantly longer half-life than proteins for which no methylation was found. Some 43% of methylated lysine sites were predicted to be amenable to ubiquitination, suggesting methyl-lysine might block the action of ubiquitin ligase.
This study suggests protein methylation to be quite widespread, albeit associated with specific functions. Large-scale tandem mass spectroscopy analyses will help to further confirm the modifications reported here.
The methylation of proteins is of increasing biological interest. It is predominantly found on lysine and arginine residues, but has also been found on histidine, glutamic acid and on the carboxyl groups of proteins (reviewed in Grillo and Colombatto 2005) . Methylation of lysine involves the addition of one to three methyl groups on the amino acid's ε-amine group, to form mono-, di- or tri-methyllysine. Its function is best understood in histones. Methylation on the tails of histone proteins, in conjunction with acetylation and phosphorylation, controls their interaction with other proteins, affects chromatin compaction and the up- or down-regulation of gene expression . For S. cerevisiae, lysine methylation is found in histone H3 and histone H4 . Tri-methylation at H3K4 and H3K36 is positively correlated with gene activity , while H3K79 are involved in gene silencing [5, 6]. Histone H3K79 methylation is evolutionarily conserved and is involved in several pathways, including Sir protein-mediated heterochromatic gene silencing . meiotic checkpoint control  and in the G1 and S phase DNA damage checkpoint functions of Rad9p [9, 10]. While studies of lysine methylation have mainly focused on histone proteins, several non-histone proteins are also known to be lysine-methylated. They are mainly ribosomal proteins or proteins involved in protein translation , and include Rpl12p [12, 13], Rpl23p [12, 14], Rpl42p , and eEF1Ap .
The methylation of arginine involves the addition of one or two methyl groups to the amino acid's guanidino group, forming mono- or di-methylarginine. It is predominantly known to be associated with RNA regulation and processing . In S. cerevisiae, Hmt1p is a type 1 arginine methyltransferase that catalyses the formation of mono- and asymmetric di-methylarginine. This enzyme is known to methylate a number of proteins that contain an RGG-motif; these include Npl3p, Hrp1p, Nab2p, Gar1p, Nop1p, Nsr1p, Yra1p, Sbp1p, and Hrb1p. These proteins have been implicated in poly(A)+ mRNA binding, processing and export , ribosome biogenesis [18–20] and gene silencing . Moreover, methylation is required for the nuclear export of RNA binding proteins Npl3p, Hrp1p, and Nab2p [22, 23]. The repeated RGG-motif was known as a RNA-binding motif , and this also supports the role of arginine methylation in the regulation of mRNA binding . The methylation of nuclear shuttling proteins is suggested to weaken their binding with cargo proteins and disrupt their export from the nucleus . Arginine methylation is also known to facilitate or block protein-protein interactions. Arginine methylation of SmB protein facilitates the binding of tudor domains in SMN, SPF30, and TDRD3 proteins . In contrast, arginine methylation of Sam68 blocks the interaction of nearby proline-rich motif with an SH3 domain, but not to a WW domain . More examples on methylarginine-regulated interactions are reviewed in McBride and Silver (2001)  and Bedford and Clarke (2009) .
There have been several studies to identify arginine or lysine-methylated proteins on a proteome-wide scale. In the first of these studies, arginine-methylated protein complexes were purified from HeLa cell extracts using anti-methylarginine antibodies specific against RG-rich sequences . This resulted in the identification of over 200 arginine-methylated proteins, involved in pre-mRNA processing, protein translation, and DNA transcription. However the actual methylation sites on these proteins remain unknown . The second study utilised stable isotope labelling by amino acid in cell culture (SILAC), in which [13CD3]methionine was converted to [13CD3]S-adenosyl methionine, the substrate for arginine and lysine methylation . Advantages of this method included increased confidence of identification, a capacity to distinguish between trimethylation and acetylation which are near-isobaric, and the ability to quantify the relative changes in methylation status of a protein between two samples. In combination with anti-methyllysine and anti-methylarginine antibody immunoprecipitation techniques, Ong et al. (2004)  was able to identify methylation on histones from HeLa cell extracts, such as on histone H3K27. Around 30 other proteins were also found to be methylated at RG-rich motifs and most of these proteins are RNA binding or associated with mRNA processing pathways. The third study used anti-methyllysine antibodies to search for organ-specific lysine methylation in Mus musculus . Proteomic analysis of brain tissue extract by 2-D PAGE, western blotting, and MALDI-ToF peptide mass fingerprinting identified the following lysine-methylated proteins: neurofilament triplet-I protein, Hsc70 protein, creatine kinase, α-tubulin, α-actin, β-actin, and γ-actin. Furthermore, α-actin and creatine kinase were found to be methylated in muscle tissue.
The use of tandem mass spectrometry to discover new protein post-translational modifications is common . However, peptide mass fingerprinting can also be used to search for new PTM sites . The FindMod program  caters for this approach. It requires peptide mass spectra from a mostly pure protein, for example a spot from 2-D gel, and examines experimental peptide masses for differences in mass with theoretical peptides for that protein that correspond to post-translational modification. Peptides that are potentially modified are checked to see if they contain amino acids that can carry the modification. Where very high accuracy peptide mass measurements can be made, for example with new instruments like the prOTOF2000, high confidence predictions are possible. Parent-ion masses from tandem mass spectrometry data can also be used in FindMod, where it may serve as an initial screen for PTMs before employing more sophisticated and computationally expensive methods [36, 37].
Here we describe a strategy for the discovery of methylation on a global scale, using peptide mass fingerprinting data, and implement this to search for methylated lysine and arginine residues in the yeast proteome. A proteome-scale set of MALDI-ToF mass spectra  was analysed for putative methylated peptides. The application of 5 filters yielded high-confidence methylation sites that were then further investigated to understand where they are found in protein sequences and their likely function.
Large-scale methylation discovery in yeast peptide mass spectra
FindMod was used to analyse peptide mass spectra for 2,607 yeast proteins out of a total ~6,500 (representing 40% of the total proteome) for the presence of mono- and di-methylation. A tailor-made mass tolerance was calculated for each spectrum to reduce spurious peptide matches; the average of this for all spectra was ± 0.04 Da. Of all the 24,105 FindMod queries, there were 17,471 matches to potentially methylated peptides (Figure 1). Five filtering strategies, used sequentially, were then applied to this set to find methylation sites of very high confidence. The first filter removed peptides that matched to unmodified peptide sequences as these peptide masses are likely to be unmodified peptides. Conversely, peptides masses that did not match to unmodified peptide sequences are likely to be modified, and these were analysed with the second filter. The second filter removed any peptides that contained D or E residues, as artifactual methylation may result from partial methyl esterification of D or E residues . The third filter was designed to take advantage of redundancy within each FindMod output, by removing one-off or spurious mass spectra. It searched for modifications that were found in two or more overlapping peptides (Figure 2), and took advantage of the reduced efficiency of tryptic cleavage at methylated residues , where overlapping peptides with missed cleavages were likely to be found. The fourth filter reduced FindMod false positives by considering whether modifications found by FindMod were unambiguous or ambiguous. An unambiguous modification had only one FindMod match against one query peptide mass (Additional File 1), an ambiguous modification had more than one match against a query mass (Additional File 1). For the peptide to be included in the final set of methylated peptides, at least one peptide in the overlapping peptides had to be an unambiguous peptide match. The use of these 4 filters resulted in 169 high confidence methylated peptides, from 17,471 initial low confidence matches (Figure 1).
While overlapping peptides helped localise methylation sites to one or more peptides, they did not necessarily localise the methylation to one amino acid. To address this, we used a fifth filter. When two or more modified peptides that passed filters 1-4 were also found to overlap and share the same modification site, the modification was classified as high confidence and kept. Note that any results for lysine trimethylation were discarded from the study since it is near-isobaric to lysine acetylation. From this filtering process, we found 40 lysine-methylated proteins with 45 lysine methylation sites: 25 with mono- and 20 with di-methylation. Similarly, we found 31 arginine-methylated proteins with 38 arginine methylation sites: 20 with mono- and 18 with di-methylation. There were 5 proteins that contained both arginine and lysine methylation. The list of high confidence methylated proteins and methylation sites are shown in Table 1, additional information on these high confidence methylated peptides and methylation sites are shown in Additional file 2 and Additional file 3 correspondingly.
Confirmation of FindMod protein methylation
To establish the accuracy of our methylation discovery approach, we theoretically digested all known methylated proteins in Swiss-Prot and analysed the resulting peptides with our FindMod approach. We supplemented this with a larger set of theoretically methylated proteins. The average true positive rate for FindMod at 0.04 Da was 89%. For methylation sites in Swiss-Prot, FindMod had a true positive rate of 100% for monomethyl-K, 98% for dimethyl-K, and 76% for dimethyl-R (Table 2a). The true positive rate for monomethyl-R could not be accurately estimated since the number of test cases was insufficient for accurate evaluation. Similarly, the true positive rate for the artificial methylation set was 78% for monomethyl-K, 89% for dimethyl-K, and 90% for both monomethyl-R and dimethyl-R (Table 2b). Additional results for the evaluation of the true positive rate of FindMod are shown in Additional file 4.
To further assess the accuracy of the FindMod approach, methylation sites discovered by FindMod were cross-referenced with known methylation sites in the literature and databases. Whilst only a small number of proteins are documented as methylated in the literature, we confirmed 3 proteins (Ssb1p, Ssb2p, Tub2p) as methylated (Table 3). If we included methyl-lysine sites in peptides containing D and E, we also confirmed the methylation of Tef1p and Rpl23p. This included 3 lysine methylation sites (K30, K79, and K390) from Tef1p, and 1 lysine dimethylation site (K110) from Rpl23p  (Additional file 5). Furthermore, we found 15 methylated ribosomal proteins in S. cerevisiae, consistent with the presence of methylation sites in ribosomal proteins of eukaryotes, such as S. cerevisiae [12–15, 40], S. pombe , A. thaliana , and human [43–46].
Discovery rate of methylated peptides, unmodified peptides, and lysine and arginine-methylated residues
The discovery rate of a peptide is the frequency of protein identifications in which a particular peptide is observed. Methylated peptides with low discovery rates are likely to be sub-stoichiometric and partially methylated. It was predicted that there should be many more unmodified peptides than methylated peptides, and that methylated peptides will have a lower discovery rate since they are likely to be sub-stoichiometric. The discovery rate of high confidence methylated peptides was found to be significantly lower than that of unmodified peptides (p < 0.0001). The median discovery rate for unmodified peptides was 0.50, and the median value for arginine and lysine methylated peptides was 0.03. To check that the lower discovery rate of methylated residues was not due to differences in peptide ionisation efficiency, we examined if there was a correlation between the discovery rates of methylated and unmodified residues. In the set of results, there were 69 methylated residues for which the corresponding unmodified residues were also seen. The discovery rate of methylated residues was significantly but weakly correlated with the discovery rate of matching unmodified residues (Kendall's τ = 0.22, p < 0.01), consistent with expected. A list of methylated proteins and the methylation sites discovered by FindMod is shown in Table 1. The discovery rate of all high confidence methylated peptides and methylation sites are shown in Additional file 2 and Additional file 3 correspondingly.
Biological function, sub-cellular localization, abundance and half-life of methylated proteins
Methylated proteins are known to be involved in several pathways, such as translation  and RNA processing . To investigate the function of the methylated proteins from yeast, gene ontology (GO) annotations for all yeast methylated proteins from FindMod analysis and Swiss-Prot were compared to non-methylated yeast proteins (Table 4). It was found that a number of biological processes were enriched with very high statistical significance, specifically translation, ribosome biogenesis and assembly, RNA metabolic process, and organelle organization and biogenesis. The molecular function of structural activity, RNA binding, and translation regulator activity were also significantly enriched. As may be expected from the above, methylated proteins were significantly enriched in the cellular components of the ribosome and cytoplasm.
Protein abundance data from Ghaemmaghami et al. (2003)  was used to compare the abundance of methylated proteins to non-methylated proteins. This was used to determine if lower abundance proteins, more likely to be involved in signal transduction and regulation , are methylated. Methylated proteins were found to have a higher median abundance of 11500, as compared to non-methylated proteins, which had a median abundance of 2220 (p < 0.0001, Figure 3a). Despite this, several methylated proteins of low abundance were seen including 5 proteins of less than 1000 molecules per cell. These included Snf2p (217 copies/cell), Snu114p (300 copies/cell), Mrpl20p (358 copies/cell), and Rpl3p (450 copies/cell). Examples of proteins with high abundance are Rp1Bp (265,000 copies/cell) and Tdh3p (169,000 copies/cell).
The methylation of lysine residues has been suggested to block their ubiquitination, leading to a longer protein half-life . To investigate this possibility, protein half-life data from Belle et al. (2006)  was used to compare the half-life of lysine-methylated proteins to non-methylated proteins. Interestingly, we found methylated proteins had a longer median half-life of 66 minutes, as compared to 43 minutes for non-methylated proteins (p = 0.012, Figure 3b). A striking difference between the methylated and non-methylated proteins was the absence of a group of proteins with very short half-life (see arrow in Figure 3b). Despite this, our approach also identified 18 methylated proteins with half-life less than 60 minutes. Examples of methylated proteins with shorter half-lives are Rrp5p (15 minutes), Ski3p (32 minutes), and Snu114p (52 minutes). Examples of proteins with long half-life are Utp22p (13,266 minutes) and Atp2p (6,627 minutes), although we note that these numbers may be erroneous estimations in the Belle et al. study (2006) . Although the abundance and half-lives of methylated proteins could be analysed more precisely by comparing methylated proteins to other proteins from the same GO slim biological process, this approach was limited by the relatively small number of methylated proteins (66 proteins) in the dataset. Methylated proteins mapped to 33 gene ontology biological process categories, with an average of 2 proteins per category, which was unsuitable for appropriate statistical analyses.
Interplay of methylation and other post-translational modifications
To see if lysine methylation might block ubiquitination, the Ubipred software  was used to predict if known methylated lysine sites are also subject to ubiquitination. The Ubipred software has an accuracy of 84.4% and is thus sufficiently reliable for this test. It was found that 43% of high-confidence lysine methylation sites were also predicted to be ubiquitination sites. This result lends support to the hypothesis that methylation might block ubiquitination, potentially prolonging the half-life of lysine-methylated proteins.
It has recently been reported that the methylation of arginine can regulate the phosphorylation (or dephosphorylation) of some proteins [55–61]. To investigate whether there is evidence of interplay between arginine methylation and phosphorylation in S. cerevisiae, we examined the proportion of arginine-methylated proteins that are known to be phosphorylated in databases and in the literature. It was found that 94% (30/32) of arginine-methylated yeast proteins are known to be phosphorylated. This is a considerable increase over the 38% (2,548/6,709) of all S. cerevisiae proteins known to be phosphorylated and suggests a possible interplay of arginine methylation and phosphorylation [55–61].
Arginine and lysine methylation motifs
To determine if methylation sites are enriched in specific sequence-motifs, all yeast methylation sites from FindMod analysis and the Swiss-Prot database were analysed to find enriched sequence-motifs. Methionine was found to be at position -1 from lysine methylation in 5 FindMod sites and two additional methylation sites previously documented in S. cerevisiae (Table 5). This presence of methylation was of very high statistical significance (p = 1.18 × 10-6) as compared to that expected in any random sequence of yeast proteins. By contrast, residues found to be significantly enriched adjacent to arginine methylation included W at position -4 (p = 1.50 × 10-7), and G at position -3 (p = 6.08 × 10-6). While it was previously known that arginine methylation is found in RGG motifs, Wooderchak et al. (2008)  showed that arginine methylation is also found in RXG and RGX motifs. No known S. cerevisiae methylation sites documented in Swiss-Prot contained the R GG, R GX, or R XG-motifs. However, FindMod found 7 methylation sites with the R XG or R GX motifs. Two methylation sites, Tdh3p dimethyl-R11 and Rpl4Ap dimethyl-R84, matched to the GXXR XG motif, which conforms with the known R XG motif and the additional GXXR motif found in this study. Three methylation sites had the novel WXXXR motif.
Large-scale discovery of lysine and arginine methylation sites
In this study, 45 lysine methylation sites and 38 arginine methylation sites were identified in 66 proteins in the S. cerevisiae proteome. These include 4 proteins previously known to be methylated in yeast or in other organisms and 15 proteins that are functionally related to others known to be methylated. Our findings support earlier studies [31–33] that suggested methylation to be quite widespread. Whilst many of our methylation sites are novel and have not been confirmed by MS-MS, the filters and replicate analyses we used in association with the FindMod tool provided a robust means by which protein methylation could be detected. The false positive rate was estimated to be 11% at 0.04 Da mass error. Notwithstanding this, it should be noted that whilst we did study 2,607 proteins from yeast, this is only ~40% of the total yeast proteome. Therefore, we expect that up to 60% of methylated proteins would have been missed. Further methylation sites may have been missed due to difficulties in mass spectrometric detection; an example is methylarginine, which is often found in arginine- and glycine-rich regions that produce tryptic peptides that are too small for routine MALDI-TOF analysis.
Discovery rates may reflect the sub-stoichiometric nature of methylation
Previous research has highlighted that methylated peptides are difficult to discover  and this is made more difficult because methylation is sub-stoichiometric . For example, sub-stoichiometric levels of methylations were observed in the human heterogeneous nuclear ribonucleoprotein K (hnRNP K), in which < 33% of hnRNP K were asymmetrically dimethylated at R303, and < 10% were monomethylated at R287 . Our results from FindMod analysis support these observations since the proportion of methylated peptides seen for any protein was very low. The sub-stoichiometric nature of methylation events was also supported by a weak but significant correlation between the discovery rates of modified and unmodified paired peptides. However, there may be explanations, other than biological, for the lower discovery rate of modified peptides. These included inefficient trypsin cleavage which occurs C-terminal to methylated lysine and arginine residues  and differences in MALDI-ToF ionisation of the methylated peptides as seen with different proteotypic peptides .
Methylated proteins are involved in specific biological functions and processes, are higher in abundance and have longer half-life
Methylated proteins were found to be enriched for specific biological processes, molecular functions and sub-cellular localizations. Firstly, methylated proteins were enriched in translation, ribosome biogenesis and assembly. This is consistent with previous studies in which methylated proteins have been linked to translation in Escherichia coli, S. cerevisiae, and Schizosaccharomyces pombe . Ribosomal proteins are also known to show lysine or arginine methylation, for example the ribosomal proteins L10a, L12, and L26a of Arabidopsis . Secondly, the methylated proteins described here were found to be involved in RNA metabolic processes and are involved in RNA binding. This is consistent with the function of several proteins known to be methylated at RG-rich motifs . The methylation of arginine in RG-rich motifs is conserved in human, and their RNA binding activity is also conserved . One such example is the fragile X mental retardation protein (FMRP) . Thirdly, our methylated proteins were enriched in the ribosome and the cytoplasm. This is consistent with the sites of translation and association with RNA inside the cell [22, 23]. Whilst the lack of methylated proteins enriched in the nucleus and nucleolus was not expected, these may have arisen due to our reduced set of proteins for analysis (40% of the yeast proteome). In addition, nuclear proteins such as histone and Npl3p are known to have peptides with multiple modification sites but these were not searched for in this study. Methylated proteins found in this study were significantly higher in abundance than proteins currently known to be non-methylated. This is partly explained by ribosomal proteins and proteins involved in translation, some of which we found to be methylated, being of very high abundance [50, 64]. Methylated proteins were also found to be of longer average half-life. This may be due to their role in translation , where ribosomal proteins are generally stable .
Interplay of methylation and other post-translational modifications
The methylation of lysine is known to block the action of ubiquitin ligase , preventing proteins from degradation via the ubiquitin/proteasome system [52, 66]. Our observation of a distinct group of low half-life proteins in S. cerevisiae, none of which were methylated, suggests that lysine methylation might be on many proteins and prevent their ubiquitination. The limited number of ubiquitination sites currently known on yeast proteins [67, 68] makes it currently difficult to check if lysine methylation, as found in this study, is found on residues that can also be poly-ubiquitinated. However, our prediction of putative ubiquitination sites  showed that 43% of the lysine methylation sites in 40 proteins may be ubiquitinated.
Several studies suggested that there is interplay between arginine methylation and phosphorylation of some proteins [55–61]. Arginine methylation may antagonise phosphorylation [56, 57], act as a switch to enable the binding of phosphatase to encourage dephosphorylation , or encourage phosphorylation . On the other hand, phosphorylation can either interfere with arginine methylation [58, 61], or promote the recruitment of arginine methyltransferase . We found that the majority of arginine-methylated proteins in our study (30 out of 32 or 94%) are known from the literature to be phosphorylated, suggesting an interplay between arginine methylation and phosphorylation in these proteins. However these arginine methylation and phosphorylation sites were not necessarily directly adjacent in the protein sequence.
Arginine and lysine methylation motifs
Motif analysis showed that many methylation sites described here conform with previously known motifs. For example, 7 arginine methylation sites discovered by FindMod conformed with the known R XG and R GX motifs . Arginine methylation sites were also enriched in GXXR motifs, which correlated with the enrichment of glycine residues nearby arginine methylation sites . In addition, two experimentally verified methylated sites in Pfk2p and Rpl23Ap annotated in Swiss-Prot along with 5 FindMod sites suggests the existence of a MK lysine methylation motif. The discovery of the novel enriched methylation motif WXXXR supports the possibility that there are more methylation sites to be found in S. cerevisiae. These also raise an interesting question concerning which motifs are methylated by specific methyltransferases. Methyltransferases responsible for most methylation sites are also unknown (e.g. Tef1p K30, Pfk2p K180), and the function of several methyltransferase proteins in S. cerevisiae remain poorly characterized . Therefore, more experiments are required to elucidate the function of methylation in S. cerevisiae.
This study is a step towards the definition of the methyl proteome of S. cerevisiae. It will be useful to guide future experiments on its predominance and role in the cell. For example, experiments are needed to elucidate the function of methylation and how each site is regulated, which with the exception of histone methylation is largely unknown. Secondly, experiments to investigate whether methylation sites overlap with poly-ubiquitination sites, and therefore prevent protein degradation via the ubiquitin/proteasome pathway could be undertaken. Thirdly, it will be important to understand whether the functions of methylated proteins are co-regulated by ubiquitination, phosphorylation or other post-translational modifications. Finally, the ultimate goal in studying methylation should be to build networks of methylated proteins, their interaction partners and modifying enzymes to elucidate their dynamics as a system, similar to previous work on protein phosphorylation [70–72].
MALDI-ToF mass spectra for S. cerevisiae
This study employed MALDI-ToF peptide mass fingerprinting spectra from the large-scale characterization of protein complexes in S. cerevisiae . There were 36,854 peptide mass spectra containing 1.2 million empirical masses, with an average mass error of 0.02 Da. These were from 2,607 proteins out of ~6,500 proteins (40%) in the yeast proteome, whereby each protein had an average of 11 spectra or at least 3 spectra. Peptide masses corresponding to unmodified peptides or tryptic peptides of porcine trypsin were removed, as were peptides less than 500 Da.
Tailor-made mass tolerance for each empirical spectrum
An error threshold was calculated for each of the 36,854 spectra; this was possible as the identity of all proteins was known. For each spectrum, the mass differences between the empirical and theoretical mass of all known unmodified peptides were calculated. The average and median mass tolerance was 0.04 Da. To ensure high accuracy of methylation discovery, only spectra with a mass error (Additional file 6) that was lower than 0.1 Da were used for the identification of methylation sites.
FindMod analysis of yeast proteins
Each peptide mass spectra was analysed with FindMod . A bulk submission web interface to FindMod was developed http://ca.expasy.org/tools/findmod/findmod_batch.html. Each FindMod query used the UniProt accession number for the protein identified through peptide mass fingerprinting (from Gavin et al., 2006) , the experimental peptide masses for this protein and the tailor-made mass tolerance in Da. Other FindMod parameters included the use of monoisotopic mass, a maximum of 1 missed cleavage by trypsin, no amino acid substitutions, that the peptides were M+H+ and could contain oxidised methionine or tryptophan. The peptide masses were matched to theoretical peptides generated from the precursor sequence. The program searched for 71 types of post-translational modifications in all experimental peptide masses , including mono-, di-, and tri-methylation http://www.expasy.ch/tools/findmod/findmod_masses.html. Matches to 6 types of modifications were removed from the analyses, as they are not found in S. cerevisiae or may lead to many false positives due to their low mass; for more details see Additional file 6. The Swiss-Prot database version 51.6 and TrEMBL version 34.6  were used for the FindMod matches.
Filters to remove low quality methylation sites
For the methylated peptides to be included in the analysis, they needed to pass the following 5 filters. The peptides 1) cannot be an unmodified peptide, 2) had to contain no Asp or Glu residues, and 3) have no or one missed tryptic cleavage. In addition, 4) the peptide must have two or more overlapping peptides and at least one peptide in the overlapping peptides had to be an unambiguous peptide match. 5) When two or more modified peptides that passed filters 1-4 were also found to overlap and share the same modification site, the modification was classified as high confidence and kept. The use of overlapping peptides to improve the reliability of methylation site is facilitated by methylation sites found at the C-terminus of peptides. Trypsin cleavage at methylated arginine and lysine has been observed in many LC-MS/MS experiments [32, 74–77], and is less efficient than at non-methylated residues. A list of tryptic peptides with C-terminal methylated amino acids, identified by LC-MS-MS, is shown in Additional file 7.
Calculation of discovery rate
The discovery rate for an unmodified peptide was calculated as the fraction of protein identifications in which the unmodified peptide is observed. In the case of duplicated genes, the counts of protein identifications were summed together because peptide mass fingerprinting cannot distinguish between proteins that do not differ in primary sequence. The discovery rate for a particular unmodified residue in the protein was calculated as the sum of the discovery rate of all the unmodified peptides that contain the residue. Discovery rates were also calculated for modified methylated peptides and methylated residues using the method as described above. Partially methylated peptides are likely to have a low discovery rate. While mass spectra with a maximum mass tolerance of 0.1 Da were used for finding the methylation sites to limit the false positive rate, all available mass spectra with a mass tolerance of up to 1.5 Da were used for the calculation of discovery rate. That is because more mass spectra were needed to increase the sample size for discovery rate calculation.
Evaluation of the true positive and false positive rate
Swiss-Prot entries with known lysine and arginine methylation sites were obtained from Swiss-Shop http://au.expasy.org/swiss-shop/, for Swiss-Prot release 57.2 , by searching the MOD_RES field using the keywords 'methyllysine' and 'methylarginine'. S. cerevisiae proteins sequences were downloaded from Swiss-Prot by using the query 'organism:4932'. The annotation of known methylation sites were obtained from the MOD_RES field of the Swiss-Prot entry, and type of methylation were determined from the standard RESID nomenclature . The proteins were processed into mature forms where appropriate; these contain no signal peptides, propeptides, intein regions, and only consists of protein chains annotated by the 'CHAIN' field of the Swiss-Prot entry. For each M or W residues in a peptide, the mass of methionine and tryptophan oxidation was added to the total mass of the peptide. Only methylated peptides with a maximum of one-missed cleavage and with masses between 500 and 3,000 Da were used. Since lysine trimethylation is near-isobaric to lysine acetylation, trimethylation was not included in the analysis. Two in silico test sets, the known methylation set and the artificial methylation set, were used to evaluate the true positive rate of FindMod for the discovery of mono- and di-methylation on arginine and lysine residues. The known methylation test set contained known lysine and arginine methylation sites from Swiss-Prot. The set of sequences from which methylation sites were found was non-redundant at the 90% identity level, generated using UniRef90 . This test set included 883 known mono- and di-methylation sites. The artificial mono- and di-methylation sites on lysine and arginine residues were generated by simulated methylation on theoretical unmodified peptides. The artificial test set has more data than the known methylation test set, to allow more accurate estimation of the true positive rate. Approximately 6% of lysine residues from S. cerevisiae protein sequences were randomly sampled to generate artificially methylated peptides for monomethyl-K. The sampling procedure was repeated for dimethyl-K, monomethyl-R, and dimethyl-R. The second test set was referred to as the artificial methylation set, and contained 36,594 artificial mono- and di-methylation sites.
The true positive rate of FindMod, with the 5 filters described above, was evaluated using known methylation sites and artificial methylation sites. Removal of peptides containing D or E residues were not required since no artifactual methylation on D or E residues were introduced to the in silico test sets. The true positive rate was evaluated at the mass tolerance of 0.04 Da, since this was the median mass tolerance all empirical for peptide masses . For each test set, a true positive FindMod match requires the residue, sequence position and the type of methylation to be correctly matched. The true positive rate of FindMod, was calculated as the number of true positive matches divided by the sum of the number of false positive matches and the number of true positives, represented as a percentage.
Arginine and lysine methylation motif analysis
Ten amino acid residues N-terminal and C-terminal to each methylation site were included in the motif analysis. The number of times each amino acid occurs at each of these positions was counted. For any methylation site less than 10 residues from the N- or C-terminus of the protein, positions beyond the limit of the sequence were disregarded. To measure whether an amino acid was significantly enriched at each position, a p-value was calculated using the prop.test function in the R statistical package. A one-sided statistical test was used, with an alternative hypothesis that there was an enrichment of amino acid frequency over the average frequency. Bonferroni's correction was used to correct the p-value calculated by prop.test to reduce false positives.
Functional analysis and statistical tests
Functional data co-analysed with modifications were protein abundance , protein half-life , Gene Ontology (GO) slim (from Saccharomyces Genome Database, ftp://ftp.yeastgenome.org/yeast/)  and protein complexes . Nonparametric tests were used for all statistical analyses. Protein abundance data, in copies per cell, was from Ghaemmaghammi et al. (2003) . Protein half-life data, in minutes, was from Belle et al. (2006) . To investigate if lysine methylation might block ubiquitination, the Ubipred software  was used to predict if known methylated lysine sites are also subject to ubiquitination. To investigate if arginine methylated proteins were co-regulated by phosphorylation, the Swiss-Prot database release 57.2  was examined to see if methylated proteins also had experimentally determined protein threonine, serine, and tyrosine phosphorylation sites. Mann-Whitney tests, a non-parametric substitute for Student's t-test, were used to compare between two samples. Kendall's correlation coefficient, a non-parametric substitute for Pearson's correlation coefficient, was used to measure the significance of the correlation between two samples. GO slim term enrichment was assessed using Fisher's exact test and Bonferroni correction . All statistical analyses were performed using the R statistical package version 2.2.1 .
liquid chromatography tandem mass spectrometry
Matrix assisted laser desorption ionisation - time of flight
Grillo MA, Colombatto S: S-adenosylmethionine and protein methylation. Amino Acids. 2005, 28 (4): 357-362. 10.1007/s00726-005-0197-6.
Strahl BD, Allis CD: The language of covalent histone modifications. Nature. 2000, 403 (6765): 41-45. 10.1038/47412.
Garcia BA, Hake SB, Diaz RL, Kauer M, Morris SA, Recht J, Shabanowitz J, Mishra N, Strahl BD, Allis CD, Hunt DF: Organismal differences in post-translational modifications in histones H3 and H4. The Journal of biological chemistry. 2007, 282 (10): 7641-7655. 10.1074/jbc.M607900200.
Pokholok DK, Harbison CT, Levine S, Cole M, Hannett NM, Lee TI, Bell GW, Walker K, Rolfe PA, Herbolsheimer E, Zeitlinger J, Lewitter F, Gifford DK, Young RA: Genome-wide map of nucleosome acetylation and methylation in yeast. Cell. 2005, 122 (4): 517-527. 10.1016/j.cell.2005.06.026.
Briggs SD, Xiao T, Sun ZW, Caldwell JA, Shabanowitz J, Hunt DF, Allis CD, Strahl BD: Gene silencing: trans-histone regulatory pathway in chromatin. Nature. 2002, 418 (6897): 498-10.1038/nature00970.
Frederiks F, Tzouros M, Oudgenoeg G, van Welsem T, Fornerod M, Krijgsveld J, van Leeuwen F: Nonprocessive methylation by Dot1 leads to functional redundancy of histone H3K79 methylation states. Nature structural & molecular biology. 2008, 15 (6): 550-557. 10.1038/nsmb.1432.
van Leeuwen F, Gafken PR, Gottschling DE: Dot1p modulates silencing in yeast by methylation of the nucleosome core. Cell. 2002, 109 (6): 745-756. 10.1016/S0092-8674(02)00759-6.
San-Segundo PA, Roeder GS: Role for the silencing protein Dot1 in meiotic checkpoint control. Mol Biol Cell. 2000, 11 (10): 3601-3615.
Giannattasio M, Lazzaro F, Plevani P, Muzi-Falconi M: The DNA damage checkpoint response requires histone H2B ubiquitination by Rad6-Bre1 and H3 methylation by Dot1. The Journal of biological chemistry. 2005, 280 (11): 9879-9886. 10.1074/jbc.M414453200.
Wysocki R, Javaheri A, Allard S, Sha F, Cote J, Kron SJ: Role of Dot1-dependent histone H3 methylation in G1 and S phase DNA damage checkpoint functions of Rad9. Molecular and cellular biology. 2005, 25 (19): 8430-8443. 10.1128/MCB.25.19.8430-8443.2005.
Polevoda B, Sherman F: Methylation of proteins involved in translation. Mol Microbiol. 2007, 65 (3): 590-606. 10.1111/j.1365-2958.2007.05831.x.
Lhoest J, Lobet Y, Costers E, Colson C: Methylated proteins and amino acids in the ribosomes of Saccharomyces cerevisiae. Eur J Biochem. 1984, 141 (3): 585-590. 10.1111/j.1432-1033.1984.tb08233.x.
Porras-Yakushi TR, Whitelegge JP, Clarke S: A novel SET domain methyltransferase in yeast: Rkm2-dependent trimethylation of ribosomal protein L12ab at lysine 10. The Journal of biological chemistry. 2006, 281 (47): 35835-35845. 10.1074/jbc.M606578200.
Porras-Yakushi TR, Whitelegge JP, Clarke S: Yeast ribosomal/cytochrome c SET domain methyltransferase subfamily: identification of Rpl23ab methylation sites and recognition motifs. The Journal of biological chemistry. 2007, 282 (17): 12368-12376. 10.1074/jbc.M611896200.
Itoh T, Wittmann-Liebold B: The primary structure of protein 44 from the large subunit of yeast ribosomes. FEBS Lett. 1978, 96 (2): 399-402. 10.1016/0014-5793(78)80447-5.
Cavallius J, Zoll W, Chakraburtty K, Merrick WC: Characterization of yeast EF-1 alpha: non-conservation of post-translational modifications. Biochimica et biophysica acta. 1993, 1163 (1): 75-80.
Xu C, Henry PA, Setya A, Henry MF: In vivo analysis of nucleolar proteins modified by the yeast arginine methyltransferase Hmt1/Rmt1p. RNA. 2003, 9 (6): 746-759. 10.1261/rna.5020803.
Russell ID, Tollervey D: NOP3 is an essential yeast protein which is required for pre-rRNA processing. J Cell Biol. 1992, 119 (4): 737-747. 10.1083/jcb.119.4.737.
Kondo K, Inouye M: Yeast NSR1 protein that has structural similarity to mammalian nucleolin is involved in pre-rRNA processing. The Journal of biological chemistry. 1992, 267 (23): 16252-16258.
Lee WC, Zabetakis D, Melese T: NSR1 is required for pre-rRNA processing and for the proper maintenance of steady-state levels of ribosomal subunits. Molecular and cellular biology. 1992, 12 (9): 3865-3871.
Loo S, Laurenson P, Foss M, Dillin A, Rine J: Roles of ABF1, NPL3, and YCL54 in silencing in Saccharomyces cerevisiae. Genetics. 1995, 141 (3): 889-902.
Green DM, Marfatia KA, Crafton EB, Zhang X, Cheng X, Corbett AH: Nab2p is required for poly(A) RNA export in Saccharomyces cerevisiae and is regulated by arginine methylation via Hmt1p. The Journal of biological chemistry. 2002, 277 (10): 7752-7760. 10.1074/jbc.M110053200.
Shen EC, Henry MF, Weiss VH, Valentini SR, Silver PA, Lee MS: Arginine methylation facilitates the nuclear export of hnRNP proteins. Genes & development. 1998, 12 (5): 679-691. 10.1101/gad.12.5.679.
Kiledjian M, Dreyfuss G: Primary structure and binding activity of the hnRNP U protein: binding RNA through RGG box. The EMBO journal. 1992, 11 (7): 2655-2664.
Dolzhanskaya N, Merz G, Aletta JM, Denman RB: Methylation regulates the intracellular protein-protein and protein-RNA interactions of FMRP. J Cell Sci. 2006, 119 (Pt 9): 1933-1946. 10.1242/jcs.02882.
McBride AE, Cook JT, Stemmler EA, Rutledge KL, McGrath KA, Rubens JA: Arginine methylation of yeast mRNA-binding protein Npl3 directly affects its function, nuclear export, and intranuclear protein interactions. The Journal of biological chemistry. 2005, 280 (35): 30888-30898. 10.1074/jbc.M505831200.
Cote J, Richard S: Tudor domains bind symmetrical dimethylated arginines. The Journal of biological chemistry. 2005, 280 (31): 28476-28483. 10.1074/jbc.M414328200.
Bedford MT, Frankel A, Yaffe MB, Clarke S, Leder P, Richard S: Arginine methylation inhibits the binding of proline-rich ligands to Src homology 3, but not WW, domains. The Journal of biological chemistry. 2000, 275 (21): 16030-16036. 10.1074/jbc.M909368199.
McBride AE, Silver PA: State of the arg: protein methylation at arginine comes of age. Cell. 2001, 106 (1): 5-8. 10.1016/S0092-8674(01)00423-8.
Bedford MT, Clarke SG: Protein arginine methylation in mammals: who, what, and why. Molecular cell. 2009, 33 (1): 1-13. 10.1016/j.molcel.2008.12.013.
Boisvert FM, Cote J, Boulanger MC, Richard S: A proteomic analysis of arginine-methylated protein complexes. Mol Cell Proteomics. 2003, 2 (12): 1319-1330. 10.1074/mcp.M300088-MCP200.
Ong SE, Mittler G, Mann M: Identifying and quantifying in vivo methylation sites by heavy methyl SILAC. Nat Methods. 2004, 1 (2): 119-126. 10.1038/nmeth715.
Iwabata H, Yoshida M, Komatsu Y: Proteomic analysis of organ-specific post-translational lysine-acetylation and -methylation in mice by use of anti-acetyllysine and -methyllysine mouse monoclonal antibodies. Proteomics. 2005, 5 (18): 4653-4664. 10.1002/pmic.200500042.
Mann M, Jensen ON: Proteomic analysis of post-translational modifications. Nat Biotechnol. 2003, 21 (3): 255-261. 10.1038/nbt0303-255.
Wilkins MR, Gasteiger E, Gooley AA, Herbert BR, Molloy MP, Binz PA, Ou K, Sanchez JC, Bairoch A, Williams KL, Hochstrasser DF: High-throughput mass spectrometric discovery of protein post-translational modifications. J Mol Biol. 1999, 289 (3): 645-657. 10.1006/jmbi.1999.2794.
Bandeira N, Tsur D, Frank A, Pevzner PA: Protein identification by spectral networks analysis. Proceedings of the National Academy of Sciences of the United States of America. 2007, 104 (15): 6140-6145. 10.1073/pnas.0701130104.
Tsur D, Tanner S, Zandi E, Bafna V, Pevzner PA: Identification of post-translational modifications by blind search of mass spectra. Nat Biotechnol. 2005, 23 (12): 1562-1567. 10.1038/nbt1168.
Gavin AC, Aloy P, Grandi P, Krause R, Boesche M, Marzioch M, Rau C, Jensen LJ, Bastuck S, Dumpelfeld B, Edelmann A, Heurtier MA, Hoffman V, Hoefert C, Klein K, Hudak M, Michon AM, Schelder M, Schirle M, Remor M, Rudi T, Hooper S, Bauer A, Bouwmeester T, Casari G, Drewes G, Neubauer G, Rick JM, Kuster B, Bork P: Proteome survey reveals modularity of the yeast cell machinery. Nature. 2006, 440 (7084): 631-636. 10.1038/nature04532.
Jung SY, Li Y, Wang Y, Chen Y, Zhao Y, Qin J: Complications in the assignment of 14 and 28 Da mass shift detected by mass spectrometry as in vivo methylation from endogenous proteins. Anal Chem. 2008, 80 (5): 1721-1729. 10.1021/ac7021025.
Lee SW, Berger SJ, Martinovic S, Pasa-Tolic L, Anderson GA, Shen Y, Zhao R, Smith RD: Direct mass spectrometric analysis of intact proteins of the yeast large ribosomal subunit using capillary LC/FTICR. Proceedings of the National Academy of Sciences of the United States of America. 2002, 99 (9): 5942-5947. 10.1073/pnas.082119899.
Sadaie M, Shinmyozu K, Nakayama J: A conserved SET domain methyltransferase, Set11, modifies ribosomal protein Rpl12 in fission yeast. The Journal of biological chemistry. 2008, 283 (11): 7185-7195. 10.1074/jbc.M709429200.
Carroll AJ, Heazlewood JL, Ito J, Millar AH: Analysis of the Arabidopsis cytosolic ribosome proteome provides detailed insights into its components and their post-translational modification. Mol Cell Proteomics. 2008, 7 (2): 347-369.
Goldenberg CJ, Eliceiri GL: Methylation of ribosomal proteins in HeLa cells. Biochimica et biophysica acta. 1977, 479 (2): 220-234.
Shin HS, Jang CY, Kim HD, Kim TS, Kim S, Kim J: Arginine methylation of ribosomal protein S3 affects ribosome assembly. Biochemical and biophysical research communications. 2009, 385 (2): 273-278. 10.1016/j.bbrc.2009.05.055.
Scolnik PA, Eliceiri GL: Methylation sites in HeLa cell ribosomal proteins. Eur J Biochem. 1979, 101 (1): 93-101. 10.1111/j.1432-1033.1979.tb04220.x.
Swiercz R, Person MD, Bedford MT: Ribosomal protein S2 is a substrate for mammalian PRMT3 (protein arginine methyltransferase 3). The Biochemical journal. 2005, 386 (Pt 1): 85-91.
Wang C, Lin JM, Lazarides E: Methylations of 70,000-Da heat shock proteins in 3T3 cells: alterations by arsenite treatment, by different stages of growth and by virus transformation. Arch Biochem Biophys. 1992, 297 (1): 169-175. 10.1016/0003-9861(92)90656-H.
Wang C, Lazarides E: Arsenite-induced changes in methylation of the 70,000 dalton heat shock proteins in chicken embryo fibroblasts. Biochemical and biophysical research communications. 1984, 119 (2): 735-743. 10.1016/S0006-291X(84)80312-5.
Yu MC, Bachand F, McBride AE, Komili S, Casolari JM, Silver PA: Arginine methyltransferase affects interactions and recruitment of mRNA processing and export factors. Genes & development. 2004, 18 (16): 2024-2035. 10.1101/gad.1223204.
Ghaemmaghami S, Huh WK, Bower K, Howson RW, Belle A, Dephoure N, O'Shea EK, Weissman JS: Global analysis of protein expression in yeast. Nature. 2003, 425 (6959): 737-741. 10.1038/nature02046.
Bedford MT, Richard S: Arginine methylation an emerging regulator of protein function. Molecular cell. 2005, 18 (3): 263-272. 10.1016/j.molcel.2005.04.003.
Desiere F, Deutsch EW, Nesvizhskii AI, Mallick P, King NL, Eng JK, Aderem A, Boyle R, Brunner E, Donohoe S, Fausto N, Hafen E, Hood L, Katze MG, Kennedy KA, Kregenow F, Lee H, Lin B, Martin D, Ranish JA, Rawlings DJ, Samelson LE, Shiio Y, Watts JD, Wollscheid B, Wright ME, Yan W, Yang L, Yi EC, Zhang H: Integration with the human genome of peptide sequences obtained by high-throughput mass spectrometry. Genome biology. 2005, 6 (1): R9-10.1186/gb-2004-6-1-r9.
Belle A, Tanay A, Bitincka L, Shamir R, O'Shea EK: Quantification of protein half-lives in the budding yeast proteome. Proc Natl Acad Sci USA. 2006, 103 (35): 13004-13009. 10.1073/pnas.0605420103.
Tung CW, Ho SY: Computational identification of ubiquitylation sites from protein sequences. BMC Bioinformatics. 2008, 9: 310-10.1186/1471-2105-9-310.
Gupta P, Ho PC, Huq MD, Khan AA, Tsai NP, Wei LN: PKCepsilon stimulated arginine methylation of RIP140 for its nuclear-cytoplasmic export in adipocyte differentiation. PLoS One. 2008, 3 (7): e2658-10.1371/journal.pone.0002658.
Ostareck-Lederer A, Ostareck DH, Rucknagel KP, Schierhorn A, Moritz B, Huttelmaier S, Flach N, Handoko L, Wahle E: Asymmetric arginine dimethylation of heterogeneous nuclear ribonucleoprotein K by protein-arginine methyltransferase 1 inhibits its interaction with c-Src. The Journal of biological chemistry. 2006, 281 (16): 11115-11125. 10.1074/jbc.M513053200.
Yamagata K, Daitoku H, Takahashi Y, Namiki K, Hisatake K, Kako K, Mukai H, Kasuya Y, Fukamizu A: Arginine methylation of FOXO transcription factors inhibits their phosphorylation by Akt. Molecular cell. 2008, 32 (2): 221-231. 10.1016/j.molcel.2008.09.013.
Yun CY, Fu XD: Conserved SR protein kinase functions in nuclear import and its action is counteracted by arginine methylation in Saccharomyces cerevisiae. J Cell Biol. 2000, 150 (4): 707-718. 10.1083/jcb.150.4.707.
Chen W, Daines MO, Hershey GK: Methylation of STAT6 modulates STAT6 phosphorylation, nuclear translocation, and DNA-binding activity. J Immunol. 2004, 172 (11): 6744-6750.
Zhu W, Mustelin T, David M: Arginine methylation of STAT1 regulates its dephosphorylation by T cell protein tyrosine phosphatase. The Journal of biological chemistry. 2002, 277 (39): 35787-35790. 10.1074/jbc.C200346200.
Hsu Ia W, Hsu M, Li C, Chuang TW, Lin RI, Tarn WY: Phosphorylation of Y14 modulates its interaction with proteins involved in mRNA metabolism and influences its methylation. The Journal of biological chemistry. 2005, 280 (41): 34507-34512. 10.1074/jbc.M507658200.
Wooderchak WL, Zang T, Zhou ZS, Acuna M, Tahara SM, Hevel JM: Substrate profiling of PRMT1 reveals amino acid sequences that extend beyond the "RGG" paradigm. Biochemistry. 2008, 47 (36): 9456-9466. 10.1021/bi800984s.
Mallick P, Schirle M, Chen SS, Flory MR, Lee H, Martin D, Ranish J, Raught B, Schmitt R, Werner T, Kuster B, Aebersold R: Computational prediction of proteotypic peptides for quantitative proteomics. Nat Biotechnol. 2007, 25 (1): 125-131. 10.1038/nbt1275.
Newman JR, Ghaemmaghami S, Ihmels J, Breslow DK, Noble M, DeRisi JL, Weissman JS: Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise. Nature. 2006, 441 (7095): 840-846. 10.1038/nature04785.
Michalek MT, Grant EP, Rock KL: Chemical denaturation and modification of ovalbumin alters its dependence on ubiquitin conjugation for class I antigen presentation. J Immunol. 1996, 157 (2): 617-624.
Chuikov S, Kurash JK, Wilson JR, Xiao B, Justin N, Ivanov GS, McKinney K, Tempst P, Prives C, Gamblin SJ, Barlev NA, Reinberg D: Regulation of p53 activity through lysine methylation. Nature. 2004, 432 (7015): 353-360. 10.1038/nature03117.
Lu JY, Lin YY, Qian J, Tao SC, Zhu J, Pickart C, Zhu H: Functional dissection of a HECT ubiquitin E3 ligase. Mol Cell Proteomics. 2008, 7 (1): 35-45.
Gupta R, Kus B, Fladd C, Wasmuth J, Tonikian R, Sidhu S, Krogan NJ, Parkinson J, Rotin D: Ubiquitination screen using protein microarrays for comprehensive identification of Rsp5 substrates in yeast. Molecular systems biology. 2007, 3: 116-10.1038/msb4100159.
Daily KM, Radivojac P, Dunker AK: Intrinsic disorder and protein modifications: building an SVM predictor for methylation. IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2005: November 2005 2005; San Diego, California, USA. 2005, 475-481.
Ptacek J, Devgan G, Michaud G, Zhu H, Zhu X, Fasolo J, Guo H, Jona G, Breitkreutz A, Sopko R, McCartney RR, Schmidt MC, Rachidi N, Lee SJ, Mah AS, Meng L, Stark MJ, Stern DF, De Virgilio C, Tyers M, Andrews B, Gerstein M, Schweitzer B, Predki PF, Snyder M: Global analysis of protein phosphorylation in yeast. Nature. 2005, 438 (7068): 679-684. 10.1038/nature04187.
Linding R, Jensen LJ, Ostheimer GJ, van Vugt MA, Jorgensen C, Miron IM, Diella F, Colwill K, Taylor L, Elder K, Metalnikov P, Nguyen V, Pasculescu A, Jin J, Park JG, Samson LD, Woodgett JR, Russell RB, Bork P, Yaffe MB, Pawson T: Systematic discovery of in vivo phosphorylation networks. Cell. 2007, 129 (7): 1415-1426. 10.1016/j.cell.2007.05.052.
Fiedler D, Braberg H, Mehta M, Chechik G, Cagney G, Mukherjee P, Silva AC, Shales M, Collins SR, van Wageningen S, Kemmeren P, Holstege FC, Weissman JS, Keogh MC, Koller D, Shokat KM, Krogan NJ: Functional organization of the S. cerevisiae phosphorylation network. Cell. 2009, 136 (5): 952-963. 10.1016/j.cell.2008.12.039.
The UniProt Consortium: The Universal Protein Resource (UniProt). Nucleic acids research. 2009, D169-174. 10.1093/nar/gkn664. 37 Database
Couttas TA, Raftery MJ, Bernardini G, Wilkins MR: Immonium ion scanning for the discovery of post-translational modifications and its application to histones. J Proteome Res. 2008, 7 (7): 2632-2641. 10.1021/pr700644t.
Beck HC, Nielsen EC, Matthiesen R, Jensen LH, Sehested M, Finn P, Grauslund M, Hansen AM, Jensen ON: Quantitative proteomic analysis of post-translational modifications of human histones. Mol Cell Proteomics. 2006, 5 (7): 1314-1325. 10.1074/mcp.M600007-MCP200.
Dave KA, Hamilton BR, Wallis TP, Furness SGB, Whitelaw ML, Gorman JJ: Identification of N, Nepsilon-dimethyl-lysine in the murine dioxin receptor using MALDI-TOF/TOF- and ESI-LTQ-Orbitrap-FT-MS. Int J Mass Spec. 2007, 268 (2-3): 168-180. 10.1016/j.ijms.2007.06.001.
Wisniewski JR, Zougman A, Kruger S, Mann M: Mass spectrometric mapping of linker histone H1 variants reveals multiple acetylations, methylations, and phosphorylation as well as differences between cell culture and tissue. Mol Cell Proteomics. 2007, 6 (1): 72-87.
Garavelli JS: The RESID Database of Protein Modifications as a resource and annotation tool. Proteomics. 2004, 4 (6): 1527-1533. 10.1002/pmic.200300777.
Suzek BE, Huang H, McGarvey P, Mazumder R, Wu CH: UniRef: comprehensive and non-redundant UniProt reference clusters. Bioinformatics (Oxford, England). 2007, 23 (10): 1282-1288. 10.1093/bioinformatics/btm098.
Hong EL, Balakrishnan R, Dong Q, Christie KR, Park J, Binkley G, Costanzo MC, Dwight SS, Engel SR, Fisk DG, Hirschman JE, Hitz BC, Krieger CJ, Livstone MS, Miyasato SR, Nash RS, Oughtred R, Skrzypek MS, Weng S, Wong ED, Zhu KK, Dolinski K, Botstein D, Cherry JM: Gene Ontology annotations at SGD: new data sources and annotation methods. Nucleic acids research. 2008, D577-581. 36 Database
Hart GT, Lee I, Marcotte ER: A high-accuracy consensus map of yeast protein complexes reveals modular nature of gene essentiality. BMC Bioinformatics. 2007, 8: 236-10.1186/1471-2105-8-236.
Rivals I, Personnaz L, Taing L, Potier MC: Enrichment or depletion of a GO category within a class of genes: which test?. Bioinformatics (Oxford, England). 2007, 23 (4): 401-407. 10.1093/bioinformatics/btl633.
R Development Core Team: R: A language and environment for statistical computing. 2005, Vienna: R Foundation for statistical computing
CNIP was the recipient of Australian Postgraduate Awards. EG was supported by the Swiss Federal Government through the Federal Office of Education and Science. This research was supported in part by a University of New South Wales Faculty Research Grant, by the University of New South Wales Goldstar Scheme and the NSW State Government Science Leveraging Fund. The author thanks Timothy A. Couttas, Daniel Yagoub, Simone S. Li, and Adam Lee for their helpful discussions on this manuscript. We thank A.C. Gavin for facilitating access to the Cellzome peptide mass data.
CNIP designed the method for searching arginine and lysine methylation sites using FindMod, performed the statistical and bioinformatics analyses, and wrote the manuscript. EG implemented the FindMod bulk submission program. MRW supervised the project and critically reviewed the manuscript. All authors read and approved the manuscript.
Electronic supplementary material
Additional file 1: Examples of ambiguous and unambiguous peptide matches. This file contains examples of ambiguous and unambiguous peptide matches. (DOC 40 KB)
Additional file 2: List of lysine- and arginine-methylated peptides. This file contains the list of all high confidence arginine- and lysine-methylated peptides, and their corresponding discovery rates. (XLS 66 KB)
Additional file 3: List of lysine and arginine methylation sites. This file contains the list of all high confidence methylated residues and their corresponding discovery rates. (XLS 42 KB)
Additional file 4: Additional benchmarking results. Evaluation of the true positive rates of FindMod using different range of mass tolerance from 0.01 to 0.10 Da, and the known non-redundant methylation test set and the artificial methylation test set. (DOC 108 KB)
Additional file 5: List of methylated peptides of Tef1p and Rpl23p discovered by FindMod. This file contains the list of methylated peptides of Tef1p and Rpl23p found by FindMod. These peptides may contain E and D residues, as E and D residues were not filtered for the analysis. (DOC 38 KB)
Additional file 6: Supplementary methods. This file describes how the tailor-made error tolerances was calculated, and also provide a list of low-quality post-translational modifications that were excluded from FindMod's analysis. (DOC 38 KB)
Additional file 7: List of methylation at C-terminus of peptides. This file is a list of peptides with methylation at C-terminus of peptides, collected from literature. (XLS 28 KB)
About this article
Cite this article
Pang, C.N.I., Gasteiger, E. & Wilkins, M.R. Identification of arginine- and lysine-methylation in the proteome of Saccharomyces cerevisiae and its functional implications. BMC Genomics 11, 92 (2010). https://doi.org/10.1186/1471-2164-11-92
- True Positive Rate
- Methylation Site
- Arginine Methylation
- Lysine Methylation
- Unmodified Peptide