Comprehensive identification of novel proteins and N-glycosylation sites in royal jelly

Background Royal jelly (RJ) is a proteinaceous secretion produced from the hypopharyngeal and mandibular glands of nurse bees. It plays vital roles in honeybee biology and in the improvement of human health. However, some proteins remain unknown in RJ, and mapping N-glycosylation modification sites on RJ proteins demands further investigation. We used two different liquid chromatography-tandem mass spectrometry techniques, complementary N-glycopeptide enrichment strategies, and bioinformatic approaches to gain a better understanding of novel and glycosylated proteins in RJ. Results A total of 25 N-glycosylated proteins, carrying 53 N-glycosylation sites, were identified in RJ proteins, of which 42 N-linked glycosylation sites were mapped as novel on RJ proteins. Most of the glycosylated proteins were related to metabolic activities and health improvement. The 13 newly identified proteins were also mainly associated with metabolic processes and health improvement activities. Conclusion Our in-depth, large-scale mapping of novel glycosylation sites represents a crucial step toward systematically revealing the functionality of N-glycosylated RJ proteins, and is potentially useful for producing a protein with desirable pharmacokinetic and biological activity using a genetic engineering approach. The newly-identified proteins significantly extend the proteome coverage of RJ. These findings contribute vital and new knowledge to our understanding of the innate biochemical nature of RJ at both the proteome and glycoproteome levels.


Background
Royal jelly (RJ) is a proteinaceous secretion derived from the hypopharyngeal and mandibular glands of young worker bees [1,2]. It is the sole food fed to the queen throughout her lifetime, and is also fed to all young larvae for the first three days after hatching [2]. RJ possesses various biological attributes beneficial for human health, such as antioxidant activities [3], antibacterial effects [4], enhancement of immune activity [5], and antitumor effects [6]. Protein accounts for >50% of RJ by dry weight [2]. It has been reported that nine members of major royal jelly proteins (MRJPs, MRJP1-9) [7,8] account for 80-90% of the total protein in RJ [9]. Other proteins, such as alpha-glucosidase, glucose oxidase, and alphaamylase have also been detected in RJ [1,[10][11][12]. Although several studies have indicated that the proteins in RJ have undergone glycosylation modification [12][13][14][15][16], we do not yet know the types or site assignments of this glycoprotein. With the development of new technologies in protein separation and identification, dozens of novel proteins have been recently identified in RJ by our group and by others [1,11,16,17]. Advances in resolution and sensitivity (double high) of liquid chromatography-tandem mass spectrometry (LC-MS/MS) have made it a powerful platform. These advances have made it possible to profile the proteome of RJ more deeply, while allowing for systemlevel mapping of glycosylation sites of RJ proteins.
Asparagine-linked (N-linked) protein glycosylation is the most abundant of all posttranslational modifications in eukaryotes, with nearly 70% of all eukaryotic proteins predicted to be N-glycoproteins [18]. N-linked glycosylation is an enzymatically catalyzed process that occurs in the endoplasmic reticulum (ER). It involves the assembly of glycans on a lipid carrier in the ER membrane, followed by a transfer to specific asparagine residues of target polypeptides [19]. The attachment of N-glycans to a peptide backbone has been reported to assist in protein folding, stability, solubility, oligomerization, quality control, sorting, and transport [20,21]. Glycoproteins mediate many important biological processes by their involvement in cell adhesion, cell differentiation, cell growth, and immunity [22,23].
To identify N-glycosylated peptides from the more abundant non-glycosylated peptides in complex biological samples, specific enrichment methods, such as lectin affinity [24] or hydrazide chemistry [25], are required before they are subjected to double high LC-MS/MS analysis. Since a consensus sequence motif of N-X-S/T exists in Nglycosylation [20,21] (N = asparagine, X = any amino acid except proline, S/T = serine or threonine), the digested asparagine residue in N-X-S/T resulting from deglycosylation of the enzyme (Peptide N Glycosidase, PNGase F, commonly used) usually increases the mass by 0.98 Da. This basic scientific evidence is used to locate the Nglycosylation sites on a protein [26]. For more exact mapping of N-glycosylation sites, deglycosylation is usually done by introduction of 18 O-water (H 2 18 O), which increases a mass shift in the MS spectra of 2.99 Da, thus adding confidence to the site assignment [27].
It is well-known that mapping residue-specific glycosylation sites is the first step towards a detailed and functional understanding of proteins [20]. However, information on N-glycosylation site assignment in RJ proteins is still very limited, thus demanding a powerful glycoproteomics approach to large-scale comprehensive mapping N-glycosylated sites in RJ proteins. Until now, RJ proteins have been documented to contain a series of glycoproteins [12,14,15], and are potentially glycosylated by a gel stain [28]. Only MRJP 2 is reported to carry two N-glycosylated sites attached a high-mannose structure and complex type antennary structures [16].
In an effort to identify hidden proteins and to map the N-linked glycosylation sites in RJ, two different double high LC-MS/MS systems, Q-Exactive coupled to Easy-nLC 1000 (orbitrap-based MS) and Triple TOF 5600 coupled with an Eksigent nLC (triple TOF-based MS), as well as complementary glycopeptide enrichment protocols (hydrazide and lectin), were employed. Overall, 25 N-glycosylated proteins carrying 53 N-glycosylation sites were confidently identified, of which novel 42 N-linked glycosylation sites were mapped in RJ proteins, and 13 novel proteins were identified in RJ.

Identified novel royal jelly proteins
To expand the number of known proteins in the RJ proteome, RJ proteins were extracted and digested with insolution methods and analyzed with double high LC-MS/ MS (orbitrap-based MS). A total of 42 nonredundant proteins were confidentially identified, of which 13 proteins were novel (Table 1 and Additional file 1: Table S1).
The 42 identified proteins in RJ were classified on the basis of their biological process and molecular function and annotated by gene ontology. In the YELLOW/MRJP family, a new protein, yellow-e3 precursor, was identified. Of the 12 proteins related to metabolic processes, five novel proteins were identified: lysosomal pro-X carboxypeptidase, lysosomal aspartic protease, membrane metallo-endopeptidase 1, matrix metalloproteinase 14, and pancreatic triacylglycerol lipase. Among the 14 proteins associated with health improvement, six were reported here for the first time: venom dipeptidyl peptidase 4 precursor, venom serine protease 34, hymenoptaecin precursor, venom protease, hypothetical protein LOC408570, lysozyme isoform 1. One of the four proteins involved in development processes was novel, protein CREG 1 (Table 1 and Additional file 1: Table S1). Interestingly, the majority of the newly-identified proteins were related both to metabolic processes (accounting for 38.5% of all novel proteins) and health promotion activities (46.2% of all novel proteins).

Mapping N-glycosylated sites
To attain a comprehensive map of N-linked glycosylation sites in RJ, RJ proteins were extracted and enriched by two different enrichment methods (hydrazide and lectin), after which the N-glycosylation peptides were analyzed by two different double high LC-MS/MS (orbitrapbased MS and triple TOF-based MS). The introduction of 18 O-water in the process of PNGase F digestion added to confidence to the identification of N-glycopeptides. An example spectrum of N-glycopeptide is shown in Figure 1 (for all other spectra see Additional file 2: Figure S1). Overall, 25 N-glycoproteins carrying 53 unique N-linked glycosylation sites represented 60% of the total identified proteins in RJ. Among the 53 identified N-linked glycosylation sites, 42 were confidentially mapped in RJ proteins for the first time (Table 2).
In the YELLOW/MRJP family, seven proteins were identified as N-glycoproteins, glycosylated on 12 unique peptides, each carrying a single N-glycosylated site ( Table 2). Of the proteins involved in metabolic processes, seven were N-glycosylated on 16 unique N-glycopeptides: all but on each contained a single N-glycosylation site and one unique N-glycopeptide carried two sites (Table 2). Of the proteins related to health improvement, seven were found N-glycosylated on 18 unique peptides, and each peptide had a single N-glycosylated site ( Table 2). Of the two proteins implicated in the regulation of morphological development, IDGF 4 was N-glycosylated on one unique peptide with a single site, and N-glycosylated protein takeout had one unique peptide carrying two sites (Table 2). Finally, two identified N-glycoproteins with unknown      functions each had one unique peptide harboring a single N-glycosylated site ( Table 2). Among those 53 unique N-glycosylated sites, 21 were identified by lectin enrichment alone, eight were uniquely identified by the hydrazide enrichment, and 18 were identified by both enrichment methods using orbitrap-based MS (Figure 2A). Similarly, eight N-glycopeptides were specifically identified by the lectin enrichment protocol, two were specifically identified by the hydrazide chemistry, and six were identified by both enrichment methods using triple TOF-based MS ( Figure 2B). In general, 29 N-glycopeptides were uniquely identified by orbitrap-based MS, four were uniquely identified by triple TOFbased MS, and 10 were identified by both MS systems using the lectin enrichment method ( Figure 2C). Likewise, 18 N-glycopeptides were identified by orbitrap-based MS alone, and eight were identified by both types of LC-MS/MS instruments with adoption of hydrazide enrichment ( Figure 2D).
As shown in Figure 3 and Table 2, the distribution of the 53 N-glycosylated sites was subdivided into known and novel proteins. Specifically, only two known sites in known glycoproteins were repeatedly identified in the current study, and six potential sites in known glycoproteins and  three potential sites in novel glycoproteins were also identified. The potential sites predicted in the UniProt Database (updated April 2013) were also experimentally confirmed in this study. Thirty-three novel sites were identified in known glycoproteins, and nine novel sites in novel glycoproteins.
Site occupancy analyses showed that approximately 48% of N-glycosylated proteins carrying a single N-linked glycosylated site, 20% contained two sites, 16% retained three sites, and the rest carried four or more N-glycosylated sites (Figure 4).
To gain a better understanding the sequence motif of the N-linked glycosylation site in RJ, the surrounding sequences (five amino acids to both termini) of N-glycosylated sites were compared. As shown in Figure 5, about two-thirds were the N-X-T motif and the others were the N-X-S motif in the downstream (positive values) of N-linked modification sites. In other words, the N-linked sequence motif was X-X-N-X-S/T-X in N-glycoproteins of RJ (N = asparagine, X = any amino acid except proline, S/T = serine or threonine).

Discussion
To gain a new understanding of innate biochemical properties of RJ at the proteome and glycoproteome levels, RJ was analyzed for the identification of novel proteins hidden in RJ and mapped for N-glycosylation sites using the double high LC-MS/MS system (orbitrap and triple TOF) and complementary methods of glycoprotein/glycopeptides enrichment (hydrazide chemistry and lectin). Overall, 13 novel proteins and 42 novel N-glycosylated sites in 25 N-glycosylated proteins were identified.  Distribution of N-glycosylated sites in royal jelly proteins. "2" is the identified two known sites in known glycoprotein. "6" is potential sites predicted in known glycoprotein, and "3" is potential glycosylation sites identified in novel glycoprotein. "33" is the novel sites identified in known glycoprotein, and "9" is the novel sites identified in novel glycoprotein.

Identification of novel RJ proteins
The exploration of novel proteins in RJ is a long-term pursuit for apicultural biologists and biochemical experts. The fast improvement of MS with high resolution, high mass accuracy, and high sequencing speed now allows for in-depth identification of proteins in a comprehensive and unbiased manner in biological samples with high confidence. Compared with previous reports and bioinformatics analysis [1,11,17,28,29], 13 novel proteins were identified in this study. To establish the confidence that the newly identified proteins were real secretory proteins and not contaminated cellular proteins that may have leaked during secretory process of RJ glands, we used two bioinformatics software programs to confirm the origination of the secretory proteins. Proteins predicted as extracellular proteins by PSORT indicate they are putative secretory proteins [30]. To confirm this, SignalP was used to verify the presence of N-terminal secretory signal peptides [31]. This method suggested that all of the 13 novel proteins predicted to be secretory proteins are real protein components of RJ. They are mainly involved in metabolic processes and health promotion activities. This finding is of particular importance for opening new doors to understanding how RJ accomplishes its roles in honeybee biology and in the promotion of human health.
The YELLOW/MRJP is the most important RJ protein family and plays key roles both in honeybee biology and the promotion of human health [9]. The amazing fecundity of the queen (one queen lays 1,500-2,000 eggs a day, more than her body weight [2]) and the exponential speed of larval growth (weight increase by 1,600 times in the first six days of growth [32]) are achieved by a diet of highlynutritious RJ. MRJPs share a common evolutionary origin with the yellow protein family [33,34]. In particular, yellow-e3 and mrjp genes share the most introns/exons in the same relative positions [33]. The gene expression of yellow-e3 in the honeybee head and hypopharyngeal glands almost completely coincides with a developmental pattern typical of mrjp genes, supporting that yellow-e3 is the most recent common ancestor of the MRJP families [33,34]. Therefore, the newly identified yellow-e3 precursor in RJ is likely to act in a similar manner to that of the MRJPs, performing multifunctional roles in supplying nutrition and modulating caste determination of the honeybee [34,35]. Noticeably, in previous RJ studies, only MRJP1-5 have been repeatedly identified by a singular proteomics protocol [1,12,17,28]. MRJP6-9 are identified only when special technology is used [8,11]. For example, identification of MRJP8 requires a special digestion method for the proteins [28]. In this study, we not only identified MRJP1-9 in a single study, but we also identified yellow-e3 precursor as a new member of the YELLOW/MRJP family. This indicates that our protocol has a high efficiency in identification of RJ proteins.
RJ provides efficient energetic fuels for the fast development of larvae and the egg-laying queen through the metabolism of sugars, lipids, and proteins [2]. The identification of a high number of proteins related to the metabolism of sugar, lipids, and proteins suggests that the honeybee has an evolutionary strategy of using RJ to fulfill the enormous energy requirement of the fast-developing larvae and the egg-laying queen through these metabolic pathways. Noticeably, five of the 13 novel proteins identified were associated with this category, indicating their  biological importance as a source of metabolic fuel for ensuring the normal growth of honeybee larvae. Triacylglycerol lipase breaks down dietary fat, mainly triacylglycerol, to monoacylglycerol, and free fatty acids to supply the energy requirements of living organisms [36]. In addition, enzymes of lysosomal pro-x carboxypeptidase, lysosomal aspartic protease, membrane metalloendopeptidase, and matrix metalloproteinase 14, also participate in the metabolism of protein to produce energy [37,38].
RJ has been well to documented enhance immunity for honeybees and to promote health for humans [2]. Among the 14 RJ proteins related to health promotion activities, six were identified as novel. Dipeptidyl peptidase IV is known to functionally suppress peritoneal dissemination and the progression of ovarian carcinoma, inhibit the malignant phenotype of prostate cancer cells, and promote the human immune system [39,40]. Venom serine protease 34 is part of a defense mechanism against intruding microorganisms and parasites in insects [41][42][43]. Hymenoptaecin can inhibit the viability of gram-positive and gram-negative bacteria, and provides wide-spectrum antibacterial protection for honeybees and humans [44,45]. Venom protease has fibrinogenolytic activity and is a strong antithrombotic agent in snakes [46]. Lysozyme isoform 1 is an important member related to the innate immunity of insects, efficiently protecting larvae from diseases and pests [47]. The hypothetical protein LOC408570 (93% homology with peroxidasin protein of Harpegnathos saltator) [48] has functions in phagocytosis and in defense against radioiodinations and oxidation [49].
The newly identified protein cellular repressor of E1Astimulated genes (protein CREG) might contribute to the promotion of differentiation of honeybee larvae by the enhancement of cell differentiation [50] as MRJP 1 does [51].

Mapping N-glycosylated sites
By using two complementary enrichment protocols (hydrazide chemistry and lectin resin) and two orbitrapbased and triple TOF-based double high LC-MS/MS systems, we have achieved an in-depth identification of 25 N-glycoproteins that mapped on to 53 sites on RJ proteins. Among these, 42 novel N-linked glycosylation sites were reported in RJ proteins. To the best of our knowledge, this is the most comprehensive assignment of the N-glycosylated sites of RJ.
Capturing the maximum number of glycopeptides is of great importance for the analysis of mapping glycosylated sites [52,53], and is achievable using the complementary enrichment of glycopeptides with techniques such as hydrazide chemistry and lectin based protocols. Hydrazide chemistry can efficiently capture glycoproteins once oxidized by sodium periodate, and is thus extremely useful for the identification of glycopeptides [54]. "Filter aided sample preparation" (FASP) is an N-glycopeptide enrichment protocol that uses a combination of different lectins to efficiently capture glycopeptides [55]. By adopting two different methods based on lectin and hydrazide enrichment, comprehensive glycosylation sites were assigned in RJ, namely 46 by lectin resin and 16 by hydrazide chemistry. Meanwhile, orbitrap-based MS seems to be more robust than Triple TOF-based MS in the identification of glycosylated sites in RJ, and the combined utilization of two different double high LC-MS/MS yielded identification of more number of N-glycosylated sites in RJ. Together, of the 53 N-glycosylation sites assigned in RJ proteins, 42 were mapped as novel. Nine potential N-glycosylation sites predicted by the Uniprot database (updated April 2013) were also verified. In addition, the only two known N-glycosylation sites [16] were repeatedly identified.
It is now known that blocking glycosylation could result in improper or incomplete folding of many polypeptides. These improperly folded polypeptides would not passing ER quality control [56] and would be retained in the ER and eventually degraded [57]. Given that RJ proteins contain 80-90% of MRJPs [9], glycosylation may help MRJPs reach their native conformation to accomplish their biological roles for both honeybees and humans [9]. Glycosylation also allegedly increases the solubility of proteins [58,59]. Therefore, the glycosylated YELLOW/MRJPs suggest their roles in promoting the solubility of YEL-LOW/MRJPs in RJ to enhance their nutritive efficiency of assimilation [60,61]. Since glycosylated proteins have roles in immunity [62], the weak immunity of the young honeybee larvae (the first 48 h) may be promoted by feeding glycosylated MRJPs to ensure normal development [63]. This is in line with report that glycosylated MRJP 2 can effectively inhibit Paenibacillus larvae infection [16].
Glycosylation site occupancy modulates enzymatic activities by the attachment of glycans to peptide backbones [64]. Interestingly, the majority of glycosylated proteins identifed here are enzymes associated with the metabolic pathways of carbohydrates and proteins. For instance, three enzymes, lysosomal alpha-mannosidase, alpha-glucosidase, and glucosylceramidase, are involved in carbohydrate metabolism [65][66][67]. Four other enzymes, plasma glutamate carboxypeptidase, lysosomal pro-x carboxypeptidase, lysosomal aspartic protease, and membrane metalloendopeptidase, are implicated in the metabolism of proteins [37,38]. The high number of glycosylated proteins related to metabolic processes indicates the production of enough energy through the metabolism of carbohydrates and proteins for queen spawning and larval growth, which may be achieved by modulating the enzymatic efficiency [64].
N-glycosylation modification of proteins has reported to improve the health of living organisms through antibacterial activity [68], antioxidant activity [69], and antihypertension [70]. For instance, glucose oxidase acts as a natural preservative and a bactericide by reducing oxygen to a hydrogen peroxide formation [71]. Venom dipeptidyl peptidase 4 precursor could enhance immune response activity by stimulating the T-cells of mammalia [39,40]. Antithrombin-III, Apolipophorin-III protein precursor, and toll-like receptor 13 all play key roles in promoting the innate immunity of honeybee larvae [11,[72][73][74][75][76][77]. MRJP 1 has potential antitumor effects by stimulating macrophages to release TNF-α [61]. In addition, the glycosylated protein affects cell proliferation and regulates circadian rhythm [78]. Chitinase, as a growth factor, stimulates the proliferation and polarization in Drosophila [79]. Protein takeout helps regulation of circadian rhythms and feeding behavior in Drosophila [80]. Overall, the glycosylation of these RJ proteins suggests that they may be involved in the above biological roles benefitting both honeybee and humans.
An oligosaccharide unit attached to the polypeptide at the site of occupancy has reported to improve solubility, folding, and half-life of the glycoprotein [81]. Most glycosylated RJ glycoproteins (~50%) carried a single N-glycosylation site,~20% carried two or three sites, and only a few carried four or five sites. In addition, the identified conservative motif of amino acid sequence of N-glycosylated RJ peptides may have structural and functional importance for RJ proteins in future studies [82,83]. Although the glycan linkages associated with the glycosylation sites demand further investigation, this new catalog of knowledge may prove helpful in elucidating the biological implications of glycosylation for the RJ proteins through synthesizing the glycan to the identified sites. This is possible because N-glycosylation is a conserved process of post-translational modification in a diversity of proteins in eukaryotic organisms [18], and the established N-linked glycosylation system in the Campylobacter system could transfer a functional N-linked glycoprotein into Escherichia coli [84]. This provides promising glycoengineering possibilities for producing modified RJ peptides that could produce a protein with desirable pharmacokinetics and biological activity.

Conclusions
A total of 13 novel proteins and 42 novel N-linked glycosylation sites in 25 N-glycosylated RJ proteins have been identified here. Of the glycosylated proteins, most were related to metabolic activities and carry multiple N-linked glycosylation sites. This is important for young larvae and the fertile egg-laying queen, since their high metabolic fuel demands may be achieved through the regulation of the enzymatic activities related to the metabolic process. The glycosylated proteins related to the improvement of human health suggest N-glycosylation plays a key role in helping RJ proteins accomplish their biological functions. The large scale assignment of N-glycosylated sites represents a crucial first step toward systematically revealing the functionality of N-glycosylated RJ proteins. In addition, the identification of novel proteins mainly associated with metabolic process and promoting human health significantly extend the proteome coverage of RJ.

Sample preparation
RJ was collected as a pooled samples from 250 queen cell cups from each of five colonies of Apis mellifera ligustica at the apiary of the Institute of Apicultural Research, Chinese Academy of Agricultural Science, Beijing. RJ proteins were extracted immediately after harvest according to previously described methods [72]. The resulting pellets were divided into three parts for the following analyses.

In-solution digestion
The first part of the above protein pellets (1 mg RJ/100 μl buffer) was dissolved in 40 mM of (NH 4 )HCO 3 (Sigma). The sample was used for in-solution digestion (trypsin, modified sequencing grade, Promega) according to our previous methods [72]. Finally, the peptide-containing solution containing peptides was concentrated using a Speed-Vac system (RVC 2-18, Marin Christ) for MS/MS analysis.

N-linked glycopeptide enrichment with hydrazide chemistry
The second part of the protein pellet (1 mg RJ/100 μl buffer) was suspended in a coupling buffer [100 mM sodium acetate (Sigma), 150 mM NaCl (Sigma), pH 5.5] and then prepared by enriching the N-linked glycopeptides with hydrazide resin according to the method of Zhang et al. [54]. Briefly, the glycoproteins were oxidized, and these oxidized proteins were captured by hydrazide resin (Bio Rad). The captured glycoproteins were digested overnight by trypsin. Afterwards, the digested glycopeptides were further digested by PNGase F (NEB) to remove the glycans attached to the proteins, and were labeled by H 2 18 O (Sigma) to confidently assign the N-glycosylation sites. Finally, the collected supernatant was concentrated using a Speed-Vac system for MS/MS analysis.

N-linked glycopeptide enrichment with lectin
The remaining third of the protein pellets (1 mg RJ/ 100 μl buffer) was suspended in 8 M of urea in 100 mM of Tris-HCl (pH 8.5) and the mixture was transferred into an Ultracel YM-10 10,000 MWCO centrifugal filter unit (Millipore) and digested by trypsin overnight. Following this, the digested peptides were prepared for enrichment by the N-linked glycopeptides with lectin (mixture with Concanavalin A, wheat germ agglutinin, and RCA 120 agglutinin) (Sigma) and a second digestion by PNGase F and H 2 18 O, labeled according to N-Glyco-FASP [85]. Finally, the labeled peptide sample was concentrated using a Speed-Vac system for MS/MS analysis.

Mass spectrometric analysis
The three peptide samples were analyzed on the Q-Exactive mass spectrometer (Thermo Fisher Scientific) coupled to an Easy-nLC 1000 (Thermo Fisher Scientific) via a nanoelectrospray ion source. Full MS scans were acquired with a resolution of 70,000 at m/z 400 in the orbitrap analyzer. The 20 most intense ions were fragmented by higher energy collisional dissociation (HCD). The HCD fragment ion spectra were acquired in the orbitrap analyzer with a resolution of 17,500 at m/z 400. Reverse phase chromatography was performed with a binary buffer system consisting of buffer A (0.1% formic acid, 2% acetonitrile in water) and buffer B (0.1% formic acid in acetonitrile). The peptides were separated with a flow rate of 350 nl/min in the EASY-nLC 1000 system by the following gradient program: from 3 to 8% buffer B for 5 min, from 8 to 20% buffer B for 55 min, from 20 to 30% buffer B for 10 min, from 30 to 90% buffer B for 5 min, and 90% buffer B for 15 min.
To obtain a comprehensive map of N-glycosylation sites in RJ proteins, the glycopeptide samples were also analyzed by electrospray ionization, quadruple time-offlight system (Triple TOF 5600, AB SCIEX) coupled with an Eksigent nano liquid chromatography system (Eksigent Technologies). Separation was performed using a selfpacked in-house 150 × 0.075 mm 300A pore C18 column, at a flow rate of 330 nl/min. The peptides were eluted with a spectral acquisition speed of 20 MS/MS per second, using the following gradient program: from 5 to 8% buffer B (0.1% formic acid in acetonitrile) for 0.1 min, from 8 to 30% buffer B for 22 min, from 30 to 48% buffer B for 6 min, from 48 to 80% buffer B for 1 min, and 80% buffer B for 5 min.

Data analysis
Tandem mass spectra were retrieved using Xcalibur (version 2.2, Thermo Fisher Scientific) and AnalystTF (version 1.6, AB SCIEX) software. The MS/MS spectra files were searched against the sequence database (72,672 entries) using in-house PEAKS software (version 6.0, Bioinformatics Solutions Inc.). The database was generated from protein sequences of Apis (downloaded April 2012), augmented with sequences from Sacharomyces cerevisiae (downloaded April 2012), and a common repository of adventitious proteins (cRAP, from The Global Proteome Machine Organization, downloaded April 2012). The precursor and fragment mass tolerances were set to 50 ppm and 0.05 Da, respectively; tryptic cleavage specificity was set for up to two missed cleavages; carbamidomethyl (C, +57.02) as a fixed modification; and oxidation (M, +15.99) and deamidation (N, +0.98) as the only variable modifications for the RJ sample and oxidation (M, +15.99); deamidation (N, +0.98) and deamidation 18 O (N, +2.998) for the glycopeptide enriched RJ sample. False discovery rate (FDR) was controlled using a target/decoy database approach for both protein identification and modified peptide identification, applying the cut-off FDR of 0.2%. Protein identification was accepted only if it contained at least two unique peptides. All of the identified glycopeptides and assigned sites were manually checked by applying the cut-off criteria: PEAKS score (-log10P) > 30 and FDR < 0.2%, and the majority of y or b ions could be detected with continuous and strong intensity peaks. To localize protein to the subcellular position, newly identified protein sequences were analyzed by PSORT II Prediction [30] (http://psort.hgc.jp/form2.html). To verify the presence of an N-terminal secretion signal peptide, the SignalP 4.1 Server [31] (http://www.cbs.dtu.dk/services/SignalP/) was also used. The D-cut off for signal-TM networks was set to 0.35. The putative functions of identified proteins and glycoproteins were annotated by searching against the Uniprot database (http://www.uniprot.org/) and grouped on the basis of their molecular behavior and biological process in gene ontology terms. All unique sequences of N-glycopeptides were submitted online to WebLogo [86] in order to extract the N-glycosylated site motif of RJ proteins.

Availability of supporting data
The data sets supporting the results of this article (Additional file 1: Table S1 and Additional file 2: Figure  S1) are included within the article and its additional files.

Additional files
Additional file 1: Table S1. Identification of Proteins and Peptides in Royal Jelly Proteins. All of the identified proteins are of Apis mellifera origin. Accession is the unique number given to mark the entry of a protein in the database of Apis (downloaded April 2012, version 4.5 of the honeybee genome). "-10logP" is the score calculated by PEAKS software (version 6.0, Bioinformatics Solutions Inc.). Z is the number of the carrying charge of the peptide. "# Spec" is the number of the spectrum of the peptide. "Start" and "end" correspond to the position of the N-terminal and C-terminal amino acids of the peptide in the protein sequence, respectively. RT is the retention time of the peptide in the mass spectrometry. "ppm" is the deviation value between the experimental mass and the theoretical mass of the peptide. C(+57.02) is the carbamidomethyl modification, M(+15.99) is the oxidation modification, and NQ(+0.98) is the deamidation modification.
Additional file 2: Figure S1. Spectra of N-glycosylated peptide in royal jelly proteins. The tandem mass spectrum of the N-glycosylated site is identified in peptide using 18 O-water labeling.