RAId_DbS: mass-spectrometry based peptide identification web server with knowledge integration

Alves, Gelio; Ogurtsov, Aleksey Y; Yu, Yi-Kuo

doi:10.1186/1471-2164-9-505

BMC Genomics

Table 2 Example entries of the processed definition file

From: RAId_DbS: mass-spectrometry based peptide identification web server with knowledge integration

>NP _ 775259,NM _173167, Q 8IWX 7
60	I 00	SAP	gTC → A	\|NM _173167\|	dbSNP : 16970659
60	I 00	SAP	v → I	\|Q 8IWX 7\|	V → I (dbSNP : 16970659): FTId = V AR _ 027506
199	V 00	SAP	GcA → T	\|NM _173167\|	dbSNP : 35749208
377	R 00	SAP	AaG → G	\|NM _173167\|	dbSNP : 41389545
496	H 00	SAP	d → H	\|Q 8IWX 7\|	D → H (breast cancersomatic mutation). FTId = VAR _035870
778	Q 00	SAP	CgG → A	\|NM _173167\|	dbSNP : 34242925
852	N 00	SAP	AtC → A	\|NM_ 173167\|	dbSNP : 11654824
852	N 00	SAP	i → N	\|Q 8IWX 7\|	I → N (dbSNP : 11654824). FTId = V AR _027507
>NP _076410,NM _023921,Q 9NY W 0
92	N 08,N 09, N 10, N 11,N 12			PTM	Nlinked(GlcNAc...) \|Q 9NY W 0\| N – linked(GlcNAc...)(Potential)
156	M 00	SAP	AcG → T	\|NM _023921\|	dbSNP : 597468
156	M 00	SAP	m → T	\|Q 9NY W 0\|	M → T (dbSNP : 597468) FTId = V AR _030009
156	M 00	SAP	t → M	\|NP _076410\|	Alignment with Q 9NY W 0
158	N 08,N 09,N 10, N 11,N 12			PTM	Nlinked(GlcNAc...) \|Q 9NY W 0\| N – linked(GlcNAc...)(Potential)

Two sequence clusters are shown in this table to demonstrate the structure of our processed information file. The text line after the ">" symbol contains accession numbers associated with the members of the cluster. The other rows each contains six entries separated by tabs. The first column indicates the residue position. The second column indicates the modified residue(s) that can occur at the position specified in the first column. The third column, labeled by either SAP or PTM, indicates the modification type. The fifth column contains the accession number of the source of modification, this may be a protein sequence or a mRNA. The fourth column explains the origin of the modification; a lower case letter indicates residue content in the source sequence, the upper case letter indicates the modified residue in the variant sequence. The notation, v → I, indicates the source sequence with amino acid V can change into I, ie, a SAP. The notation, gTC → A, is a short hand for codon change from gtc to atc, ie, a SNP that changes the coded amino acid from V to I as well. The sixth column contains additional information for the fourth column. It may include disease information or database entry index. As an example, in the first row of the first cluster, we have dbSNP : 16970659 indicating this SNP comes from the NCBI's dbSNP with entry index 16970659. In the fifth row, the sixth column contains disease origin. The additional Feature Identifier (FTId), VAR_xxxxxx, is the variant sequence documented by SwissProt. In the second cluster, fourth row, we see in the sixth column "Alignment with Q9NYW0", indicating that this SAP comes from the mismatch in the alignment between protein sequences in the clustering procedure. In the first and the last row of the second cluster, the second column contains N08, N09,..., N12, all of which are possible post-translational modifications associated with Glycosylations [22] as indicated in the sixth column.

Back to article page

ISSN: 1471-2164

Contact us

Submission enquiries: bmcgenomics@biomedcentral.com
General enquiries: ORSupport@springernature.com