Skip to main content

Table 1 An example set of text strings matched a mass 270.14

From: A graph-based filtering method for top-down mass spectral identification

No.

Amino acid string

No.

Amino acid string

1

QAA

7

GAAA

2

AQA

8

AGAA

3

AAQ

9

AAGA

4

RGG

10

AAAG

5

GRG

11

NR

6

GGR

12

RN

  1. In proteoform identification, there are 20 common types of amino acids. The scaling factor 100 is used for the discretization of the residue masses of the 20 amino acids. The alphabet consists of 19 integers because leucine and isoleucine have the same discretized mass 11308. There are a total of 12 text strings whose descritized masses are 27014. For brevity, all masses in the text strings are represented by their corresponding amino acids. For example, the text string 11404, 15610 is represented by NR