Skip to main content

Table 3 The similarities of proteins and their frameshifts (aligned by FrameAlign)

From: Frameshift and wild-type proteins are often highly similar because the genetic code and genomes were optimized for frameshift tolerance

Type

Species

Number of CDSs

Average Similarity

Number of Gaps

δ12

δ13

δ23

δ

MAX

MIN

Real CDSs (FrameAlign)

H. sapiens

71,853

0.492 ± 0.043

0.472 ± 0.044

0.434 ± 0.040

0.466 ± 0.029

0.713

0.194

2

P. troglodytes

15,781

0.491 ± 0.046

0.468 ± 0.046

0.431 ± 0.042

0.463 ± 0.030

0.625

0.311

2

M. musculus

27,208

0.484 ± 0.046

0.469 ± 0.042

0.426 ± 0.040

0.460 ± 0.029

0.739

0.286

2

X. tropicalis

7706

0.481 ± 0.042

0.481 ± 0.041

0.439 ± 0.037

0.467 ± 0.028

0.644

0.353

2

D. rerio

14,151

0.471 ± 0.044

0.468 ± 0.040

0.408 ± 0.040

0.449 ± 0.030

0.614

0.314

2

D. melanogaster

23,936

0.475 ± 0.046

0.457 ± 0.044

0.362 ± 0.047

0.431 ± 0.030

0.689

0.236

2

C. elegans

29,227

0.450 ± 0.047

0.475 ± 0.045

0.421 ± 0.043

0.449 ± 0.032

0.634

0.224

2

A. thaliana

35,378

0.442 ± 0.045

0.477 ± 0.044

0.412 ± 0.041

0.444 ± 0.031

0.882

0.244

2

S. cerevisiae

5889

0.461 ± 0.041

0.510 ± 0.042

0.423 ± 0.038

0.465 ± 0.029

0.692

0.259

2

E.coli

4140

0.435 ± 0.046

0.426 ± 0.047

0.372 ± 0.043

0.411 ± 0.030

0.571

0.237

2

Average

235,269

0.468 ± 0.045

0.470 ± 0.043

0.413 ± 0.041

0.450 ± 0.030

0.882a

0.194a

2

Random CDSs (FrameAlign)

Three frames

100,000

0.394 ± 0.028

0.394 ± 0.028

0.395 ± 0.028

0.394 ± 0.016

0.477

0.330

2

Three random CDSs

100000 × 3

0.383 ± 0.028

0.383 ± 0.028

0.383 ± 0.028

0.383 ± 0.018

0.458

0.304

0

  1. aVery large/small similarity values were observed in a few very short or repetitive peptides