Skip to main content

Table 2 Training set used to build regular expression and pHMMs models

From: Systematic interrogation of the Conus marmoreus venom duct transcriptome with ConoSorter reveals 158 novel conotoxins and 13 new gene superfamilies

 

Signal

Pro-region

Mature

 

Total clusters

Unique seq.

Total seq.

Total clusters

Unique seq.

Total seq.

Total clusters

Unique seq.

Total seq.

A

3

-

77

21

6

136

23

6

195

B1

2

1

8

2

-

13

7

3

13

B2

1

1

1

1

1

1

1

1

1

B3

1

1

1

-

-

-

1

1

1

C

1

-

3

2

-

4

2

-

4

D

1

-

10

3

1

21

5

1

30

E

1

1

1

-

-

-

1

1

1

F

1

1

1

1

1

1

1

1

1

G

1

1

1

1

1

1

1

1

1

H

1

-

2

3

2

5

4

2

7

I1

3

1

12

5

4

13

8

2

48

I2

2

-

31

14

6

32

12

5

50

I3

1

-

4

3

2

7

2

-

8

J

1

-

8

2

1

5

2

-

11

K

1

1

1

1

1

1

1

-

3

L

4

2

10

5

2

10

5

2

11

M

6

2

152

16

5

263

34

16

267

N

1

-

2

1

-

3

1

-

3

O1

14

3

198

20

7

292

32

4

437

O2

4

1

50

15

6

75

12

6

81

O3

1

-

16

7

5

21

3

1

26

P

2

1

6

5

3

7

3

1

12

S

4

3

7

3

-

9

5

1

14

T

3

-

79

18

10

112

12

6

138

V

1

-

2

1

-

2

1

-

2

Y

1

1

1

1

1

1

1

1

1

M---L-LTVA

1

-

5

2

1

8

4

2

8

MKFPLLFISL

1

1

1

1

1

1

1

1

1

MKLCVVIVLL

1

-

2

1

1

1

1

-

3

MKLLLTLLLG

1

1

1

-

-

-

-

-

-

MKVAVVLLVS

1

1

1

-

-

-

-

-

-

MRCLSIFVLL

1

-

2

1

1

1

1

1

1

MRFLHFLIVA

1

1

1

1

1

1

1

1

1

MRFYIGLMAA

1

-

2

1

-

3

1

-

4

MSKLVILAVL

1

1

1

1

1

1

1

1

1

MSTLGMTLL-

1

-

5

3

2

5

3

1

5

MTAKATLLVL

1

1

1

1

1

1

1

1

1

MTFLLLLVSV

1

1

1

1

1

1

1

1

1

MTLTFLLVVA

1

1

1

1

1

1

1

1

1

Conantokin

2

1

8

2

-

13

4

1

19

Conodipine

-

-

-

-

-

-

2

2

2

Conohyal

1

-

2

-

-

-

2

2

2

Conolysin

-

-

-

-

-

-

1

-

2

Conomap

-

-

-

-

-

-

1

1

1

Conomarphin

1

-

2

1

-

2

1

-

4

Conopeptide Y

1

1

1

1

1

1

1

-

2

Conophan

-

-

-

-

-

-

1

-

2

Conoporin

1

1

1

-

-

-

1

1

1

Conopressin

1

1

1

-

-

-

1

-

6

Conorfamide

-

-

-

-

-

-

1

-

2

Conotoxin

29

4

697

84

14

1,299

189

87

1,730

Conotoxin-like

1

-

2

1

1

1

1

-

2

Contryphan

2

1

10

1

-

13

2

1

15

Contulakin

1

-

3

1

-

3

2

1

4

  1. For each gene superfamilies and classes the table shows the total number of clusters (containing conopeptides with high sequence similarities), unique sequences, and total sequences in the ER signal, pro- and mature regions.