Skip to main content

Table 1 Bacterial species used in the analysis.

From: Predicting protein function by machine learning on amino acid sequences – a critical evaluation

Species

Total # of proteins

# of 'unknowns'

% 'unknowns'

% Average GC content

Haemophilus ducreyi

1830

381

21

38.22

Neisseria gonorrhoeae

2188

667

30

52.69

Chlamydia trachomatis

902

318

35

41.31

Treponema pallidum

1051

28

5

52.77

Streptococcus agalactiae

2177

567

26

35.65

Ureaplasma urealyticum

614

275

48

25.50

Mycoplasma genitalium

485

158

33

31.69