Skip to main content

Table 1 Bacterial species used in the analysis.

From: Predicting protein function by machine learning on amino acid sequences – a critical evaluation

Species Total # of proteins # of 'unknowns' % 'unknowns' % Average GC content
Haemophilus ducreyi 1830 381 21 38.22
Neisseria gonorrhoeae 2188 667 30 52.69
Chlamydia trachomatis 902 318 35 41.31
Treponema pallidum 1051 28 5 52.77
Streptococcus agalactiae 2177 567 26 35.65
Ureaplasma urealyticum 614 275 48 25.50
Mycoplasma genitalium 485 158 33 31.69
\