From: Predicting protein function by machine learning on amino acid sequences – a critical evaluation
Species | Total # of proteins | # of 'unknowns' | % 'unknowns' | % Average GC content |
---|---|---|---|---|
Haemophilus ducreyi | 1830 | 381 | 21 | 38.22 |
Neisseria gonorrhoeae | 2188 | 667 | 30 | 52.69 |
Chlamydia trachomatis | 902 | 318 | 35 | 41.31 |
Treponema pallidum | 1051 | 28 | 5 | 52.77 |
Streptococcus agalactiae | 2177 | 567 | 26 | 35.65 |
Ureaplasma urealyticum | 614 | 275 | 48 | 25.50 |
Mycoplasma genitalium | 485 | 158 | 33 | 31.69 |