From: Machine learning approaches to predict the Plant-associated phenotype of Xanthomonas strains
Top features enriched in pathogens | ||||||||
 | Domain | Description | RF | Lasso | CART | P | NP | Enrichment |
 | PF13855 | Leucine-rich repeat | 100.0 | 74.9 | 100.0 | 0.62 | 0.04 | 2.14e-08 |
 | PF09613 | Type III secretion system, HrpB1/HrpK | 52.1 | 14.3 | 75.6 | 0.83 | 0.30 | 3.83e-06 |
* | PF05932 | Tir chaperone protein (CesT) family | Â | Â | Â | 0.82 | 0.28 | 3.96e-06 |
* | PF09483 | Type III secretion protein HpaP | Â | Â | Â | 0.83 | 0.30 | 3.83e-06 |
* | PF09486 | Type III secretion protein HrpB7 | Â | Â | Â | 0.83 | 0.30 | 3.83e-06 |
* | PF09487 | Type III secretion protein HrpB2 | Â | Â | Â | 0.83 | 0.30 | 3.83e-06 |
* | PF09502 | Type III secretion protein HrpB4 | Â | Â | Â | 0.83 | 0.30 | 3.83e-06 |
* | PF05819 | NolX | Â | Â | Â | 0.69 | 0.30 | 3.34e-03 |
 | PF09994 | Domain of unknown function DUF2235 | 19.0 | 13.0 | 33.3 | 0.54 | 0.09 | 8.84e-05 |
 | PF13276 | HTH-like domain | 23.0 | 8.3 | 26.8 | 0.91 | 0.49 | 1.82e-04 |
 | PF13333 | Integrase, catalytic core | 13.4 | 3.6 | 12.2 | 0.51 | 0.09 | 2.39e-04 |
 | PF13579 | Glycosyltransferase subfamily 4-like | 16.0 | 100.0 | 4.4 | 0.88 | 0.53 | 2.89e-03 |
 | PF14341 | Type 4 fimbrial biogenesis protein PilX | 15.2 | 18.3 | 4.5 | 0.85 | 0.49 | 4.25e-03 |
 | PF01382 | Avidin/streptavidin | 6.3 | 33.6 | 0.0 | 0.26 | 0.04 | 2.74e-02 |
 | PF10117 | 5-methylcytosine restriction system component | 17.0 | 77.6 | 3.8 | 0.32 | 0.08 | 3.30e-02 |
 | PF12161 | N6 adenine-specific DNA methyltransferase | 5.5 | 27.7 | 0.1 | 0.88 | 0.62 | 4.15e-02 |
* | PF01420 | Restriction endonuclease, type I, HsdS | Â | Â | Â | 0.85 | 0.60 | 5.74e-02 |
Top features enriched in non-pathogens | ||||||||
 | Domain | Description | RF | Lasso | CART | P | NP | Enrichment |
 | PF12840 | Helix-turn-helix domain | 46.7 | 67.3 | 70.1 | 0.25 | 0.75 | 1.87e-05 |
 | PF13570 | Pyrrolo-quinoline quinone-like domain | 12.7 | 4.3 | 18.2 | 0.60 | 0.98 | 1.01e-04 |
 | PF03552 | Cellulose synthase | 15.5 | 0.0 | 25.5 | 0.35 | 0.81 | 1.80e-04 |
* | PF03170 | Cellulose synthase BcsB, bacterial | Â | Â | Â | 0.37 | 0.83 | 1.01e-04 |
* | PF05420 | Cellulose synthase operon C, C-terminal | Â | Â | Â | 0.38 | 0.81 | 4.28e-04 |
* | PF01270 | Glycoside hydrolase, family 8 | Â | Â | Â | 0.37 | 0.79 | 7.33e-04 |
 | PF13424 | Tetratricopeptide repeat | 16.9 | 26.0 | 26.1 | 0.51 | 0.91 | 4.28e-04 |
* | PF12823 | Domain of unknown function DUF3817 | Â | Â | Â | 0.52 | 0.92 | 2.85e-04 |
 | PF06629 | MltA-interacting MipA | 14.5 | 47.7 | 4.6 | 0.54 | 0.85 | 1.57e-02 |
 | PF00656 | Caspase domain | 6.1 | 0.5 | 19.6 | 0.58 | 0.85 | 4.45e-02 |
 | PF13391 | HNH nuclease | 8.5 | 55.6 | 1.4 | 0.08 | 0.30 | 5.35e-02 |
 | PF10013 | Uncharacterised conserved protein UCP037205 | 4.0 | 27.1 | 0.3 | 0.31 | 0.57 | 8.03e-02 |