Skip to main content

Table 3 List of all attributes categories used in data set formation in this study, and number of attributes in each categories for all data points in training data set for Dickeya dadantii (Dd3937) and Pectobacterium carotovorum (WPP14)

From: Identification of host-microbe interaction factors in the genomes of soft rot-associated pathogens Dickeya dadantii 3937 and Pectobacterium carotovorum WPP14 with supervised machine learning

Category

Subcategory

Dd3937

WPP14

Reference

Sequence homology

Subtotal

297

297

 
 

Gamma strains

239

239

Additional file 2a

 

Non-gamma strains

58

58

Additional file 2b

Phenotypes of interest

Subtotal

194

194

 
 

Taxonomy Statistics

76

76

Additional file 2c, d

 

Lifestyle Statistics

118

118

Additional file 2c, d

Gene characteristics

Subtotal

23

21

 
 

GC content

1

1

This study

 

subcellular localization

1

1

[42, 46]

 

phylogenetic profile

6

6

[40, 41]

 

fingerprints scanning

3

3

[43, 44]

 

codon adaptation index (CAI)

3

3

[47, 48]

 

physical adjacency (gene neighbor)

2

2

[49, 50]

 

Operon prediction

1

1

[36, 51]

 

phylogenetic conservation

1

1

This study

 

COG functional category

1

1

[52]

 

Genomic island

4

1

[53, 54]

 

computed structural and physicochemical features of proteins and peptides

40

66

[35, 55]

Functional genomics

Subtotal

52

3

 
 

binding site prediction

32

0

Additional file 3b

 

Gene expression

14

3

Additional file 3a

 

proteomics

6

0

Additional file 3a

 

Total

606

581

Â