Skip to main content

Table 2 Sequence features found on nodes with the highest Shannon’s index scores

From: Graph mining for next generation sequencing: leveraging the assembly graph for biological insights

Sample

Shannon's Index Score

Sequence Feature(s)

Blast E-Values

Female 33

3.69

Transporter, RelB/DinJ, Transposase

5e-15, 5e-32, 4e-32

Female 33

3.66

Transposase

1.00e-45

Female 58

2.58

Hypothetical protein

6.00e-04

Female 58

2.46

TonB-dependent receptor

6.00e-51

Female 92

2.82

Delta-lactam-biosynthetic de-N-acetylase

3.00e-57

Female 92

2.74

Resolvase

4.00e-87

Male 104

2.53

Transposase, Cbl

3e-32, 7e-27

Male 104

2.47

Phosphatase, Histidine phosphotransferase

2e-108, 4e-72

Male 68

2.86

PG1 protein

1.00e-32

Male 68

2.43

Transposase

1.00e-45

Female 638754422

3.75

Transposase, IS4 family

8.00e-61

Female 638754422

3.68

Transposase

2.00e-22

Female 763820215

3.26

Major facilitator transporter

0.00e + 00

Female 763820215

2.51

ATPase AAA

1.00e-120

Female 764042746

2.69

Transposase

1.00e-34

Female 764042746

2.64

DEAD/DEAH box helicase

1.00e-172

Female 809635352

4.65

30S ribosomal protein S12

1.00e-26

Female 809635352

4.58

Uracil phosphoribosyltransferase

7.00e-05

Female 764143897

3.90

None

NA

Female 764143897

3.63

None

NA

Male 604812005

2.96

Tetratricopeptide repeat protein

6.00e-47

Male 604812005

2.71

ATP-dependent DNA helicase RecQ

2.00e-149

Male 763435843

3.65

Putative transposase, Major Facilitator Superfamily protein, Glycosyltransferase, Group 1 family protein

5e-12, 1e-24, 8e-13

Male 763435843

3.47

Transposase

1.00e-43

Male 763961826

2.72

None

NA

Male 763961826

2.65

Transposase family protein, DNA polymerase IV

4e-67, 2e-51