Skip to main content

Table 3 A sample of irrelevant hits identified through the data cleaning process. The irrelevant hits from a viral lineage search included those of missing/incomplete lineage information (unknown, unclassified or no species information) or of unrelated lineage (bacterial species) origin

From: A systematic bioinformatics approach for large-scale identification and characterization of host-pathogen shared sequences

Protein ID

“Organism” field value

Remark

CAA41747.1

Retroviridae

Species information not available

CAA41748.1

Retroviridae

AAB29320.1

Tobamovirus

AAB22506.1

Orthohantavirus

CAM83964.1

unclassified Parechovirus

5AUM_D

Potyviridae

2MLG_A

Fuselloviridae

1BDE_A

Unknown

1Q3Z_A

Unknown

3F2E_A

Rudivirus

ANM47321.1

Streptococcus suis

Bacterial origin

ANM47427.1

Streptococcus suis

AAT65035.1

Mycoplasma fermentans

AAT65057.1

Mycoplasma fermentans

BAA94190.1

Escherichia coli O157:H7

CAC83125.1

Escherichia coli

CAH23236.1

Escherichia coli

CAH23267.1

Escherichia coli

CAH23268.1

Escherichia coli

4PJZ_A

Actinoplanes teichomyceticus

4PK0_A

Actinoplanes teichomyceticus