Skip to main content

Table 1 Real genomic datasets tested in this study

From: HGTector: an automated method facilitating genome-wide discovery of putative horizontal gene transfers

Category

Selfgroup

Closegroup

No. of genes

Date of BLAST

Max. no. of hits

List of input genomes (organism name and NCBI accn. no.)

Alphaproteobacteria

SFG Rickettsia

Rickettsiales

8484

Jan. 2013

200

R. akari str. Hartford [GenBank:NC_009881]

R. felis URRWXCal2 [GenBank:NC_007109]

R. massiliae MTU5 [GenBank:NC_009900]

R. slovaca 13-B [GenBank:CP002428]

R. rickettsii str. ‘Sheila Smith’ [GenBank:NC_009882]

R. africae ESF-5 [GenBank:NC_012633]

R. conorii str. Malish 7 [GenBank:NC_003103]1

Firmicutes

Streptococcus

Bacilli

11906

Nov. 2013

100

S. anginosus C238 [GenBank:NC_022239]

S. gallolyticus UCN34 [GenBank:NC_013798]

S. intermedius B196 [GenBank:NC_022246]

S. mutans LJ23 [GenBank:NC_017768]

S. pneumonia A026 [GenBank:NC_022655]

S. suis JS14 [GenBank:NC_017618]

Epsilonproteobacteria

Helicobacter

Campylobacterales

10531

Mar. 2013, Nov. 20132

200

H. acinonychis Sheeba [GenBank:NC_008229]

H. bizzozeronii CIII-1 [GenBank:NC_015674]

H. cinaedi PAGU611 [GenBank:NC_017761]

H. felis ATCC 49179 [GenBank:NC_014810]

H. mustelae 12198 [GenBank:NC_013949]

H. hepaticus ATCC 51449 [GenBank:NC_004917]

Gammaproteobacteria

Erwinia

Enterobacteriales

19013

Mar. 2013

200

E. amylovora ATCC 49946 [GenBank:NC_013971]

E. billingiae Eb661 [GenBank:NC_014306]

E. sp. Ejp617 [GenBank:NC_017445]

E. pyrifoliae DSM 12163 [GenBank:NC_017390]

E. tasmaniensis Et1/99 [GenBank:NC_010694]

Actinobacteria

Mycobacterium africanum

Mycobacterium

3830

Oct. 2013

100

M. africanum GM041182 [GenBank:NC_015758]

Unicellular red algae

Galdieria sulphuraria

Eukaryota

7174

Dec. 2013

50

G. sulphuraria [GenBank:ASM34128v1]3

Higher animal

Homo sapiens

Animalia

225164

Nov. 2013

1000

H. sapiens [GenBank:GCF_000001405.13]

  1. 1The genomes used in this study are identical to those used in [37].
  2. 2Two independent analyses were conducted on different dates, and similar outcomes were obtained. The more recent result was reported.
  3. 3The genome used in this study is identical to that used in [79].
  4. 4For genes with multiple isoforms, the longest CDS was extracted using an in-house Perl script and used for the analysis.