Skip to main content

Table 1 Previous analyses of gene fusions

From: Systematic identification and analysis of frequent gene fusion events in metabolic pathways

No. of genomes

Organisms analyzed

No. of detected fused proteins

No. of predicted functional linkages**

Ref

Website

Fusion detection method***

Homology or orthology-based? ***

2

EC, SC

-

6,809 in EC 45,502 in SC

[62]

-

Gene fusion (BLAST) & domain fusion (ProDom)

All homologs (5 % most promiscuous domains removed)

3

EC, PH, SC

-

854 in EC 107 in PH; 918 in SC

[63]

-

Gene fusion (BLAST)

All homologs

4

EC, HI, MJ, SC

64

-

[2]

List of fusions a

Gene fusion (BLAST & S-W)

All homologs

17

Bact, Arch

229

-

[64]

-

Gene fusion (S-W)

Orthologs only (BBH)

24

Bact, Arch (+SC)

2,365 (621 families)

-

[65]

-

Gene fusion (BLAST, component overlap <10 %)

All homologs

30

Bact, Arch (+SC)

4,515

-

[3]

DB (not maintained) b; Fusion stats c

Gene fusion (BLAST)

Orthologs only (one link between each COG)

89

Bact, Arch

∼20,000

-

[66]

FusionDB (not maintained) d

Gene fusion (BLAST)

Orthologs only (BBH)

184

Bact, Arch, Eukar

130,229

2,192,019

[25]

Results for download e

Domain fusion (Pfam)

All homologs (promiscuous domains removed)

20

Bact, Arch, Eukar

49

-

[67, 68]

SAFE software; FED DB (not maintained) f

Gene fusion (BLAST)

All homologs (promiscuous domains removed)

30

Bact, Arch

2,490 by MF 5,339 by FT

-

[69]

MosaicFinder; FusedTriplets software g

Gene fusion (BLAST)

Graph topology of seq. similarity network is used for scoring

1,895*

Bact, Arch

user set-dependent, 2,193 in EC

-

[70]

MicroScope h

n/a

Synteny based fusion detection

2,031*

Bact, Arch, Eukar

user set-dependent

-

[24, 71]

String DB i

n/a

n/a

2,291*

Bact, Arch (+SC)

-

2,209,622

[72]

Prolinks j

Gene fusion (BLAST)

All homologs (promiscuous domains removed)

31,442*

Bact, Arch, Eukar

user set-dependent,397 in EC

-

[34, 73]

JGI IMG k

Gene fusion (USEARCH)

All homologs (as in [2])

user set

Eukar

-

user set-dependent

[24]

CODA software l

Domain fusion (Pfam)

All homologs (scoring immune to promiscuous domains)

2

Eukar (HS, SC)

235 in HS; 189 in SC

-

[74]

Domain Fusion DB m

Domain fusion (Pfam)

All homologs (promiscuous domains removed)

1

Eukar (TT)

80 in TT

-

[17]

DeFuser n

Domain fusion (KOG)

Compares N and C termini of query sequence to KOG DB

  1. The Table is modified and extended from Table 1 in Reid et al. [24]
  2. Abbreviations: DB database, MF MosaicFinder software, FT FusedTriplets software, n/a information not available, S-W Smith-Waterman, organisms, Bact Bacteria, Arch Archea, Eukar Eukaryota, EC E. coli, HI H. influenza, HS H. sapiens, MJ M. jannaschii, PH P. horikshii, SC S. cerevisiae, TT T. thermophila
  3. * Statistics as of November 2015
  4. ** Predicted potential protein-protein interactions (‘functional links’) based on gene fusion events; the actual fused proteins were NOT reported in some studies
  5. *** Two main bioinformatics approaches to identify fusion events were used: whole protein sequence comparisons (‘gene fusion’) or domain family comparisons (‘domain fusion’)
  6. a http://www.nature.com/nature/journal/v402/n6757/extref/402086a0-s2.html
  7. b http://fusion.bu.edu
  8. c http://www.pnas.org/content/98/14/7940/T1.expansion.html
  9. d http://www.igs.cnrs-mrs.fr/FusionDB/
  10. e http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2248599/#S8
  11. f Contact Sofia KOSSIDA (sofia.kossida@igh.cnrs.fr)
  12. g http://sourceforge.net/projects/mosaicfinder/
  13. h https://www.genoscope.cns.fr/agc/microscope/compgenomics/fusfis.php?
  14. i http://string-db.org/
  15. j http://prl.mbi.ucla.edu/prlbeta/
  16. k https://img.jgi.doe.gov
  17. l ftp://ftp.biochem.ucl.ac.uk/pub/gene3d_data/v12.0.0/coda/
  18. m http://calcium.uhnres.utoronto.ca/pi/no_flash.htm