From: Systematic identification and analysis of frequent gene fusion events in metabolic pathways
ID | Criteria | Biological meaning |
---|---|---|
1 | Protein length must exceed 600 amino acid residues | Fusion proteins should be longer than single-domain proteins |
2 | All non-overlapping CDDs together must align to at least 40Â % of the gene length | Fused-domains should cover the full length of the fused gene |
3 | A minimum alignment length of 50 for all non-overlapping CDDs | Fused-domains should represent entire genes and should not be overly short |
4 | Gap between fused domains must be at least 60 residues and 10Â % of gene length from end of gene | Point of fusion should be fairly centrally located in fused gene |
5 | At least two distinct CDD sets represented in the gene | Fused domains should not belong to the same CDD |
6 | Less than half of the CDD alignments for the gene should cross the gap between fused domains | A fused gene should be characterized more as a fusion of multiple domains than as a match to a single domain |
7 | All non-overlapping CDDs must co-occur with fewer than 1500 different CDD sets | Fused domains should not be overly promiscuous |
8 | Fewer than 1000 matches among the non-overlapping CDDs | Fused domains should be different from one another |