Skip to main content
Figure 1 | BMC Genomics

Figure 1

From: Gene fusions and gene duplications: relevance to genomic annotation and functional analysis

Figure 1

Identification and sequence similarity of multimodular E. coli proteins. (a) An E. coli protein (gi1787250) aligns with two smaller proteins from C. acetobutylicum, histidinol phosphatase (gi15026114) and imidazoleglycerol-phosphate dehydratatase (gi15023840). The E. coli protein represents a fused or multimodular protein encoding the two functions in separate parts of the protein as indicated by the two non-overlapping alignment regions. Based on the alignment regions, the E. coli protein is separated into two separate components, modules. The modules are identified with the extensions "_1" or "_2" to indicate their location in the gene product as N-terminal or C-terminal, respectively. (b) Sequence similarity between modules of the multimodular proteins is shown. No detectable similarity between the joined modules is indicated by a difference in the module patterns in the cartoon. Similarity is measured by Darwin and indicates that the proteins align at a distance of ≤ 200 PAM units over at least 83 amino acid residues or >45% of the length of the proteins. This level of similarity also reflects whether the modules belong to the same paralogous group.

Back to article page