From: Large gene overlaps in prokaryotic genomes: result of functional constraints or mispredictions?
Reference | Objective | Excluded genes | Accepted gene set | Annotation errors suggested |
---|---|---|---|---|
Comparison study of overlapping genes in two Mycoplasma genomes. Study of overlapping genes in bacterial genomes | Homologous genes whose start codons was assigned differently and genes coding for hypothetical or putative proteins | Authentic ORFs, thus genes not annotated as hypothetical or putative proteins and conserved in COG database | Misprediction of the start codons | |
Rogozin et al., 2002[12] | Study of non-coding DNA in prokaryotic genomes | Genes coding for hypothetical proteins and overlapping more than 90 bps | Gene pairs not annotated as hypothetical or putative proteins and conserved in COG database | Misprediction of start codons, falsely predicted genes and missed genes, frameshifts |
Rogozin et al., 2002[14] | Analysis of the purifying and directional selection in overlapping prokaryotic genes | Genes not conserved in COG database and neither co-directional nor divergent overlapping pairs nor overlapping gene pairs not conserved in two or more species | Convergent overlapping genes conserved in both the COG database and in two or more than two genomes | Misprediction of start codons (affecting co-directional and divergent overlaps) and loss of termination codons (affecting co-directional and convergent overlaps) |
Johnson and Chisholm, 2004[6] | Study of the properties of the overlapping genes in microbial genomes | Genes coding for hypothetical proteins | Gene pairs not annotated as hypothetical or putative proteins | Misidentification of coding sequences |
Sakharkar et al., 2005[13] | Comparison study of overlapping genes in two Rickettsia genomes | Genes coding for hypothetical proteins | Gene pairs not annotated as hypothetical or unknown proteins | Incorrectly annotated ORFs |
Cock and Withworth, 2007[15] | Study of the relative reading frame bias in Prokaryotic Two-component system genes which use to overlap | Genes with ambiguous locations | Two component system gene pairs well located in the chromosome | Invalid bacterial start codons or premature stop codons |