Skip to main content
Fig. 1 | BMC Genomics

Fig. 1

From: Many purported pseudogenes in bacterial genomes are bona fide genes

Fig. 1

Pseudogene discrepancies do not converge to zero near 100% ANI. The percentages of frameshifts (top) and internal stops (bottom) that are incongruent (pseudogene/gene) pairs are shown for 10,362 genomes and their nearest neighbor in RefSeq. Up to 100 genomes were randomly selected from each genus in RefSeq and coding sequence pairs were identified with Clusterize at ≥ 90% similarity. Each point represents the nearest genome by average nucleotide identity (ANI) scaled to the number of matched pseudogenes. Spline fits (curves) show that disagreement in whether a coding sequence is a gene or pseudogene does not converge to zero as two genomes approach identity. Inset legends show the scaling of point sizes based on the total number of congruent or incongruent pseudogene pairs shared by both assemblies

Back to article page