Skip to main content
Fig. 2 | BMC Genomics

Fig. 2

From: Many purported pseudogenes in bacterial genomes are bona fide genes

Fig. 2

Potential causes of variability in pseudogene counts. (A) Reported (blue) and automatic (green) metadata may affect the number of putative pseudogenes (red) in a genome assembly. Arrows represent causal connections predicted by Tetrad between metadata variables for available RefSeq genomes. Sequencing platform and read coverage were predicted to have a potential causal influence over observed pseudogene counts. (B) Cumulative distributions of average pseudogene density per assembly identified by PGAP in publicly available E. coli assemblies. Some reported assemblers and sequencing platforms were associated with unusual numbers of frameshifts (left) and internal stops (right). However, this observational dataset cannot do more than suggest hypotheses as to the causes of variability in pseudogene counts. Inset legends are ordered from most to least frequently observed assembler (top row; n = 7480 total) or sequence platform (bottom row; n = 10,170 total)

Back to article page