Skip to main content
Fig. 1 | BMC Genomics

Fig. 1

From: Recurrent miscalling of missense variation from short-read genome sequence data

Fig. 1

Recurrent false positive variant calls result from difficulties encountered with alignment of short reads to complex mammalian genomes. a The frequencies of observed variant calls from 2314 exomes of inbred mice show an intermediate category of recurrent-yet-intermittent SNVs, between the frequency extremes of fixed strain-specific variation and the rare, pedigree-specific induced mutation. b Recurrent false positive variants can be replicated for a given individual sequence through randomly sampling short reads, realigning these to a reference genome and recalling sequence variants. Variants that are only intermittently called accumulate after multiple cycles of this process and increase in number following an approximately Poisson distribution. Blue dots show the smaller number of recurrent false positive variants obtained from sampling a C57BL6 mouse and realigning it to itself. Greater numbers are obtained through simulation with three non-reference mouse strains FVB (orange), CBA (red), and C3H (grey)

Back to article page