Skip to main content
Fig. 5 | BMC Genomics

Fig. 5

From: Systematic benchmark of state-of-the-art variant calling pipelines identifies major factors affecting accuracy of coding sequence variant discovery

Fig. 5

Variant calling and filtering methods perform similarly on GIAB and non-GIAB datasets. (a, c) A scatterplot showing the results of the principal component analysis in space of discordant variant calls on GIAB (a) and non-GIAB (c) data. b Distributions of the number of variant calling methods that detected each false negative (top) or false positive (bottom) variant. Note that the majority of FN and FP variants are represented by unique non-calls and unique calls, respectively. d Boxplots showing the number of unique calls (top) and unique non-calls (bottom) on GIAB and non-GIAB datasets for the indicated variant calling and filtering methods Variant callers and filtering strategies: CL - Clair3, DV - DeepVariant, G1 - GATK HaplotypeCaller with 1D CNN filtering, G2 - GATK HaplotypeCaller with 2D CNN filtering, GH - GATK HaplotypeCaller with recommended hard filters. ST - Strelka2, FB - Freebayes, OS - Octopus with standard filtering, OF - Octopus with random forest filtering

Back to article page