Fig. 5From: Systematic benchmark of state-of-the-art variant calling pipelines identifies major factors affecting accuracy of coding sequence variant discoveryVariant calling and filtering methods perform similarly on GIAB and non-GIAB datasets. (a, c) A scatterplot showing the results of the principal component analysis in space of discordant variant calls on GIAB (a) and non-GIAB (c) data. b Distributions of the number of variant calling methods that detected each false negative (top) or false positive (bottom) variant. Note that the majority of FN and FP variants are represented by unique non-calls and unique calls, respectively. d Boxplots showing the number of unique calls (top) and unique non-calls (bottom) on GIAB and non-GIAB datasets for the indicated variant calling and filtering methods Variant callers and filtering strategies: CL - Clair3, DV - DeepVariant, G1 - GATK HaplotypeCaller with 1D CNN filtering, G2 - GATK HaplotypeCaller with 2D CNN filtering, GH - GATK HaplotypeCaller with recommended hard filters. ST - Strelka2, FB - Freebayes, OS - Octopus with standard filtering, OF - Octopus with random forest filteringBack to article page