Skip to main content
Fig. 1 | BMC Genomics

Fig. 1

From: VariantMetaCaller: automated fusion of variant calling pipelines for quantitative, precision-based filtering

Fig. 1

Earlier approaches, current study design including data sets and evaluations, and the conceptual overview of VariantMetaCaller. Study design: Simulated sequences of various target region sizes, and real sequence data covering the whole exome of NA12878 were aligned by BWA and Bowtie 2 to the human genome. Variants were called by GATK HaplotypeCaller, GATK UnifiedGenotyper, FreeBayes and SAMtools. Evaluation: Variant calling pipelines were compared by calculating concordance rates. Precision-recall curves were plotted and the area under the precision-recall curves was calculated for each method. Earlier approaches: Hard filters can be applied to filter variants by specifying annotation cutoffs. VQSR can be applied to recalibrate variant qualities based on gold standard reference data and variant annotations. BAYSIC combines the unfiltered variant calls by late integration. Overview of VariantMetaCaller: VariantMetaCaller (1) combines the unfiltered call sets by SVMs that use variant annotations as features and (2) estimates the probability of each variants being real. The probabilistic output of VQSR and VariantMetaCaller can be used to estimate FDR at each probability cutoff and to optimally select the filtered variants with respect to the cost function of the researchers. AUPRC = Area under the precision-recall curve, FDR = false discovery rate, NGS = Next-generation sequencing, SVM = Support Vector Machine

Back to article page