Skip to main content
Fig. 1 | BMC Genomics

Fig. 1

From: Muver, a computational framework for accurately calling accumulated mutations

Fig. 1

Overview of muver operation. White boxes represent input/output files, purple boxes represent tasks implemented in the muver codebase, while green boxes represent tasks performed by external tools. a Given a list of FASTQ files, muver initiates alignment, quality filtering, and deduplication of reads. b For each sample, Gaussian distributions of per-nucleotide coverage and bias in coverage per-strand are fit to observed values to aid in filtering of sites with abnormal coverage patterns, and indel error rates are estimated as a function of repeat length. Optionally, allelic fraction distributions are generated to facilitate determination of sample ploidy. Read counts supporting each observed allele, per-strand, are determined for potential reference variants using GATK HaplotypeCaller. c Outgrowth (descendant) sample read counts are compared to t0 (ancestor) using three separate statistical tests; genotypes are called for each sample by comparing observed read count distributions to those expected for all possible combinations of observed alleles at the provided ploidy. Regions with abnormally high or low coverage, to be excluded from mutation calling, are identified based on the previous fit. d For each outgrowth sample, mutations are inferred at sites where read counts differ sufficiently from t0 by enumerating all possible series of events that explain the called t0 and outgrowth genotypes. The most parsimonious of these are accepted as the most likely, and events occurring in all cases are reported to the output

Back to article page