Skip to main content
Fig. 2 | BMC Genomics

Fig. 2

From: Mid-pass whole genome sequencing enables biomedical genetic studies of diverse populations

Fig. 2

Performance of mid-pass whole genome sequencing across self-reported ethnicities and compared to array genotyping. A) Principal component analysis of imputed genotype data from 1410 mid-pass and 100 high-pass sequenced Polynesian individuals’ genomes. Data points are colored by self-reported ethnicity (EURO, European, MACI, Cook Islands Māori, MANZ, Aotearoa New Zealand Māori, NIUE, Niuean, OTHR, other, PNMI, Mixed Ethnicity Polynesian, PUKA, Pukapukan, SAMO, Samoan, TONG, Tongan, listed in alphabetical order) with symbols corresponding to the broader regional division of Polynesia (East, West or NA, not applicable). B) Performance measured using recall, precision, and non-reference concordance rate (NCR) for mid-pass derived imputed genotype calls across self-reported ethnicities. Metrics were calculated for the genomes of 100 individuals sequenced as part of this study at both and high- and mid-pass using the high-pass genotype calls as a truth set. C) Performance as a function of cohort size for individuals with self-reported Aotearoa New Zealand Māori ethnicity. Individuals were selected such that the smaller cohorts have less European ancestry admixture (Fig. S8). D) Performance calculated from imputed genotypes for 84 individuals binned by sequencing coverage with corresponding array data for comparison and using previously available 30x whole-genome sequencing genotype calls as a truth set. For boxplots, bottom whisker: Q1–1.5*interquartile range (IQR), top whisker: Q3 + 1.5*IQR, box: IQR, center: median, and outliers are not plotted for ease of viewing

Back to article page