Skip to main content

Table 2 Overview of SNP discovery and genotype calling using three different callers

From: Generation of SNP datasets for orangutan population genomics using improved reduced-representation sequencing and direct comparisons of SNP calling algorithms

  GATK_v.2.5-0 CLC_v.5.0.1 SAMtools_v.0.1.19
  Pop_SK Pop_WA Overall Pop_SK Pop_WA Overall Pop_SK Pop_WA Overall
No. of SNPs 34257 40248 57396 34788 55585 75364 14494 14903 24103
No. of private SNPs 17148 23139 40287 19779 40576 60355 9200 9609 18809
% singletons 7.68 10.83 12.18 11.53 27.47 25.59 14.63 21.66 22.19
Median site heterozygositya 0.267 0.250 / 0.236 0.200 / 0.266 0.231 /
Median coverage per individual 93× 70× 82× 66× 29× 48× 66× 19× 27×
  GATK-CLC intersect SAMtools- GATK intersect SAMtools-CLC intersect
  Pop_SK Pop_WA Overall Pop_SK Pop_WA Overall Pop_SK Pop_WA Overall
No. of SNPs 21475 24936 37085 11325 12350 18933 9861 11310 17163
No. of private SNPs 12149 15610 27759 6583 7608 14191 5853 7302 13155
% singletons 9.91 17.98 12.82 9.99 20.53 19.37 10.54 23.08 21.60
Median site heterozygositya 0.250 0.222 / 0.286 0.231 / 0.266 0.222 /
Median coverage per individualb 107× (65) 81× (27) 96× (37) 55× (98) 18× (98) 20× (99) 69× (76) 19× (35) 26× (46)
  1. We required all SNPs to have a genotype call passing all stringent quality filters in a minimum of eight individuals per population (population-based filtering). The intersect datasets contain exclusively concordant genotype calls between the designated SNP callers. Pop_SK: South Kinabatangan population, Pop_WA: West Alas population.
  2. aBased on the sites being polymorphic within the population.
  3. bCoverage values of intersect datasets are taken from the first named SNP caller. The coverage values of the second named caller are given in brackets.