Skip to main content

Table 3 Efficiency and SNP detection rates of non-barcoded and pooled samples

From: Accurate variant detection across non-amplified and whole genome amplified DNA using targeted next generation sequencing

Minimum read count for SNP call

Library ID

Positive control SNPs

Positive control SNPs in sample

Total SNPs in sample

Sensitivity

False discovery rate

5

768_1L

244

226

376

92,6%

39,9%

 

768_2L

244

230

371

94,3%

38,0%

10

768_1L

244

212

277

86,9%

23,5%

 

768_2L

244

212

267

86,9%

20,6%

20

768_1L

244

193

214

79,1%

9,8%

 

768_2L

244

198

221

81,1%

10,4%

  1. - Minimum read count for SNP call: Minimum number of non-reference allele counts required for a SNP to be considered detected.
  2. - Positive control SNPs: Positive control SNPs generated from the non-pooled, non-barcoded data (759–764). Since the HapMap genotyping data was incomplete, even for known SNPs, we attempted to create a positive control set of SNPs within the targeted regions. If the SNP was detected within samples 759–764, a combined genotype was determined for that SNP position. For example, position X was determined to have a “CG” genotype in sample 759 and position X had the reference genotype of “CC” in samples 760–764, the predicted allele frequency would be 8.3% (1 in 12). In the non-pooled samples, a SNP with a non-reference allele frequency of 10-90% was considered a heterozygote. A homozygous SNP in non-pooled samples was defined as having >90% non-reference allele frequency. The number in this column represents the total number of SNPs that have a non-reference allele within a given pooled sample. Note that these positive control SNPs include HapMap samples with rs IDs, non-HapMap samples with rs IDs, and potentially novel SNPs.
  3. - Positive control SNPs found: This number represents the number of positive control SNPs that were detected in a given pool with a given set of parameters.
  4. - Total SNPs detected: This number represents the total number of SNPs found in a given pool with a given set of parameters. This number contains the “positive SNPs found” plus other SNPs. It is assumed that most of these SNPs are false positives since this number decreases significantly if you increase the stringency of your SNP detection parameters. However, some novel SNPs could exist in this set.
  5. - Sensitivity: In this case, this is simply the percentage of positive controls SNPs found in a given pool with a given set of parameters. Sensitivity decreases as SNP detection stringency increases.
  6. - False Discovery Rate: This was defined as (total SNPs detected – positive control SNPs found)/Total SNPs detected * 100.