Skip to main content

Table 4 Descriptive statistics of the additional non-GIAB samples used for overfitting analysis

From: Systematic benchmark of state-of-the-art variant calling pipelines identifies major factors affecting accuracy of coding sequence variant discovery

Sample

Type

Source (SRA ID)

Ethnicity

Mean coverage

Fraction of 10x bases

Median variant count**

NA18870

WGS

ERX3266761

African

123.8

0.999

25,528

NA18871

WGS

ERX3266762

African

104.5

0.999

25,277

NA18874

WGS

ERX3270176

African

70.1

0.998

25,269

RUSZ02

WES

[18]

Russian

154.3

0.986

20,184

RUSZ05

WES

[18]

Russian

178.0

0.986

19,972

RUSZ07

WES

[18]

Russian

174.2

0.984

20,092