Skip to main content

Table 1 Prediction accuracy of PHLAT and other methods in benchmarking datasets

From: Inference of high resolution HLA types using genome-wide RNA or DNA sequencing reads

HLA resolution

Dataset

Read length

PHLAT

HLAminer

HLAforest

seq2HLA

Accuracy

Accuracy

Apparent accuracy

Accuracy

Accuracy

4-digit

HapMap RNAseq

2×37 bp

92.3%

39.8%

43.0%

84.2%

~32%

1000 Genome WXS

2×100 bp

95.0%

55.0%

71.0%

77.0%

-

HapMap WXS

2×101 bp

93.3%

53.3%

84.4%

45.6%

-

Amplicon seq

2×250 bp

100%

50.0%

55.0%

-

-

2-digit

HapMap RNAseq

2×37 bp

99.1%

71.1%

71.6%

97.3%

97.2%

1000 Genome WXS

2×100 bp

97.0%

83.0%

85.0%

95.0%

90.0%

HapMap WXS

2×101 bp

95.6%

78.9%

88.9%

81.1%

93.3%

Amplicon seq

2×250 bp

100%

95.0%

95.0%

-

-

  1. The accuracies and apparent accuracies are calculated as described in Methods. The accuracies of the existing methods are taken from their original publications if the datasets were examined therein, otherwise are derived by applying the methods locally (Additional file 1: Table S1 and Additional file 4: Table S4, Additional file 5: Table S5 and Additional file 6: Table S6). The four-digit accuracy of seq2HLA in HapMap RNAseq dataset (~32%) is taken from the main text of its publication [28]. For all other datasets, seq2HLA is applied only at two-digit resolution. The accuracy of seq2HLA predictions is calculated without any p-value threshold. It produces less false negatives and hence higher accuracies than if imposing a p-value cutoff of 0.1 as described earlier [28].