Skip to main content

Table 2 Negative predictions and accuracy of the ERPIN and POLYADQ program, evaluated for different control sequences not containing polyadenylation sites: coding sequences (CDS), introns, and two types of randomized UTR sequences: simple shuffling or first order Markov simulation.

From: Sequence determinants in human polyadenylation site selection

Negative set

AWUAAA per 100 kb

Program

TN

FP

FP per 100 kb

Specificity SP

Accuracy CC

CDS

31.2

Erpin

880

102

3.7

84.33 %

0.483

  

Polyadq

862

120

3.8

82.01 %

0.459

Introns

156.4

Erpin

741

241

38.9

69.49 %

0.320

  

Polyadq

718

264

42.0

67.45 %

0.293

UTR shuffled

109.6

Erpin

888

94

11.0

85.38 %

0.494

  

Polyadq

826

156

17.4

77.81 %

0.415

UTR Markov 1 st order

94.49

Erpin

772

210

21.9

72.33 %

0.354

  

Polyadq

733

249

23.9

68.72 %

0.309

  1. See Methods for information on database construction. Each row shows the number of potential A(A/U)UAAA signals per 100 kb in the dataset, True Negatives (TN), False Positives (FP), False Positives per 100 kb, Specificity (SP) and Accuracy (CC). Calculation of CC uses TP and TN from Table 1.