Skip to main content

Table 1 Sensitivity of the algorithms on the analysis of the influenza data

From: Identifying genetic determinants of complex phenotypes from whole genome sequence data

Site

AB

RRF

Phenotype

References

PB2 9

✓

✓

Infectivity

[26]

PB2 105

✓

✓

Pathogenicity/Infectivity

[27]

PB2 339

✓

✓

Infectivity

[28, 29]

PB2 391

✓

 

Transmissibility

[30]

PB2 627

✓

✓

Infectivity

[31]

PB2 667

✓

 

Infectivity

[32]

PB1 215

✓

✓

Pathogenicity

[33]

PB1 375

 

✓

Pathogenicity

[34]

PB1 757

✓

 

Infectivity

[35]

HA 163

 

✓

Pathogenicity/Infectivity

[36]

HA 212

 

✓

Pathogenicity/Infectivity

[37]

HA 246

✓

✓

Transmissibility

[38]

HA 536

 

✓

Infectivity

[39]

NP 400

  

Pathogenicity

[40]

NA 49

 

✓

Transmissibility

[41]

NA 75

  

Transmissibility

[42]

M2 31

 

✓

Pathogenicity/Infectivity

[41]

NS1 127

  

Pathogenicity

[43]

NS1 195

  

Transmissibility/Infectivity

[44]

NS1 212

 

✓

Pathogenicity/Infectivity

[45]

  1. This table lists the genes and amino acid positions known to be involved in the three phenotypes studied here, and which one of these were rediscovered by our algorithms. For AB, chunk sizes of 75, 125, and 175 were used to calculate the importance values of each site for adaptive boosting. An importance threshold of 1 was used to determine whether a site was a potential genetic determinant. For RRF, chunk sizes of 80, 125, and 175 were used with a threshold of the 90th percentile and a 60% consensus. Data on experimental validations are from the Influenza Research Database [24]. Genes are ordered by segment size. See Figs. 2 and 3 for the specificity of these algorithms