Skip to main content

Table 2 List of all the programs run in ReVac and their predicted features, with the scoring scheme for each programs output. Additional scoring descriptions based on outputs from multiple programs are listed at the bottom

From: ReVac: a reverse vaccinology computational pipeline for prioritization of prokaryotic protein vaccine candidates

Module (Reference)

Gene property

Evidence

Output

Scoring weight (points)

Example Protein (M. catarrhalis NAO366_1291)

Example Feature

Example Weight

Example Cumulative Score

PSORTb* [13]

Surface exposure^

Sub-cellular localization

Surface localization prediction

+ 1 if surface exposed

9.52|OuterMembrane

Positive surface exposure

1

1

−1 if cytoplasmic

    

LipoP [14]

Surface exposure^

Lipoprotein motif

Presence or absence of a motif

1 or 0

SpI|18.809

Positive for lipoprotein motif

1

2

TMHMM [15]

Surface exposure^

Transmembrane spans

Number of helices

If surface exposed < 2: + 0.5

1

Presence of 1 TMH

0.5

2.5

02:00.0

    

3: −0.2

    

≥4: −2

    

If cytoplasmic

    

−2

    

SignalP [16]

Surface exposure^

Signal peptide

Signal peptide

+ 1 for presence

MNKTSTQLGLLAVSVSLIMASLPAHA

Signal peptide present

1

3.5

SPAAN [17]

Surface exposure^

Adhesin protein

Adhesin protein score

+ 0.5 if above cutoff score (default 0.75)

0.907057

Predicted Adhesin

0.5

4

Surface HMMs [18]a

Surface exposure

Function

HMM for motif or function

HMM title and score

0.5

None

No HMM alignment

0

4

Antigenic [19]

Antigenicity

Antigenic epitopes

Peptides, scores, protein coverage

0.5

QLGLLAVSVSLIMASLPAHAVYLDR|1.193|10(169)|41.73

Predicted antigenic region.

0.5

4.9173

+ 0–1 proportional to coverage

 

41.73% of the protein is antigenic

0.4173

 

Bcell Pred [20]

Antigenicity

B cell epitopes, 6 prediction methods combined

Number of peptides, protein coverage

+  0–1 proportional to coverage

6(59)|14|14.57

Predicited B-cell Epitopes

0.1457

5.09809

+  0–1 proportional to total number of peptides of a given length per protein

 

14.57% predicted in 14 peptides of 7AA

14/(405–7 +  1) = 0.03509

 

MHC class I [20]

Antigenicity

MHC-I epitopes

Number of peptides, protein coverage

+  0–1 proportional to coverage if 80–90%

6(378)|124|73|93.33

Predicted MHC binding

 

6.41039

+  0–1 proportional to total number of peptides of a given length per protein

 

93.33% predicted in 124 peptides of 9AA

124/(405–9 +  1) = 0.3123

 

+  1 if coverage is > = 90%

  

1

 

NetCTLpan [20]

Antigenicity

MHC-I epitopes

Number of peptides, protein coverage

+  0–1 proportional to coverage if 80–90%

12(334)|70|12|82.47

Predicted MHC binding

0.8247

7.41139

+  0–1 proportional to total number of peptides of a given length per protein

 

82.47% predicted in 70 peptides

70/(405–9 + 1) = 0.1763

 

+ 1 if coverage is > = 90%

    

Immunogenicity (MHC-I) [20]

Antigenicity

MHC-I epitopes immunogenicity

Number of peptides, protein coverage

+  0–1 proportional to coverage

7(76)|14|36|18.77

Predicted immunogenic region

0.1877

8.63435

+  0–1 proportional to total number of peptides of a given length per protein

 

14 peptides of 9AA

14/(405–9 + 1) = 0.035264

 

+ 1 if coverage is > = 10%

  

1

 

MHC class II [20]

Antigenicity

MHC-II epitopes

Number of peptides, protein coverage

+  0–1 proportional to coverage if 80–90%

2(404)|315|61|99.75

Predicted MHC-II binding

 

10.43995

+  0–1 proportional to total number of peptides of a given length per protein

 

99.75% predicted in 315 peptides of 15AA

315/(405–15 + 1) = 0.8056

 

+ 1 if coverage is > = 90%

  

1

 

BLAT (IEDBb database*) [20]

Antigenicity

Similarity to curated epitopes from IEDB

Protein coverage

+  0–1 proportional to coverage

None

No hits to epitope database

0

10.43995

+ 1 if coverage is > 70%

    

Autoimmunity [5]

Autoimmunity

Similarity to human proteins

Protein coverage

+ 1 if no autoimmunity

None

No hits to Human

1

11.43995

−2 *(0 to1) proportional to coverage

    

−2 if coverage is > 20%

    

Autoimmunity Commensals [5]

Autoimmunity

Similarity to user-defined commensal organisms’ proteins

Protein coverage

+ 1 if no autoimmunity

3(39)|9.63

9.63% similarity to commensal

(0.0963)x(− 2) = −0.1926

11.24735

−2 *(0 to1) proportional to coverage

  

(Negative feature)

 

−2 if coverage is > 20%

    

SSRd Finder [4]

Variability of expression

Phase variation

Number of simple sequence repeats

+ 1 if no SSR

None

No DNA SSR found

1

12.24735

−0.5 for each SSR

    

−0.25 for each SSR in the promoter

    

−0.5 for each SSR with frameshift potential

    

−0.01 times the length of the SSR.

    

SSRd Finder Protein [4]

Variability of expression

Potential conformational shifts

Number of protein tandem repeats

−0.2 for each protein repeat, max penalty of 1.

None

No protein SSR found

0

12.24735

IslandPath [21]

Potential for horizontal gene transfer

Genomic Islands

Presence in a GI

−1 for each protein in a GI

None

Not present in a GI

0.5

12.74735

+ 0.5 for absence

    

Jaccard Clusters [22]†

Conservation

Orthologous clusters

Presence in an orthologous cluster

+ 1 for each protein in a COG in > = 90% of genomes in atleast one method

j_ortholog_cluster_3254|63

Present in > 90% of the genomes

  

−0.25 for each protein in a COG in < 90% of genomes

    

PanOCT [23]†

Conservation

Orthologous clusters

Presence in an orthologous cluster

+ 1 for each protein in a COG in > = 90% of genomes in atleast one method

PanOCT_cluster_108|63

Present in > 90% of the genomes

  

−0.25 for each protein in a COG in < 90% of genomes

  

1

13.74735

OrthoMCL [24]†

Conservation

Orthologous clusters

Presence in an orthologous cluster

+ 1 for each protein in a COG in > = 90% of genomes in atleast one method

orthomcl_cluster1407|63

Present in > 90% of the genomes

  

−0.25 for each protein in a COG in < 90% of genomes

    

LS-BSR [25]†

Conservation

Orthologous clusters

Presence in an orthologous cluster

+ 1 for each protein in a COG in > = 90% of genomes in atleast one method

63

Present in > 90% of the genomes

  

−0.25 for each protein in a COG in < 90% of genomes

    

Attributorc

Function

Annotation & GO Terms

Annotation & GO Terms

+ 1 for each GO term in our surface exposed GO db

hypothetical_protein_domain_protein

No conclusive GO terms predicted

0

13.74735

−1 for each GO term in our non-surface exposed GO db

    

+ 1 if presence of surface exposure keywords if predicted periplasmic

    

aHMM: Hidden Markov Model. This component includes a collection of HMMs selected from the Pfam database for motifs associated with surface proteins.

    

bIEDB: Immune Epitope Database and Analysis Resource

    

cIn house Perl/Python script

    

dSSRs: simple sequence repeats

    

*If any three of PSORTB, LipoP, SignalP and IEDB Database matches are all positive, weight is incremented by 2.

 

True

2

15.74735

^If all surface exposure tools fail a conclusive prediction, weight is decremented by 2

 

False

0

15.74735

†Each protein is given an additional 0.1 for > 90% presence in each of the clustering algorithms, Jaccard Clusters, PanOCT, OrthoMCL and LS-BSR, and penalized 0.5 for < 90% presence or absence of a cluster for each tool.

 

True

0.4

ReVac Score = 16.14735