Skip to main content

Table 1 The features used by EFIN

From: EFIN: predicting the functional impact of nonsynonymous single nucleotide polymorphisms in human genome

Name

Description

Value and range

Reference amino acid (AAref)

The reference amino acid of the query position

nominal (A,R,N…V)*

Mutant amino acid (AAmut)

The mutant amino acid of the query position

nominal (A,R,N…V)*

Frequency of reference amino acid (Fref)

Frequency of reference amino acid at the query position in each block

interval [0,1], with 1 means perfect conservation of reference amino acid

Frequency of mutant amino acid (Fmut)

Frequency of mutant amino acid at the query position in each block

interval [0,1], with 1 means that all sequences have the mutant amino acid at the position

Shannon Entropy (H)

Shannon entropy in each block at the query position

interval [0,4.322], 0 means no diversity and larger number means more diversity at the position

NAS of the first sequence in each block (NASfirst)

Normalized alignment score of the first sequence in each block.

interval [0,1], while 1 means identical sequence to the query human protein

Number of sequences in each block (No_all)

Number of total sequences in each block

Interval [0,5000], while 5000 is the cutoff for each MSA

Number of sequences which cover the query position in each block (No_qp)

Number of sequences that cover the query position in each block

Interval [0,5000], while 5000 is the cutoff for each MSA

No_qp/ No_all (RatioNN)

The ratio of No_qp and No_all

Interval [0,1]

Lowest conserved block

The lowest block for which all sequences, together with all the sequences in upper blocks, have the reference amino acid perfectly conserved.

Ordinal (primate block, Non-primate mammal block, non-mammal vertebrate block, invertebrate block, other species block)

  1. *The 20 amino acids in human proteins.