Skip to main content

Table 1 The features used by EFIN

From: EFIN: predicting the functional impact of nonsynonymous single nucleotide polymorphisms in human genome

Name Description Value and range
Reference amino acid (AAref) The reference amino acid of the query position nominal (A,R,N…V)*
Mutant amino acid (AAmut) The mutant amino acid of the query position nominal (A,R,N…V)*
Frequency of reference amino acid (Fref) Frequency of reference amino acid at the query position in each block interval [0,1], with 1 means perfect conservation of reference amino acid
Frequency of mutant amino acid (Fmut) Frequency of mutant amino acid at the query position in each block interval [0,1], with 1 means that all sequences have the mutant amino acid at the position
Shannon Entropy (H) Shannon entropy in each block at the query position interval [0,4.322], 0 means no diversity and larger number means more diversity at the position
NAS of the first sequence in each block (NASfirst) Normalized alignment score of the first sequence in each block. interval [0,1], while 1 means identical sequence to the query human protein
Number of sequences in each block (No_all) Number of total sequences in each block Interval [0,5000], while 5000 is the cutoff for each MSA
Number of sequences which cover the query position in each block (No_qp) Number of sequences that cover the query position in each block Interval [0,5000], while 5000 is the cutoff for each MSA
No_qp/ No_all (RatioNN) The ratio of No_qp and No_all Interval [0,1]
Lowest conserved block The lowest block for which all sequences, together with all the sequences in upper blocks, have the reference amino acid perfectly conserved. Ordinal (primate block, Non-primate mammal block, non-mammal vertebrate block, invertebrate block, other species block)
  1. *The 20 amino acids in human proteins.