Skip to main content
Figure 2 | BMC Genomics

Figure 2

From: PNImodeler: web server for inferring protein-binding nucleotides from sequence data

Figure 2

Example of representing a DNA sequence of 9 nucleotides by a sliding window of size 5. The '+' symbol represents binding, and the '-' symbol represents non-binding. 'X' in fragments indicates a null nucleotide at the position. Part A explains dividing a DNA sequence into sequence fragments and Part B explains encoding a sequence fragment in a feature vector. For each DNA sequence fragment of 5 nucleotides, 179 elements are encoded in a feature vector: 4 elements for the base composition, 5 elements for mass, 5 elements for pK a , 1 element for the normalized position, 64 elements for the nucleotide triplet composition, and 100 elements for the IP of nucleotide triplets. The features of protein are encoded in 420 feature elements (20 elements for the sum of the normalized positions of 20 amino acids + 400 elements for the dipeptide composition). The feature vector for DPI1 does not include the 420 features of protein.

Back to article page