Skip to main content

Table 1 Illustration of the mining process of the modified PrefixSpan algorithm.

From: Similarity evaluation of DNA sequences based on frequent patterns and entropy

current pattern extended patterns
C〉: 7, 10, 14, 16, 17; CA〉: 8; 〈CC〉: 17; 〈CG〉: Empty; 〈CT 〉: 11, 15, 18;
CT 〉: 11, 15, 18; CT A〉: Empty; 〈CT C〉: 16; 〈CT G〉: 12, 19; 〈CT T 〉: Empty;
CT G〉: 12, 19; CTGA〉: 13, 20; 〈CT GC〉: Empty; 〈CT GG〉: Empty; 〈CT GT 〉: Empty;
CTGA〉: 13, 20; CT GAA〉: Empty; 〈CT GAC〉: 14; 〈CT GAG〉: Empty; 〈CT GAT 〉: Empty;
A〉: 1, 8, 13, 20; AA〉: Empty; 〈AC〉: 14; 〈AG〉: Empty; 〈AT〉: 2, 9;
AT〉: 2, 9; AT A〉: Empty; 〈AT C〉: 10; 〈AT G〉: 3; 〈AT T 〉: Empty;
G〉: 3, 4, 6, 12, 19; GA〉: 13, 20; 〈GC〉: 7; 〈GG〉: 4; 〈GT 〉: 5;
T 〉: 2, 5, 9, 11, 15, 18; T A〉: Empty; 〈TC〉: 10, 16; 〈T G〉: 3, 6, 12, 19; 〈T T 〉: Empty;
TC〉: 10, 16; T CA〉: Empty; 〈T CC〉: 17; 〈T CG〉: Empty; 〈T CT 〉: 11;
T G〉: 3, 6, 12, 19; T GA〉: 13, 20; 〈T GC〉: 7; 〈T GG〉: 4; 〈T GT 〉: Empty;
  1. Each row represents one recursive step. The numbers after each pattern represent the starting locations of the suffixes, which are the so-called pseudo-projections. Patterns in bold are maximal.