The pipeline of MOST+ system. A set of target genomic sequences are extracted from a genome then indexed by a suffix tree to count occurrence of each word (or K-mer). If under MOST+ mode, histone modification marks and/or DNase I hypersensitivity (referred as tag signals in this schema or mark distribution) of each word are used to yield mark distribution scores. Top ranked words are put into clustering and motifs are generated from the resulted clusters. The strategy for clustering is illustrated on the right panel of this figure.