Skip to main content

Table 4 Top 2 clusters for the unidirectional promoter. The word-based clusters for the two most overrepresented words for the bidirectional promoters. Rank 1 refers to word ACCCGCCT and Rank 2 to CTTCTTTC.

From: Word-based characterization of promoters involved in human DNA repair pathways

(a) Rank 1
Word S ES O EO Sln(S/ES) RevComp. Position Palindrome
ACCCGCCT 4 0.716577 4 0.727273 6.87826 AGGCGGGT 19440 No
ATCCGCCT 1 0.132296 1 0.133333 2.02271 AGGCGGAT NA No
ACCAGCCT 2 0.738772 2 0.75 1.99183 AGGCTGGT 1303 No
AGCCGCCT 1 0.657331 1 0.666667 0.419567 AGGCGGCT 1056 No
ACCCACCT 1 0.738772 1 0.75 0.302766 AGGTGGGT NA No
ACGCGCCT 1 1.16147 1 1.18519 -0.14969 AGGCGCGT NA No
CCCCGCCT 1 2.45503 2 2.54545 -0.89814 AGGCGGGG 21912 No
(b) Rank 2
Word S ES O EO Sln(S/ES) RevComp. Position Palindrome
CTTCTTTC 5 1.7686 5 1.81818 5.19624 GAAAGAAG 13567 No
CTACTTTC 1 0.180301 1 0.181818 1.71313 GAAAGTAG NA No
CTTCTTCC 1 0.304671 1 0.307692 1.18852 GGAAGAAG 5306 No
CTGCTTTC 2 1.15305 2 1.17647 1.10147 GAAAGCAG 9703 No
CGTCTTTC 1 0.371023 1 0.375 0.991491 GAAAGACG 20167 No
CTCCTTTC 3 2.36561 3 2.45 0.712729 GAAAGGAG 11346 No
CTTCTATC 1 0.607134 1 0.615385 0.499005 GATAGAAG NA No
CTTCCTTC 1 0.921427 1 0.9375 0.0818318 GAAGGAAG 10908 No
GTTCTTTC 1 1.07027 1 1.09091 -0.067912 GAAAGAAC 17502 No
CTTTTTTC 1 1.2055 1 1.23077 -0.186894 GAAAAAAG NA No
TTTCTTTC 2 3.4628 2 3.63636 -1.09786 GAAAGAAA NA No