Skip to main content

Table 3 Top 2 clusters for the bidirectional promoter. The word-based clusters for the two most overrepresented words for the bidirectional promoters. Rank 1 refers to word TCGCGCCA and Rank 2 to TCCCGGGA.

From: Word-based characterization of promoters involved in human DNA repair pathways

(a) Rank 1
Word S ES O EO Sln(S/ES) RevComp. Position Palindrome
TCGCGCCA 4 0.918299 4 0.9375 5.88611 TGGCGCGA 12538 No
TCGCCCCA 3 0.805161 3 0.820513 3.94598 TGGGGCGA 2834 No
TAGCGCCA 1 0.263929 1 0.266667 1.33207 TGGCGCTA 4918 No
TCGAGCCA 1 0.469775 1 0.47619 0.755501 TGGCTCGA NA No
TCGCGACA 1 0.655751 1 0.666667 0.421975 TGTCGCGA NA No
TCGGGCCA 1 0.683955 1 0.695652 0.379863 TGGCCCGA NA No
TTGCGCCA 1 0.693903 2 0.705882 0.365423 TGGCGCAA NA No
TCGCGGCA 1 0.826074 1 0.842105 0.191071 TGCCGCGA NA No
TCGCGTCA 1 0.84063 1 0.857143 0.173604 TGACGCGA 4051 No
TCGCGCCC 1 1.51582 1 1.5625 -0.41596 GGGCGCGA 13089 No
CCGCGCCA 2 2.5054 2 2.625 -0.4506 TGGCGCGG NA No
(b) Rank 2
Word S ES O EO Sln(S/ES) RevComp. Position Palindrome
TCCCGGGA 8 3.97165 8 4.26667 5.60208 TCCCGGGA 2 Yes
TCCAGGGA 2 0.941495 2 0.961538 1.50687 TCCCTGGA NA No
TCCCGAGA 2 1.05556 2 1.08 1.27816 TCTCGGGA 13248 No
TGCCGGGA 1 0.514348 1 0.521739 0.664856 TCCCGGCA NA No
TCCCGTGA 1 0.702073 1 0.714286 0.353718 TCACGGGA NA No
TCCCAGGA 4 3.71413 5 3.97222 0.296597 TCCTGGGA 19059 No
TCTCGGGA 2 1.73986 2 1.8 0.278683 TCCCGAGA 3074 No
ACCCGGGA 1 0.785281 1 0.8 0.241714 TCCCGGGT 20941 No
TCCCCGGA 1 0.852649 1 0.869565 0.159407 TCCGGGGA NA No
TCCCGCGA 1 1.01424 1 1.03704 -0.01414 TCGCGGGA NA No
TCCCGGAA 3 3.29619 3 3.5 -0.28247 TTCCGGGA NA No
TCCTGGGA 1 1.32696 1 1.36364 -0.28289 TCCCAGGA 13129 No
TCCCGGGG 3 3.34568 3 3.55556 -0.32717 CCCCGGGA 21071 No
TCCCGGGT 1 2.38044 1 2.48889 -0.86729 ACCCGGGA 13746 No
CCCCGGGA 1 2.78651 1 2.93333 -1.02479 TCCCGGGG 19211 No
GCCCGGGA 1 3.73853 2 4 -1.31869 TCCCGGGC 21163 No
TCCCGGGC 3 5.1829 4 5.68889 -1.64025 GCCCGGGA 21138 No