Skip to main content

Table 5 Edit cluster for bidirectional promoters. The word-based clusters for the two most overrepresented words for the bidirectional promoters according to the edit distance metric. Rank 1 refers to word TCGCGCCA and Rank 2 to TCCCGGGA.

From: Word-based characterization of promoters involved in human DNA repair pathways

(a) Rank 1
Word S ES O EO Sln(S/ES) RevComp. Position Palindrome
TCGCGCCA 4 0.918299 4 0.9375 5.88611 TGGCGCGA 12538 No
TCGCCCCA 3 0.805161 3 0.820513 3.94598 TGGGGCGA 2834 No
TAGCTCCA 2 0.352982 2 0.357143 3.46897 TGGAGCTA NA No
TCTCGCGA 2 0.438673 2 0.444444 3.0343 TCGCGAGA 4937 No
TCGCCACA 2 0.455424 2 0.461538 2.95935 TGTGGCGA 4669 No
...         
(b) Rank 2
Word S ES O EO Sln(S/ES) RevComp. Position Palindrome
TCCCGGGA 8 3.97165 8 4.26667 5.60208 TCCCGGGA 2 Yes
TCCCGGCT 6 2.54354 6 2.66667 5.14921 AGCCGGGA NA No
ATCCGGGA 2 0.395077 2 0.4 3.24364 TCCCGGAT NA No
TCTCGCGA 2 0.438673 2 0.444444 3.0343 TCGCGAGA 4937 No
TTCCTGGA 2 0.493082 2 0.5 2.80045 TCCAGGAA 9505 No
...