Skip to main content

Table 4 Top 2 clusters for the unidirectional promoter. The word-based clusters for the two most overrepresented words for the bidirectional promoters. Rank 1 refers to word ACCCGCCT and Rank 2 to CTTCTTTC.

From: Word-based characterization of promoters involved in human DNA repair pathways

(a) Rank 1

Word

S

ES

O

EO

Sln(S/ES)

RevComp.

Position

Palindrome

ACCCGCCT

4

0.716577

4

0.727273

6.87826

AGGCGGGT

19440

No

ATCCGCCT

1

0.132296

1

0.133333

2.02271

AGGCGGAT

NA

No

ACCAGCCT

2

0.738772

2

0.75

1.99183

AGGCTGGT

1303

No

AGCCGCCT

1

0.657331

1

0.666667

0.419567

AGGCGGCT

1056

No

ACCCACCT

1

0.738772

1

0.75

0.302766

AGGTGGGT

NA

No

ACGCGCCT

1

1.16147

1

1.18519

-0.14969

AGGCGCGT

NA

No

CCCCGCCT

1

2.45503

2

2.54545

-0.89814

AGGCGGGG

21912

No

(b) Rank 2

Word

S

ES

O

EO

Sln(S/ES)

RevComp.

Position

Palindrome

CTTCTTTC

5

1.7686

5

1.81818

5.19624

GAAAGAAG

13567

No

CTACTTTC

1

0.180301

1

0.181818

1.71313

GAAAGTAG

NA

No

CTTCTTCC

1

0.304671

1

0.307692

1.18852

GGAAGAAG

5306

No

CTGCTTTC

2

1.15305

2

1.17647

1.10147

GAAAGCAG

9703

No

CGTCTTTC

1

0.371023

1

0.375

0.991491

GAAAGACG

20167

No

CTCCTTTC

3

2.36561

3

2.45

0.712729

GAAAGGAG

11346

No

CTTCTATC

1

0.607134

1

0.615385

0.499005

GATAGAAG

NA

No

CTTCCTTC

1

0.921427

1

0.9375

0.0818318

GAAGGAAG

10908

No

GTTCTTTC

1

1.07027

1

1.09091

-0.067912

GAAAGAAC

17502

No

CTTTTTTC

1

1.2055

1

1.23077

-0.186894

GAAAAAAG

NA

No

TTTCTTTC

2

3.4628

2

3.63636

-1.09786

GAAAGAAA

NA

No