Skip to main content

Table 3 Top 2 clusters for the bidirectional promoter. The word-based clusters for the two most overrepresented words for the bidirectional promoters. Rank 1 refers to word TCGCGCCA and Rank 2 to TCCCGGGA.

From: Word-based characterization of promoters involved in human DNA repair pathways

(a) Rank 1

Word

S

ES

O

EO

Sln(S/ES)

RevComp.

Position

Palindrome

TCGCGCCA

4

0.918299

4

0.9375

5.88611

TGGCGCGA

12538

No

TCGCCCCA

3

0.805161

3

0.820513

3.94598

TGGGGCGA

2834

No

TAGCGCCA

1

0.263929

1

0.266667

1.33207

TGGCGCTA

4918

No

TCGAGCCA

1

0.469775

1

0.47619

0.755501

TGGCTCGA

NA

No

TCGCGACA

1

0.655751

1

0.666667

0.421975

TGTCGCGA

NA

No

TCGGGCCA

1

0.683955

1

0.695652

0.379863

TGGCCCGA

NA

No

TTGCGCCA

1

0.693903

2

0.705882

0.365423

TGGCGCAA

NA

No

TCGCGGCA

1

0.826074

1

0.842105

0.191071

TGCCGCGA

NA

No

TCGCGTCA

1

0.84063

1

0.857143

0.173604

TGACGCGA

4051

No

TCGCGCCC

1

1.51582

1

1.5625

-0.41596

GGGCGCGA

13089

No

CCGCGCCA

2

2.5054

2

2.625

-0.4506

TGGCGCGG

NA

No

(b) Rank 2

Word

S

ES

O

EO

Sln(S/ES)

RevComp.

Position

Palindrome

TCCCGGGA

8

3.97165

8

4.26667

5.60208

TCCCGGGA

2

Yes

TCCAGGGA

2

0.941495

2

0.961538

1.50687

TCCCTGGA

NA

No

TCCCGAGA

2

1.05556

2

1.08

1.27816

TCTCGGGA

13248

No

TGCCGGGA

1

0.514348

1

0.521739

0.664856

TCCCGGCA

NA

No

TCCCGTGA

1

0.702073

1

0.714286

0.353718

TCACGGGA

NA

No

TCCCAGGA

4

3.71413

5

3.97222

0.296597

TCCTGGGA

19059

No

TCTCGGGA

2

1.73986

2

1.8

0.278683

TCCCGAGA

3074

No

ACCCGGGA

1

0.785281

1

0.8

0.241714

TCCCGGGT

20941

No

TCCCCGGA

1

0.852649

1

0.869565

0.159407

TCCGGGGA

NA

No

TCCCGCGA

1

1.01424

1

1.03704

-0.01414

TCGCGGGA

NA

No

TCCCGGAA

3

3.29619

3

3.5

-0.28247

TTCCGGGA

NA

No

TCCTGGGA

1

1.32696

1

1.36364

-0.28289

TCCCAGGA

13129

No

TCCCGGGG

3

3.34568

3

3.55556

-0.32717

CCCCGGGA

21071

No

TCCCGGGT

1

2.38044

1

2.48889

-0.86729

ACCCGGGA

13746

No

CCCCGGGA

1

2.78651

1

2.93333

-1.02479

TCCCGGGG

19211

No

GCCCGGGA

1

3.73853

2

4

-1.31869

TCCCGGGC

21163

No

TCCCGGGC

3

5.1829

4

5.68889

-1.64025

GCCCGGGA

21138

No