Skip to main content

Table 18 Co-occurrence in Distal Promoters

From: The word landscape of the non-coding segments of the Arabidopsis thaliana genome

Word1

Word2

S

ES

S*ln(S/ES)

TAAAAAAT

ATTTTTTA

1855

898.8038

1344.087

AATATATT

TAAAAAAT

1759

902.7094

1173.429

AATATATT

ATTTTTTA

1692

882.8679

1100.631

TTATATAA

ATTTTTTA

1478

740.7429

1020.99

TTATATAA

TAAAAAAT

1464

757.3903

964.8477

AATATATT

TTATATAA

1447

743.9616

962.6287

AAAAATTG

TAAAAAAT

1301

747.7933

720.4442

CAATTTTT

TAAAAAAT

1279

745.3293

690.6698

AAAAATTG

ATTTTTTA

1237

731.3568

650.0966

ATTTTGTA

ATTTTTTA

1156

665.4975

638.3272

CAATTTTT

ATTTTTTA

1200

728.947

598.171

TAGAAAAT

TAAAAAAT

1024

586.114

571.3484

ATTTTGTA

TAAAAAAT

1108

680.4539

540.2074

CAATTTTT

AATATATT

1162

732.1145

536.7987

ATTTTTCA

ATTTTTTA

1078

666.4705

518.3745

AAAAATTG

AATATATT

1148

734.5348

512.627

CAATTTTT

TTATATAA

1003

614.2579

491.8069

TAGAAAAT

AATATATT

956

575.7221

484.8189

ATTTTCTA

ATTTTTTA

952

574.2477

481.2399

ATTTTCTA

TAAAAAAT

964

587.1534

477.9562

TAGAAAAT

ATTTTTTA

941

573.2313

466.4103

ATTTTTCA

TAAAAAAT

1058

681.4487

465.4297

TGAAAAAT

ATTTTTTA

1020

658.2655

446.7086

TGAAAAAT

TAAAAAAT

1033

673.0593

442.5259

AAAAATTG

TTATATAA

970

616.2886

439.9733

  1. Overrepresented non-overlapping word-pairs detected in the distal promoters of Arabidopsis thaliana. A word-pair is characterized through the two nucleotide sequences associated with it (Word1 and Word2), the number of sequences the pair occurs in (S) as well as the expected number of sequences (ES) and a statistical score symbolizing the overrepresentation of the word-pair in the specific sequence set (S*ln(S/ES)).