Skip to main content

Table 17 Co-occurrence in Proximal Promoters

From: The word landscape of the non-coding segments of the Arabidopsis thaliana genome

Word1

Word2

S

ES

S*ln(S/ES)

AAATTTTA

TAAAAAAT

996

489.8445

706.8206

ATTTTTTA

TAAAAAAT

869

395.77

683.4771

TAAATTTT

TAAAAAAT

970

501.8706

639.1852

AAAAATTA

TAAAAAAT

1040

565.2386

634.1171

TAAAATTT

TAAAAAAT

963

498.7952

633.5171

TAAAATTT

ATTTTTTA

892

458.4645

593.7003

AAATTTTA

ATTTTTTA

868

450.2375

569.7695

AAAAATTA

ATTTTTTA

947

519.5356

568.5445

AAAATTTA

TAAAAAAT

919

496.1801

566.4231

TAATTTTT

TAAAAAAT

965

539.2575

561.5671

AAAATTTA

ATTTTTTA

865

456.0608

553.6894

TAATTTTT

ATTTTTTA

907

495.6552

548.0656

AATATATT

TAAAAAAT

776

391.8276

530.2646

AAAATTTA

AAATTTTA

973

564.4665

529.8015

AAATTTTA

TAAAATTT

976

567.4415

529.3092

AAAAATTA

TAATTTTT

1125

707.8947

521.1483

AATATATT

ATTTTTTA

730

360.1459

515.7708

TAAATTTT

ATTTTTTA

845

461.2912

511.4845

AAAAATTA

TAAAATTT

1052

654.7789

498.8066

AAAATTTA

AAAAATTA

1044

651.346

492.5318

AAAATTTA

TAAAATTT

958

574.7807

489.4031

AAATTTTA

TAATTTTT

993

613.4724

478.2242

TAATTTTT

TAAAATTT

995

624.6821

463.1724

AAAATTTA

TAATTTTT

990

621.407

461.0615

TTATATAA

TAAAAAAT

645

316.3233

459.5531

  1. Overrepresented non-overlapping word-pairs detected in the proximal promoters of Arabidopsis thaliana. A word-pair is characterized through the two nucleotide sequences associated with it (Word1 and Word2), the number of sequences the pair occurs in (S) as well as the expected number of sequences (ES) and a statistical score symbolizing the overrepresentation of the word-pair in the specific sequence set (S*ln(S/ES)).