Skip to main content

Advertisement

Table 16 Co-occurrence in Core Promoters

From: The word landscape of the non-coding segments of the Arabidopsis thaliana genome

Word1 Word2 S ES S*ln(S/ES)
GCCCAATA GCCCATTA 32 2.3492 83.5729
TTTTTTCT TTTTTCTT 68 22.9531 73.8516
AATAAAAA AAGAAAAA 84 41.5798 59.069
CTCTCTTT CTTTCTCT 40 9.1626 58.95
AATAAAAA ATTAAAAA 57 22.4453 53.1222
ACAAAAAA AAGAAAAA 71 35.1265 49.9645
ACAAAAAA AGAAAAAA 66 31.1075 49.6455
ATTTCTCA TATAAATA 30 6.1031 47.772
AATAAAAA TAAAAAAT 38 10.8748 47.5432
AAAAAACA ACAAAAAA 56 24.4921 46.3121
AAAAATAT AAAAAACA 44 15.5191 45.8533
AACAAAAA AAGAAAAA 77 42.5433 45.6828
AACAAAAA AGAAAAAA 69 37.6758 41.7512
TTTCTTTT TTTTTTGT 40 14.2927 41.1653
AAAAAACA ATATAAAG 30 7.659 40.9596
AAAAAACA CTATATAA 36 11.9538 39.689
AAAAATAT CTATATAA 30 8.0863 39.3309
TATATAAA TAAAAAAT 36 12.3623 38.4793
AATAAAAA TTAAAAAA 53 25.8324 38.0892
TTTTATTT TTTTTTAA 38 14.0039 37.9336
TTTTATTT TTTTTCTT 50 23.5743 37.5932
TTCTTTTT TTTTTCTT 46 20.3942 37.416
AAATTAAA ACAAAAAA 44 18.9721 37.0137
AATAAAAA AGAAAAAA 65 36.8225 36.938
TTTCTTTT TTTTTGTT 41 16.8429 36.4755
  1. Overrepresented non-overlapping word-pairs detected in the core promoters of Arabidopsis thaliana. A word-pair is characterized through the two nucleotide sequences associated with it (Word1 and Word2), the number of sequences the pair occurs in (S) as well as the expected number of sequences (ES) and a statistical score symbolizing the overrepresentation of the word-pair in the specific sequence set (S*ln(S/ES)).