Skip to main content

Table 14 Co-occurrence in 5'UTRs

From: The word landscape of the non-coding segments of the Arabidopsis thaliana genome

Word1

Word2

S

ES

S*ln(S/ES)

CTCTCTTT

CTTTCTCT

209

108.1185

137.7533

TTTCTCTC

CTCTCTTT

214

139.4419

91.6622

TTTCTCTC

CTTTCTCT

198

125.808

89.7949

TTTTTTGT

TTTTCTTT

97

41.7516

81.7683

CTTCTCTT

CTCTTCTC

97

45.9973

72.3745

CTCTGTTT

TTTTTCTT

105

54.0587

69.7085

TTTTTTGT

TTTTTCTT

97

48.6186

66.9983

TTTTCTTT

TTTTTCTT

122

71.3728

65.4048

TTTTTGTT

TTTTTCTT

115

65.2326

65.2019

TTTCTCTC

CTCTTCTC

128

78.07

63.2863

TTTTCTTT

TTTTTGTT

103

56.0093

62.7487

CTCTGTTT

TTTTTGTT

87

42.4337

62.4629

AAAGAAAA

AGAAAAAA

130

82.9236

58.4498

CTCTCTGT

CTTTCTCT

90

47.3124

57.8733

CTTTCTCT

CTCTTCTC

105

60.5869

57.7376

TTTTCTCC

CTCTTCTC

61

23.918

57.1107

ACAAAAAA

AAAAAACA

92

49.5364

56.9554

CTTTCTTC

CTCTTCTC

88

47.0073

55.179

AAGAAAAA

AGAAAAAA

141

95.4769

54.9724

CTCTCTTT

CTCTTCTC

109

67.1219

52.8472

GAAAGAGA

AGAGAAAG

57

22.6518

52.6003

TTTCCTCT

CTTTCTCT

79

40.6193

52.5511

TTTCCTCT

TTTCTCTC

91

52.3194

50.3678

TTTTCTTT

CTCTCTTT

127

85.6598

50.013

TTCTCTCC

CTCTTCTC

53

21.4631

47.9097

  1. Overrepresented non-overlapping word-pairs detected in the 5'Untranslated Regions of Arabidopsis thaliana. A word-pair is characterized through the two nucleotide sequences associated with it (Word1 and Word2), the number of sequences the pair occurs in (S) as well as the expected number of sequences (ES) and a statistical score symbolizing the overrepresentation of the word-pair in the specific sequence set (S*ln(S/ES)).