Skip to main content

Table 15 Co-occurrence in Introns

From: The word landscape of the non-coding segments of the Arabidopsis thaliana genome

Word1

Word2

S

ES

S*ln(S/ES)

TTTTATTT

ATTTTTTA

393

217.8144

231.9354

TTTTTATT

ATTTTTTA

334

186.0726

195.3914

TAAAAAAT

AATATATT

147

39.3119

193.8792

TTTTTAAT

TTTTTATT

460

306.2869

187.084

TAAAAAAT

TTTTATTT

273

140.3538

181.6284

TAATTTTT

ATTTTTTA

238

113.2939

176.6639

CTCTGTTT

CTGTTTTT

346

208.3136

175.5583

TTTTATTT

AATATATT

308

175.8151

172.6854

TTTTATTT

TTTTTAAT

505

358.7745

172.6415

TAAAAAAT

ATTTTTTA

149

48.6332

166.8264

TAAAAAAT

TTTTTAAT

189

79.759

163.0573

TAAAAAAT

TAATTTTT

179

73.1119

160.2756

TTTTATTT

TAATTTTT

461

328.5857

156.0948

TTTTTAAT

ATTTTTTA

238

123.6151

155.9133

TAAAAAAT

TTTTTCTT

305

185.7949

151.1788

TAAAAAAT

TTTTTATT

230

119.9486

149.7338

TTTTTATT

AATATATT

261

150.2261

144.1709

TAATTTTT

TTTTTAAT

300

186.1617

143.1501

TTTTTAAT

AATATATT

202

99.8493

142.3303

TTTTATTT

TTTTTATT

670

542.1648

141.8441

TAAAAAAT

TTTTTTGT

262

157.163

133.898

TAATTTTT

AATATATT

187

91.5206

133.6198

ATTTTTTA

TTTTTTGT

354

243.9756

131.769

TAAAAAAT

TTTTGTTT

357

246.9371

131.5909

TTTTTAAT

TTTTTGTT

638

519.9558

130.5312

  1. Overrepresented non-overlapping word-pairs detected in the introns of Arabidopsis thaliana. A word-pair is characterized through the two nucleotide sequences associated with it (Word1 and Word2), the number of sequences the pair occurs in (S) as well as the expected number of sequences (ES) and a statistical score symbolizing the overrepresentation of the word-pair in the specific sequence set (S*ln(S/ES)).