From: Word-based characterization of promoters involved in human DNA repair pathways
(a) Bidirectional | |||||||||
---|---|---|---|---|---|---|---|---|---|
Word | S | ES | O | EO | Sln(S/ES) | RevComp | Position | Palindrome | P-Value |
TCGCGCCA | 4 | 0.918299 | 4 | 0.9375 | 5.88611 | TGGCGCGA | 12538 | No | 0.015391 |
TCCCGGGA | 8 | 3.97165 | 8 | 4.26667 | 5.60208 | TCCCGGGA | 2 | Yes | 0.068606 |
GGCCCGCC | 10 | 5.85012 | 11 | 6.5 | 5.36123 | GGCGGGCC | 21073 | No | 0.066821 |
TCCCGGCT | 6 | 2.54354 | 6 | 2.66667 | 5.14921 | AGCCGGGA | NA | No | 0.054084 |
CAGGGGCC | 4 | 1.1085 | 4 | 1.13514 | 5.13315 | GGCCCCTG | 14546 | No | 0.028413 |
AGGGCCGT | 5 | 1.80245 | 5 | 1.86667 | 5.10145 | ACGGCCCT | 613 | No | 0.04142 |
TCTGAGGA | 5 | 1.84222 | 6 | 1.90909 | 4.99234 | TCCTCAGA | 5391 | No | 0.013499 |
CGTGGGGG | 5 | 1.86693 | 5 | 1.93548 | 4.92572 | CCCCCACG | 20402 | No | 0.047015 |
TGCTGAGA | 4 | 1.17067 | 4 | 1.2 | 4.91487 | TCTCAGCA | NA | No | 0.033766 |
CGCGGCCG | 4 | 1.17067 | 4 | 1.2 | 4.91487 | CGGCCGCG | 20259 | No | 0.033766 |
TCTGGGAT | 2 | 0.180188 | 2 | 0.181818 | 4.8138 | ATCCCAGA | 2854 | No | 0.014655 |
GGGGCCGG | 5 | 1.92725 | 5 | 2 | 4.76672 | CCGGCCCC | 20866 | No | 0.052648 |
AGGGAGGG | 6 | 2.73111 | 6 | 2.87234 | 4.7223 | CCCTCCCT | 9852 | No | 0.07159 |
AGAAAAGA | 3 | 0.632564 | 3 | 0.642857 | 4.66976 | TCTTTTCT | NA | No | 0.027559 |
CGACTCCG | 3 | 0.632564 | 3 | 0.642857 | 4.66976 | CGGAGTCG | NA | No | 0.027559 |
GGGCCAGG | 7 | 3.61284 | 7 | 3.85714 | 4.6299 | CCTGGCCC | 19875 | No | 0.096315 |
ACTCCAGC | 5 | 2.02051 | 5 | 2.1 | 4.53045 | GCTGGAGT | NA | No | 0.062121 |
CGGGCCGA | 5 | 2.05153 | 5 | 2.13333 | 4.45426 | TCGGCCCG | 6128 | No | 0.065478 |
TGCGGAAT | 2 | 0.220092 | 2 | 0.222222 | 4.41371 | ATTCCGCA | NA | No | 0.021321 |
GCCCCTCC | 8 | 4.63031 | 9 | 5.03226 | 4.37454 | GGAGGGGC | 7041 | No | 0.070206 |
GCCGGCGA | 3 | 0.707627 | 3 | 0.72 | 4.33335 | TCGCCGGC | 20143 | No | 0.036618 |
TGAAGCCA | 4 | 1.38876 | 4 | 1.42857 | 4.23154 | TGGCTTCA | NA | No | 0.056996 |
GGCAGGGA | 6 | 3.01111 | 6 | 3.18182 | 4.1367 | TCCCTGCC | 10531 | No | 0.103337 |
TGCCCGCG | 5 | 2.19845 | 5 | 2.29167 | 4.10844 | CGCGGGCA | NA | No | 0.082773 |
CAGCAGCC | 6 | 3.02748 | 6 | 3.2 | 4.10418 | GGCTGCTG | 19198 | No | 0.105399 |
(b) Unidirectional | |||||||||
Word | S | ES | O | EO | Sln(S/ES) | RevComp | Position | Palindrome | P-Value |
ACCCGCCT | 4 | 0.716577 | 4 | 0.727273 | 6.87826 | AGGCGGGT | 19440 | No | 0.006562 |
CTTCTTTC | 5 | 1.7686 | 5 | 1.81818 | 5.19624 | GAAAGAAG | 13567 | No | 0.037733 |
AGGAAACA | 4 | 1.16659 | 4 | 1.19048 | 4.92885 | TGTTTCCT | 21667 | No | 0.032947 |
GCAGGGCG | 6 | 2.75716 | 6 | 2.86957 | 4.66535 | CGCCCTGC | 1311 | No | 0.071337 |
GGGGCTGC | 5 | 2.036 | 5 | 2.1 | 4.49226 | GCAGCCCC | 16359 | No | 0.062122 |
TCTTCTTC | 4 | 1.30438 | 4 | 1.33333 | 4.48225 | GAAGAAGA | NA | No | 0.046491 |
GGGGAGTA | 3 | 0.682407 | 3 | 0.692308 | 4.44222 | TACTCCCC | 17991 | No | 0.033211 |
ATTAAAAT | 4 | 1.36853 | 4 | 1.4 | 4.29023 | ATTTTAAT | 16078 | No | 0.053723 |
CGGAAACC | 3 | 0.750393 | 3 | 0.761905 | 4.15731 | GGTTTCCG | NA | No | 0.042101 |
TGGGCGGA | 4 | 1.44679 | 4 | 1.48148 | 4.06778 | TCCGCCCA | NA | No | 0.063337 |
CGGCGGCG | 3 | 0.787559 | 3 | 0.8 | 4.01229 | CGCCGCCG | 22091 | No | 0.047421 |
TTTTTTGA | 3 | 0.787559 | 3 | 0.8 | 4.01229 | TCAAAAAA | NA | No | 0.047421 |
TTTCTCCA | 4 | 1.48541 | 4 | 1.52174 | 3.96242 | TGGAGAAA | 2378 | No | 0.068398 |
AGCCGGCT | 3 | 0.805285 | 3 | 0.818182 | 3.94551 | AGCCGGCT | 14 | Yes | 0.050071 |
CCTCTTTA | 2 | 0.282982 | 2 | 0.285714 | 3.91104 | TAAAGAGG | NA | No | 0.033814 |
CGCCCCTT | 6 | 3.12976 | 6 | 3.27273 | 3.90482 | AAGGGGCG | 21917 | No | 0.113859 |
GCGCCGCG | 5 | 2.33164 | 5 | 2.41379 | 3.81433 | CGCGGCGC | 15062 | No | 0.097601 |
ATTCCCAG | 3 | 0.843245 | 3 | 0.857143 | 3.80733 | CTGGGAAT | 21297 | No | 0.055985 |
TCTCCCCT | 4 | 1.56036 | 4 | 1.6 | 3.7655 | AGGGGAGA | 18183 | No | 0.07881 |
TCCGCCGG | 3 | 0.855341 | 3 | 0.869565 | 3.7646 | CCGGCGGA | NA | No | 0.057938 |
CTCCCGCT | 3 | 0.867789 | 3 | 0.882353 | 3.72126 | AGCGGGAG | NA | No | 0.059981 |
TGCGCCGA | 2 | 0.316812 | 2 | 0.32 | 3.68519 | TCGGCGCA | 3202 | No | 0.041483 |
GGGCGCCC | 4 | 1.59514 | 4 | 1.63636 | 3.67732 | GGGCGCCC | 23 | Yes | 0.083901 |
GTGCGTTT | 3 | 0.884961 | 3 | 0.9 | 3.66247 | AAACGCAC | NA | No | 0.062855 |
TTGGTCTC | 4 | 1.60537 | 4 | 1.64706 | 3.65176 | GAGACCAA | NA | No | 0.085429 |