Table 2 Extending transcriptional modules of E. coli

From: Prosecutor: parameter-free inference of gene function for prokaryotes using DNA microarray data, genomic context and multiple gene annotation sources

transcriptional module gene prosecutor rank motif rank motif sequence in the intergenic region of either the gene or its operon literature reference
ArgR Amino acid biosynthesis: Arginine. AUC 0.92 artJ 21 8 TGCATAACATTGCG [56]
  aroP 58 39 TGATTTTTAATTCA [57]
  artI 131 50 TGCATAATTATTCT [56]
  hisL 16 4 TGAATAAACATTCA putative
  pyrL 32 61 TGACTTTTAATTCA putative
  metH 36 76 TGAATTTTTATTAA putative
  ydcS 43 63 TGAATAAATTTTCT putative
  stpA 132 21 TGCATTTTTATTCA putative
  hisG 141 8 TGAATAAACATTCA putative
  hisJ 144 27 TGCATTGAAATGCA putative
  hisC 145 13 TGAATAAACATTCA putative
  hisA 147 14 TGAATAAACATTCA putative
  potF 162 46 TGCATAAAAATTTG putative
CysB Amino acid biosynthesis: Cysteine AUC 0.91 sbp 12 0 CGCAAGTTATAGCCAATCTTTTTTTATTCTT [48, 58]
Fur iron regulatory gene AUC 0.84 yncD 74 32 GGGAATGGTAATCATTATT [44]
  ybaN 37 5 GAAAATGATAATTGTTATG putative
  folE 101 29 GGCAATTACAATAATTATC putative
LexA major regulator of DNA repair AUC 0.87 yebG 0 21 CTGTATAAAATCACAG [59, 60]
  dinI 2 6 CTGTATAAATAACCAG [61, 62]
  yjiW 39 20 CTGATGATATATACAG [45]
  ybfE 120 31 CTGATTAAAAACCCAG [45]
  sbmC 125 4 CTGTATATAAAAACAG [64]
MetJ Amino acid biosynthesis: Methionine AUC 0.88 ybdH 10 6 AGACGTTTAGATGTCT [65]
  ybdL 106 0 AGACATCTAAACGTCT [65]
  ycbK 198 17 AGTCATCTTGACGTCT [65]
  mmuP 14 15 GGATGTTTAGATGTCC putative
  1. Transcriptional module member predictions identified by Prosecutor applied to well-known transcriptional modules for E. coli. Gene: the gene for which a validated functional prediction with a transcriptional module (column three) was found. Rank: represents the position of a gene in the sorted list with p-values. These p-values describe the functional prediction significance for every individual gene with a specific functional category. More significant p-values are matched with lower ranks. AUC: describes the association efficiency for a transcriptional module with respect to its own members. In addition to the rank information provided by Prosecutor, supplemental motif information is provided. This data is obtained by applying a position specific scoring matrix (PSSM) to the upstream sequences of all genes in the genome. The PSSM is derived from aggregating all known consensus target sequences (DNA regulatory binding sites). The additional motif information allows users to concentrate on genes that exhibit coexpression with a transcriptional module as well as possessing a predicted consensus sequence. This additional evidence contributes to the confidence in assigning a gene to a particular transcriptional module. Motif rank: based on the results for a PSSM when matched to every upstream sequence in the genome. For example, based on the PSSM of the regulator LexA, the upstream region of gene dinD contains the best ranking motif (rank 1).