Transcriptional double-autorepression feedforward circuits act for multicellularity and nervous system development
© Iwama et al; licensee BioMed Central Ltd. 2011
Received: 23 December 2010
Accepted: 11 May 2011
Published: 11 May 2011
The transcriptional regulatory network is considered to be built from a set of circuit patterns called network motifs. Experimental studies have provided instances where a feedforward circuit (FFC) appears with modification of autoregulation, but little is known systematically about such autoregulation-integrated FFCs. Therefore, we aimed to examine whether the autoregulation-integrated FFC is a network motif relevant to describing the human transcriptional regulatory systems, and explored the relationship of such network motifs with biological functions.
Based on human-mouse evolutionarily conserved transcription factor binding sites (TFBSs) in 76600 conserved blocks for 5169 genes, we compiled the human transcriptional connections into a matrix, and examined the number of FFC appearances in comparison with randomized networks. The results revealed that the configuration of autoregulation integrated in the FFC critically affects the abundance or avoidance of FFC appearances. In particular, an FFC comprising two repressors that are both autoregulated was revealed as a significant network motif, which we termed the double-autoregulation FFC (DAR-FFC). Interestingly, this network motif preferentially constitutes effecter transcriptional circuits with functions in cell-cell signaling and multicellular organization, and is particularly related to nervous system development.
We have revealed that the configuration of autoregulation integrated in the FFCs is a critical factor for abundance or avoidance of the appearance of the FFCs. In particular, we have identified the DAR-FFC as a distinctive integrated network motif endowed with properties that are indispensable for forming the transcriptional regulatory circuits involved in multicellular organization and nervous system development. This is the first report showing that the DAR-FFC is a significant network motif.
It has been proposed that a complex system can be grasped using a small set of network motifs that recur significantly more often in the complex system than expected in its randomized network [1–5]. The transcriptional regulatory network of a cell is a complex system in which many transcription factor (TF) proteins turn gene expressions on and off according to spatiotemporal contexts. A feedforward circuit (FFC) comprising two regulators has systematically been shown to be a network motif in the transcriptional regulatory networks of Escherichia coli[1–5], yeast [2–6] and higher eukaryotes [4, 7, 8]. It is crucial to find an appropriate set of such basic network motifs to intelligibly describe the gene regulatory system.
One of the simplest network motifs is autoregulation, in which a regulator (TF) regulates the gene that encodes the regulator itself. An FFC is known to appear in some instances with modification of autoregulation [3, 9, 10], but no systematic studies on autoregulation-integrated FFCs have yet been carried out. Therefore, we aimed to elucidate whether the autoregulation-integrated FFC forms a distinct network motif relevant to describing the transcriptional network, and explored the relationships of this network motif with biological functions. For this purpose, we examined the influence of integrating autoregulation into the FFC on the number of FFC appearances by surveying the human transcriptional regulatory network.
In the present study, we further expanded the scope of network motifs to a category of small-scale integrated circuits by analyzing the autoregulation-integrated FFCs. In contrast to the conservativeness of the usage of simple FFC network motifs, our results revealed that the configuration of autoregulation integrated in the FFCs is a critical factor for abundance or avoidance of the appearance of the FFCs. In particular, we found that an FFC composed of two repressor regulators with autoregulation on each of them is a significant network motif. We report here that this novel network motif is preferentially involved in intercellular communication and multicellularity-oriented functions and particularly related to nervous system development.
Overrepresentation of coherent type-1 FFCs
We first classified the FFCs into four classes based on the combinations of the modes (activator or repressor) of the originating and intermediary regulators [1, 5] (Figure 1a). Following the FFC classification of Mangan and Alon , coherent and incoherent FFCs were further categorized as type 1 when the originating regulator was an activator and type 2 when it was a repressor. In addition, we separately evaluated the FFCs targeting TF genes and effecter (non-TF) genes because if an FFC targets a TF gene, the TF gene can again act as a node that further integrates other circuits but if an FFC targets an effecter gene, it is the terminal output of the FFC.
Statistical data for feedforward circuits and autoregulation in the real human transcriptional network in comparison with 1000 randomized networks.
Appearance in the real network
Appearance in the randomized networks (mean ± s.d.*)
378991 ± 13650
P ≈ 0.035
105296 ± 7675
P ≈ 0.406
484287 ± 13837
P < 0.05
36.42 ± 2.95
P ≈ 0.060
The configuration of autoregulation critically affects the FFC abundance
To examine the influence of autoregulation, we classified the FFCs according to the presence or absence of autoregulation on each regulator of an FFC without respect to the regulators' modes. This classification created four classes, which we termed (I) double-autoregulation, (II) intermediary-only autoregulated, (III) originating-only autoregulated and (IV) no autoregulation (Figure 1b). We found that categories III and IV, both of which have no autoregulation on the intermediary regulator, were consistently underrepresented, while in sharp contrast, categories I and II, both of which have autoregulation on the intermediary regulator, were overrepresented (Figure 1b).
Therefore, the absence of intermediary autoregulation was a hallmark for avoidance of FFC formation. In particular, the FFC composed of an autoregulated originating regulator and an autoregulation-less intermediary regulator was a significant antimotif common to TF and effecter target genes. Regarding the regulator chain backbone, autoregulation on the intermediary regulator was shown to be a significant factor for a higher frequency of appearance than expected in the random networks. These results demonstrated that the FFC, which was overrepresented as a whole, could be broken down into distinct classes that were heterogeneous in abundance or avoidance of the appearance of the FFC depending on the configuration of autoregulation.
Autoregulation-and-mode combined classification of FFCs
Next, we combined the autoregulation-based classification with the classification based on the regulators' modes, and evaluated its applicability. This combined classification revealed two significant antimotifs of FFCs common to the TF and effecter target genes (Figure 1c). Both of these antimotifs were composed of an originating repressor regulator with autoregulation and an intermediary regulator without autoregulation irrespective of the mode. Since the degrees of representation of the backbone regulator chain for these motifs did not deviate much from the random expectations, the configuration of the autoregulated repressor with autoregulation-less intermediation is likely to have a specific disadvantage for feedforward synergistic control of the effecter.
A novel network motif, the double-autorepression FFC (DAR-FFC)
The autoregulation-and-mode combined classification further revealed a significantly overrepresented network motif to which little attention has been paid. Specifically, this motif is the DAR-FFC that targets effecter genes in which both regulators are repressors with autoregulation (Figure 1d). Compared with the degree of representation of the backbone regulatory chain for this double-autorepression configuration, the DAR-FFC targeting an effecter showed a remarkable increase in its representation. This result indicates that the DAR-FFC has specific features that are advantageous for controlling effecter genes.
Preferred functions of DAR-FFC-targeted effecter genes
Preferred molecular functions of DAR-FFC-targeted effecters.
GO molecular functions
growth factor activity
1.47 × 10-17
3.09 × 10-16
voltage-gated ion channel activity
1.50 × 10-13
transcription regulator activity
1.30 × 10-12
transcription factor binding
2.00 × 10-12
potassium ion binding
3.45 × 10-12
voltage-gated potassium channel activity
4.99 × 10-12
extracellular matrix structural constituent
5.70 × 10-12
1.36 × 10-7
1.56 × 10-6
calcium ion binding
5.40 × 10-6
Preferred biological processes of DAR-FFC-targeted effecters.
GO biological processes
multicellular organismal development
5.31 × 10-32
nervous system development
2.67 × 10-17
Wnt receptor signaling pathway
3.20 × 10-16
5.48 × 10-12
regulation of transcription, DNA-dependent
3.05 × 10-10
potassium ion transport
8.34 × 10-10
homophilic cell adhesion
6.96 × 10-8
3.43 × 10-7
3.66 × 10-7
transmembrane receptor protein tyrosine kinase signaling pathway
3.95 × 10-7
7.25 × 10-7
DAR-FFC interlinks forming higher-order DAR-FFCs
As a result, we identified five higher-order DAR-FFCs in 13 TF-TF connections of the DAR-FFCs. This densely connected feature of the higher-order DAR-FFCs coincided with the results of motif representation analyses in the autoregulation-and-mode combined classification in which the DAR-FFC targeting a TF was the most overrepresented (P ≈ 0.060, see the Methods) among the FFCs that target TFs.
The aspect of these higher-order DAR-FFCs provides another interesting network feature. Among the five higher-order DAR-FFCs (Figure 3), KLF12 is the only node that always acts as an originating regulator in all three higher-order DAR-FFCs in which it is involved. This TF (KLF12) shows a clear contrast to ZBTB7A and GFI1B that have different roles, i.e. an originating regulator and two intermediary regulators, and an originating regulator and a target, respectively. These findings suggest the possibility that KLF12 has the distinctive function of being a 'higher-order originating regulator hub' that potentially spreads regulatory effects through DAR-FFC-mediated transmissions. The delineation of the DAR-FFC as a network motif provides further potential to reveal such complexity of the higher-order features of regulatory networks.
The confidence of the results is robust against TF mode perturbation
We have demonstrated that the DAR-FFC in which the two repressor regulators are both autoregulated is a distinct and significant network motif in the human transcriptional regulatory network. We were able to improve the resolution power for analyzing the network circuitry by devising an autoregulation-based FFC classification and combining it with the mode-based FFC classification. By focusing on the autoregulation-integrated FFC, we demonstrated the analytical relevance of such small-scale integrated circuits that are positioned between elementary circuit units (e.g. autoregulation and simple FFCs) and large-scale integrated circuits (e.g. 'dense overlapping regulon') . This autoregulation-based classification has an important logical feature in that it merely depends on the circuit structure without introducing any qualitative features (i.e. repressor or activator) of the regulator, and it still enabled the delineation of specific antimotifs.
We also showed that the overall FFC, coherent FFC and coherent type-1 FFC appearances were each significantly overrepresented in the human transcriptional regulatory network based on the conventional mode-based classification. These results are consistent with previous findings in E. coli and yeast, suggesting strong evolutionary conservation of the usage of FFCs, particularly type-1 coherent FFCs, at the system level.
The autoregulation-and-mode combined classification enabled us to identify two antimotifs that were both composed of an originating repressor regulator with autoregulation and an intermediary regulator without autoregulation irrespective of the mode. This autoregulation configuration is likely to have a specific disadvantage for feedforward synergistic control of the effecter. From the viewpoint of circuit dynamics, negative autoregulation formed by a repressor has been reported to speed up the response of its own regulator . Therefore, the aforementioned antimotifs that each form a circuit of a quick originating regulator aided by a slower modifier (i.e. intermediary regulator) seem to be inappropriate for fine-tuning of the target gene, possibly because the modifier is required to monitor the originating regulator at a higher frequency than the switching frequency of the originating regulator.
The combined classification further revealed the DAR-FFC that targets effecter genes as a novel network motif that comprises two repressors with autoregulation. The overrepresentation of the number of DAR-FFC appearances is possibly explained by the facts that the autorepression integrated in the DAR-FFC provides robustness against stochastic perturbation  and accelerates reaching a stable transcription level , which consequently make DAR-FFCs advantageous for controlling the effecters. This notion is specifically supported by the finding that the most preferred molecular function of the DAR-FFC-targeted effecters was 'growth factor activity' (Table 2), since cellular responses to a growth factor have been reported to require a set of key 'repressive' transcriptional regulators to achieve tightly controlled signal attenuation processes . Consequently, the swiftness and robustness of DAR-FFCs are suitable properties for the signaling systems of various growth factors.
In addition to 'growth factor activity', the effecter functions targeted by the DAR-FFCs demonstrated a marked preference for intercellular communications through humoral, neuronal and ECM-mediated signaling modalities, representing the major intercellular communications in higher eukaryotes. Another functionality significantly preferred by the DAR-FFC-targeted effecters included transcription-modifying activities that increase the information integration capacities at the transcriptional level. These properties of DAR-FFCs have a crucial advantage for enhancing the information transmission and integration that is inevitable for the inflated needs of information processing that are possibly brought about multicellularization.
Notably, and consistent with the above notion, we found that 'multicellular organismal development' was by far the most preferred biological process among DAR-FFC-targeted effecters. It is also noteworthy that 'multicellular organismal development' was followed in the list by biological processes related to the neuronal cell fate program, namely 'nervous system development', 'Wnt receptor signaling pathway' and 'axon guidance'. This preference for nervous system development is a general feature of DAR-FFCs because all the individual DAR-FFC connections identified in the present study had effecter genes with preferred functions within 'multicellular organismal development' and 'nervous system development'. Furthermore, our network analyses revealed that the DAR-FFCs were densely interconnected, and that individual DAR-FFCs formed a higher-order DAR-FFC topology. This topology included a possible 'higher-order originating regulator hub' that potentially spreads regulatory effects through DAR-FFC-mediated transmissions. These dense interlinking features of DAR-FFCs are likely to provide further robustness for transcriptional regulatory networks involved in multicellularity and nervous system development.
Based on the results of the network motif representation analyses, GO functional analyses and higher-order network structures, we suggest that the DAR-FFC identified in the present study is a distinctive integrated network motif endowed with properties that are indispensable for forming the transcriptional regulatory circuits essential for multicellular organization and nervous system development. An intelligible and comprehensive description of the transcriptional regulatory system requires an appropriate set of network motifs that would include integrated network motifs such as the DAR-FFC. It is necessary to elucidate other potential integrated network motifs with a sufficient descriptive power for understanding the gene regulatory system. The degree of integration or abstraction of these motifs would vary depending on the purpose of the description of the system.
We have demonstrated that the configuration of autoregulation integrated in the FFC critically affects the FFC abundance or avoidance. In particular, we found that the DAR-FFC composed of two repressors with autoregulation is a novel significantly overrepresented network motif. Notably, we have revealed that DAR-FFCs constitute transcriptional circuits that preferentially control multicellular organization and nervous system development. These results suggest that the DAR-FFC is a key component that is essential for the multicellularization of higher eukaryotes.
Human-mouse orthologous upstream sequences
We downloaded genome sequences and annotations for humans (Build 37.1) and mice (Build 36.3) from NCBI at http://ftp.ncbi.nih.gov/genomes/. We parsed the annotations and selected protein-coding genes that did not overlap with other genes. We further chose genes whose upstream sequences did not overlap with other genes for at least 8 kb upstream of the translation start site. If a gene had alternative transcription/translation start or end sites, we always adopted the 5'-most site for the start site and the 3'-most for the end site. These retrieval processes yielded 14021 genes for humans and 16412 genes for mice. To stringently identify human-mouse orthologs, we selected genes whose official gene symbols were identical between humans and mice. This ortholog assignment approach is effective for human-mouse comparisons [17, 18], because the same official gene symbol is endowed based on well-curated functional experimental evidence. Finally, we obtained 5169 orthologous 8-kb upstream sequence pairs (Additional file 2 and our website ).
TF gene selection
Among the 5169 orthologs, we identified TF genes according to the GO 'molecular functions' category by surveying the gene2go file of NCBI reference sequences (RefSeq)  at http://ftp.ncbi.nlm.nih.gov/gene/DATA/. We regarded genes assigned to 'transcription factor activity' (GO:0003700) as TF genes, which amounted to 307 TF genes. We further selected the TF genes with known binding motifs by searching TRANSFAC  Professional 12.2 for the binding motif matrices under the conditions that (I) an official gene symbol of the TF was unequivocally designated, (II) the TF gene was included in the 5169 orthologs, (III) the TF was able to function even if it did not form a complex with other TFs, (IV) the motif matrix was not a composite one with other kinds of TFs, and (V) the motif matrix existed in humans or mice. A total of 82 TFs were assigned to corresponding motif matrices that fulfilled the above five conditions. To identify the mode of each of the 82 TFs, i.e. positive (activator), negative (repressor) or bimodal regulator, we surveyed the descriptions of GO, Entrez Gene  and TRANSFAC (Additional file 3).
In the analyses in which the TF mode affected the results, we excluded the counts of every circuit that included a bimodal TF or a mode-unspecified TF, since the particular connections and the proportion of connections that were positive or negative for these TFs were mostly uncertain with regard to our predicted regulatory connections. On the contrary, in cases where the TF mode did not affect the results, the counts of the circuits that included a bimodal or a mode-unspecified TF were not excluded from the statistics. These cases were the analyses of the overall FFCs, the overall autoregulations, and the autoregulation-configuration classification regardless of the mode (Figure 1b).
We searched the 8-kb upstream sequence of each of the 82 TFs for TFBS motifs using the MATCH™ program  with a score cutoff of 0.9 and a motif-core score cutoff of 0.8. To identify conserved TFBS motifs, we performed a genomic alignment for each of the 5169 8-kb orthologous upstream sequence pairs using ReAlignerV  with its default settings. We obtained 76600 conserved blocks whose identities were ≥70%. Next, we retrieved TFBSs that met the following three conditions between the human and mouse motifs: (I) the motif directions were the same; (II) both motifs were located within a conserved block; and (III) the identity of the pair of motifs was ≥90%. Finally, we obtained 386237 conserved TFBSs (Additional file 4 and our website ).
Connection matrix randomization
Transcriptional targeting of the 82 TFs to the 5169 genes was represented as a connection matrix, M, in which Mij = 1 if TF j targets gene i and Mij = 0 in the absence of such targeting (Additional file 5). Since all of the 82 TF genes were included in the 5169 genes, it was possible to examine autoregulation for the 82 TFs. Starting from the original connection matrix of the real data, we created randomized matrices, so that every TF j should target the same number of genes as in the original matrix and every gene i should be targeted by the same number of TFs as in the original matrix [1–4]. According to a previously described method , we performed the randomization using stepwise processes as follows. We randomly chose a pair of columns (TFs), a and b (a ≠ b), and a pair of rows (genes), m and n (m ≠ n). When M ma = 1 and M nb = 1, we further checked whether M na = 0 and M mb = 0. If the four equations were satisfied, we swapped the elements such that M ma = 0, M nb = 0, M na = 1 and M mb = 1. We obtained 1000 randomized connection matrices by repeating these swapping steps 500000 times for each randomization.
To determine the number of swapping steps that resulted in sufficient randomization of the original connection matrix, we monitored the number of value-changed elements from the original connection matrix by varying the number of swapping steps (100, 1000, 10000, 50000, 100000, 500000, 1000000 and 5000000 steps), and for each number of swapping steps, we repeated the randomization 100 times. As a result, the number of value-changed elements reached a plateau phase with 500000 swapping steps, and the use of 1000000 and 5000000 swapping steps yielded no significant increases in the numbers of value-changed elements. Therefore, we determined that 500000 swapping steps were sufficient for randomization of the original connection matrix.
GO enrichment analyses
First, we examined the background set of GO 'molecular functions' terms for the 4862 effecter genes (i.e. 5169 total genes - 307 TF genes). The total number of redundant appearances of the GO terms assigned to all the effecters was denoted as E (E = 12319), and the number of non-redundant appearances of the GO terms was denoted as N (N = 1562). For each non-redundant GO term g of the N terms, the number of appearances within all the effecters was denoted as F g .
We ranked the P g values in increasing order to show the highly enriched GO terms. The GO terms that appeared at frequencies of ≤0.1% of E were discarded from the list. We separately conducted the same procedures described above for the GO 'biological processes' terms.
The GO terms with frequencies of ≤0.05% of E or P (g, i) values of ≥0.01 were discarded from the list for these analyses.
For each circuit, the mean and standard deviation were computed based on the appearance in each of the 1000 randomized matrices, and the P value and Z score were estimated by assuming a standard normal distribution.
TF mode perturbation analyses
To assess the robustness against perturbation of the mode assignments of the TFs, we randomly selected the connections regulated by positive TFs and negative TFs with proportions of 1%, 5%, 10%, 15% and 20%, and stored the selected connections for each proportion. This procedure was repeated 10 times for each of the five proportions. To the selected connections stored for the resulting 50 sets, we applied swapped modes between positive and negative, and subsequently computed the Z scores based on the 1000 randomized and one real connection matrices that were used in the main analyses.
List of abbreviations
double-autorepression feedforward circuit
transcription factor binding site
National Center for Biotechnology Information
This work was supported by a Grant-in-Aid for Scientific Research (KAKENHI) 20570222. H. Iwama thanks Y. I. for support.
- Shen-Orr SS, Milo R, Mangan S, Alon U: Network motifs in the transcriptional regulation network of Escherichia coli. Nat Genet. 2002, 31: 64-68. 10.1038/ng881.PubMedView Article
- Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U: Network motifs: simple building blocks of complex networks. Science. 2002, 298: 824-827. 10.1126/science.298.5594.824.PubMedView Article
- Alon U: Network motifs: theory and experimental approaches. Nat Rev Genet. 2007, 8: 450-461. 10.1038/nrg2102.PubMedView Article
- Milo R, Itzkovitz S, Kashtan N, Levitt R, Shen-Orr S, Ayzenshtat I, Sheffer M, Alon U: Superfamilies of evolved and designed networks. Science. 2004, 303: 1538-1542. 10.1126/science.1089167.PubMedView Article
- Mangan S, Alon U: Structure and function of the feed-forward loop network motif. Proc Natl Acad Sci USA. 2003, 100: 11980-11985. 10.1073/pnas.2133841100.PubMed CentralPubMedView Article
- Lee TI, Rinaldi NJ, Robert F, Odom DT, Bar-Joseph Z, Gerber GK, Hannett NM, Harbison CT, Thompson CM, Simon I, Zeitlinger J, Jennings EG, Murray HL, Gordon DB, Ren B, Wyrick JJ, Tagne JB, Volkert TL, Fraenkel E, Gifford DK, Young RA: Transcriptional regulatory networks in Saccharomyces cerevisiae. Science. 2002, 298: 799-804. 10.1126/science.1075090.PubMedView Article
- Boyer LA, Lee TI, Cole MF, Johnstone SE, Levine SS, Zucker JP, Guenther MG, Kumar RM, Murray HL, Jenner RG, Gifford DK, Melton DA, Jaenisch R, Young RA: Core transcriptional regulatory circuitry in human embryonic stem cells. Cell. 2005, 122: 947-956. 10.1016/j.cell.2005.08.020.PubMed CentralPubMedView Article
- Odom DT, Zizlsperger N, Gordon DB, Bell GW, Rinaldi NJ, Murray HL, Volkert TL, Schreiber J, Rolfe PA, Gifford DK, Fraenkel E, Bell GI, Young RA: Control of pancreas and liver gene expression by HNF transcription factors. Science. 2004, 303: 1311-1312. 10.1126/science.1095486.View Article
- Mangan S, Itzkovitz S, Zaslaver A, Alon U: The incoherent feed-forward loop accelerates the response-time of the gal system of Escherichia coli. J Mol Biol. 2005, 356: 1073-1081.PubMedView Article
- Kalir S, Mangan S, Alon U: A coherent feed-forward loop with a SUM input function prolongs flagella expression in Escherichia coli. Mol Syst Biol. 2005, 1: 2005.0006
- Itzkovitz S, Milo R, Kashtan N, Ziv G, Alon U: Subgraphs in random networks. Phys Rev E. 2003, 68: 026127-View Article
- Shalgi R, Lieber D, Oren M, Pilpel Y: Global and local architecture of the mammalian microRNA-transcription factor regulatory network. PLoS Comput Biol. 2007, 3: e131-10.1371/journal.pcbi.0030131.PubMed CentralPubMedView Article
- Barrell D, Dimmer E, Huntley RP, Binns D, O'Donovan C, Apweiler R: The GOA database in 2009 - an integrated Gene Ontology Annotation resource. Nucleic Acids Res. 2009, D396-D403. 37 Database
- Rosenfeld N, Elowitz MB, Alon U: Negative autoregulation speeds the response times of transcription networks. J Mol Biol. 2002, 323: 785-793. 10.1016/S0022-2836(02)00994-4.PubMedView Article
- Becskei A, Serrano L: Engineering stability in gene networks by autoregulation. Nature. 2000, 405: 590-593. 10.1038/35014651.PubMedView Article
- Amit I, Citri A, Shay T, Lu Y, Katz M, Zhang F, Tarcic G, Siwak D, Lahad J, Jacob-Hirsch J, Amariglio N, Vaisman N, Segal E, Rechavi G, Alon U, Mills GB, Domany E, Yarden Y: A module of negative feedback regulators defines growth factor signaling. Nat Genet. 2007, 39: 503-512. 10.1038/ng1987.PubMedView Article
- Iwama H, Gojobori T: Highly conserved upstream sequences for transcription factor genes and implications for the regulatory network. Proc Natl Acad Sci USA. 2004, 101: 17156-17161. 10.1073/pnas.0407670101.PubMed CentralPubMedView Article
- Iwama H, Hori Y, Matsumoto K, Murao K, Ishida T: ReAlignerV: web-based genomic alignment tool with high specificity and robustness estimated by species-specific insertion sequences. BMC Bioinform. 2008, 9: 112-10.1186/1471-2105-9-112.View Article
- Iwama H, Murao K, Imachi H, Ishida T: Alignments and conserved TFBSs for 5,169 human-mouse orthologs within 8-kb upstream sequences. [http://genet.med.kagawa-u.ac.jp/pub/HMO_5169/List.htm]
- Pruitt KD, Tatusova T, Maglott DR: NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2007, D61-D65. 35 Database
- Wingender E: The TRANSFAC project as an example of framework technology that supports the analysis of genomic regulation. Brief Bioinform. 2008, 9: 326-332. 10.1093/bib/bbn016.PubMedView Article
- Maglott D, Ostell J, Pruitt KD, Tatusova T: Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res. 2007, D26-D31. 35 Database
- Kel AE, Gössling E, Reuter I, Cheremushkin E, Kel-Margoulis OV, Wingender E: MATCH: a tool for searching transcription factor binding sites in DNA sequences. Nucleic Acids Res. 2003, 31: 3576-3579. 10.1093/nar/gkg585.PubMed CentralPubMedView Article
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.