Volume 9 Supplement 1

The 2007 International Conference on Bioinformatics & Computational Biology (BIOCOMP'07)

Open Access

Selection of thermodynamic models for combinatorial control of multiple transcription factors in early differentiation of embryonic stem cells

BMC Genomics20089(Suppl 1):S18

DOI: 10.1186/1471-2164-9-S1-S18

Published: 20 March 2008

Abstract

Background

Transcription factors (TFs) have multiple combinatorial forms to regulate the transcription of a target gene. For example, one TF can help another TF to stabilize onto regulatory DNA sequence and the other TF may attract RNA polymerase (RNAP) to start transcription; alternatively, two TFs may both interact with both the DNA sequence and the RNAP. The different forms of TF-TF interaction have different effects on the probability of RNAP's binding onto the promoter sequence and therefore confer different transcriptional efficiencies.

Results

We have developed an analytical method to identify the thermodynamic model that best describes the form of TF-TF interaction among a set of TF interactions for every target gene. In this method, time-course microarray data are used to estimate the steady state concentration of the transcript of a target gene, as well as the relative changes of the active concentration for each TF. These estimated concentrations and changes of concentrations are fed into an inference scheme to identify the most compatible thermodynamic model. Such a model represents a particular way of combinatorial control by multiple TFs on a target gene.

Conclusions

Applying this approach to a time-course microarray dataset of embryonic stem cells, we have inferred five interaction patterns among three regulators, Oct4, Sox2 and Nanog, on ten target genes.

Background

Quantitative models describing gene expression in terms of quantity, speed and timing in different environmental contexts are essential for the study of many biological processes. Thermodynamic models are based on the assumption that the level of gene expression is proportional to the equilibrium probability that RNA polymerase (RNAP) is bound to the promoter of the interested gene and these probabilities can be computed in a statistical mechanics framework. In prokaryotes under well studied assumptions, a function is available to relate each particular form of interaction among transcription factors (TFs) and RNAP to the level of the expression of the target gene [13]. Such functions are termed “regulation factors” [1]. There are to date few discussions on the extent to which these regulation factors hold for eukaryotes [3].

In this paper we propose a method to select regulation factors, i.e. to infer the form of TF-TF and TF-RNAP interactions for each target gene. This method enables the investigation of regulation factors from empirical data in eukaryotic systems. Applying this method to a time course microarray dataset of retinoid acid induced differentiation of mouse embryonic stem cells (ESCs) [4], we clearly distinguish different interaction forms among Oct4, Sox2 and Nanog, and their roles as activators, repressors and helpers on each target gene. The detailed characterization of interaction forms among multiple transcription factors allows us to build a core transcription network in ESCs using a bottom-up approach.

ESCs are derived from early mammalian embryos and can be propagated through apparently unlimited, undifferentiated proliferation (self-renewal) in cultured cell lines (mouse: [5, 6], human: [7]). ESCs possess several notable properties that account for their exceptional scientific and medical importance. ESCs have remarkable potential to develop into many different cell types in the body (known as pluripotency [8]) and therefore they may be used to study both normal and abnormal body developments. A major challenge in the study of ESCs is to explain how the complex gene network is wired to control their properties of pluripotency and self-renewal. Transcriptional control is thought to be a key control mechanism for ESCs to maintain their undifferentiated state [916]. Regulatory proteins and relevant genomic sequences work together to precisely tune the expression levels of thousands of target genes in ESCs. The interactions among these regulatory proteins and their interactions with particular genomic sequences collectively define a transcription network. Understanding of the part of the network at work in ESCs, i.e. the functional state of the transcription network in ESCs, can reveal how the undifferentiated state of ESCs is maintained, and how it can be disrupted to initiate different routes of differentiation.

Results

Simulation data

We use three regulatory patterns to test our new algorithm (Figure 1). Under the first regulatory pattern (Row 1, Figure 1), we do two simulations . First, TF's expression increases linearly over time: real_TF_exp = 500+500T, where T=2, 4, 8, 16, 32, 64 and 128. In the second simulation the TF's expression increases exponentially over time: real_TF_exp = 500+200logT, where T=2, 4, 8, 16, 32, 64 and 128. Because there is only one TF in consideration, there are only two candidate regulatory models, either repression (Model 1) or activation (Model 2). In both simulations our method correctly picks Model 2 (Row 1, Figure 1). Two simulations are performed under the second regulatory pattern (Row 2, Figure 1). For each simulation, our method consistently identifies the correct regulatory model out of five candidate models (Row 2, Figure 1). Under the third regulatory pattern, we do a two-step analysis. In the first step, we apply the method to identify the regulatory relationship between TFs A and B (Row 3, Figure 1), i.e. one TF controls the expression of another TF. After a regulatory model is determined between A and B, we use the expression pattern of B derived from the Step 1 to identify the interaction form between TFs B and C. There are two candidate models for Step 1 and five candidate models for Step 2. Altogether 10 potential regulatory models exist among the four genes. In two independent simulations, our method has both identified the correct regulatory models (Row 3, Figure 1).
https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-S1-S18/MediaObjects/12864_2008_Article_1849_Fig1_HTML.jpg
Figure 1

Results from synthetic data using the Interaction-Identifier algorithm. The concentration of A is simulated using either a linear function: [TF] = 500+500T, or an exponential function [TF] = 500+200logT, where T is the time in both equations.

Sensitivity analysis

We check to what extent the choices of parameters affect the method performance. Regulatory model 7 (the regulatory pattern between B, C, D in Row 3, Figure 1) is chosen to perform the sensitivity analysis. We vary KTF, Kg, Kd and qp in wide ranges, for example an 10000 fold range for KTF, and re-run our algorithm. Results in Table 1 show that the method can robustly identify the correct regulatory model even if the parameters are varied by 100 fold. The only exceptions are the cases where the synthesis rates of mRNA are set to be too slow – below 1 mRNA molecule per 10 minute, as compared to the default of 10 mRNA per minute from empirical data. We therefore do not suggest using a very small synthesis rate.
Table 1

Sensitivity test for KTF , Kg , qp, Kd and H.

Numbers in bold represent the highest correlations under each parameter set. The results indicate that the correct model can be identified even with drastic variation in parameters used in the model.

T

KTF*

Pearson

Kg

Pearson

Kd

Pearson

qp

Pearson

H

Pearson

D

1

 

10

 

60/36

 

0.05

 

2

 

model 3

0.01

0.950042

1/60

0.967087

60/24

0.967081

1/35

0.967742

1

0.971055

0.1

0.950513

1/6

0.967081

60/30

0.967081

0.05

0.967081

2

0.967081

 

1

0.967081

10

0.967081

60/36

0.967081

0.10

0.966161

3

0.963737

 

10

0.957063

600

0.967081

60/42

0.967081

1

0.964237

4

0.964052

 

100

0.956187

1000

0.967081

60/48

0.967081

10

0.963906

5

0.968799

 

model 4

0.01

-0.95142

1/60

-0.96956

60/4

-0.96954

1/35

-0.96947

1

-0.97152

0.1

-0.95166

1/6

-0.96953

60/30

-0.96954

0.05

-0.96954

2

-0.96954

 

1

-0.96954

10

-0.96954

60/36

-0.96954

0.10

-0.96969

3

-0.96932

 

10

-0.95665

600

-0.96954

60/42

-0.96954

1

-0.97154

4

-0.97191

 

100

-0.95622

1000

-0.96954

60/48

-0.96954

10

-0.97186

5

-0.97484

 

model 5

0.01

-0.88219

1/60

-0.79531

60/24

-0.79529

1/35

-0.79534

1

-0.92423

0.1

-0.97198

1/6

-0.7953

60/30

-0.79529

0.05

-0.79529

2

-0.79529

 

1

-0.79529

10

-0.79529

60/36

-0.79529

0.10

-0.79518

3

-0.59358

 

10

-0.61602

600

-0.79529

60/42

-0.79529

1

-0.79363

4

-0.40171

 

100

-0.6125

1000

-0.79529

60/48

-0.79529

10

-0.78977

5

-0.2617

 

model 6

0.01

0

1/60

-0.96681

60/24

-0.96681

1/35

-0.96684

1

-0.96778

0.1

-0.97196

1/6

-0.9668

60/30

-0.96681

0.05

-0.96681

2

-0.96681

 

1

-0.96681

10

-0.96681

60/36

-0.96681

0.10

-0.96674

3

-0.95978

 

10

-0.66535

600

-0.96681

60/42

-0.96681

1

-0.96544

4

-0.93539

 

100

-0.61378

1000

-0.96681

60/48

-0.96681

10

-0.95794

5

-0.8747

 

model 7

0.01

0.96079

1/60

0.969306

60/24

0.969321

1/35

0.969576

1

0.97157

0.1

0.961556

1/6

0.96932

60/30

0.969321

0.05

0.969321

2

0.969321

 

1

0.969321

10

0.969321

60/36

0.969321

0.10

0.96904

3

0.968987

 

10

0.709205

600

0.96932

60/42

0.969321

1

0.968629

4

0.973865

 

100

0.614337

1000

0.96932

60/48

0.96932

10

0.968572

5

0.98013

 

model 8

0.01

-0.97205

1/60

-0.92839

60/24

-0.9284

1/35

-0.92828

1

-0.94378

0.1

-0.97213

1/6

-0.9284

60/30

-0.9284

0.05

-0.9284

2

-0.9284

 

1

-0.9284

10

-0.9284

60/36

-0.9284

0.10

-0.92855

3

-0.86111

 

10

-0.95055

600

-0.9284

60/42

-0.9284

1

-0.92884

4

-0.73939

 

100

-0.95336

1000

-0.9284

60/48

-0.9284

10

-0.92888

5

-0.68121

 

* The unit number of KTF is the maximum expression value /10.

Interaction models for Oct4 and Nanog in mouse embryonic stem cells

Oct4, Sox2 and Nanog are the key transcription factors to maintain pluripotency ESCs. Nanog is known to be jointly regulated by Oct4 and Sox2.

Time course microarray data have been generated for retinoid acid induced differentiation of mouse ESCs [4]. Genes that are jointly regulated by Oct4 and Nanog have been reliably identified [13]. Among these target genes, nine genes (Jarid2, Sall4, Rif1, Gbx2, REST, Zin3, Foxc1, Smarcad1 and Atbf1) are represented on the Affymetrix U133 microarray and therefore their time course data are available [4]. We first apply the Interaction-Identifier method to identify the regulatory model for Nanog, following the same procedure as we did for the synthetic data. The time course expression data suggest that Oct4 and Sox2 help each other to stabilize onto the regulatory sequence and attract the RNAP (Figure 2). We then identify the regulatory models for the Oct4 and Nanog regulated genes. Although these nine genes are all regulated by Oct4 and Nanog in ESCs, they are not regulated under the same mechanism. Jarid2, Sall4, Rif1 and Gbx2, are regulated under model 7 (Figure 3), where Oct4 and Nanog are synergistic activators. REST and Zic3 are both regulated under model 3, with one TF being an activator and the other a helper (Figure 4). Atbf1 is regulated under model 5 where Oct4 and Nanog are independent repressors (Figure 5A). Foxc1 is regulated under model 4 where Nanog is a helper and Oct4 is a repressor (Figure 5B). These results suggest that Atbf1 and Foxc1 are probably involved in lineage differentiation and therefore need to be repressed by key transcription factors in ESCs. Indeed, Foxc1 is involved in ocular development [17] and Abf1 mRNA is found to be abundant in prostate [18]. Finally, none of the models being considered derives an expression pattern similar to the observed expression pattern of Smarcad1 (All Pearson correlations are smaller than 0.5). This may suggest that besides Oct4 and Nanog, there are other mechanisms responsible for the transcriptional control of Smarcad1.
https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-S1-S18/MediaObjects/12864_2008_Article_1849_Fig2_HTML.jpg
Figure 2

The i dentified interation form of Oct4 and Sox2 on Nanog.

https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-S1-S18/MediaObjects/12864_2008_Article_1849_Fig3_HTML.jpg
Figure 3

Forms of TF-RNAP interactions and their corresponding parameters for modeling the probability of RNAP binding. A1 and A2 are transcription factors acting as activators of genes. R1 and R2 are transcription factors acting as repressors of genes. A box without label represents RNAP. A curve with a bar at the end represents a repression effect; a curve with an arrow at the end indicates either cooperation between transcription factors or activation of a gene by a transcription factor.

https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-S1-S18/MediaObjects/12864_2008_Article_1849_Fig4_HTML.jpg
Figure 4

The different regulatory networks of different groups of target genes identified using Interaction-Identification algorithm. The directed arrows represent activation and the dotted line represents the function of a helper. The relationship between Nanog and Oct 4 with these target genes follows the model 3 in Figure 3.

https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-S1-S18/MediaObjects/12864_2008_Article_1849_Fig5_HTML.jpg
Figure 5

The different regulatory networks of different groups of target genes identified using the Interaction-Identifier algorithm. (a) model 5 (Figure 3) (b) model 4 (Figure 3), where the dotted line represents the function of a helper; a line with an arrow at the end represents the effect of an activator; a line with a solid dot at the end represents the effect of a repressor.

Discussion

New algorithms combining the strengths of both mechanistic modeling and statistical inference approaches to identify genetic regulatory networks are in demand. The methodology proposed in this study is one step towards this goal. This new method integrates three pieces of information together to infer a genetic regulatory network: a) mechanistic models of transcription factor binding and RNA transcription [3], b) prior knowledge of network components based on ChIP-chip data, c) time course expression data. Furthermore, this method combines two methodologies together, kinetic modelling and correlation analysis. In the following, we further compare this new method with previous efforts in the same vein and explain the rationale and assumptions of this new approach.

We choose to represent the expression levels as continuous instead of discrete variables in this study. Reverse engineering approaches have been developed to infer boolean networks underlying changes in the gene expression levels, assuming that expression levels of different genes can be categorized into different states [19]. In reality, gene expression levels reflected by microarray data tend to be continuous rather than discrete. Furthermore, continuous signals have a greater capacity over discrete signals in implementing different control functions, such as signal transformation and transduction, precise feedback and feed forward loops and maintaining homeostasis [20]. An implicit assumption of using continuous concentrations of the chemical species (mRNA and protein) is that the stochastic fluctuations due to single molecules are ignored. In both prokaryotic and eukaryotic cells, noises in gene expression levels have been observed and suggested to be an evolvable trait, which possibly plays a role in cellular phenotypic variation and cellular differentiation [2124]. Both stochasticity inherent in the biochemical process of gene expression (intrinsic noise) and fluctuations in other cellular components (extrinsic noise) contribute substantially to overall phenotypic variation [21]. In this study, the mRNA signals obtained are effectively averages of pooled populations of cells; where the influence of stochastic noise of single molecules on chemical concentration (mRNA and protein) are presumably effectively decreased.

Three other assumptions are made in the methodological framework. First the form of the interaction among TFs and RNAP are assumed to be invariant for the different conditions under which the gene expression data are obtained. This assumption can be violated when the experimental conditions are dramatically different from each other, for example, under different stress conditions. This assumption is better satisfied by using data from one biological process, for example, a developmental process. For this reason we suggest using time course gene expression data rather than data generated from different experimental conditions. Even for time course data, the users should exercise caution, because the regulation factor can still change in some circumstances, such as when the cell goes through different phases of the cell cycle [25, 26]. The second assumption is that the transcriptome is at equilibrium state at each time point when the gene expression is measured. This assumption is satisfied by most of the time course microarray data that the authors are aware of, and users can check this assumption by examining the reproducibility of the data among biological replicates. The third and the biggest assumption is that the thermodynamic models derived and tested for prokaryotes can be applied to eukaryote systems. This is essentially ignoring a number of transcriptional regulatory mechanisms that eukaryotes utilize, such as chromatin modification and enhancer effects. However as an approximation, the Interaction-Identifier method is still useful to analyze the biophysical properties of known TFs. Another point in favor of the validity of this method is that the absolute value of the model-derived gene expression level does not influence the correlation calculation. Only the pattern of change of the expression levels over time influence the correlation calculation. Many of the eukaryotic specific regulatory features, such as the distance between the enhancer and the promoter, are invariant for the target gene over the time course, and therefore such features should not affect the selection of the corrected model.

Previously, models were developed to infer genetic regulatory networks from time series data that are generated before the equilibrium is reached [27, 28]. There are, however, a lot more experiments generating gene expression data at steady states in a time series manner. In this paper we demonstrate that steady state time series data can be utilized to effectively characterize the interaction forms among multiple transcription factors. The Interaction-Identifier method should therefore be applicable to analyze a larger number of biological processes where steady state time course data are available.

Conclusions

We developed Interaction-Identifier methods for identifying interaction forms of TFs. We applied it to analyze the combinatorial control of the key transcription factors in mouse ESCs. ESCs are pluripotent cells derived from the inner cell mass of the mammalian blastocyst. They are capable of indefinite self-renewing expansion in culture. Depending on culture conditions, these cells can differentiate into a variety of cell types [29]. The ability to steer ESC differentiation into specific cell types holds great promise for regenerative medicine [13, 3032]. A few transcription factors have shown to be key transcriptional regulators in ESCs. These include Oct4, Sox2, Nanog and others [4, 13]. Large scale genomic data have been generated for these regulators, including ChIP-PET (a technology close to ChIP-chip) [13] and time course microarray data [4]. Albeit the availability of the high-throughput genomic data, the regulatory circuit in ESC still await quantitative and realistic models to be described. We regard a realistic model for quantifying the effect of combinatorial control of multiple ESC regulators as a firm building block towards understanding the whole network. In this paper we explored Interaction-Identifier method to infer the interaction patterns between multiple ESC regulators. In particular, Interaction-Identifier method identifies that Oct4 and Sox2 help each other to stabilize onto DNA and attract the RNAP. This indicates that the DNA-bound Oct4 will be less in Sox2 knock-down ESCs, and vice versa. This is in line with the fact that the knock-down of either of the two transcription factors will decrease the expression levels of the mutual target genes and start the differentiation process [4]. We have subsequently categorized the mutual targets of Oct4 and Nanog according to the pattern of their combinatorial effect. Although Oct4 and Nanog often serve as activators for maintaining the expression of ESC specific genes, they also inhibit genes for lineage specific differentiation. Little is known about how Oct4 and Nanog switch their tasks between activators and repressors. Among all the identified regulatory patterns, Oct4 and Sox2 generally do not attract RNAP at the same time, but rather one serves as the helper to the other (Figures 4, 5). Only in one case Oct4 and Nanog both interact with RNAP, where both serve as repressors (Figure 5A). This result suggests that when both of the two transcription factors interact with RNAP, they perform an inhibition task.

Methods

We propose an Interaction-Identifier method to identify the candidate form of interaction among the TFs and RNA polymerase (RNAP) on the promoter of a target gene. This method begins by using a thermodynamic function, termed regulation factor, to predict the equilibrium probability that RNAP binds to the promoter of its targeted gene (PRNAP) based on concentrations of associated TFs and interaction forms among TFs and RNAP. Next, it uses systems of ordinary differential equations to simulate the dynamics of expression interested genes, assuming a) the transcription rate is proportional to the PRNAP; b) mRNA degradation rate is linearly dependent on the RNA concentration; c) the concentration changes of TF factor can be inferred from the changes in the mRNA levels of TFs. Thirdly, using measured time course gene expression data from microarray experiments, we compute the Pearson correlation coefficient and Euclidean distance between the observed expression pattern and the predicted expression pattern. Since different interaction forms among TFs and RNAP will lead to different regulation factors, Interaction-Identifier method conducts these first three steps for all interaction forms between TFs and promoters. Finally, the regulation factor that predicts an expression pattern with highest correlation to the observed expression pattern is identified as most plausible interaction form that TFs take to regulate this target gene (Figure 6).
https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-S1-S18/MediaObjects/12864_2008_Article_1849_Fig6_HTML.jpg
Figure 6

Flowchart of the Interaction-Identifier algorithm.

Thermodynamic models for RNAP binding

Cells receive a wide variety of cellular and environmental signals, which are often processed combinatorially to generate specific genetic responses. We follow Buchler et al (2003) to integrate combinatorial signal at the level of cis-regulatory transcription control. Also see Bintu et al for review [13]. In this section this theoretical framework [13] is briefed.

RNAP binds to promoter of a target gene to initiate gene transcription. The promoter of a given gene can only take one of the two states in a given time in a cell: with or without RNAP binding. We denote the ratio of the probability of promoter bound by RNAP to unbound by RNAP as q p (Table 2).
Table 2

The Bernoulli distribution for the two states of a promoter.

State

RNAP

Weight

1

0

1

2

1

q p

The weight q p denotes the ratio between the probabilities of the two states of promoter.

It follows that the percentage of the promoter of the target gene being bound with an RNAP is P [ R N A P b i n d i n g ] = q p 1 + q p MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXafv3ySLgzGmvETj2BSbqeeuuDJXwAKbsr4rNCHbGeaGqipu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaaeaabaWaaaGcbaqcLbuacqWGqbaucqGGBbWwcqWGsbGucqWGobGtcqWGbbqqcqWGqbaujuaGdaWgaaWcbaqcLbuacqWGIbGycqWGPbqAcqWGUbGBcqWGKbazcqWGPbqAcqWGUbGBcqWGNbWzaSqabaqcLbuacqGGDbqxcqGH9aqpjuaGdaWcaaGcbaqcLbuacqWGXbqCjuaGdaWgaaWcbaqcLbuacqWGWbaCaSqabaaakeaajugqbiabigdaXiabgUcaRiabdghaXLqbaoaaBaaaleaajugqbiabdchaWbWcbeaaaaaaaa@51EC@ https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-S1-S18/MediaObjects/12864_2008_Article_1849_IEq1_HTML.gif.

If we consider there is an TF interacting with RNAP, a promoter can then take one of the four possible states: (1) both the TF and the RNAP bind to the promoter; (2) Only the RNAP binds to the promoter; (3) Only the TF binds to the promoter; (4) neither the TF nor the RNAP binds to the promoter (Table 3).
Table 3

The multinomial distribution of a promoter with one TF and its RNAP.

State

TF

RNAP

Weight

1

0

0

1

2

0

1

q p

3

1

0

q TF

4

1

1

w TFp q p q TF

The weights q p , q TF and w TFp q p q TF denote the ratios between the probabilities of States 2,3,4 respectively assuming the weight of promoter with no RNAP or TF is 1.

Let Pi denote the probability of a promoter in the ith state. It follows that the probability of the promoter of the target gene being bound with an RNAP is
P [ R N A P b i n d i n g ] = P 2 + P 4 P 1 + P 2 + P 3 + P 4 = q p + w T F p q T F q p 1 + q p + q T F + w T F p q T F q p MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXafv3ySLgzGmvETj2BSbqeeuuDJXwAKbsr4rNCHbGeaGqipu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaaeaabaWaaaGcbaqcLbuacqWGqbaucqGGBbWwcqWGsbGucqWGobGtcqWGbbqqcqWGqbaujuaGdaWgaaWcbaqcLbuacqWGIbGycqWGPbqAcqWGUbGBcqWGKbazcqWGPbqAcqWGUbGBcqWGNbWzaSqabaqcLbuacqGGDbqxcqGH9aqpjuaGdaWcaaGcbaqcLbuacqWGqbaujuaGdaWgaaWcbaqcLbuacqaIYaGmaSqabaqcLbuacqGHRaWkcqWGqbaujuaGdaWgaaWcbaqcLbuacqaI0aanaSqabaaakeaajugqbiabdcfaqLqbaoaaBaaaleaajugqbiabigdaXaWcbeaajugqbiabgUcaRiabdcfaqLqbaoaaBaaaleaajugqbiabikdaYaWcbeaajugqbiabgUcaRiabdcfaqLqbaoaaBaaaleaajugqbiabiodaZaWcbeaajugqbiabgUcaRiabdcfaqLqbaoaaBaaaleaajugqbiabisda0aWcbeaaaaqcLbuacqGH9aqpjuaGdaWcaaGcbaqcLbuacqWGXbqCjuaGdaWgaaWcbaqcLbuacqWGWbaCaSqabaqcLbuacqGHRaWkcqWG3bWDjuaGdaWgaaWcbaqcLbuacqWGubavcqWGgbGrcqWGWbaCaSqabaqcLbuacqWGXbqCjuaGdaWgaaWcbaqcLbuacqWGubavcqWGgbGraSqabaqcLbuacqWGXbqCjuaGdaWgaaWcbaqcLbuacqWGWbaCaSqabaaakeaajugqbiabigdaXiabgUcaRiabdghaXLqbaoaaBaaaleaajugqbiabdchaWbWcbeaajugqbiabgUcaRiabdghaXLqbaoaaBaaaleaajugqbiabdsfaujabdAeagbWcbeaajugqbiabgUcaRiabdEha3LqbaoaaBaaaleaajugqbiabdsfaujabdAeagjabdchaWbWcbeaajugqbiabdghaXLqbaoaaBaaaleaajugqbiabdsfaujabdAeagbWcbeaajugqbiabdghaXLqbaoaaBaaaleaajugqbiabdchaWbWcbeaaaaaaaa@9DF9@ https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-S1-S18/MediaObjects/12864_2008_Article_1849_Equa_HTML.gif

, where w T F p = { 1 10 ~ 100 0 no interaction activation repression MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXafv3ySLgzGmvETj2BSbqeeuuDJXwAKbsr4rNCHbGeaGqipu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaaeaabaWaaaGcbaqcLbuacqWG3bWDkmaaBaaaleaajugqbiabdsfaujabdAeagjabdchaWbWcbeaajugqbiabg2da9OWaaiqaaeaajugqbuaabeqadeaaaOqaaKqzafGaeGymaedakeaajugqbiabigdaXiabicdaWiabc6ha+jabigdaXiabicdaWiabicdaWaGcbaqcLbuacqaIWaamaaaakiaawUhaaKqzafqbaeqabmqaaaqcfayaaKqzafGamaiGKgaaaeOBa4MamaiGKgaaae4Ba8MaiaiGKgaaaGjbVlacaciPbaaayIW7cWaGasAaaaqGPbqAcWaGasAaaaqGUbGBcWaGasAaaaqG0baDcWaGasAaaaqGLbqzcWaGasAaaaqGYbGCcWaGasAaaaqGHbqycWaGasAaaaqGJbWycWaGasAaaaqG0baDcWaGasAaaaqGPbqAcWaGasAaaaqGVbWBcWaGasAaaaqGUbGBcWaGacBaaaqGGaaicWaGacBaaaqGGaaicWaGacBaaaqGGaaicWaGacBaaaqGGaaicWaGacBaaaqGGaaicWaGacBaaaqGGaaicWaGacBaaaqGGaaicWaGacBaaaqGGaaicWaGacBaaaqGGaaiaKqbagaajugqbiadacih=daabggaHjadacih=daabogaJjadacih=daabsha0jadacih=daabMgaPjadacih=daabAha2jadacih=daabggaHjadacih=daabsha0jadacih=daabMgaPjadacih=daab+gaVjadacih=daab6gaUbqcfayaaKqzafGamaiGW9paaeOCaiNamaiGW9paaeyzauMamaiGW9paaeiCaaNamaiGW9paaeOCaiNamaiGW9paaeyzauMamaiGW9paae4CamNamaiGW9paae4CamNamaiGW9paaeyAaKMamaiGW9paae4Ba8MamaiGW9paaeOBa4gaaaaa@D72E@ https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-S1-S18/MediaObjects/12864_2008_Article_1849_IEq2_HTML.gif

A TF can serve as either an activator or a repressor, or simply does not interact with the RNAP, represented by different w TFp (Table 3). If w is set to 1, it represents that RNAP and the TF bind independently to the promoter. If w is set to 10~100, it represents that the TF helps to recruit RNAP to the promoter. The larger w is the higher the synergism is. If w is set to 0 or close to 0, it represents that the TF blocks the RNAP binding, and thus the TF is a repressor.
Table 4

The multinomial distribution of a promoter with its RNAP and two regulatory TFs.

(TF1, TF2) RNAP

(0, 0)

(1, 0)

(0, 1)

(1, 1)

0

1

q TF1

q TF2

w TF1TF2 q TF1 q TF2

1

q p

w TF1 p q TF1 q p

w TF2 p q TF2 q p

(w TF1 p + w TF2 p ) w TF1TF2 q TF1 q TF2 q p

The state of the promoter is represented by the row name and the column name. For example, the first row and the second column has RNAP=0 and (TF1, TF2)=(1,0), which means the promoter is bound by TF1 only. The quantity in a cell represents the ratio between the probability of a particular state and the probability of the base state (no RNAP binding and no TF binding).

Similar expressions can be derived for genes with two regulatory TFs capable of binding to a promoter together with RNAP (Table 4). The parameter w TF1TF2 is used to simulate the interaction between the two TFs. A large w TF1TF2 (10~100) represents that the two TFs stabilize each other onto the promoter. If the two TFs have no interaction, w TF1TF2 should be set to 1. If the two TFs compete for the binding, w TF1TF2 should be set to 0 or close to 0. The other two parameters, w TF1 p and w TF2 p , represent the interaction between each TF and RNAP, respectively. They can be set to reflect the nature of these interactions similar to w TF1TF2.

The marginal probability of RNAP binding to the promoter is:
P [ R N A P b i n d i n g ] = q p + w T F 1 p q p q T F 1 + w T F 2 p q p q T F 2 + ( w p T F 1 + w p T F 2 ) w T F 1 T F 2 q T F 1 q T F 2 q p 1 + q T F 1 + q T F 2 + w q T F 1 q T F 2 + q p + w T F 1 p q p q T F 1 + w p T F 2 q p q T F 2 + ( w p T F 1 + w p T F 2 ) w T F 1 T F 2 q T F 1 q T F 2 q p MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXafv3ySLgzGmvETj2BSbqeeuuDJXwAKbsr4rNCHbGeaGqipu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaaeaabaWaaaGcbaqcLbuacqWGqbaucqGGBbWwcqWGsbGucqWGobGtcqWGbbqqcqWGqbaujuaGdaWgaaWcbaqcLbuacqWGIbGycqWGPbqAcqWGUbGBcqWGKbazcqWGPbqAcqWGUbGBcqWGNbWzaSqabaqcLbuacqGGDbqxcqGH9aqpjuaGdaWcaaGcbaqcLbuacqWGXbqCjuaGdaWgaaWcbaqcLbuacqWGWbaCaSqabaqcLbuacqGHRaWkcqWG3bWDjuaGdaWgaaWcbaqcLbuacqWGubavcqWGgbGrcqaIXaqmcqWGWbaCaSqabaqcLbuacqWGXbqCjuaGdaWgaaWcbaqcLbuacqWGWbaCaSqabaqcLbuacqWGXbqCjuaGdaWgaaWcbaqcLbuacqWGubavcqWGgbGrcqaIXaqmaSqabaqcLbuacqGHRaWkcqWG3bWDjuaGdaWgaaWcbaqcLbuacqWGubavcqWGgbGrcqaIYaGmcqWGWbaCaSqabaqcLbuacqWGXbqCjuaGdaWgaaWcbaqcLbuacqWGWbaCaSqabaqcLbuacqWGXbqCjuaGdaWgaaWcbaqcLbuacqWGubavcqWGgbGrcqaIYaGmaSqabaqcLbuacqGHRaWkjuaGdaqadaqaaiabdEha3naaBaaabaGaemiCaaNaemivaqLaemOrayKaeGymaedabeaacqGHRaWkcqWG3bWDdaWgaaqaaiabdchaWjabdsfaujabdAeagjabikdaYaqabaaacaGLOaGaayzkaaqcLbuacqWG3bWDjuaGdaWgaaWcbaqcLbuacqWGubavcqWGgbGrcqaIXaqmcqWGubavcqWGgbGrcqaIYaGmaSqabaqcLbuacqWGXbqCjuaGdaWgaaWcbaqcLbuacqWGubavcqWGgbGrcqaIXaqmaSqabaqcLbuacqWGXbqCjuaGdaWgaaWcbaqcLbuacqWGubavcqWGgbGrcqaIYaGmaSqabaqcLbuacqWGXbqCjuaGdaWgaaWcbaqcLbuacqWGWbaCaSqabaaakeaajugqbiabigdaXiabgUcaRiabdghaXLqbaoaaBaaaleaajugqbiabdsfaujabdAeagjabigdaXaWcbeaajugqbiabgUcaRiabdghaXLqbaoaaBaaaleaajugqbiabdsfaujabdAeagjabikdaYaWcbeaajugqbiabgUcaRiabdEha3jabdghaXLqbaoaaBaaaleaajugqbiabdsfaujabdAeagjabigdaXaWcbeaajugqbiabdghaXLqbaoaaBaaaleaajugqbiabdsfaujabdAeagjabikdaYaWcbeaajugqbiabgUcaRiabdghaXLqbaoaaBaaaleaajugqbiabdchaWbWcbeaajugqbiabgUcaRiabdEha3LqbaoaaBaaaleaajugqbiabdsfaujabdAeagjabigdaXiabdchaWbWcbeaajugqbiabdghaXLqbaoaaBaaaleaajugqbiabdchaWbWcbeaajugqbiabdghaXLqbaoaaBaaaleaajugqbiabdsfaujabdAeagjabigdaXaWcbeaajugqbiabgUcaRiabdEha3LqbaoaaBaaaleaajugqbiabdchaWjabdsfaujabdAeagjabikdaYaWcbeaajugqbiabdghaXLqbaoaaBaaaleaajugqbiabdchaWbWcbeaajugqbiabdghaXLqbaoaaBaaaleaajugqbiabdsfaujabdAeagjabikdaYaWcbeaajugqbiabgUcaRKqbaoaabmaabaGaem4DaC3aaSbaaeaacqWGWbaCcqWGubavcqWGgbGrcqaIXaqmaeqaaiabgUcaRiabdEha3naaBaaabaGaemiCaaNaemivaqLaemOrayKaeGOmaidabeaaaiaawIcacaGLPaaajugqbiabdEha3LqbaoaaBaaaleaajugqbiabdsfaujabdAeagjabigdaXiabdsfaujabdAeagjabikdaYaWcbeaajugqbiabdghaXLqbaoaaBaaaleaajugqbiabdsfaujabdAeagjabigdaXaWcbeaajugqbiabdghaXLqbaoaaBaaaleaajugqbiabdsfaujabdAeagjabikdaYaWcbeaajugqbiabdghaXLqbaoaaBaaaleaajugqbiabdchaWbWcbeaaaaaaaa@1F83@ https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-S1-S18/MediaObjects/12864_2008_Article_1849_Equb_HTML.gif

By adjusting the parameters w TF1 p , w TF2 p and w TF1TF2 , we can obtain an analytical form for the probability of RNAP binding under different forms of interactions among RNAP and the two TFs. Figure 3 summarizes the parameter choices for two forms of simple interactions and six forms of three-way interactions.

Linking TF concentration to the probability of promoter occupancy

In this section we describe the influence of TF concentration on the probability of TF binding to the promoter of its target gene. In other words, we seek a function fsuch that q TF =f([TF]). This function will be used to predict changes in the transcription rate upon changes in TF concentration. Let T F D N A MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXafv3ySLgzGmvETj2BSbqeeuuDJXwAKbsr4rNCHbGeaGqipu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaaeaabaWaaaGcbaqcLbuacqWGubavcqWGgbGrcqGHsislcqWGebarcqWGobGtcqWGbbqqaaa@3721@ https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-S1-S18/MediaObjects/12864_2008_Article_1849_IEq3_HTML.gif represent the promoter bound by TF, and the binding process can be denoted as:
T F + D N A T F D N A MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXafv3ySLgzGmvETj2BSbqeeuuDJXwAKbsr4rNCHbGeaGqipu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaaeaabaWaaaGcbaqcLbuacqWGubavcqWGgbGrcqGHRaWkcqWGebarcqWGobGtcqWGbbqqcqGHsgIRcqWGubavcqWGgbGrcqGHsislcqWGebarcqWGobGtcqWGbbqqaaa@3F77@ https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-S1-S18/MediaObjects/12864_2008_Article_1849_Equc_HTML.gif

At equilibrium the concentrations of the substrates are described using the Hill equation: P ( T F b i n d i n g ) = [ T F ] n [ T F ] n + [ K T F ] n = ( [ T F ] / K T F ] ) n ( [ T F ] / K T F ] ) n + 1 , MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXafv3ySLgzGmvETj2BSbqeeuuDJXwAKbsr4rNCHbGeaGqipu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaaeaabaWaaaGcbaqcLbuacqWGqbaucqGGOaakcqWGubavcqWGgbGrjuaGdaWgaaWcbaqcLbuacqWGIbGycqWGPbqAcqWGUbGBcqWGKbazcqWGPbqAcqWGUbGBcqWGNbWzaSqabaqcLbuacqGGPaqkcqGH9aqpjuaGdaWcaaGcbaqcLbuacqGGBbWwcqWGubavcqWGgbGrcqGGDbqxjuaGdaahaaWcbeqaaKqzafGaemOBa4gaaaGcbaqcLbuacqGGBbWwcqWGubavcqWGgbGrcqGGDbqxjuaGdaahaaWcbeqaaKqzafGaemOBa4gaaiabgUcaRiabcUfaBjabdUealLqbaoaaBaaaleaajugqbiabdsfaujabdAeagbWcbeaajugqbiabc2faDLqbaoaaCaaaleqabaqcLbuacqWGUbGBaaaaaiabg2da9KqbaoaalaaakeaajugqbiabcIcaOiabcUfaBjabdsfaujabdAeagjabc2faDjabc+caViabdUealLqbaoaaBaaaleaajugqbiabdsfaujabdAeagbWcbeaajugqbiabc2faDjabcMcaPKqbaoaaCaaaleqabaqcLbuacqWGUbGBaaaakeaajugqbiabcIcaOiabcUfaBjabdsfaujabdAeagjabc2faDjabc+caViabdUealLqbaoaaBaaaleaajugqbiabdsfaujabdAeagbWcbeaajugqbiabc2faDjabcMcaPKqbaoaaCaaaleqabaqcLbuacqWGUbGBaaGaey4kaSIaeGymaedaaKqbakabcYcaSaaa@87A3@ https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-S1-S18/MediaObjects/12864_2008_Article_1849_IEq4_HTML.gif Where K TF is the concentration required for half of the TF binding to the promoter and n is the Hill coefficient.

Recall the percentage of promoter bound by TF can also be described using q TF , the ratio of the probabilities of the promoter in the bound and free states,
P ( T F b i n d i n g ) = ( [ T F ] / K T F ] ) n ( [ T F ] / K T F ] ) n + 1 = q T F q T F + 1 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXafv3ySLgzGmvETj2BSbqeeuuDJXwAKbsr4rNCHbGeaGqipu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaaeaabaWaaaGcbaqcLbuacqWGqbaucqGGOaakcqWGubavcqWGgbGrcqWGIbGycqWGPbqAcqWGUbGBcqWGKbazcqWGPbqAcqWGUbGBcqWGNbWzcqGGPaqkcqGH9aqpjuaGdaWcaaGcbaqcLbuacqGGOaakcqGGBbWwcqWGubavcqWGgbGrcqGGDbqxcqGGVaWlcqWGlbWsjuaGdaWgaaWcbaqcLbuacqWGubavcqWGgbGraSqabaqcLbuacqGGDbqxcqGGPaqkjuaGdaahaaWcbeqaaKqzafGaemOBa4gaaaGcbaqcLbuacqGGOaakcqGGBbWwcqWGubavcqWGgbGrcqGGDbqxcqGGVaWlcqWGlbWsjuaGdaWgaaWcbaqcLbuacqWGubavcqWGgbGraSqabaqcLbuacqGGDbqxcqGGPaqkjuaGdaahaaWcbeqaaKqzafGaemOBa4gaaiabgUcaRiabigdaXaaacqGH9aqpjuaGdaWcaaGcbaqcLbuacqWGXbqCjuaGdaWgaaWcbaqcLbuacqWGubavcqWGgbGraSqabaaakeaajugqbiabdghaXLqbaoaaBaaaleaajugqbiabdsfaujabdAeagbWcbeaajugqbiabgUcaRiabigdaXaaaaaa@75FA@ https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-S1-S18/MediaObjects/12864_2008_Article_1849_Equd_HTML.gif

Therefore, we can obtain: q T F = ( [ T F ] K T F ) n MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXafv3ySLgzGmvETj2BSbqeeuuDJXwAKbsr4rNCHbGeaGqipu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaaeaabaWaaaGcbaqcLbuacqWGXbqCjuaGdaWgaaWcbaqcLbuacqWGubavcqWGgbGraSqabaqcLbuacqGH9aqpcqGGOaakjuaGdaWcaaGcbaqcLbuacqGGBbWwcqWGubavcqWGgbGrcqGGDbqxaOqaaKqzafGaem4saSucfa4aaSbaaSqaaKqzafGaemivaqLaemOrayealeqaaaaajugqbiabcMcaPKqbaoaaCaaaleqabaqcLbuacqWGUbGBaaaaaa@4866@ https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-S1-S18/MediaObjects/12864_2008_Article_1849_IEq5_HTML.gif

We use the unit of [TF] and K TF as the number of TFs per cell. There have been a few efforts to estimate K TF from empirical data [33]. In this study, we assume at each time point in the time course, [TF] is linearly related to the expression level of the TF, as did in earlier module network studies [34]. It follows that [TF] peaks at the same time as its gene expression peaks. We further assume q TF is maximized at the maximum [TF] (see sensitivity analysis for further discussion on this assumption). In this study, we assume that K TF equates the maximum [TF] and it is a linear transformation of the maximum expression value of the gene coding this TF. We adopt the value 1/20 for q p from Buchler et al [13].

A kinetic model for the quantity of the mRNA of the target gene

Assuming the expression level of a gene is proportional to the probability of RNAP binding to its promoter [13], we use a differential equation to model the dynamic changes in RNA expression level.
d G d t = K g ( Pr ( R N A P B i n d i n g ) ) K d G MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXafv3ySLgzGmvETj2BSbqeeuuDJXwAKbsr4rNCHbGeaGqipu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaaeaabaWaaaGcbaqcfa4aaSaaaOqaaKqzafGaemizaqMaem4raCeakeaajugqbiabdsgaKjabdsha0baacqGH9aqpcqWGlbWsjuaGdaWgaaWcbaqcLbuacqWGNbWzaSqabaqcLbuacqGGOaakcyGGqbaucqGGYbGCcqGGOaakcqWGsbGucqWGobGtcqWGbbqqcqWGqbaujuaGdaWgaaWcbaqcLbuacqWGcbGqcqWGPbqAcqWGUbGBcqWGKbazcqWGPbqAcqWGUbGBcqWGNbWzaSqabaqcLbuacqGGPaqkcqGGPaqkcqGHsislcqWGlbWsjuaGdaWgaaWcbaqcLbuacqWGKbazaSqabaqcLbuacqWGhbWraaa@5946@ https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-S1-S18/MediaObjects/12864_2008_Article_1849_Eque_HTML.gif

, where G is the concentration (number per cell) of the transcript; K g is the maximum number of transcripts synthesized per minute per cell and K d is the degradation rate of transcripts (per minute per cell).

The maximum rate of mRNA synthesis rate has been estimated to be about one mRNA per 6-8 seconds [35]. Following previous estimates [36, 37], we assume that the rate of degradation around 1/6 of the maximum transcription rate. Therefore, we use K g =10 counts per minute and K d =10/6 counts per minute in this study.

Test with synthetic data

As a proof of principle, we first use synthetic data to show the validity of the method. We choose three commonly seen regulatory patterns (Figure 1). These regulatory patterns are: 1. a target gene is activated by one TF (Model 2 in Figure 3); 2. RNAP is blocked by a TF (repressor), and this TF is stabilized to DNA by a helper TF (Model 4 in Figure 3); 3. a target gene is regulated by two interacting activators (Model 7 in Figure 3), and one of the two activators is transcriptionally repressed by a third TF. For each of these three regulatory patterns, we do simulations as follows. First, we simulate the concentration change of each TF over time, which we call real_TF_exp, using equation: EA = aA+bAlog t +ε, where aA and bA are background gene expression index and coefficient describing changes of expression index with time. The ε represents the variability of expression index for gene A. Different patterns of transcription factor expression can be obtained by using different parameters of aA, bA and ε. Assuming that the concentration of TF is a linear transformation of EA, we feed these simulated concentrations of the TFs into a chosen regulatory pattern described in Figure 1 and derive the expression pattern of the target gene (real_target_exp) according to equations 1 and 2. Noises (normal(0,1)) are added to all the real expression patterns for both TFs and the target gene. We assume only the noise-added expression patterns are observed, and we denote the observed expression values as obs_TF_exp and obs_target_exp. The obs_TF_exp for all TFs in consideration are used to derive expression pattern for the target gene under each model in Figure 3. The model derived expression patterns are termed model_target_exp. For each model, obs_target_exp is compared to model_target_exp in terms of Pearson correlation. To test the robustness of the model, we have assessed the effects of choices of parameters on method performance (see Sensitivity analysis).

Model fitting

For each target gene, we identify its TF from either literature survey or ChIP-chip data. In this study we focus on genes regulated by two key transcription factors in ESCs: Oct4 and Nanog [13]. For each interaction form in Figure 3, we use the differential equation (eq. 1) to simulate the steady state level of mRNA expression level using a) the estimated [TF] and K TF based on measured mRNA levels. We simulate a series of steady state mRNA concentrations corresponding to measured expression profile of the target gene. We then compute the Pearson correlation between the simulated concentrations of target genes over time and the observed concentrations from the time course microarray data. The interaction form that predicts a concentration dynamics with a largest correlation to the measured expression level is regarded as the model-identified interaction form.

Declarations

Acknowledgements

This work is supported by National Center for Supercomputing Applications (SZ). XGZ is supported by U.S. National Science Foundation IBN 04-17126.

This article has been published as part of BMC Genomics Volume 9 Supplement 1, 2008: The 2007 International Conference on Bioinformatics & Computational Biology (BIOCOMP'07). The full contents of the supplement are available online at http://​www.​biomedcentral.​com/​1471-2164/​9?​issue=​S1.

Authors’ Affiliations

(1)
Department of Bioengineering, University of Illinois at Urbana Champaign
(2)
Department of Computer Science, University of Illinois at Urbana Champaign
(3)
Department of Plant Biology, University of Illinois at Urbana Champaign
(4)
Department of Statistics, University of Illinois at Urbana Champaign
(5)
Institute of Genomic Biology, University of Illinois at Urbana Champaign

References

  1. Bintu L: Transcriptional regulation by the numbers: applications. Curr Opin Genet Dev. 2005, 15 (2): 125-35. 10.1016/j.gde.2005.02.006.PubMedPubMed CentralView Article
  2. Bintu L: Transcriptional regulation by the numbers: models. Curr Opin Genet Dev. 2005, 15 (2): 116-24. 10.1016/j.gde.2005.02.007.PubMedPubMed CentralView Article
  3. Buchler NE, Gerland U, Hwa T: On schemes of combinatorial transcription logic. Proc Natl Acad Sci U S A. 2003, 100 (9): 5136-41. 10.1073/pnas.0930314100.PubMedPubMed CentralView Article
  4. Ivanova N: Dissecting self-renewal in stem cells with RNA interference. Nature. 2006, 442 (7102): 533-538. 10.1038/nature04915.PubMedView Article
  5. Evans MJ, Kaufman MH: Establishment in culture of pluripotential cells from mouse embryos. Nature. 1981, 292 (5819): 154-6. 10.1038/292154a0.PubMedView Article
  6. Martin GR: Isolation of a pluripotent cell line from early mouse embryos cultured in medium conditioned by teratocarcinoma stem cells. Proc Natl Acad Sci U S A. 1981, 78 (12): 7634-8. 10.1073/pnas.78.12.7634.PubMedPubMed CentralView Article
  7. Thomson JA: Embryonic stem cell lines derived from human blastocysts. Science. 1998, 282 (5391): 1145-7. 10.1126/science.282.5391.1145.PubMedView Article
  8. Pease S: Isolation of embryonic stem (ES) cells in media supplemented with recombinant leukemia inhibitory factor (LIF). Dev Biol. 1990, 141 (2): 344-52. 10.1016/0012-1606(90)90390-5.PubMedView Article
  9. Abeyta MJ: Unique gene expression signatures of independently-derived human embryonic stem cell lines. Hum Mol Genet. 2004, 13 (6): 601-8. 10.1093/hmg/ddh068.PubMedView Article
  10. Bhattacharya B: Gene expression in human embryonic stem cell lines: unique molecular signature. Blood. 2004, 103 (8): 2956-64. 10.1182/blood-2003-09-3314.PubMedView Article
  11. Catena R: Conserved POU binding DNA sites in the Sox2 upstream enhancer regulate gene expression in embryonic and neural stem cells. J Biol Chem. 2004, 279 (40): 41846-57. 10.1074/jbc.M405514200.PubMedView Article
  12. Boyer LA: Core transcriptional regulatory circuitry in human embryonic stem cells. Cell. 2005, 122 (6): 947-56. 10.1016/j.cell.2005.08.020.PubMedPubMed CentralView Article
  13. Loh YH: The Oct4 and Nanog transcription network regulates pluripotency in mouse embryonic stem cells. Nat Genet. 2006, 38 (4): 431-40. 10.1038/ng1760.PubMedView Article
  14. Golan-Mashiach M: Design principle of gene expression used by human stem cells: implication for pluripotency. Faseb J. 2005, 19 (1): 147-9.PubMed
  15. Skottman H: Gene expression signatures of seven individual human embryonic stem cell lines. Stem Cells. 2005, 23 (9): 1343-56. 10.1634/stemcells.2004-0341.PubMedView Article
  16. Bernstein BE: A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell. 2006, 125 (2): 315-26. 10.1016/j.cell.2006.02.041.PubMedView Article
  17. Nishimura DY: A spectrum of FOXC1 mutations suggests gene dosage as a mechanism for developmental defects of the anterior chamber of the eye. Am J Hum Genet. 2001, 68 (2): 364-72. 10.1086/318183.PubMedPubMed CentralView Article
  18. Sun X: Frequent somatic mutations of the transcription factor ATBF1 in human prostate cancer. Nat Genet. 2005, 37 (4): 407-12. 10.1038/ng1528.PubMedView Article
  19. Liang S, Fuhrman S, Somogyi R: Reveal, a general reverse engineering algorithm for inference of genetic network architectures. Pac Symp Biocomput. 1998, 18-29.
  20. Tyson JJ, Chen K, Novak B: Network dynamics and cell physiology. Nat Rev Mol Cell Biol. 2001, 2 (12): 908-16. 10.1038/35103078.PubMedView Article
  21. Elowitz MB: Stochastic gene expression in a single cell. Science. 2002, 297 (5584): 1183-6. 10.1126/science.1070919.PubMedView Article
  22. Rosenfeld N: Gene regulation at the single-cell level. Science. 2005, 307 (5717): 1962-5. 10.1126/science.1106914.PubMedView Article
  23. Raser JM, O'Shea EK: Noise in gene expression: origins, consequences, and control. Science. 2005, 309 (5743): 2010-3. 10.1126/science.1105891.PubMedPubMed CentralView Article
  24. Blake WJ: Noise in eukaryotic gene expression. Nature. 2003, 422 (6932): 633-7. 10.1038/nature01546.PubMedView Article
  25. Spellman PT: Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell. 1998, 9 (12): 3273-97.PubMedPubMed CentralView Article
  26. Banerjee N, Zhang MQ: Identifying cooperativity among transcription factors controlling the cell cycle in yeast. Nucleic Acids Res. 2003, 31 (23): 7024-31. 10.1093/nar/gkg894.PubMedPubMed CentralView Article
  27. Gardner TS, Faith JJ: Reverse-engineering transcription control networks. Physics of Life Reviews. 2005, 2: 65-88. 10.1016/j.plrev.2005.01.001.PubMedView Article
  28. Schmitt WA, Raab RM, Stephanopoulos G: Elucidation of gene interaction networks through time-lagged correlation analysis of transcriptional data. Genome Res. 2004, 14 (8): 1654-63. 10.1101/gr.2439804.PubMedPubMed CentralView Article
  29. Smith AG: Embryo-derived stem cells: of mice and men. Annu Rev Cell Dev Biol. 2001, 17: 435-62. 10.1146/annurev.cellbio.17.1.435.PubMedView Article
  30. Pera MF, Reubinoff B, Trounson A: Human embryonic stem cells. J Cell Sci. 2000, 113 (Pt 1): 5-10.PubMed
  31. Donovan PJ, Gearhart J: The end of the beginning for pluripotent stem cells. Nature. 2001, 414 (6859): 92-7. 10.1038/35102154.PubMedView Article
  32. Loebel DA: Lineage choice and differentiation in mouse embryos and embryonic stem cells. Dev Biol. 2003, 264 (1): 1-14. 10.1016/S0012-1606(03)00390-7.PubMedView Article
  33. Stormo GD: DNA binding sites: representation and discovery. Bioinformatics. 2000, 16 (1): 16-23. 10.1093/bioinformatics/16.1.16.PubMedView Article
  34. Segal E: Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat Genet. 2003, 34 (2): 166-76. 10.1038/ng1165.PubMedView Article
  35. Iyer V, Struhl K: Absolute mRNA levels and transcriptional initiation rates in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A. 1996, 93 (11): 5208-12. 10.1073/pnas.93.11.5208.PubMedPubMed CentralView Article
  36. Lewis J: Autoinhibition with transcriptional delay: a simple mechanism for the zebrafish somitogenesis oscillator. Curr Biol. 2003, 13 (16): 1398-408. 10.1016/S0960-9822(03)00534-7.PubMedView Article
  37. Brandman O: Interlinked fast and slow positive feedback loops drive reliable cell decisions. Science. 2005, 310 (5747): 496-8. 10.1126/science.1113834.PubMedPubMed CentralView Article

Copyright

© Chen et al.; licensee BioMed Central Ltd. 2008

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://​creativecommons.​org/​licenses/​by/​2.​0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Advertisement