Discrete diffusion models to study the effects of Mg 2+ concentration on the PhoPQ signal transduction system
© Ghosh et al. 2010
Published: 01 December 2010
Skip to main content
© Ghosh et al. 2010
Published: 01 December 2010
The challenge today is to develop a modeling and simulation paradigm that integrates structural, molecular and genetic data for a quantitative understanding of physiology and behavior of biological processes at multiple scales. This modeling method requires techniques that maintain a reasonable accuracy of the biological process and also reduces the computational overhead. This objective motivates the use of new methods that can transform the problem from energy and affinity based modeling to information theory based modeling. To achieve this, we transform all dynamics within the cell into a random event time, which is specified through an information domain measure like probability distribution. This allows us to use the “in silico” stochastic event based modeling approach to find the molecular dynamics of the system.
In this paper, we present the discrete event simulation concept using the example of the signal transduction cascade triggered by extra-cellular Mg 2+ concentration in the two component PhoPQ regulatory system of Salmonella Typhimurium. We also present a model to compute the information domain measure of the molecular transport process by estimating the statistical parameters of inter-arrival time between molecules/ions coming to a cell receptor as external signal. This model transforms the diffusion process into the information theory measure of stochastic event completion time to get the distribution of the Mg 2+ departure events. Using these molecular transport models, we next study the in-silico effects of this external trigger on the PhoPQ system.
Our results illustrate the accuracy of the proposed diffusion models in explaining the molecular/ionic transport processes inside the cell. Also, the proposed simulation framework can incorporate the stochasticity in cellular environments to a certain degree of accuracy. We expect that this scalable simulation platform will be able to model more complex biological systems with reasonable accuracy to understand their temporal dynamics.
Advancement in high-throughput biological experiments has generated huge amounts of empirical data on the molecular foundations of biological structures and functions that require computer models for analysis. The next challenge is to understand the complex interactions of biological processes and functions creating the intelligence of life. The complexity increases manifold as we move into higher scales: interaction of large ensemble of cells in a tissue or interaction of tissues in an organ. Thus, we need to develop a comprehensive model integrating molecular and genetic data for quantitative studies of physiology and behavior of biological processes at multiple scales .
Existing models used in the understanding of biological processes can be divided into three main classes. Quantum Mechanics based models (femtosecond-picosecond; ) are used to understand the structure of macro molecules. The functional understanding of biological molecules (like binding properties, configuration states after binding and other individual properties of the molecular reactions) are well studied by the Molecular dynamics based model (picosecond-nanosecond; 1nm-10nm). The next challenge is to understand the biological intelligence created by the usage of the macro molecules in the cell. These are accomplished by the Mesoscale Dynamics (nanosecond-seconds; 10nm-1mm) models, and Cellular-level/Organ-level simulation schemes. Such simulation schemes are again broadly classified into two categories: (a) Continuous system models [2–6], employing differential equations to simulate cellular dynamics used in tools like Dizzy  and JARNAC ; (b) Stochastic discrete time models, like StochSim  and M-cell , that have been developed for capturing the stochastic nature of molecular interactions within the existing framework of rate equations in continuous time domain. Most of these models focus on intracellular biochemical reactions and require accurate estimation of a very large number of system parameters for providing systemic understanding of underlying processes. More integrative tools at the whole cell level have also been developed, which try to model cellular mechanisms and present visual representation of their functionality [9, 10].
Recently, it was shown that gene expression type interactions create a stochastic resonance  within the system and hence deterministic models are inappropriate for this process. The Gilliespie simulation  incorporates the dynamics of the chemical master equation by approximately handling the stochasticity of the mass-kinetic equations. As the temporal variability of reaction time is appreciable within a biological process, this method suffers from simulation stiffness. Moreover, this model represents the biological pathways through a set of reaction equations without showing their relation to the biological functions creating the pathway. Any change in the pathway description may change the complete set of equations. Also their approach has computational overheads and require estimates of all the rate constants. With the existing systems in perspective, we present a discrete-event driven paradigm - modeling a composite system by combining the state variables in the time-space domain as events and determining the immediate dynamics between the events by using statistical analysis or simulation methods. To reduce computational overhead we transform the different diffusion and molecular interactions within a cell from the thermodynamic energy based fields to the information theory field by suitable abstraction of the energy field profiles into selected statistical distributions. Our goals are to : (1) Use the results of the Quantum Mechanics based models (molecular structure data) and Molecular dynamics models (molecular binding data) to create the micro-level biological event models. (2) Transform the energy driven biological effects to information theory parameters in the probabilistic domain considering the biological functions. (3) Develop event models to estimate the statistics of the biological event. (4) Develop a discrete-event based “in silico” simulation for complex systems.
The rest of the paper is organized as follows. We briefly present an overview on the different modeling and simulation paradigms in Section Modeling and Simulation landscape. Next, we introduce the discrete event based simulation framework in Section Discrete Event Simulation Technique. In Section PhoPQ Biological System Model, we present the concept of event abstraction of biological pathways using the PhoPQ biological system in Salmonella. We introduce the analytical models for the molecular transport mechanisms in Section Analytical Models for Molecular Transport. In Section Numerical Results for the Molecular Transport Models, we present the performance results and validation of the molecular transport models. Section Simulation Results of the PhoPQ system presents some in silico results from the discrete event simulation of the PhoPQ system, to validate the performance of the simulation engine and also provides some examples of in silico hypothesis tests. Finally, we conclude in Section Conclusion and provide some directions for future work. A short version of this paper appeared in .
The inherent complexity involved in the molecular processes governing life has motivated the development of computational modeling and simulation techniques to decipher their ensemble dynamics. In this section, we provide an overview of the wide spectrum of in silico modeling and simulation methodologies available for system-wide study of biological processes.
Mathematical models have being extensively used for intracellular molecular networks like kinase cascades and metabolic pathways, gene regulatory networks and protein interaction networks. A large section of the work in computational models of biological systems is based on classical chemical kinetic (CCK) formalism based on a set of ordinary differential equations (ODE), also known as reaction rate equations or mass action kinetics . Representing a homogeneous biological system as a set of biochemical reactions, the temporal dynamics of the molecular species is studied in the continuous-deterministic domain. A large number of computational tools, which provide a software platform for building, storing and parameterizing a set of biochemical reactions and solving those using numerical techniques, are available, like Gepasi , Jarnac , CyberCell , Promot/DIVA , Stode . These rate-based models have been successfully applied to study gene expression and other molecular reaction systems.
While continuous-deterministic reaction models are capable of capturing behavioral dynamics for spatially homogeneous systems with large number of molecular species, the inherent stochasticity observed in many biological processes (gene expression and protein synthesis) have proven the limitation of CCK in accurately representing biological processes. In a recent article , Arkin et.al have shown the limitations of CCK in several common biological scenarios, where stochastic reaction dynamics present a more accurate picture of the systems behavior. Stochastic models, which present an accurate approximation for the chemical master equation (CME), have been developed, largely based on Gillespie’s algorithm [12, 20, 12]. In this method, the next reaction event and the time associated with it are computed based on a probability distribution (Monte Carlo Step). Stochastic tools, like StochSim , have been developed based on Gillespie’s technique and its computationally efficient variants like Gibson-Bruck  and tau-leaping [22–24]. A large number of tools, which provide an integrative environment to build and study biochemical reaction systems in an exchangeable format (like Systems Biology Markup Language (SBML)) using deterministic as well as stochastic techniques are available, like E-Cell , Virtual Cell , Dizzy , CyberCell , and M-Cell . These techniques are based on treating a biological process as a system of equations, represented by their rate constants and other parameters (like volume, cell density etc.) and simulating their interactions through numerical techniques or Monte Carlo based stochastic simulations.
Another technique in building abstract computational models for biosimulation has been developed based on Petri nets [29–31] and stochastic process algebra . These methods present a mathematical formalism for representing biochemical pathways within and between cells. In , the authors present a stochastic Petri net (SPN) model for studying simple chemical reactions (SPN model of ColE1 plasmid replication) and show how existing softwares can be used to perform structural analysis based on numerical techniques. Discrete event system specifications based on Devs & StateCharts  and Stochastic π calculus  have been successfully demonstrated to provide a computational platform for temporal simulation of complex biological systems. Hillston et. al have developed a mathematical technique, Performance Evaluation Process Algebra (PEPA) , wherein functionality is captured at the level of pathways rather than molecules and the system is represented as a continuous time Markov chain.
Other simulation methodologies, based on object oriented and agent based (ABM) paradigms have also been studied for in silico modeling of complex bio-processes by Uhramacher et.al [35–37]. In , the authors have developed AgentCell, an ABM based digital assay for the study of bacterial chemotaxis. Simulation platforms, based on discrete events, where the events are modeled on rate constants and measured experimental data, have been demonstrated in [9, 39].
The overarching theme, guiding the development of in silico modeling and simulation tools, is developing models based on continuous-deterministic ODEs or using stochastic simulation algorithms (SSA) for approximating the chemical master equation, which capture the temporal evolution of the biological process dynamics. Most of these techniques focus on molecular pathways, which are represented in graphical and mathematical formalisms, treat spatial dynamics in terms of well-defined cellular compartments, and abstract the complexity in terms of estimated parameters and rate constants. In the next section, we briefly outline our modeling and simulation technique, based on a discrete event system specification, where the molecular events (representing reactions, molecular/ionic transport etc) are mechanistically modeled depending on their biophysical characteristics to compute the probability distribution of their execution times. A discrete event simulation system then links the biological processes to simulate the behavior emerging from the interaction of the events in time.
In a discrete-event based approach, the dynamics of the system are captured by discrete time-space state variables, where an event is a combined process of large number of state transitions between a set of state variables accomplished within event execution time. The underlying assumption is that it is possible to segregate the complete state-space into such disjoint sets of independent events which can execute simultaneously without any interaction.
In Salmonella, virulence is produced by the two-component PhoPQ system that is activated by Mg 2+ concentration change. We identify the key biological functions involved in the PhoPQ regulatory network (from the sensing of Mg 2+ at the cell membrane to the expression of virulent genes in the nucleus). The schematic block diagram of the processes which we have identified to capture the pathway details is presented first. For each process block, we have some input signal(s) coming into the process and output signal(s) which can be considered as the outcome of the process and can trigger one or more processes (or the same process itself in a feedback mechanism). Figure 3 captures the high-level biological functions involved.
Normally a biological process is defined by a pathway (experimentally determined by biologists) that shows the cascade of biological functions in time. Currently, many pathway databases have been established maintaining this record for different species which we use to understand this process. With the departure of a Mg 2+ molecule, the phoQ protein auto-phosphorylates (kinase activity) by making use of an ATP molecule from the cell. The phosphatase activity of phoQ regulates the phosphotransfer mechanism to phosphorylate the phoP protein under micromolar Mg 2+ concentrations, and dephosphorylates the phosphorylated phoP molecules under millimolar Mg 2+ concentrations. Generally, Mg 2+ concentrations higher than 250 mM stimulate the dephosphorylation of phospho-phoP (also called phoPp). Two independent mechanisms of dephosphorylation of phoPp occur. One involves the reversion of the reaction that takes place to phosphorylate the response regulator, and the other is a specific phoPp phosphatase induced by high concentrations of Mg 2+ that renders the release of inorganic phosphate.
Thus we can identify the following discrete events from the PhoPQ pathway: with the departure of a Mg 2+ molecule (event: ion diffusion from membrane protein), the phoQ protein autophosphorylates (kinase activity) by making use of an ATP molecule from the cell (event: membrane reaction). The phosphate activity of the phoQ regulates the phosphotransfer mechanism to phosphorylate the phoP protein under micro molar Mg 2+ concentrations, and dephosphorylates the phosphorylated phoP molecules under millimolar Mg 2+ concentrations (event: cytoplasmic reaction). The Phospho PhoP (phoPp) activates the promoter loci and there is only one activation per phoPp. The loci are obtained from the determination of regulatory pathway. phoPp binding to DNA site is required for transcription (event: DNA protein binding). RNA polymerases are involved in the process of transcription (event: cytoplasmic multi molecule reaction). We also need to consider translation (including steps such as binding of polymerases, regulatory factors, subunits etc) and transport processes.
Thus we can identify many different biological functions and separate models are required to estimate their characteristics. The models for cytoplasmic reactions [42–45], DNA-protein binding [46, 47], protein-ligand docking [48, 49] and protein synthesis  have been reported separately. Here, we present the model for the molecular transport time event. We also explain, how we validate the mathematical model by considering actual molecular data on the PhoPQ system and published experimental results (similar analysis has been done for other model systems e.g., the RNAi pathway in ). Based on this model and the other models we mentioned, we can complete the simulation of the PhoPQ system.
From the PhoPQ system, we find that an important process that we have to model is the movement of molecules (Mg 2+ ions, phoPp etc). We have identified the following movement models for biological processes: (a) diffusion of charged ions (e.g. Mg 2+) in the cell (to model the Mg 2+ arrival/departure process); (b) diffusion of non-charged molecules (to model the transport function of phospho-PhoP in the cytosol); (c) diffusion of charged ions out of the cell (to model the Mg 2+ departure process out of the cell). This movement model should also consider the breakage of the ionic bond between Mg 2+ and phoQ molecules for the diffusion to occur; (d) The fourth movement model is the movement of ions or molecules due to additional energy provided by the pump system. Here, we present the analytical solution of the first two models.
where, for convenience, we assume that the capillary is infinitely long. Here, D = diffusion constant having units length 2 /time, c = concentration of the chemical, t = time and x = distance traversed inside the capillary by the chemical.
Because the solute bath in which the capillary sits is large, it is reasonable to assume that the chemical concentration at the tip is fixed at C (0, t) = C 0, and because the tube is initially filled with pure water, C (x,0) = 0.
where . We can compute the inter-arrival time between the diffused molecules from the following theorem:
where I i +1 and I i are the times taken for diffusion of the (i + 1) th and i th molecules respectively, and G is the cross-sectional area of the capillary.
It is also possible to determine the diffusion coefficient by solving Eqn 4 for D :
In , this expression was used to measure the diffusion constant in bacteria. With concentration C 0 = 7 × 107/ml, and times t = 2, 5, 10, 12.5, 15 and 20 minutes, they counted N = 1800, 3700, 4800, 5500, 6700 and 8000 bacteria in a capillary of length 32 mm with 1 μ l total capacity. In addition, with C 0 = 2.5, 4.6, 5.0, and 12.0 × 107 bacteria per millimeter, counts of 1350, 2300, 3400, and 6200 were found at t = 10 minutes. A value of D in the range of 0.1 — 0.3 cm 2 /hour was estimated using Eqn 4.
Estimates of diffusion times for typical cellular structures, computed from the relation t = x 2/D using D = 10–5 cm 2/s
thickness of cell membrane
1 μ m
size of mitochondrion
10 μ m
radius of small mammalian cell
250 μ m
radius of squid giant axon
half-thickness of frog sartorius muscle
half-thickness of lens in the eye
radius of mature ovarian follicle
thickness of ventricular myocardium
length of a nerve axon
where, a = αV/l. As it is difficult to achieve a closed form solution of the above equation, we modify the boundary conditions leading to the following theorem:
A standard method for obtaining the solution of the above partial differential equation (PDE) is to assume that the variables are separable. Thus we may attempt to find a solution of Eqn 7 by putting
C = Y (x) Z (t) (9)
Y ″(x) + aY ′(x) + λ 2 Y (x) = 0 (12)
The solution for the first equation is:
Z = e – λ 2 Dt (13)
The previous capillary model cannot be used in this case to obtain a solution because the underlying complexity becomes immense. We will now consider diffusion out of a plane sheet of thickness l through which the diffusing substance is initially uniformly distributed and the surfaces of which are kept at zero concentration. Mapping this model to our case, the ion channel of length l is assumed to contain the entire diffusing substance. Every single molecule coming out of this sheet is assumed to enter the cell membrane (Mg 2+ arrival process). This model thus approximately characterizes the Mg 2+ diffusion process. The corresponding boundary conditions are as follows:
Thus we get the time domain analysis for the concentration of Mg 2+ molecules from which we can derive the mean Mg 2+ departure rate. The inter-arrival time between the diffused molecules can be computed from the following theorem:
The total number of molecules/ions, N, present inside the sheet of area G in a fixed time I N is given by:
The inter-arrival time can be computed in a straight-forward way by noting that diffusion occurs when a molecule/ion goes out off the plane sheet.
The third diffusion model basically characterizes reaction-diffusion systems and can be simply computed by convoluting the event time distributions (which are random variables) of a reaction model and any of the above diffusion models.
Parameter Estimation for the numerical plots
Diameter of an ion-channel (d)
10 × 10–10 m
Cross-sectional area of ion-channel (G ′)
Number of ion-channels (N ′)
N ′ × G ′
10–5 cm 2/s
Note that model 1 is standard and we estimated the inter-arrival times of Mg 2+ molecules using it. The transient analysis of model 2 is hard to solve and hence we chose a specific boundary condition (as mentioned before) to derive a closed form expression. The corresponding results compare well with model 1 indicating its validity.
As the arrival/departure of Mg 2+ molecules into the cell membrane is essentially a stochastic process, a constant diffusion rate is not suitable to trigger the input process of the PhoPQ system. Hence we use an exponential distribution (as indicated by the numerical plots above) to estimate the inter-arrival times for diffusion of Mg 2+ (which is considered to be a random variable) to generate the results. The mean of this exponential distribution is obtained from similar plots of inter-arrival times as shown above and corresponding curve-fitting. As mentioned before, the PhoPQ system is triggered at micromolar concentrations of Mg 2+ outside the cell, i.e., with millimolar Mg 2+ concentration inside the cell. Thus it is fair to assume C 0 ≃ 10–3 moles. The mean of the inter-arrival times of Mg 2+ for this concentration is estimated to be ≈ 10–6 secs for Model 1 and 10 msecs for Model 2 respectively. The discrete-event simulation framework correspondingly uses a Poisson distribution with the same mean (as the inter-arrival times follow an exponential distribution) to estimate the time taken for the departure process of Mg 2+triggering the signal transduction cascade (following Model 2) and an exponential distribution to estimate the phoPp molecule transport times (following Model 1).
The simulation framework also uses the holding time estimates of other elementary biological processes such as cytoplasmic reactions , , ,  (models 2, 3, 4 and 5 in Figure 3), protein-DNA binding  (model 6 in Figure 3) and gene transcription/translation times . Here, we present the results illustrating the sensitivity of the simulation to the diffusion models used.
The efficacy of an in silico modeling and simulation approach is governed by
validation of the model against existing wet-lab experimental results,
effective calibration and sensitivity analysis of the key parameters governing the biological model and
hypothesis testing of different conditions on the biological system which can give further insights for novel experiments in the future.
In this section, we employ the discrete event based stochastic simulation framework to model the dynamics of single cell dynamics, specifically, the effect of the PhoPQ two-component signal transduction pathway on the expression of virulence genes involved in bacterial pathogenesis of the gram-negative bacteria Salmonella Typhimurium. While the simulation system can be used to model the temporal dynamics of different regulatory pathways in a bacterial cell, we focus on the particular system in this work as it provides,
Existing wet-lab experimental setup and results  which allow the validation of the in silico results
The system involves the interaction of signal transduction with subsequent expression of genes governed by the upstream signals
The gene regulation pathway as built based on existing literature on the two-component system provides various regulatory mechanisms including up and down regulation of genes, and positive feedback effects which can serve to test different hypothesis in silico.
As the system involves complex biological functions like gene regulation and protein expression, whose exact molecular mechanisms are not always well known, it provides a platform to test the efficacy of granular model abstraction based on available knowledge, on the behavior at a systems level.
In the rest of the section, we start with a brief description of the wet lab experimental system, moving on to present the detailed results of in silico analysis. We show how the discrete event simulation framework can be used for hypothesis-driven analysis of different conditions in silico for the PhoPQ system.
The experimental setup, explained in details in , consists of reporting the system output of the phoPQ pathway on bacterial cells. As reported in , fluorescence measure of expression of destabilized green fluorescence protein (dEGFP) under the control of a phoPp (phosphorylated phoP) responsive promoter was used as the reporter system. Thus, the system measure of the dEGFP was in essence an indication of the phoPp concentration in the system.
Also, the platform provides flexibility in changing these conditions and resources to generate synthetic, hypothetical results for a better understanding of the test system. In the next subsection, we outline the system and simulation parameters and present the results of the in silico experiment.
System model and simulation parameters
Length of Genome
Number of Genes
Rate of transcription
Rate of translation
Area of cell
6 × 10–12 m 2
Volume of cell
10–18 m 3
Diffusion coefficient of Mg 2+ ion
10–9 m 2/s
Diffusion coefficient of protein molecule
7.7 × 10–6 m 2/s
Average mass of a protein molecule
Average diameter of a protein molecule
For the current system, the simulation focused on tracing the effects of signaling events (Mg 2+ ion arrival and departures) on the expression dynamics of the PhoPQ pathway. Also, as a reporter protein (GFP) has been used in the wet-lab scenario to trace the system behavior, our results are focused primarily on phoPp as the main resource whose dynamic temporal behavior was observed in the simulation. Although, the simulation can be configured to monitor and generate results for a wide range of system resources, phoPp was chosen primarily to verify the wet-lab tests. The in-silico results denote resource states averaged over 100 runs of the simulation under the same initial conditions.
Also, the condition of no resource starvation shows relative smoothness in output as obtained from continuous system models since the effect of low copy number of molecules on stochasticity  is not displayed. The in silico platform allows the analysis of the effects of stochasticity on the model by varying the resource states of the molecules involved in the simulation and also the sensitivity of the system outputs to the different parameters governing the event holding time distributions. In the next sub-section, we present a systematic analysis of the different in silico hypothesis tests.
The in-silico simulation allows the modeler to test the system under various synthetic conditions, in terms of system resource states, initial conditions and different combinations of environmental cues driving the systems (for example, the diffusion of Mg 2+ through the cell membrane in our case study).
The in silico results on the test-bed pathway demonstrate the efficacy of the modeling and simulation approach for studying single cell dynamics. Particularly, the flexibility in event scheduling and resource state specifications allows a modeler to validate the effects of high and low copy number of molecules on different parts of the biological system. Moreover, the flexibility allows the simulation to be computationally efficient depending on the required granularity of the biological model and the resource state space considered .
We have proposed a new “in silico” modeling technique capturing the temporal dynamics of biological systems at multiple scales that can be simulated by the discrete event technique. For this, we need the transformation of biological functions into information theory based measure like probability distributions of event time. We have presented one example of the transformation of a biological function (i.e., molecular transport time) driven by concentration and potential gradients in this paper. We also validated the molecular transport models and put together a discrete event simulation for the PhoPQ system to validate the system level dynamics based results with experimental estimates. We also used this molecular transport model to generate some in-silico hypothesis testing results on the PhoPQ system.
The proposed stochastic models meet the accuracy and computational speed requirements for modeling complex biological processes. These models are parametric and can be used for different cases of molecular transport. Once the complete set of mathematical models for the different biological functions are in place, it should be possible to reuse these models to construct other biological process models with marginal changes. The models provide for both speed of computation and flexibility that is required to model the dynamics of an entire cell. We envisage the development of an efficient tool for understanding the dynamics of complex biological systems that can model the multi-scale biological process at a coarse grain accuracy.
The authors would like to acknowledge the National Science Foundation (NSF) and the University of Southern Mississippi for providing generous funds to accomplish this project. The authors also thank Dr. Simon Daefler’s lab at the University of Texas, Southwestern Medical Center for providing the experimental results on the PhoPQ system. Publication of this supplement was made possible with support from the International Society of Intelligent Biological Medicine (ISIBM).
This article has been published as part of BMC Genomics Volume 11 Supplement 3, 2010: The full contents of the sup-plement are available online at http://www.biomedcentral.com/1471-2164/11?issue=S3.
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.