Volume 10 Supplement 1
The 2008 International Conference on Bioinformatics & Computational Biology (BIOCOMP'08)
Reverse engineering module networks by PSORNN hybrid modeling
 Yuji Zhang^{1, 2},
 Jianhua Xuan^{2},
 Benildo G de los Reyes^{3},
 Robert Clarke^{1} and
 Habtom W Ressom^{2}Email author
DOI: 10.1186/1471216410S1S15
© Zhang et al; licensee BioMed Central Ltd. 2009
Published: 7 July 2009
Abstract
Background
Inferring a gene regulatory network (GRN) from high throughput biological data is often an underdetermined problem and is a challenging task due to the following reasons: (1) thousands of genes are involved in one living cell; (2) complex dynamic and nonlinear relationships exist among genes; (3) a substantial amount of noise is involved in the data, and (4) the typical small sample size is very small compared to the number of genes. We hypothesize we can enhance our understanding of gene interactions in important biological processes (differentiation, cell cycle, and development, etc) and improve the inference accuracy of a GRN by (1) incorporating prior biological knowledge into the inference scheme, (2) integrating multiple biological data sources, and (3) decomposing the inference problem into smaller network modules.
Results
This study presents a novel GRN inference method by integrating gene expression data and gene functional category information. The inference is based on module network model that consists of two parts: the module selection part and the network inference part. The former determines the optimal modules through fuzzy cmean (FCM) clustering and by incorporating gene functional category information, while the latter uses a hybrid of particle swarm optimization and recurrent neural network (PSORNN) methods to infer the underlying network between modules. Our method is tested on real data from two studies: the development of rat central nervous system (CNS) and the yeast cell cycle process. The results are evaluated by comparing them to previously published results and gene ontology annotation information.
Conclusion
The reverse engineering of GRNs in time course gene expression data is a major obstacle in system biology due to the limited number of time points. Our experiments demonstrate that the proposed method can address this challenge by: (1) preprocessing gene expression data (e.g. normalization and missing value imputation) to reduce the data noise; (2) clustering genes based on gene expression data and gene functional category information to identify biologically meaningful modules, thereby reducing the dimensionality of the data; (3) modeling GRNs with the PSORNN method between the modules to capture their nonlinear and dynamic relationships. The method is shown to lead to biologically meaningful modules and networks among the modules.
Background
In recent years, high throughput biotechnologies have made largescale gene expression surveys a reality. Gene expression data provide an opportunity to directly review the activities of thousands of genes simultaneously. However, computational methods that can handle the complexity (noisy, substantial amount of variables, high dimensionality, etc.) of these biological data are often unavailable [1]. Powerful computational methods and data mining tools are needed for biologically meaningful inferences from gene expression data.
Cluster analysis has been used to separate genes into groups based on their expression profiles [2], in which similar expression profiles will be more likely in the same group. Although cluster analysis gives insight into the groups of genes that may share similar functions, the inference of the relationships among these groups is beyond what cluster analysis can do.
A variety of continuous or discrete, static or dynamic, quantitative or qualitative models have been proposed for inference of biological networks. These include biochemically driven methods [3], linear models [4, 5], Boolean networks [6], fuzzy logic [7, 8], Bayesian networks [9], and recurrent neural networks [10–12]. Biochemically inspired models are developed on the basis of the reaction kinetics between different components of a network. However, most of the biochemically relevant reactions under participation of proteins do not follow linear reaction kinetics, and the full network of regulatory reactions is very complex and hard to unravel in a single step. Linear models attempt to solve a weight matrix that represents a series of linear combinations of the expression level of each gene as a function of other genes, which is often underdetermined since gene expression data usually have far fewer dimensions than the number of genes. In a Boolean network, the interactions between genes are modeled as Boolean function. Boolean networks assume that genes are either "on" or "off" and attempt to solve the state transitions for the system. The validity of the assumptions that genes are only in one of these two states has been questioned by a number of researchers, particularly among those in the biological community. In [7], an approach is proposed based on fuzzy rules of a known activator/repressor model of gene interaction. This algorithm transforms expression values into qualitative descriptors that can be evaluated by using a set of heuristic rules and searches for regulatory triplets consisting of activator, repressor, and target gene. This approach, though logical, is a brute force technique for finding gene relationships. It involves significant computation time, which restricts its practical usefulness. In [8], we propose the use of clustering as an interface to a fuzzy logicbased method to improve the computational efficiency. In a Bayesian network model, each gene is considered as a random variable and the edges between a pair of genes represent the conditional dependencies entailed in the network structure. Bayesian statistics are applied to find certain network structure and the corresponding model parameters that maximize the posterior probability of the structure given the data. Unfortunately, this learning task is NPhard, and it also has the underdetermined problem. The recurrent neural network (RNN) model has received considerable attention because it can capture the nonlinear and dynamic aspects of gene regulatory interactions. Several algorithms have been applied for RNN training in network inference tasks, such as fuzzylogic [11] and genetic algorithm [12]. In [10, 13], we applied particle swarm optimization (PSO) method to train the RNN for network inference, yielding promising results.
As variant sources of biological data are becoming available now, it is very necessary and helpful to infer gene regulatory network (GRN) not only from one single data source, but from data fusion of multiple complementary data sources. A few previous studies combined time course gene expression data with other data sources, such as genomic location data [14] and sequence motif [15]. Prior knowledge of GRN helps understand gene interactions in important biological processes such as differentiation, cell cycle, and development. Due to the specific properties of gene expression data, the task of inferring GRNs involves several challenges including: (1) living cells contain thousands of genes (high dimensionality); (2) each gene interacts with one or more other genes directly or indirectly with complex dynamic and nonlinear relationships, (3) current technologies generate data that involve a substantial amount of noise, and (4) due to the cost of largescale gene expression profiling experiments, the sample size is extremely low compared with the number of genes. In this study, we address these challenges by: (1) preprocessing gene expression data (e.g. normalization and missing value imputation) to reduce the data noise; (2) clustering genes with gene expression data and gene functional category information to find the optimal modules with biological significance and reduce the problem dimensionality; (3) modeling GRNs with the particle swarm optimization – recurrent neural network (PSORNN) method between the modules to capture their nonlinear and dynamic relationships.
Our previous studies [10, 13] demonstrate that we can benefit by incorporating known gene functional category information in terms of improving the inferential power of our framework. Moreover, instead of using fully connected RNN model, we propose a network pruning method to select the statistically significant weights for the final GRN structure using PSO. The hybrid PSORNN algorithm is applied to infer networks of interactions from two realworld gene expression data. The inferred GRNs are confirmed with previous studies.
Results and discussion
In this section, we demonstrate the inference ability of our proposed method via two experimental studies: the rat central nervous system (CNS) and yeast cell cycle process. Both data were preprocessed in the original studies [16, 17]. To proceed with the module network inference process, we first imputed the missing values in the data by using the Bayesian principal component analysis (BPCA) method [18]. Following that, we standardized the data between zero and one.
Rat CNS data
This case study is based on the data published in [16], consisting of gene expression levels for 112 genes during the development of the CNS of rats. Each gene was measured at nine different points in time (of which the last, measured for the adult animal, was not used here). The first measurement was made 10 days before birth, and the intervals between measurements were 2 or 3 days in the period before birth and 7 days after birth. The gene functional category information can also be found in [16].
Yeast cell cycle data
The yeast cell cycle data presented in [17] consist of six time series (cln3, clb2, alpha, cdc15, cdc28, and elu) expression measurements of the transcript (mRNA) levels of S. cerevisiae genes. 800 genes were identified as cell cycle regulated based on cluster analysis in [17]. Here, we used the cdc15 time course data of the 800 genes since it has the largest number of time points (24).
Mapping of expression clusters to functional gene classes.
G1  S  S/G2  G2/M  M/G1  

wave1  210  2  2  2  25 
wave2  46  63  67  2  1 
wave3  0  1  38  125  3 
wave4  9  1  2  24  58 
wave5  35  4  12  26  42 
Conclusion
Reverse engineering of GRNs from time course gene expression data is a major obstacle in system biology due to the limited number of time points. We demonstrate that our method can address this challenge by decomposing the reverse engineering problem into modules, where two steps are involved: the gene expression data is clustered into modules with biological significances to reduce the problem dimensionality, and the network is built based on the expression profiles of modules. We evaluate the performance of the algorithm using two real data sets: rat CNS data and yeast cell cycle data. The results indicate that biologically meaningful modules are selected and biologically plausible networks between modules are estimated. For example, in CNS data, the inferred network at module level is a combination of the networks verified in the other two studies [19, 20]. Our future research will focus on network module inference with more detailed gene category/regulation information. Multiple data sources (e.g. ChIPonChip data [21], motif information, and gene ontology annotation) can be used for this purpose. Also, the data fusion from complementary data sources will not only help solve the underdetermined problem in GRN inference, but also increase the prediction accuracy. Another direction to address the underdetermined reverse engineering problem is to decompose the GRN into small subnetworks, called network motifs (NMs) [22].
Methods
The proposed method includes two parts: module selection and network inference. In the module selection part, we cluster the genes by FCM clustering. The optimal number of clusters is determined by the relative entropy estimate method, which incorporates the gene functional category information; each cluster is considered as a module representing certain coregulated genes. After the modules are determined, the PSORNN inference algorithm is applied. In this algorithm, each module is considered as a neuron in the RNN structure, and any regulation between two modules is a weight in the RNN. To find the best fit network among the modules, a generalized PSO method, including basic PSO and neural network pruning technique, is used to determine RNN structure and its parameters.
Module selection
Clustering has been a major method to partition the genes into groups of coexpressed genes [23]. However, most of these clustering methods are purely datadriven with no prior biological knowledge. Here we present a new clustering method based on FCM clustering. Instead of using purely datadriven estimate methods, we propose a new estimate method to select the optimal number of clusters by incorporating gene functional category information.
FCM clustering
FCM is a method of clustering which allows a data point to belong to two or more clusters. The detailed description of FCM method can be found in [24]. Several methods have been used for estimating the optimal number of clusters, e.g. XieBeni statistic [24]and gap statistic [25]. Because all these methods are purely datadriven, it is not suitable to estimate the clustering of gene expression data.
Estimating the number of modules
where Λ is the sample space of x. The goal is to identify the clusters with significant relative entropy.
Network inference
In building an RNN to infer a network of interactions, the identification of the correct structure and determination of the free parameters (weights and biases) to mimic measured data is a challenging task given the limited available quantity of data and complex search space. In this paper, we apply PSO and neural network pruning methods to select the optimal architecture of an RNN and update its free parameters.
Network model
where x_{ i }is the gene expression level of the i^{ th }gene (1 ≤ i ≤ N), N is the number of genes in the model), φ(·) is a activation function, w_{ ij }represents the effect of j^{ th }gene on the i^{ th }gene (1 ≤ i, j ≤ N), b_{ i }denotes the bias for the i^{ th }gene, and τ is the decay rate parameter. The function φ (·) introduces nonlinearity to the model.
where x_{ i }(t) and are the true and predicted values (expression levels) for the i^{ th }neuron (entity) at time t. The goal is to determine the structure and weights that minimize this cost function.
Training algorithm
There exist many algorithms for RNN training in the literature, e.g., backpropagation through time (BPTT) [27] and genetic algorithm (GA) [12]. BPTT is an extension of the standard backpropagation algorithm, using gradient descent method to find the best solution. However, the use of the gradient descent requires the error function to be differentiable, and also makes the procedure easy to get stuck in local minima. GA, inspired by the natural evolution process, has been applied to optimize the GRN in some applications [12, 28].
Here, we use PSO [29] for RNN structure training. It has been shown that PSO requires less computational cost and can achieve faster convergence than conventional backpropagation in training neural networks for approximating a nonlinear function [30]. Compared with GA, PSO is easy to implement and there are few parameters to adjust. Particularly, PSO has memory for the previous best solutions to avoid the possible loss of learned knowledge. All these features make PSO suitable for GRN inference.
In PSO, each particle is represented as a vector and instantaneous trajectory vector , describing its direction of motion in the search space at iteration k. The index i refers to the i^{ th }particle. The core of the PSO algorithm is the position update rule (7) which governs the movement of each of the n particles through the search space.
The key strength of the PSO algorithm is the interaction among particles. The second term in (7), , is considered to be a "social influence" term. While this term tends to pull the particle towards the globally best solution, the first term, , allows each particle to think for itself. The net combination is an algorithm with excellent tradeoff between total swarm convergence, and each particle's capability for global exploration. Moreover, the relative contribution of the two terms is weighted stochastically.
The algorithm consists of repeated application of the velocity and position update rules presented above. Termination can occur by specification of a minimum error criterion, maximum number of iterations, or alternately when the position change of each particle is sufficiently small as to assume that each particle has converged.
PSO Parameter setting
Parameter  Value 

Maximum search space range, W_{ max }  [5, 5] 
Acceleration constants, c_{1} &c_{2}  2.05, 2.05 
Size of swarm  50–150 
PSORNN hybrid algorithm
In this section, we illustrate how PSO optimizes the parameters of RNN and how the structure of RNN is pruned to mimic the response of an unknown network of interactions. Since PSO is a stochastic algorithm, a single solution may not reflect the underlying network. We therefore collect a number of solutions from the PSORNN algorithm and use them to determine a single output network that receives the majority vote. Specifically, we applied 100 runs for each network inference. If the absolute value of the average of one parameter in hundred runs is larger than its standard deviation, it is said significant and will be selected for the final network, otherwise it will be set to zero. The following reverse engineering procedure is utilized:
1. Run the reverse engineering algorithm without introducing any particular constraints (except the maximumallowed values) in the network parameters. Perform hundred runs, and select the networks with mean squared error (MSE) less than certain threshold for further network parameter evaluation.
2. Determine the average and standard deviations of the network parameters using the results from Step 1.
3. Set nonsignificant parameters (if any) to zero. If there is no nonsignificant parameter, the procedure is stopped.
4. Return to the reverse engineering algorithm, with nonsignificant weights set to zero. If the results (measured by the fitness) are as good, or almost good, as for the previous sets of runs, form the network averages, and return to Step 3. If instead the results are worse than in the previous run, discontinue the procedure.
List of abbreviations used
 (CNS):

Central nervous system
 (FCM):

fuzzy cmeans
 (GRN):

Gene regulatory network
 (MSE):

mean square error
 (NM):

network motif
 (PSO):

particle swarm optimization
 (RNN):

recurrent neural network.
Declarations
Acknowledgements
This article has been published as part of BMC Genomics Volume 10 Supplement 1, 2009: The 2008 International Conference on Bioinformatics & Computational Biology (BIOCOMP'08). The full contents of the supplement are available online at http://www.biomedcentral.com/14712164/10?issue=S1.
Authors’ Affiliations
References
 Nasmyth K, Dirick L: The role of SW14 and SW16 in the activity of Gl cyclins in yeast. Cell. 1991, 66: 9951013.View ArticlePubMedGoogle Scholar
 D'Haeseleer P, Liang S, Somogyi R: Genetic network inference: from coexpression clustering to reverse engineering. Bioinformatics. 2000, 16 (8): 707726.View ArticlePubMedGoogle Scholar
 Naraghi M, Neher E: Linearized buffered Ca2+ diffusion in microdomains and its implications for calculation of [Ca2+] at the mouth of a calcium channel. J Neurosci. 1997, 17 (18): 69616973.PubMedGoogle Scholar
 Chen T, He HL, Church GM: Modeling gene expression with differential equations. Pac Symp Biocomput. 1999, 2940.Google Scholar
 D'Haeseleer P, Wen X, Fuhrman S, Somogyi R: Linear modeling of mRNA expression levels during CNS development and injury. Pac Symp Biocomput. 1999, 4152.Google Scholar
 Shmulevich I, Dougherty ER, Kim S, Zhang W: Probabilistic Boolean Networks: a rulebased uncertainty model for gene regulatory networks. Bioinformatics. 2002, 18 (2): 261274.View ArticlePubMedGoogle Scholar
 Woolf PJ, Wang Y: A fuzzy logic approach to analyzing gene expression data. Physiol Genomics. 2000, 3 (1): 915.PubMedGoogle Scholar
 Ressom H, Reynolds R, Varghese RS: Increasing the efficiency of fuzzy logicbased gene expression data analysis. Physiol Genomics. 2003, 13 (2): 107117.View ArticlePubMedGoogle Scholar
 Friedman N, Linial M, Nachman I, Pe'er D: Using Bayesian networks to analyze expression data. J Comput Biol. 2000, 7: 601620.View ArticlePubMedGoogle Scholar
 Ressom HW, Zhang Y, Xuan J, Wang J, Clarke R: Inferring network interactions using recurrent neural networks and particle swarm optimization. Proceedings of the First International Conference on Computational Systems Biology. 2006Google Scholar
 Maraziotis I, Dragomir A, Bezerianos A: Gene networks inference from expression data using a recurrent neurofuzzy approach. Conf Proc IEEE Eng Med Biol Soc: 2005. 2005, 48344837.Google Scholar
 Chiang JH, Chao SY: Modeling human cancerrelated regulatory modules by GARNN hybrid algorithms. BMC Bioinformatics. 2007, 8: 91PubMed CentralView ArticlePubMedGoogle Scholar
 Zhang Y, Xuan J, de los Reyes BG, Clarke R, Ressom HW: Network motifbased identification of transcription factortarget gene relationships by integrating multisource biological data. BMC Bioinformatics. 2008, 9: 203PubMed CentralView ArticlePubMedGoogle Scholar
 Yeang C, Jaakkola T: Time Series Analysis of Gene Expression and Location Data. Proc Of the 3rd IEEE Symposium on Bioinformatics and BioEngineering (BIBE'03): 2003. 2003Google Scholar
 Tavazoie S, Hughes JD, Campbell MJ, Cho RJ, Church GM: Systematic determination of genetic network architecture. Nat Genet. 1999, 22 (3): 281285.View ArticlePubMedGoogle Scholar
 Wen X, Fuhrman S, Michaels GS, Carr DB, Smith S, Barker JL, Somogyi R: Largescale temporal gene expression mapping of central nervous system development. Proc Natl Acad Sci USA. 1998, 95 (1): 334339.PubMed CentralView ArticlePubMedGoogle Scholar
 Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, Futcher B: Comprehensive identification of cell cycleregulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell. 1998, 9 (12): 32733297.PubMed CentralView ArticlePubMedGoogle Scholar
 Oba S, Sato MA, Takemasa I, Monden M, Matsubara K, Ishii S: A Bayesian missing value estimation method for gene expression profile data. Bioinformatics. 2003, 19 (16): 20882096.View ArticlePubMedGoogle Scholar
 Tang Y, Kitisin K, Jogunoori W, Li C, Deng CX, Mueller SC, Ressom HW, Rashid A, He AR, Mendelson JS, et al: Progenitor/stem cells give rise to liver cancer due to aberrant TGFbeta and IL6 signaling. Proc Natl Acad Sci USA. 2008, 105 (7): 24452450.PubMed CentralView ArticlePubMedGoogle Scholar
 Wahde M, Hertz J: Modeling genetic regulatory dynamics in neural development. J Comput Biol. 2001, 8 (4): 429442.View ArticlePubMedGoogle Scholar
 Lee TI, Rinaldi NJ, Robert F, Odom DT, BarJoseph Z, Gerber GK, Hannett NM, Harbison CT, Thompson CM, Simon I, et al: Transcriptional regulatory networks in Saccharomyces cerevisiae. Science. 2002, 298 (5594): 799804.View ArticlePubMedGoogle Scholar
 Alon U: Network motifs: theory and experimental approaches. Nat Rev Genet. 2007, 8 (6): 450461.View ArticlePubMedGoogle Scholar
 BenDor A, Shamir R, Yakhini Z: Clustering gene expression patterns. J Comput Biol. 1999, 6 (3–4): 281297.View ArticlePubMedGoogle Scholar
 Xie XL, Beni G: A validity measure for fuzzy clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1991, 13 (8): 841847.View ArticleGoogle Scholar
 Tibshirani R, Walther G, Hastie T: Estimating the number of clusters in a datset via the Gap statistic. Royal Statistical Society: Series B (Statistical Methodology). 2001, 63 (2): 411423.View ArticleGoogle Scholar
 Xu R, Wunsch DC: Gene regulatory networks inference with recurrent neural network models. IEEE International Joint Conference on Neural Networks: 31 July–4 Aug. 2005. 2005, 286291.Google Scholar
 Werbos PJ: Backpropagation Through Time: What It Does And How to Do It. Proceedings of IEEE. 1990, 78 (10): 15501560.View ArticleGoogle Scholar
 Keedwell E, Narayanan A: Discovering gene regulatory networks with a neuralgenetic hybrid. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2005, 2 (3): 231243.View ArticlePubMedGoogle Scholar
 Kennedy J, Eberhart RC: Particle swarm optimization. Proceedings of the 1995 IEEE International Conference on Neural Networks (Perth, Australia). 1995, IV: 19421948.Google Scholar
 Gudise V, Venayagamoorthy G: Comparison of Particle Swarm Optimization and Backpropagation as Training Algorithms for Neural Networks. Proceedings of the 2003 IEEE Svarm Intelligence Symposium. 2003, 110117.Google Scholar
 Birge B: PSOt – a particle swarm optimization toolbox for use with Matlab. Swarm Intelligence Symposium, 2003 SIS '03 Proceedings of the 2003 IEEE: 2003. 2003, 182186.Google Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.