Based on the immune response principle and ε-dominance strategy [19], this paper incorporating dynamic population size [18] into MOIB [15] algorithm, and proposes a novel dynamic multi-objective immune optimization biclustering(DMOIOB) algorithm to find one or more significant biclusters of maximum size in microarray data. In the proposed algorithm, the feasible solutions are regarded as antibodies and Pareto optimal solutions are preserved in an antigen population updated by ε-dominance relation and computation of crowding distance. Many Pareto optimal solutions can be effectively obtained and distributed onto the Pareto front in this way. Three objectives, the size, homogeneity and row variance of biclusters, are satisfied simultaneously by applying three fitness function in optimization framework. A low mean squared residue (MSR) score of bicluster denotes that the expression level of each gene within the bicluster is similar over the range of conditions. Therefore, we focus on finding biclusters of maximum size, with mean squared residue lower than a given δ, with a relatively high row variance.

### Biclusters

Given a gene expression data matrix D=G×C={*d*
_{
ij
}} (here *i*∊[1, *n*] , *j*∊[1, *m*]) is a real-valued *n*×*m* matrix, here G is a set of *n* genes {g_{1}, g_{2}, ⋯, g_{n}}, C a set of m biological conditions {c_{1}, c_{2}, ⋯, c_{n}}. Entry *d*
_{
ij
} means the expression level of gene *g*
_{
i
} under condition *c*
_{
j
}
*.* If there is a submatrix B=*g*×*c*, where *g*⊂G, *c*⊂C, to satisfy certain homogeneity and minimal size of the cluster, we say that B is a bicluster.

### Bicluster encoding

Each bicluster is encoded as an individual of the population. Each individual is represented by a binary string of fixed length *n*+*m*, where *n*, *m* is the number of genes, conditions of the microarray dataset, respectively. The first n bits are responding to n genes, the following m bits to m conditions. If a bit is set to 1, it means that the responding gene or condition belongs to the encoded bicluster; otherwise it does not. This encoding presents the advantage of having a fixed size, thus using simply of standard variation operations. Therefore, the string “0110100010#0110100110” presents the individual encoding a bicluster with 4 genes and 5 conditions, and its size is 4×5=20. Where # is a symbol used to delimit the bits of the rows from the columns.

### Fitness function

Our hope is mining biclusters with low mean squared residue, with high volume and gene-dimensional variance, and those three objectives in conflict with each other are well suited for multi-objective to model. To achieve these aims, this paper uses the same fitness functions as [20].

### Update of ϵ-Pareto set of the population

In order to guarantee the convergence and maintain diversity in the population at the same time, we implement updating of ϵ-Pareto set of the population during clonal selection operation. A general scheme of the updating algorithm is given in [19].

### Immune response principle

An immune system can collect biological processes of an organism that protects against disease by identifying and killing pathogens and tumour cells. It can detect a wide variety of viruses and parasitic worms, and distinguish them from the organism's own healthy cells and tissues to protect an organism. It is highly distributed, highly adaptive, self-organization in nature [21]. Artificial Immune System (AIS) is a new computational approach for the computational intelligence community. It has widely such as pattern recognition, data analysis, function approximation and optimization.

The immune selection principle [22] is used to describe the basic properties of an adaptive immune response to an antigenic stimulus [21]. When applying the immune selection principle to solve multi-objective problem, it can generate several elements from the Pareto optimal set at one run. Clonal selection operation is used to implement local search in many different directions along the Pareto front. Mutation operator is applied to explore through the whole search space, thus attain the exact Pareto front of the problem.

### DMOIO biclustering algorithm

Multiple-objective optimization aim at the following two competing objectives: 1) to quickly obtain a non-dominated front that is close to the true Pareto front and 2) to maintain the diversity of the solutions along the resulting Pareto front. These two objectives are in conflict each other because maintaining the diversity will slow down the convergence speed and may degrade the quality of the resulting Pareto front. On one hand, MOIO algorithms tend to the optimal regions. On the other hand, the clonal selection behaviour may lead to premature convergence in the search space and produce a uniformly distributed Pareto front. The influence of population size on the performance of MOIO is the computational cost. It is difficult to deal with this conflict issues for a MOIO with a fixed population size because a predetermined computation resource has to be allocated and properly distributed between two competing objectives. Hence, inspired by [18], during biclustering of the microarray datasets, dynamically adjusting the population size to explore the search space in balance between two competing objectives is applied in this paper.

### Initial population

In most multi-objective optimization methods the initial archive is set to empty. The first archive contains the non-dominated solutions of the initial population. Each antigen selects best local guide from the archive members using Sigma method [23]. Selecting the first local guides from the archive has a great impact on the diversity of solutions in the next generations. Hence the diversity of solutions depends on the first non-dominated solutions. But if the initial archive is not empty and contains some well-distributed non-dominated solutions, the solutions converge faster than before, while keeping a good diversity. There are two methods to find a good initial archive. The first possibility is to run the MOIO with an empty archive for a large population and a few generations. The large population gives us a good diversity and a few generations (e.g., 10 generations) are used to develop the population to a little convergence. On another hand, MOEA can produce some good solutions with a very good diversity after a few generations. So another possibility is to use the results of a small MOEA method. Here, small means a MOEA with a few individuals and a few generations (e.g., 10 individuals and 10 generations). This paper first runs state-of-art MOEA(NSGA-II [24] ) with 30 individuals and 10 generations to produce the initial archive of DMOIOB.

### Fining the global best solution

To order to find the global best solutions, this paper uses the basic idea of Sigma method [

23] and by considering the objective space, finding the best local guide p

_{g} among the archive members for the antigen

iof population as follows. In the first step, we assign the value σ

_{j}, to each antigen

*j* in the archive. In the second step, σ

_{i}for antigen

i of the population is calculated, and then calculates the distance between the σ

_{i} and σ

_{j}
**,** ∀

_{j}
*j*=

*1*,

*⋯*,

*|A|*. Finally, the antigen kin the archive A which its σ

_{k}has the minimum distance to σ

_{i} is selected as the best local guide for the antigen i. Therefore, antigen p

_{g} = x

_{k}is the best local guide for antigen i.In other words, each antigen that has a closer sigma value to the sigma value of the archive member, must select that archive member as the best local guide. In the case of two dimensional objective spaces, closer means the difference between the sigma values and in the case of m- dimensional objective space, it means the m-dimensional Euclidean distance between the sigma values. Algorithm 1 shows the algorithm of the Sigma method for finding the best local p

_{g} for the antigen i of the population [

23]. Here, the function Sigma calculates the σ value and dist computes the Euclidian distance. y

_{i} denotes the objective value of the jth element of the antigen population

*P*.

### Population adding method

Population adding strategy mainly consist in increasing the population size to ensure sufficient number of individuals to contribute to the search process and to place those new individuals in unexplored areas to discover new possible solutions. Based on the strategies of dynamic population size [18], the following procedures is proposed to facilitate exploration and exploitation capabilities for DMOIOB.

**Step 1:** Selecting candidate antibodies added

The non-dominated set considered as candidate antibodies must have the highest probability of generating new antibodies that will improve the convergence toward the Pareto front. Therefore the number of potential antibodies determined via ns = INT(r1× (total no. of antibodies in non-dominated set)) is randomly selected from the non-dominated set. Where *r*
_{
1
} denotes a random number obtained from a uniform distribution within [0, 1].

**Step 2:** Defining the number of mutation

The number of mutation of the selected antibody is adaptively determined every iteration. Each selected antibody’s responsibility is to generate a certain number of new antibodies from the selected antibody. A probability value is used to determine the number of perturbations adaptively in which the number of mutation (number of new antibodies to be generated) is bound by the minimum and maximum number of mutation.

**Step 3:** Limiting the range of new antibodies

In proposed algorithm, to balance the exploitation and exploration capabilities and to avoid generating too many new antibodies from being too far away from the selected antibodies, it is necessary to generate a higher number of new antibodies within the neighbourhood than outside of the neighbourhood which similar to [16].

### Population decreasing method

To prevent the excessive growth in population, a population decreasing strategy which similar to [16] is proposed to adaptively control the population size. In DMOIOB, the condition to remove a antibody depends upon Sigma values. Sigma value is utilized to select potential antibodies to be deleted. After computing all the distance between Sigma value of each antibody and Sigma value of its corresponding best local guide, the rank of the distance of each antibody can be attained. If the removal of antibodies is only based upon the distance rank of each antibody, then there is a possibility of eliminating an excessively large quantity of antibodies in which some may carry unique schema to contribute in the search process. A selection ratio is implemented to regulate the number of antibodies to be removed and to provide some degrees of diversity preservation at the same time. A selection ratio that is inspired by Coello and Montes [25] is used to stochastically allocate a small percentage of antibodies in the population for removal. Hence, given a selection ratio *S* ∊ [0, 1], at iteration *t*, the number of antibodies to be eliminated is *S*×|Pt|. Note that the choice of the selection ratio is dependent upon the user’s preference, where it can be a function of the swarm population size or it can be a fixed ratio. For this paper, the selection ratio is a fixed number, which is set to be a small number, i.e., *S* ≤ 0*.*2. With a small selection ratio, there is a possibility that those selected antibodies in *Pt* may not be eliminated. In other words, some of the selected antibodies in *Pt* whose rank indicators are low may remain in the next iteration. In addition, a small selection ratio can prevent the removal of an uncontrollable large number of antibodies while providing some degree of diversity preservation. This paper set *S* =0.02.

### DMOIOB algorithm

We propose a dynamic MOIO biclustering algorithm (DMOIOB) to mine biclusters from the microarray datasets to attain the global optimum solutions. We incorporates the following three strategies: 1) ϵ-dominance to quicken convergence speed; 2) Sigma method to find good local guides; 3) population-growing strategy to increase the population size to promote exploration capability; and 4) population declining strategy to prevent the population size from growing excessively.

The pseudo-code of the proposed DMOIOB algorithm is given in Algorithm 2.

DMOIOB algorithm iteratively updates the antigens population until user-defined number of generation are generated and last converges to the optimal solution.