 Research
 Open access
 Published:
SEEI: spherical evolution with feedback mechanism for identifying epistatic interactions
BMC Genomics volume 25, Article number: 462 (2024)
Abstract
Background
Detecting epistatic interactions (EIs) involves the exploration of associations among single nucleotide polymorphisms (SNPs) and complex diseases, which is an important task in genomewide association studies. The EI detection problem is dependent on epistasis models and corresponding optimization methods. Although various models and methods have been proposed to detect EIs, identifying EIs efficiently and accurately is still a challenge.
Results
Here, we propose a linear mixed statistical epistasis model (LMSE) and a spherical evolution approach with a feedback mechanism (named SEEI). The LMSE model expands the existing single epistasis models such as LRScore, K2Score, Mutual information, and Gini index. The SEEI includes an adaptive spherical search strategy and population updating strategy, which ensures that the algorithm is not easily trapped in local optima. We analyzed the performances of 8 random disease models, 12 disease models with marginal effects, 30 disease models without marginal effects, and 10 highorder disease models. The 60 simulated disease models and a real breast cancer dataset were used to evaluate eight algorithms (SEEI, EACO, EpiACO, FDHEIW, MPHSDHSI, NHSADHSC, SNPHarvester, CSE). Three evaluation criteria (pow1, pow2, pow3), a Ttest, and a Friedman test were used to compare the performances of these algorithms. The results show that the SEEI algorithm (order 1, averages ranks = 13.125) outperformed the other algorithms in detecting EIs.
Conclusions
Here, we propose an LMSE model and an evolutionary computing method (SEEI) to solve the optimization problem of the LMSE model. The proposed method performed better than the other seven algorithms tested in its ability to identify EIs in genomewide association datasets. We identified new SNP–SNP combinations in the real breast cancer dataset and verified the results. Our findings provide new insights for the diagnosis and treatment of breast cancer.
Availability and implementation: https://github.com/scutdy/SSO/blob/master/SEEI.zip.
Background
Complex disorders are defined as diseases that are driven by more than one genetic factor, including multiple genes, or gene–gene or gene–environment interactions [1]. Highthroughput genotyping technologies have led to the rapid development of genomewide association studies (GWAS), and highrisk loci of many complex disorders have been identified [2]. However, the reported loci explain only part of the genetic mechanism, and the interactions of single nucleotide polymorphisms (SNPs) or epistasis are considered to be important reasons for the “missing” heritability [3]. Epistatic interactions (EIs) are now considered to be one of the important genetic bases for the occurrence and development of complex diseases [4]. However, studying EIs is challenging because of the large number of SNPs that need to be tested and the huge amount of computation that is required [2]. Therefore, an effective and efficient epistasis detection method for 2locus or multilocus SNPs in the whole genome will be of great significance.
Many EI detection methods have been proposed. These methods can be classified as exhaustive search, stochastic search, machine learning, and metaheuristic methods. The exhaustive search method systematically lists all possible candidates and explores all possible SNP combinations [5]. Multifactor dimensionality reduction (MDR) [6] is an exhaustive search method that simplifies highdimensional genotype combination data into a single dimension. MDR is a modelfree method that can identify gene–gene interactions in highorder data [7,8,9]. Extensions of MDR, such as MDRER [10], CoxMDR [11], and MOMDR [12], have been proposed. Stochastic search methods such as random search and filtering methods also perform well. Random search methods include Bayesian epistatic association mapping (BEAM) [13]. Filtering methods reduce the computational burden by discarding large numbers of SNPs, and include EpiMiner [14], weighted heuristic anytime search [15], FDHEIW [16], LRMW [17], BADTrees [18], SNPRuler [19], ensemble learningbased approach (ELSSI) [20], EpiReSIM [21], MDSN [22], EpiMC [23], and modeling epistatic interaction (MEI) [24]. These methods sometimes screen out potentially meaningful single SNPs or SNP–SNP interactions, which reduces their detection power. Machine learning methods include random forest, support vector machine, and neural network [25, 26]. Methods based on machine learning generally have the problem that the results are difficult to interpret [27]. Metaheuristic methods (swarm intelligence algorithms) based on Darwin’s theory of natural selection aim to imitate the behavior of various living beings in nature to find the optimal solution of complex problems. Metaheuristic algorithms have become increasingly popular in solving EI detection problems. They include the cuckoo search epistasis (CSE) algorithm [28], MACOED [29], epiACO [30], MPHSDHSI [31], the extended ant colony optimization (EACO) algorithm [32], NHSADHSC [33], EIMOABC/D [34], Intelligent PrivacyPreserving (IPP) scheme [35], GEPEpiSeeker [36], SFMOABC [37], EpiMOGA [38], and multiobjective evolutionary computation (MEC) [39].
Many metaheuristic methods aim to solve a specific optimization epistasis model (singleobjective model or multiobjective model), and therefore cannot be applied widely to different optimization epistasis models. For example, the CSE algorithm uses only the K2 score model, the epiACO algorithm uses only a mixed model of K2 score and mutual information (MI), the EACO algorithm combines only MI and the Gini index (GINI), the EIMOABC/D algorithm uses only the K2 score and GINI as a multiobjective model, and the GEPEpiSeeker algorithm uses only the K2 score model. These optimization epistasis models have their own advantages and disadvantages when they are applied to EI problems on different datasets. Therefore, a metaheuristic method that can be widely applied in multiple optimization epistasis models is needed to increase the generalization ability of the algorithms. When using metaheuristic methods, the balance between exploitation and exploration is key to ensure that the algorithm escapes from the local optima and ultimately finds the global optimal solution. Furthermore, most metaheuristic algorithms search in hypercube space (e.g., differential evolution, particle swarm optimization, artificial bee colony), which limits their search ability. In 2019, we proposed a spherical evolution algorithm that uses a super spherical search style that has a larger search space than the hypercube search style [40]. However, the standard spherical search algorithm can easily fall into a local optimum because it lacks a systematic feedback mechanism. To address these issues, we implemented the following innovations to solve the EI detection problem.

1.
Develop a spherical evolution framework that uses the spherical search approach based on a feedback mechanism. Circle and spherical searches are executed alternately to solve any korder (k ≥ 2) SNP combination problem. The individual arrangement and parameter adjustment of the search are dependent on a feedback mechanism.

2.
Develop the feedback mechanism by spherical search scale parameter adjustment and a population updating strategy. The spherical search scale parameter is updated according to the winning ability of each individual for each generation. The excellent individuals in each generation are saved in a set and the population size can be decreased by linear reduction.

3.
Apply a linear mixed optimization epistasis (LMOE) model as the fitness function. The LMOE model simplifies a multiobjective problem as a singleobjective problem.
We tested the proposed spherical evolution approach with a feedback mechanism algorithm that we called SEEI on the single optimization epistasis model (singleobjective optimization problem) and the LMOE model (multiobjective optimization problem) and verified its performance and generalization ability.
The SEEI algorithm (Fig. 1) has four main components: spherical search, fitness function, spherical search factor, and population updating strategy. The spherical search components (Xi, Xbest, Xr1, Xr2) are used for two spherical searches. The fitness function uses the LMOE model, which can be used for singleobjective and twoobjective optimization models because it converts twoobjective optimization problems to singleobjective optimization problems. The spherical search factor can be considered as an adaptive parameter adjustment strategy in which the search factor parameter controls the spherical search scale; large scale values will make the search for large search space, whereas small scale values will make the search for small search space. Therefore, adaptive adjustment of the search scale is a key task of the SEEI algorithm.
Experimental results and analysis
We tested the performance of the SEEI approach on 60 simulated datasets in various types of disease models and a real breast cancer dataset, and compared its performance with those of seven stateoftheart algorithms, namely CSE [28], EACO [32], EpiACO [30], FDHEIW [16], MPHSDHSI [31], NHSADHSC [33], and SNPHarvester [41], on the same datasets. To ensure the fairness of the comparisons, we used the parameter settings of the algorithms from the original articles. Details of each algorithm are provided in Supplementary Files. To fully evaluate the performance of these algorithms in different disease models, three evaluation criteria were used, namely \(Power1\), \(Power2\), and \(Power3\). \(Power1\) is defined as \(Power1=\frac{\#S}{\#T}\), where #T is the number of datasets generated by the same model (#T = 100 in our study), #S is the number of pathogenic datasets found from #T datasets. (The best SNP combination can be successfully detected in the dataset.) \(Power2\) uses the Gtest statistical method [33, 42] to test the significance level of candidate SNP combinations under \(Power1\). Gtest is a likelihood ratio test asymptotically similar to Pearson’s chisquare and superior to the approximation to the theoretical chisquare distribution. \(Power3\) uses the balanced accuracy [7] and predictive error rate [43] of the MDR classifier under \(Power1\). The three evaluation criteria were used to evaluate the ability of SEEI and the seven selected optimization algorithms to detect EIs.
Parameter setting for the eight algorithms tested
For a fair evaluation of the comparisons, we used the running number of the fitness function with MAX_FES set as the maximum running number of the fitness function. The parameters of the eight algorithms were from the original papers. MAX_FES was set as: 2order MAX_FES = 20000, 3order MAX_FES = 20000 × 3, and 4order MAX_FES = 20000 × 4. The parameter setting of each algorithm are as follows:

(1)
EACO: AntNumber = 200, evaporation = 0.3, Tau0 = 1, Alfa = 1, Beta = 1. AntNumber is the number of ants in the population, the initial pheromone Tau0 was set to 1, and Alfa and Beta, which determine the weights of pheromone and heuristic information, were set to 1. The evaporation coefficient of pheromones was set to 0.3.

(2)
MPHSDHSI: HMS = 10 for each HM, HMCR = 0.98, PAR = 0.35, threshold pvalue for Gtest = 1/Cnk, pre_error_rate = 0.45. Harmony memory (HM) is composed of harmonies. Harmony memory size (HMS) is the population size, harmony memory considering rate (HMCR) is the crossover rate, pitch adjusting rate (PAR) is the selection rate for different dimensions in each individual, and pre_error_rate is the MDR parameter.

(3)
NHSADHSC: HMS = 50, HMCR = 0.95, PAR = 0.35. Harmony memory (HM) is composed of harmonies. HMS is the population size, HMCR is the crossover rate, and PAR is the selection rate for different dimensions in each individual.

(4)
epiACO: AntNumber = 200; τ0 = 1, η = 1, α = 1, β = 1, evaporation coefficient ρ = 0.2, constant ξ = 0.3. AntNumber is the number of the ants in the population, τ0 is the pheromones of the path from position i to position j at iteration t. The heuristic information of the path from position i to position j is denoted as η. α and β are parameters that control the importances of pheromones and heuristic information respectively. The evaporation coefficient of pheromones was set to 0.2. Constant ξ is set as 0.3.

(5)
FDHEIW: K = epi_num, Candatiesize = 2 × epi_num. K is the number of SNP combination and Candatiesize is the number of candidate solution sets.

(6)
SNPHarvester: SuccessiveRun = 50. SuccessiveRun is the number of searches in each iteration.

(7)
CSE: Fraction of eggs discarded each generation = 0.25, maximum number of steps to take in a levy flight = 1, number of groups = 5, and number of nests = 100 (population size).

(8)
SEEI: pop_size = 50, p_best_rate = 0.11, arc_rate = 2.6, memory_size = 6, min_pop_size = 4. pop_size is the number of the population, p_best_rate is the rate of the best solutions in the population, arc_rate is the number of a population of outstanding individuals in history, memory_size is the search factor number in memory, and min_pop_size is the population size at the end of the final iteration.
Experiments on simulated datasets
To comprehensively compare the eight algorithms (SEEI, EACO, EpiACO, FDHEIW, MPHSDHSI, NHSADHSC, SNPHarvester, CSE), we built 12 disease models with marginal effect, 30 disease models without marginal effect, and 8 random disease models. Details of the disease models are provided in Supplementary files.
Disease models with marginal effect (DMEs)
Twelve DMEs were used to evaluate the performances of SEEI and the seven other algorithms for detecting EIs [9, 44, 45]. The DMEs were designed according to the interaction structure with different diseases. Details of the multilocus penetrances are presented in Table S1. The heritability (h2) values were between 0.031 and 0.008. In each disease model, 100 datasets were randomly generated using GAMETES software, which can generate datasets containing a specific twolocus EI with random architectures. Each dataset included an interacting SNP pair (M0P0–M1P1) that was generated according to the disease model setting, and other SNPs are generated with minimum allele frequency (MAF) selected uniformly in (0.05, 0.5). For each DME, we simulated 100 replicate datasets with sample sizes 800 and balanced cases and controls (DME1–DME6), and sample sizes of 1600 and balanced cases and controls (DME7–DME12), and a total of 1000 SNPs. Power1, Power2, and Power3 values for the eight algorithms in the 12 DMEs were shown in Fig. 2. SEEI had higher Power1, Power2, and Power3 than the other seven algorithms in the 12 DMEs, indicating that the search ability of SEEI was better than that of the other seven algorithms. We also verified the robustness of the SEEI algorithm and found that it outperformed the other seven algorithms when the number of samples was increased from 800 (DME1–DME6) to 1600 (DME7–DME12). Indeed, the performances of all the other algorithms, except for that in DME10, decreased when the number of samples was 1600. NHSADHSC and MPHSDHSI outperformed all the other algorithms, except SEEI.
Disease models without marginal effect (DNMEs)
Thirty 2locus and pure disease models without marginal effects (DNMEs) were from a previous study [19]. The simulated datasets were generated with various parameter settings (heritability (h2) and MAF) using GAMETES [44]. Each dataset contained a specific 2locus interacting SNP pair (M0P0–M1P1) with random architectures. The details of the multilocus penetrances are presented in Table S2. The h2 values controlled the phenotypic variation of all disease models and ranged from 0.025 to 0.2, and the MAFs were 0.2 and 0.4. Each disease model was generated using 100 datasets consisting of 1000 SNPs, two of which (M0P0 and M1P1) were the specific SNPs. The other SNPs were generated with MAFs selected uniformly in (0.05, 0.5). For each selected DNME, we simulated 100 replicate datasets with sample sizes of 800 and balanced cases and controls, and a total of 1000 SNPs. We used 30 different DNMEs to fully test the detection ability of the eight algorithms. The performances of SEEI and the other seven algorithms in the 2order DNMEs (DNME1–DNME30) are shown in Fig. 3. The SEEI algorithm outperformed the other seven algorithms, achieving the highest power among the DNMEs. These results again demonstrate the effectiveness of the SEEI algorithm in screening for significant SNP combinations.
Random disease models
We used GAMETES to generate 100,000 random, strict, and pure disease models for each of the different combinations of genetic constraints that were obtained using different twolocus interacting SNP pairs (M0P0–M1P1); h2 values of 0.001, 0.025, 0.05, and 0.1, and MAFs of 0.2 and 0.4, with a varying population prevalence. For each setting, 100,000 disease models were ranked according to the ease of detection measure (EDM), and the eight disease models with the lowest EDM values were selected as the random disease models for data simulation (Random1–Random8). For each selected disease model, we simulated 100 replicate datasets with sample sizes 800 with balanced cases and controls and a total of 1000 SNPs. Each dataset contained one pair of highly interactive SNPs (M0P0–M1P1), and the other SNPs were generated with MAFs selected uniformly in (0.05, 0.5). Power1, Power2, and Power3 were higher for SEEI than they were for the other seven algorithms (Fig. 4), except in Random2, where Power1 and Power3 were smaller for SEEI than they were for MPHSDHSI and NHSADHSC. In Random1, the power of the eight algorithms was < 0.4, indicating that many algorithms were unable to detect the best SNP combination. For SEEI and NHSADHSC, Power1 was > 0.3 and < 0.2, respectively, implying that SEEI significantly outperformed NHSADHSC. In Random2, only SEEI, MPHSDHSI, and NHSADHSC detected the best SNP combination.
Highorder disease models (HODMs)
We selected the additive models in literature [46] that define EIs with marginal effects. All the simulation configurations and datasets are publicly available at Github. (https://github.com/UDCGAC/epistasissimulationdata). Details of the multilocus penetrances are presented in Table S3. The criteria to create penetrance tables of third and fourthorder were MAF values of 0.10, 0.25, and 0.40, and h2 values of 0.10, 0.25, 0.50, and 0.80 with prevalence > 1E − 06. From each penetrance table, 100 datasets were generated containing 500 SNPs from 2000 individuals (1000 cases and 1000 controls). We generated 10 simulated datasets; six for 3order models and four for 4order models. The performances of SEEI and the other seven algorithms on the 3locus and 4locus models are shown in Fig. 5. Most of the methods produced the highest power value (i.e., 1.0) on the highorder data models, confirming that these HODMs easily detected SNP–SNP interactions. SEEI, FDHEIW, MPHSDHS, and SNPHarvester produced the best performances, whereas CSE performed the worst. These results show that among the 3locus and 4locus models, SEEI was highly effective in detecting highorder interactions.
Comparison of the different disease models
To comprehensively validate the performance of our proposed method, the four disease models, DMEs, DNMEs, Random models, and HODMs were used to compare SEEI and the seven stateoftheart algorithms, EACO, MPHSDHSI, NHSADHSC, epiACO, FDHEIW, SNPHarvester, and CSE. In the 12 DME models (Fig. 2), SEEI, NHSADHSC, MPHSDHSI, and FDHEIW had better pow1, pow2, and pow3 values than the other algorithms. In addition, the power values of SEEI increased when the number of samples was increased from 800 (DME1–DME6) to 1600 (DME7–DME12). Conversely, the power values of some methods, such NHSADHSC, decreased when the number of samples was increased from 800 (DME1–DME6) to 1600 (DME7–DME12). In the 30 DNME models, SEEI, NHSADHSC, MPHSDHSI, epiACO, and EACO had the best pow1, pow2, and pow3 values (Fig. 3). The epiACO and EACO algorithms had better power values in the DNME models than they had in the DME models, whereas the FDHEIW had better power values in DME models than it had in DNME models. In the eight random disease models, CSE, FDHEIW, and SNPHarvester had very low power values and failed to detect the epistatic interaction pairs, whereas SEEI, NHSADHSC, MPHSDHSI, epiACO, and EACO had better power values (Fig. 4). Only the SEEI, NHSADHSC, and MPHSDHSI algorithms had high stable power values in the 2order DME, DNME, and random disease models. In the 10 highorder disease models (HODMs), only SEEI, FDHEIW, MPHSDHS, and SNPHarvester had high power values (Fig. 5). The detection ability of some methods, such as NHSADHSC, decreased from the 2order models to the high order models, whereas the detection ability of FDHEIW, MPHSDHS, and SNPHarvester increased from the 2order models to the high order models. This result indicates that FDHEIW, MPHSDHS, SNPHarvester, and NHSADHSC are more suitable for certain disease models. SEEI performed better than all the other seven methods in the 2order and a high order models, demonstrating that SEEI has strong robustness and can be used for many different types of disease models.
Comprehensive analysis of the 60 disease models
The heap map (Fig. 6) shows the detection powers of the seven algorithms and the SEEI algorithm with 10 epistasis optimization methods in the 60 disease models. SEEI with 10 epistasis optimization methods outperformed the other seven algorithms in the 50 2order disease models, with the exception of MPHSDHS and NHSADHSC in the Random2 model, which outperformed the other methods. In the 10 highorder disease models, SEEI (with 9 of the epistasis optimization methods), FDHEIW, MPHSDHS, and SNPHarvester outperformed the other algorithms. The results also show that the SEEI algorithm had the best performance in the 60 disease models with not only the single epistasis optimization models but also with the linear maxed epistasis optimization models.
The Power1 values for the eight algorithms, SEEI, EACO, EpiACO, FDHEIW, MPHSDHSI, NHSADHSC, SNPHarvester, and CSE, in the 60 disease models by Ttest are shown in Table 1. The results show that SEEI outperformed the other seven algorithms. The average ranks of the 17 algorithms by Friedman test [47] are listed in Table 2. The detection power of the 10 SEEI epistasis optimization methods outperformed those of the other seven algorithms (pvalue = 7.56E − 12; Table 3), showing that SEEI had better robustness and scalability than the other methods. This finding suggests that the SEEI algorithm may be applicable to other epistasis optimization models.
Because of the long running time required for highorder data, the gap in the running times between the highorder disease models for the other disease models was very large, and therefore we used the logarithm of the running times of all the algorithms for a better comparison. Compared with the other seven algorithms, the SEEI algorithm (MI) had the shortest running time in most of the disease models (Fig. 7). Although the running time of the SEEI (MI) algorithm was slightly longer than that of the SNPHarvester algorithm in DME3, DME5, Random43, Random44, and Random50, and slightly longer than the MPHSDHSI and SNPHarvester algorithms in the highorder disease models, the average running time of SEEI was the shortest among the 60 disease models, implying that the SEEI algorithm was more adaptable to different disease models and had a lower computational burden than the other algorithms. We believe that the SEEI algorithm has promise in detecting diseasecausing SNP combinations and has a broad future.
Experiments on a real dataset
We used the breast cancer (BC) dataset of the Welcome Trust Case Control Consortium (WTCCC) project as the real dataset to evaluate the ability of SEEI to detect EIs [48]. Mutual information (MI) was used as the fitness function of SEEI. BC is a complex disease and its etiology is not fully understood [49]. The BC dataset included 15,436 SNPs from 1045 patients with BC and 1438 normal individuals from the 1958 birth cohort. Samples with SNP genotype deletion rates ≥ 2% were excluded, and for a SNP, if the genotype deletion rate of all the samples was ≥ 5%, or its pvalue (Hardy–Weinberg equilibrium) was < 0.0001 in the controls, or the MAF was < 0.1, the SNP was excluded. After quality control, 3386 SNPs from 1045 patient and 1329 control samples in the BC dataset were used in this study.
The population size of SEEI was set to 400 and the maximum number of iterations was set to 5000. The SEEI algorithm identified some 2way and 3way SNP combinations that may be associated with BC (Table 4 and Fig. 8). The single SNP (rs1402954) that was detected most frequently in the twoway SNP combinations is located in FBXO3 on chromosome 11. FBXO3 is involved in ΔNp63α degradation to empower TGFβ signaling in promoting tumor metastasis and the TβRI–FBXO3–ΔNp63α axis is critically important in BC development and its clinical prognosis [50]. SNP rs2290501 is located in HSPG2 on chromosome 1. HSPG2 is involved in tumor development and progression, and, in BC, HSPG2 was fragmented or completely lost in epithelial basement membrane, with a marked increase in HSPG2 abundance in the stroma [51]. SNP rs11211247 is located in MAST2 on chromosome 1. Gene rearrangements in the microtubuleassociated serinethreonine kinase (MAST) gene family have been identified in 5%–7% of invasive BCs [52]. SNP rs3765966, which is located in CA6 on chromosome 1, was detected more than once in 3way SNP combinations. CA6 is highly expressed by many BCs [53]. SNP rs6662382 was also detected in 3way SNP combinations. The relationship of rs6662382 with BC has not been reported so far, implying that this may be a new SNP combination associated with BC. The SNP interaction network for BC is shown in Fig. 8.
Discussion
We proposed a spherical evolution with feedback mechanism method called SEEI to effectively detect 2order and highorder epistasis from genomewide case–control data. To verify the robustness and scalability of SEEI, we used a linear mixed statistical epistasis model that converts the multiobjective problem to a singleobjective problem. SEEI was compared with seven stateoftheart algorithms (EACO, EpiACO, FDHEIW, MPHSDHSI, NHSADHSC, SNPHarvester) for four single epistasis models (LR, K2, MI, GINI) and six linear mixed statistical epistasis models (LRK2, LRMI, LRGINI, K2MI, K2GINI, MIGINI). We found that SEEI outperformed the other seven comparison methods for detection accuracy on simulated and real datasets. Furthermore, SEEI showed strong robustness and scalability, indicating that it can be used for both single and linear mixed statistical epistasis models. The proposed SEEI method performed well in identifying EIs, in particular, in solving multiple different epistasis optimization models, indicating that the SEEI algorithm has good generalization ability. However, the good performance of SEEI in all disease models or all real datasets cannot be guaranteed. We found that the SEEI algorithm still falls into local optima when tackling different kinds of data. To overcome these problems, further focus on the feedback mechanism of metaheuristic methods, multisubpopulation search methods, and the ensemble learning framework for different algorithms may help to greatly improve the generalization ability of algorithms in the future.
Methods
Related work
In the last four years, the EI identification problem has developed rapidly. The main focus has been on how to establish an epistasis interaction model and how to solve the model. Stochastic search methods and metaheuristic methods are still hot topics in current research. Some models and search approaches for stochastic search methods have been proposed. Sun et al. [22] proposed a module detection method in which a SNP network was constructed and a node was comprehensively evaluated by the topological characteristics of the neighborhood. Wang et al. [23] proposed a matrix factorization based multiple clustering algorithm to generate multiple diverse clusters and then used Jaccard similarity to obtain candidate sets. David et al. [24] proposed a model that scores candidate SNP sets by computing maximum likelihood distribution for the observed phenotypes. Different from stochastic search method, metaheuristic method is the general term for several optimization algorithms that are inspired by the Darwinian principles of nature’s capability to evolve organisms that are well adapted to their environment. Because of its strong search ability, many metaheuristic methods have been used to solve the EI identification problem. Tuo et al. [31] proposed a multipopulation harmony search (HS) algorithm and multiple criteria (Bayesian networkbased K2score, JensenShannon (JS) divergence, likelihood ratio (LR) and normalized distance with joint entropy (NDJE)) were adopted by four harmony memories to improve its ability to discriminate diverse disease models. Sun et al. [32] detected epistatic interactions based on the ant colony optimization (ACO) algorithm, the highlights of which were the introduced heuristic information, fitness function, and a candidate solutions filtration strategy. Two functionally complementary fitness functions, mutual information, and the Gini index, were combined to effectively evaluate the associations between SNP combinations and the phenotype. Li et al. [34] proposed and formulated an epistatic interaction multiobjective artificial bee colony algorithm based on decomposition (EIMOABC/D) to address those problems for genetic interaction detection in GWAS. First, to direct the genetic interaction detection, two objective functions were formulated to characterize various epistatic models, and a rank probability model was proposed to sort each population into different nondomination levels based on the fast nondominated sorting approach. Wang et al. [35] proposed a method based on the ACO algorithm to detect SNP–SNP interactions for GWAS, an IPP scheme. Initially, a multiobjective search algorithm was designed to discover candidate SNP sets related to a disease phenotype, which used a Differential Privacy method by disturbing the multiobjective function to construct a rational epistatic privacy protection strategy. Peng et al. [36] proposed several tailormade chromosome rules to describe SNP combinations, and incorporated Bayesian networkbased fitness evaluation into the evolution of tailormade chromosomes to find candidate SNP combinations, and adopted the chisquare test to identify optimal solutions from suspected SNP combinations. Gu et al. [37] proposed a multiobjective artificial bee colony (ABC) algorithm based on the scalefree network (SFMOABC), which incorporated the scalefree network into the ABC algorithm to guide the update and selection of solutions. In addition, SFMOABC used the mutual information (MI) and the K2Score of the Bayesian network as objective functions, and the oppositionbased learning strategy was used to improve the search ability. Chen et al. [38] proposed a multiobjective genetic algorithm (EpiMOGA) for SNP epistasis detection. The K2 score based on the Bayesian network criterion and the Gini index of the diversity of the binary classification problem were used to guide the search process of the genetic algorithm. M. GCJ et al. [39] proposed the application of two of the most successful multiobjective evolutionary algorithms to solve the binary classification problem, namely the referencepoint based Manyobjective Fast Nondominated Sorting Genetic Algorithm (NSGAIII) and a Multiobjective Evolutionary Algorithm based on Decomposition with Dynamical Resource Allocation (MOEA/DDRA). Yang et al. [12] proposed a novel ensemble learningbased approach (ELSSI) that can significantly reduce the bias of individual detectors and their computational load. ELSSI randomly divides SNPs into different subsets and evaluates them by multitype detectors in parallel.
Clearly, a large number of manyobjective metaheuristic methods have been used to solve EI problems. The classical ABC algorithm, genetic algorithm, ant colony optimization, harmony search algorithm, and many epistasis interaction models (e.g., K2score, JS, LR, MI) are among those that have been used. Novel metaheuristic algorithms and epistasis interaction models have not yet been developed. We considered the spherical evolution (SE) method for solving the EI problem. The SE algorithm was first proposed by our team in 2019 [40]. Although the SE algorithm performed well in solving optimization problems, it was easily trapped in local optima. Here, we propose a SE method based on the feedback mechanism and a linear mixed statistical epistasis model (LMSE) instead of the traditional manyobjective epistasis models that have been used to solve the EI problem.
EI identification problem
The EI identification problem can be represented as a matrix D = (X,Y),\(X\in {\mathbb{Z}}^{ m\times n}\), \(Y\in {\mathbb{S}}^{ m\times 1}\), where m is the number of samples and n is the number of SNPs. In X (\(\forall {x}_{i,j}\in \{\mathrm{0,1},2\}\), 0 represents the homozygous major allele, 1 represents the heterozygous allele, and 2 represents the homozygous minor allele, and \({x}_{i,j}\) is the genotype of the jth SNP and the ith sample in dataset D. In Y (\(\forall {y}_{i}\in \{\mathrm{0,1}\}\)), 0 represents the control and 1 represents the case, and \({y}_{i}\) is the phenotype of the ith sample in dataset D, which represents the disease status of the ith sample corresponding to its SNPs.
If the kSNP combination set is defined as \({X}_{m,k}^{\mathrm{^{\prime}}}=\left\{({x}_{1..m,{j}_{1}},{x}_{1..m,{j}_{2}}...{x}_{1..m,{j}_{p}}...{x}_{1..m,{j}_{k}})1\le p\le k,k<<n\right\}\), then the optimization epistasis model can be represented as: \({\min}_{x' \subset x}f\left(x'_{m,k},Y_m\right)\), where \({{\varvec{X}}\mathrm{^{\prime}}}_{{\varvec{m}},{\varvec{k}}}\) is a set of korder SNPs and \({\varvec{f}}\) is an optimization epistasis model (a scoring function) for evaluating the association between a kSNPs (\({{\varvec{X}}\mathrm{^{\prime}}}_{{\varvec{m}},{\varvec{k}}}\)) and disease status (\({Y}_{m}\)).
Spherical evolution approach with feedback mechanism
The SE algorithm [40] searches for all regions of a sphere by continuously adjusting the radius and angle of the sphere. However, the current SE algorithm lacks a good feedback mechanism. Adaptive parameter control and population size control strategies have been proposed to improve the accuracy of the differential evolution algorithm [54]. Inspired by these strategies, we used adaptive spherical search and population updating strategies for the feedback mechanism of the SE algorithm. We also designed a fitness function using a linear mix optimization epistasis model. The main steps are:

Step 1: Population initialization
Randomly generate N individuals (solution vectors) in the search space. In the \({{\varvec{j}}}_{{\varvec{t}}{\varvec{h}}}\) dimension, the search space is limited in [\({a}_{\mathit{min},j}\),\({a}_{\mathit{max},j}\)], where \({a}_{\mathit{min},j}\) is 1 and \({a}_{\mathit{max},j}\) is equal to the order of SNP. If \({a}_{i,j}\) denotes a solution vector \(i\in \left[\mathrm{1,2},\dots ,{\text{N}}\right],j\in [\mathrm{1,2}\dots ,\mathrm{ SNP}\)], then
where \(rand[\mathrm{0,1}]\) is a number of uniform distribution from 0 to 1.

Step 2: Spherical search operator
To obtain a better individual (solution vector), each individual \({a}_{i,*}^{g}\) in the population performs a spherical search operation as shown in Eqs. (5), (6), and (7) or Eqs. (8) and (9), where \({a}_{pbest,j}\) is an individual chosen from a set of the better individuals in the current generation (sort and select based on the better fitness values), \({a}_{r1,j}\) is an individual selected randomly from the population, and \({a}_{h,j}\) is an individual selected randomly from a set of the excellent individuals in each generation. \(Scale\) is the only important parameter (search factor), which represents the radius of a sphere. In threedimensional (3D) space, the spherical search can be formulated as:
In twodimensional (2D) space, the spherical search can be formulated as:
where is a random number of uniform distribution between [0, 2\(\mathrm\pi\)].
\({S}_{dim}\) numbers are selected randomly in the search space [1, 2, …, DIM], where DIM is the maximum number of SNPs, and \({S}_{dim}\) is the spherical search dimension (\({S}_{dim}\)=2 is the search in 2D space for 2order SNP combinations or \({S}_{dim}\)=3 is the search in 3D space for highorder SNP combinations). The search dimension can use 2D and 3D spherical search styles, with the two spherical search styles running alternately.

Step 3: Updating operation
The fitness function of each new individual \({a}_{i}^{g+1}\) in a population is calculated and compared with the fitness function of the old individual \(a_{i}^{g}\) . Excellent individuals are retained in a set H.

Step 4: Adaptive adjustment of spherical search scale
In spherical search operations, the spherical search factor is also known as the spherical search radius, which has a significant impact on the search of each individual. The traditional spherical search factor directly controls the size of the search radius based on the fitness function of the population. This method accelerates the convergence speed of the algorithm, but it also causes the algorithm to easily fall into local optima. Therefore, in this paper, we adopted the adaptive adjustment strategy for the spherical search factor as follows:
where \(f\left({a}_{i}^{g+1}\right)\) is the fitness value of the new individual i in the population, \(f\left({a}_{i}^{g}\right)\) is the fitness function value of the previous generation, and \(\Delta {f}_{i}^{g+1}\) is the winning intensity of the individual.
where \({w}_{i}^{g+1}\) is the weight of the winning ability of the individual in the current generation of the population.
where \({\varvec{S}}{\varvec{c}}{\varvec{a}}{\varvec{l}}{{\varvec{e}}}_{{\varvec{n}}}^{{\varvec{g}}}\) is the value of the spherical search factor in generation \(g\) and \(m{s}^{g+1}\) is the weighted value of the spherical search factor of the population in generation \(g+1\).
where MS is a set of the weighted value of the spherical search factor for consecutive r generations.
A spherical search factor is selected randomly from the MS set and the Cauchy distribution, represented by \(randc\left(m{s}_{r1},0.1\right),\) is used to generate a new search factor value \({t}_{i}\).

Step 5: Population updating strategy
To enhance the diversity of the population, the excellent individuals who won in previous consecutive generations of the population are preserved and stored in a set of excellent individuals (H). In a spherical search operation, individual \({{\varvec{a}}}_{{\varvec{h}},{\varvec{j}}}\) are generated randomly from H, seen as Eqs. (5)– (6). \({H}^{\boldsymbol{^{\prime}}}\) is the new set of the current excellent individuals.
After each generation evolution, the next generation population size \(Popsiz{e}_{g+1}\) is calculated as:
\(Popsiz{e}_{init}\) is set to the smallest possible value so that the reproduction operator can be applied; FES is the current fitness evaluation number and \(FE{S}_{max}\) is the maximum fitness evaluation number.
Linear mixed statistical epistasis model (LMSE)
In recent years, multiobjective algorithms have been used to solve EI problems. However, these algorithms do not converge efficiently for solving multiobjective optimization models. We combined two approximately normalized functions to convert a multiobjective optimization problem to a singleobjective optimization problem for the evolutionary iterations of the algorithm, and calculated the lowest values of each \(scoreFu{n}_{i}\) as:
where \(Ma{x}_{i}\) and \(Mi{n}_{i}\) are estimates of the maximum and minimum values of \(scoreFu{n}_{i}\) that are obtained during the initialization process.
This approach attempts to address the difficulty of convergence of multiobjective evolutionary algorithms to guarantee metric accuracy with high probability. Four statistical epistasis models were chosen to verify the performance of the proposed algorithm, namely K2Score, LR, MI, and GINI. Pseudo code of the SEEI algorithm is provided in Fig. 9.
LRScore [31]
LR is an indicator that reflects authenticity. It is a composite index that reflects both sensitivity and specificity and is calculated as:
where \({o}_{ij}\) and \({e}_{ij}\) are the observed and expected genotype numbers, respectively, when SNPs are combined for genotype j, and phenotype is the disease state i. Low LRScore values indicate strong association between a SNP combination and a phenotype.
K2Score [31]
K2Score, based on Bayesian networks, is calculated as:
where I is the number of all genotype combinations of a SNP and J is the number of all sample state sets. GWAS data usually contain only disease and control samples, so J is usually 2. \({N}_{i}\) is the number of SNP combinations of the \({i}_{th}\) genotype, and \({N}_{ij}\) is the number SNP combinations in the \({j}_{th}\) state. Low K2Score values indicate high correlation of a SNP combination with a disease state.
Mutual information [32]
Mutual information (MI) is one of the most commonly used measures for feature selection. We used MI to measure the correlation between SNP combinations and phenotypes as:
where A is the SNP combination, Y is the phenotype, H(A) is the entropy of A, H(Y) is the entropy of Y, and H(A,Y) is the joint entropy of A and Y. High MI scores indicate strong correlation between SNP combinations and phenotypes.
Gini index (GINI) [32]
The Gini index can be used to measure the degree of impurities and inequality in a dataset, as well as to analyze how dispersed the data are. The Gini Index is defined as:
where \({P}_{i}\) is the probability of the \({i}_{th}\) genotype combination in the sample set, and \({P}_{i,j}\) is the estimated probability of the sample where the ith genotype combination is related to phenotype y. \(1\sum_{j1}^{J}{P}_{i,j}^{2}\) is the estimated probability that the genotype combination will be misclassified as phenotype Y. A low Gini coefficient indicates a high degree of equality and a strong association between a SNP combination and a phenotype.
The landscape of 2order SNP combinations that are constructed for the epistasis models by the first DME model (DME01 with 1000 SNPs and 400 samples) randomly produce 100 SNP combination pairs (2order) in the 1000 SNPs. The search space is limited in a set as [1,2,…,1000] for each dimension. The values of the epistasis optimization model is as the fitness value, which is used to evaluate the association between SNP combinations and disease status. The fitness values of the four single epistasis models (LR, K2, MI, GINI) are shown in Figs. 10, 11, 12 and 13, and the fitness values of the six linear mixed statistical epistasis models (LRK2, LRMI, LRGINI, K2MI, K2GINI, MIGINI) are shown in Figs. 14, 15, 16, 17, 18 and 19. Each epistasis model can be considered an objective function with multiple local optima. The landscape of the linear mixed statistical epistasis model is similar to the single epistasis model. Indeed, the linear mixed statistical epistasis model convert the twoobjective optimization problem into a single optimization problem.
Availability of data and materials
The Simulation datasets of the models were generated by GAMETES 2.0, that is available from https://surveillance.cancer.gov/geneticsimulationresources/packages/gametes/.The simulation datasets can be produced by the GAMETES 2.0 according to the supplementary file. The Wellcome Trust Case Control Consortium (WTCCC) is a collaboration of many British research groups. The experimental protocols of the breast cancer (BC) dataset were approved by the Wellcome Trust Case Control Consortium. The consent was obtained from all subjects. The WTCCC data set in this study are available at https://www.wtccc.org.uk. The all datasets used during the current study available from the corresponding author on reasonable request.
References
Uitterlinden AG. An Introduction to GenomeWide Association Studies: GWAS for Dummies. Semin Reprod Med. 2016;34(4):196–204.
Dehghan A. GenomeWide Association Studies. Methods Mol Biol. 2018;1793:37–49.
Tam V, Patel N, Turcotte M, Bossé Y, Paré G, Meyre D. Benefits and limitations of genomewide association studies. Nat Rev Genet. 2019;20(8):467–84.
Ding X, Wang J, Zelikovsky A, Guo X, Xie M, Pan Y. Searching HighOrder SNP Combinations for Complex Diseases Based on Energy Distribution Difference. IEEE/ACM Trans Comput Biol Bioinform. 2015;12(3):695–704.
STOJANOVSKI TD. Performance of exhaustive search with parallel agents. Turk J Elect Eng Comp Sci. 2014;22:1382–94.
Ritchie MD, Hahn LW, Roodi N, Bailey LR, Dupont WD, Parl FF, Moore JH. Multifactordimensionality reduction reveals highorder interactions among estrogenmetabolism genes in sporadic breast cancer. Am J Hum Genet. 2001;69(1):138–47.
Velez DR, White BC, Motsinger AA, Bush WS, Ritchie MD, Williams SM, Moore JH. A balanced accuracy function for epistasis modeling in imbalanced datasets using multifactor dimensionality reduction. Genet Epidemiol. 2007;31(4):306–15.
Yang CH, Chuang LY, Lin YD. Multiobjective multifactor dimensionality reduction to detect SNPSNP interactions. Bioinformatics. 2018;34(13):2228–36.
Ritchie MD, Hahn LW, Moore JH. Power of multifactor dimensionality reduction for detecting genegene in teractions in the presence of genotyping error, missing data, phenocopy, and genetic heterogeneity. Genet Epidemiol. 2003;24(2):150–7.
Yang CH, Lin YD, Chuang LY, Chen JB, Chang HW. MDRER: balancing functions for adjusting the ratio in risk classes and classification errors for imbalanced cases and controls using multifactordimensionality reduction. PLoS One. 2013;8(11):e79387.
Lee S, Kwon MS, Oh JM, Park T. Genegene interaction analysis for the survival phenotype based on the Cox model. Bioinformatics. 2012;28(18):i582–8.
Yang CH, Hou MF, Chuang LY, Yang CS, Lin YD. Dimensionality reduction approach for manyobjective epistasis analysis. Brief Bioinform. 2023;24(1):1–13.
Zhang Y, Liu JS. Bayesian inference of epistatic interactions in casecontrol studies. Nat Genet. 2007;39(9):1167–73.
Shang J, Zhang J, Sun Y, Zhang Y. EpiMiner: A threestage coinformation based method for detecting and visualizing epistatic interactions. Digital Signal Processing. 2014;24:1–13.
Flerova N, Marinescu R, Dechter R. Weighted heuristic anytime search: new schemes for optimization over graphical models. Ann Math Artif Intell. 2017;79(1–3):77–128.
Tuo S. FDHEIW: A Fast Approach for Detecting HighOrder Epistasis in GenomeWide CaseControl Studies. Genes (Basel). 2018;9(9):435.
Wei C, Lu Q. GWGGI: software for genomewide genegene interaction analysis. BMC Genet. 2014;15:101.
Guy RT, Santago P, Langefeld CD. Bootstrap aggregating of alternating decision trees to detect sets of SNPs that associate with disease. Genet Epidemiol. 2012;36(2):99–106.
Wan X, Yang C, Yang Q, Xue H, Tang NL, Yu W. Predictive rule inference for epistatic interaction detection in genomewide association studies. Bioinformatics. 2010;26(1):30–7.
Wang X, Cao X, Feng Y, Guo M, Yu G, Wang J. ELSSI: parallel SNPSNP interactions detection by ensemble multitype detectors. Brief Bioinform. 2022;23(4):1–13.
Shang J, Cai X, Zhang T, Sun Y, Zhang Y, Liu J, Guan B. EpiReSIM: A resampling method of epistatic model without marginal effects using underdetermined system of equations. Genes. 2022;13:2286.
Sun Y, Gu Y, Ren Q, Li Y, Shang J, Liu JX, Guan B. MDSN: A module detection method for identifying highorder epistatic interactions. Genes. 2022;13:2403.
Wang J, Zhang H, Ren W, Guo M, Guoxian Yu. EpiMC: detecting epistatic interactions using multiple clusterings. IEEE/ACM Trans Comput Biol Bioinform. 2022;19(1):243–54.
David BB, Jan B, Markus H, Tim K, Markus L. A framework for modeling epistatic interaction. Bioinformatics. 2021;37(12):1708–16.
Uppu S, Krishna A. A deep hybrid model to detect multilocus interacting SNPs in the presence of noise. Int J Med Inform. 2018;119:134–51.
Uppu S, Krishna A, Gopalan RP. A review on methods for detecting SNP interactions in highdimensional genomic data. IEEE/ACM Trans Comput Biol Bioinform. 2018;15(2):599–612.
Wang X, Zhang H, Wang J, Yu G, Cui L, Guo M. EpiHNet: Detecting epistasis by heterogeneous molecule network. Methods. 2022;198:65–75.
Aflakparast M, Salimi H, Gerami A, Dubé MP, Visweswaran S, MasoudiNejad A. Cuckoo search epistasis: a new method for exploring significant genetic interactions. Heredity (Edinb). 2014;112(6):666–74.
Jing PJ, Shen HB. MACOED: a multiobjective ant colony optimization algorithm for SNP epistasis detection in genomewide association studies. Bioinformatics. 2015;31(5):634–41.
Sun Y, Shang J, Liu JX, Li S, Zheng CH. epiACO  a method for identifying epistasis based on ant Colony optimization algorithm. BioData Min. 2017;10:23.
Tuo S, Liu H, Chen H. Multipopulation harmony search algorithm for the detection of highorder SNP interactions. Bioinformatics. 2020;36(16):4389–98.
Sun Y, Wang X, Shang J, Liu JX, Zheng CH, Lei X. Introducing Heuristic Information Into Ant Colony Optimization Algorithm for Identifying Epistasis. IEEE/ACM Trans Comput Biol Bioinform. 2020;17(4):1253–61.
Tuo S, Zhang J, Yuan X, He Z, Liu Y, Liu Z. Niche harmony search algorithm for detecting complex disease associated highorder SNP combinations. Sci Rep. 2017;7(1):11529.
Li X, Zhang S, Wong KC. NatureInspired Multiobjective Epistasis Elucidation from GenomeWide Association Studies. IEEE/ACM Trans Comput Biol Bioinform. 2020;17(1):226–37.
Wang H, Wu X. IPP: An Intelligent PrivacyPreserving Scheme for Detecting Interactions in Genome Association Studies. IEEE/ACM Trans Comput Biol Bioinform. 2023;20(1):455–64.
Peng YZ, Lin Y, Huang Y, Li Y, Luo G, Liao J. GEPEpiSeeker: a gene expression programmingbased method for epistatic interaction detection in genomewide association studies. BMC Genomics. 2021;22(Suppl 1):910.
Gu Y, Sun Y, Shang J, Li F, Guan B, Liu JX. MultiObjective Artificial Bee Colony Algorithm Based on ScaleFree Network for Epistasis Detection. Genes (Basel). 2022;13(5):871.
Chen Y, Xu F, Pian C, Xu M, Kong L, Fang J, Li Z, Zhang L. E, EpiMOGA: An Epistasis Detection Method Based on a MultiObjective Genetic Algorithm. Genes. 2021;12:191.
M. GCJ, Álvaro RL, Sergio SJ, et al. Multiobjective evolutionary computation for highorder genetic interactions. Appl Soft Comput J. 2022;128:1–13.
Tang D. Spherical evolution for solving continuous optimization problems. Appl Soft Comput J. 2019;81:105499.
Yang C, He Z, Wan X, Yang Q, Xue H, Yu W. SNPHarvester: a filteringbased approach for detecting epistatic interactions in genomewide association studies. Bioinformatics. 2009;25(4):504–11.
Hoey J .The TwoWay Likelihood Ratio (G) Test and comparison to twoway chi squared test.Statistics. 2012. https://doi.org/10.48550/arXiv.1206.4881.
Yang CH, Chuang LY, Lin YD. CMDR based differential evolution identifies the epistatic interaction in genomewide association studies. Bioinformatics. 2017;33(15):2354–62.
Urbanowicz RJ, Kiralis J, SinnottArmstrong NA, Heberling T, Fisher JM, Moore JH. GAMETES: a fast, direct algorithm for generating pure, strict, epistatic models with random architectures. BioData Min. 2012;5(1):16.
Namkung J, Kim K, Yi S, Chung W, Kwon MS, Park T. New evaluation measures for multifactor dimensionality reduction classifiers in genegene interaction analysis. Bioinformatics. 2009;25(3):338–45.
PonteFernandez C, GonzalezDominguez J, CarvajalRodriguez A, Martin MJ. Evaluation of existing methods for highorder epistasis detection. IEEE/ACM Trans Comput Biol Bioinform. 2022;19(2):912–26.
Derrac J, García S, Molina D, Herrera F. A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol Comput. 2011;1(1):3–18.
Burton PR, Clayton DG, Cardon LR, Craddock N, Deloukas P, Duncanson A, Kwiatkowski DP, McCarthy MI, Ouwehand WH, Samani NJ, et al. Association scan of 14,500 nonsynonymous SNPs in four diseases identifies autoimmunity variants. Nat Genet. 2007;39(11):1329–37.
Colletti JA 2nd, LelandWavrin KM, Kurz SG, Hickman MP, Seiler NL, Samanas NB, Eckert QA, Dennison KL, Ding L, Schaffer BS, et al. Validation of six genetic determinants of susceptibility to estrogeninduced mammary cancer in the rat and assessment of their relevance to breast cancer risk in humans. G3 (Bethesda). 2014;4(8):1385–94.
Niu M, He Y, Xu J, Ding L, He T, Yi Y, Fu M, Guo R, Li F, Chen H, et al. Noncanonical TGFβ signaling leads to FBXO3mediated degradation of ΔNp63α promoting breast cancer metastasis and poor clinical prognosis. PLoS Biol. 2021;19(2):e3001113.
Jansson M, Billing O, Herdenberg C, Lundin C, Tolockiene E, Nazemroaya A, Sund M. Expression and Circulating Levels of Perlecan in Breast Cancer  Implications for Oestrogen Dependent Stromal Remodeling. J Mammary Gland Biol Neoplasia. 2020;25(1):69–77.
Clay MR, Varma S, West RB. MAST2 and NOTCH1 translocations in breast carcinoma and associated preinvasive lesions. Hum Pathol. 2013;44(12):2837–44.
Smith NL, Halliday BE, Finley JL, Wennerberg AE. The spectrum of immunohistochemical reactivity of monoclonal antibody DS6 in nongynecologic neoplasms. Appl Immunohistochem Mol Morphol. 2002;10(2):152–8.
Tanabe R, Fukunaga A. Successhistory based parameter adaptation for differential evolution. IEEE Congr Evol Comput. 2013;2013:71–8.
Funding
This work was supported by the National Natural Science Foundation of China (61976239); Guang Dong Provincial Natural Fund project (2020A1515010783); the Guangdong provincial medical research foundation of China (No. A2022531); the National Key Laboratory of Data Space Technology and System, National Key R&D Program of China（Grant No. 2021YFD1300100).
Author information
Authors and Affiliations
Contributions
Deyu Tang and Junxi Zheng conceived the study. Deyu Tang, Jin Yang and Yijun Mao designed the experiment. Jin Yang and Jie Zhao performed the experiment and wrote the manuscript. Shiyin Li, Fuxiang Ren, and Junxi Zheng revised and polished the manuscript. All authors have participated in the study discussion and manuscript preparation. All of the authors have read and approved the final manuscript. Deyu Tang, Yijun Mao, Jin Yang and Junxi Zheng are the cocorresponding authors.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Tang, Dy., Mao, Yj., Zhao, J. et al. SEEI: spherical evolution with feedback mechanism for identifying epistatic interactions. BMC Genomics 25, 462 (2024). https://doi.org/10.1186/s12864024103734
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12864024103734