Armañanzas R, Inza I, Santana R, Saeys Y, Flores JL, Lozano JA, Van de Peer Y, Blanco R, Robles V, Bielza C and Larrañaga P (2008), "A review of estimation of distribution algorithms in bioinformatics", BioData Mining. Vol. 1(6), pp. doi:10.1186/1756-0381-1-6. |
Abstract: Evolutionary search algorithms have become an essential asset in the algorithmic toolbox for solving high-dimensional optimization problems in across a broad range of bioinformatics problems. Genetic algorithms, the most well-known and representative evolutionary search technique, have been the subject of the major part of such applications. Estimation of distribution algorithms (EDAs) offer a novel evolutionary paradigm that constitutes a natural and attractive alternative to genetic algorithms. They make use of a probabilistic model, learnt from the promising solutions, to guide the search process. In this paper, we set out a basic taxonomy of EDA techniques, underlining the nature and complexity of the probabilistic model of each EDA variant. We review a set of innovative works that make use of EDA techniques to solve challenging bioinformatics problems, emphasizing the EDA paradigm's potential for further research in this domain. |
BibTeX:
@article{Armananzas_et_al:2008, author = {R. Armañanzas and I. Inza and R. Santana and Y. Saeys and J. L. Flores and J. A. Lozano and Y. Van de Peer and R. Blanco and V. Robles and C. Bielza and P. Larrañaga}, title = {A review of estimation of distribution algorithms in bioinformatics}, journal = {BioData Mining}, year = {2008}, volume = {1}, number = {6}, pages = {doi:10.1186/1756-0381-1-6}, url = {http://www.biodatamining.org/content/1/1/6} } |
Echegoyen C, Santana R, Lozano JA and Larrañaga P (2008), "The impact of probabilistic learning algorithms in EDAs based on Bayesian networks", In Linkage in Evolutionary Computation. , pp. 109-139. Springer. |
Abstract: This paper discusses exact learning of Bayesian networks in estimation of distribution algorithms. The estimation of Bayesian network algorithm (EBNA) is used to analyze the impact of learning the optimal (exact) structure in the search. By applying recently introduced methods that allow learning optimal Bayesian networks, we investigate two important issues in EDAs. First, we analyze the question of whether learning more accurate (exact) models of the dependencies implies a better performance of EDAs. Secondly, we are able to study the way in which the problem structure is translated into the probabilistic model when exact learning is accomplished. The results obtained reveal that the quality of the problem information captured by the probability model can improve when the accuracy of the learning algorithm employed is increased. However, improvements in model accuracy do not always imply a more efficient search. |
BibTeX:
@inproceedings{Echegoyen_et_al:2008, author = {C. Echegoyen and R. Santana and J. A. Lozano and P. Larrañaga}, title = {The impact of probabilistic learning algorithms in EDAs based on Bayesian networks}, booktitle = {Linkage in Evolutionary Computation}, publisher = {Springer}, year = {2008}, pages = {109-139}, url = {http://dx.doi.org/10.1007/978-3-540-85068-7_6} } |
González-Arenas Z, Jiménez-Sobrino JC, Lozada-Chang L and Santana R (2008), "Parameter estimation of diffusion processes using EDAs", In Proceedings of VIII International Conference on Operations Research. La Habana, Cuba , pp. 55. |
Abstract: In this work we introduced a specific EDA, a continuous version of the Univariate Marginal Distribution Algorithm (UMDAc), to seek a solution for the optimization problem related with the estimation of unknown parameters in a diffusion process. There were considered two different examples for evaluating our proposal behavior and a comparison was made with the case of using a local search algorithm. |
BibTeX:
@inproceedings{Gonzalez_et_al::2008, author = {Z. González-Arenas and J. C. Jiménez-Sobrino and L. Lozada-Chang and R. Santana}, title = {Parameter estimation of diffusion processes using EDAs}, booktitle = {Proceedings of VIII International Conference on Operations Research}, year = {2008}, pages = {55} } |
Santana R, Larrañaga P and Lozano JA (2008), "Estimation of distribution algorithms with affinity propagation methods". Research Report at: Department of Computer Science and Artificial Intelligence, University of the Basque Country., January, 2008. (EHU-KZAA-IK-1/08) |
Abstract: Estimation of distribution algorithms (EDAs) that use marginal product model factorizations have been widely applied to a broad range of, mainly binary, optimization problems.In this paper, we introduce the affinity propagation EDA which learns a marginal product model by clustering a matrix of mutual information learned from the data using a very efficient message-passing algorithm known as affinity propagation. The introduced algorithm is tested on a set of binary and non-binary decomposable functions and using a hard combinatorial class of problem known as the HP protein model. The results show that the algorithm is a very efficient alternative to other EDAs that use marginal product model factorizations such as the extended compact genetic algorithm (ECGA) and improves the quality of the results achieved by ECGA when the cardinality of the variables is increased. |
BibTeX:
@techreport{Santana_et_al:2008, author = {R. Santana and P. Larrañaga and J. A. Lozano}, title = {Estimation of distribution algorithms with affinity propagation methods}, school = {Department of Computer Science and Artificial Intelligence, University of the Basque Country}, year = {2008}, number = {EHU-KZAA-IK-1/08}, url = {https://www.researchgate.net/publication/220375120_Learning_Factorizations_in_Estimation_of_Distribution_Algorithms_Using_Affinity_Propagation} } |
Santana R, Larrañaga P and Lozano JA (2008), "Protein folding in simplified models with estimation of distribution algorithms", IEEE Transactions on Evolutionary Computation. Vol. 12(4), pp. 418-438. |
Abstract: Simplified lattice models have played an important role in protein structure prediction and protein folding problems. These models can be useful for an initial approximation of the protein structure, and for the investigation of the dynamics that govern the protein folding process. Estimation of distribution algorithms (EDAs) are efficient evolutionary algorithms that can learn and exploit the search space regularities in the form of probabilistic dependencies. This paper introduces the application of different variants of EDAs to the solution of the protein structure prediction problem in simplified models, and proposes their use as a simulation tool for the analysis of the protein folding process. We develop new ideas for the application of EDAs to the bidimensional and tridimensional (2-d and 3-d) simplified protein folding problems. This paper analyzes the rationale behind the application of EDAs to these problems, and elucidates the relationship between our proposal and other population-based approaches proposed for the protein folding problem. We argue that EDAs are an efficient alternative for many instances of the protein structure prediction problem and are indeed appropriate for a theoretical analysis of search procedures in lattice models. All the algorithms introduced are tested on a set of difficult 2-d and 3-d instances from lattice models. Some of the results obtained with EDAs are superior to the ones obtained with other well-known population-based optimization algorithms. |
BibTeX:
@article{Santana_et_al:2008a, author = {R. Santana and P. Larrañaga and J. A. Lozano}, title = {Protein folding in simplified models with estimation of distribution algorithms}, journal = {IEEE Transactions on Evolutionary Computation}, year = {2008}, volume = {12}, number = {4}, pages = {418--438}, url = {http://dx.doi.org/10.1109/TEVC.2007.906095} } |
Santana R, Larrañaga P and Lozano JA (2008), "Component weighting functions for adaptive search with EDAs", In Proceedings of the 2008 Congress on Evolutionary Computation CEC-2008. Hong Kong , pp. 4067-4074. IEEE Press. |
Abstract: This paper introduces the component weighting approach as a general optimization heuristic to increase the likelihood of escaping from local optima by dynamically modifying the fitness function. The approach is tested on the optimization of the simplified hydrophobic-polar (HP) protein problem using estimation of distribution algorithms (EDAs). We show that the use of component weighting together with statistical information extracted from the set of selected solutions considerably improve the results of EDAs for the HP problem. The paper also elaborates on the use of probabilistic modeling for the definition of dynamic fitness functions and on the use of combinations of models. |
BibTeX:
@inproceedings{Santana_et_al:2008b, author = {R. Santana and P. Larrañaga and J. A. Lozano}, title = {Component weighting functions for adaptive search with EDAs}, booktitle = {Proceedings of the 2008 Congress on Evolutionary Computation CEC-2008}, publisher = {IEEE Press}, year = {2008}, pages = {4067--4074}, url = {http://dx.doi.org/10.1109/CEC.2008.4631352} } |
Santana R, Larrañaga P and Lozano JA (2008), "Adaptive estimation of distribution algorithms", In Adaptive and Multilevel Metaheuristics. Vol. 136, pp. 177-197. Springer. |
Abstract: Estimation of distribution algorithms (EDAs) are evolutionary methods that use probabilistic models instead of genetic operators to lead the search. Most of current proposals on EDAs do not incorporate adaptive techniques. Usually, the class of probabilistic model employed as well as the learning and sampling methods are static. In this paper, we present a general framework for introducing adaptation in EDAs. This framework allows the possibility of changing the class of probabilistic models during the evolution. We present a number of measures, and techniques that can be used to evaluate the effect of the EDA components in order to design adaptive EDAs. As a case of study we present an adaptive EDA that combines different classes of probabilistic models and sampling methods. The algorithm is evaluated in the solution of the satisfiability problem.. |
BibTeX:
@incollection{Santana_et_al:2008c, author = {R. Santana and Pedro Larrañaga and J. A. Lozano}, editor = {C. Cotta and M. Sevaux and K. Sörensen}, title = {Adaptive estimation of distribution algorithms}, booktitle = {Adaptive and Multilevel Metaheuristics}, publisher = {Springer}, year = {2008}, volume = {136}, pages = {177-197}, url = {http://dx.doi.org/10.1007/978-3-540-79438-7_9} } |
Santana R, Larrañaga P and Lozano JA (2008), "Adding probabilistic dependencies to the search of protein side chain configurations using EDAs", In Parallel Problem Solving from Nature - PPSN X. Dortmund, Germany Vol. 5199, pp. 1120-1129. Springer. |
Abstract: The problem of finding an optimal positioning for the side chain residues of a protein is called the side chain placement or side chain prediction problem. It can be posed as an optimization problem in the discrete domain. In this paper we use an estimation of distribution algorithm to address this optimization problem. Using a set of 50 difficult protein instances, it is shown that the addition of dependencies between the variables in the probabilistic model can improve the quality of the solutions achieved for most of the instances considered. However, we also show that only when information about the known interactions between the residues is considered in the creation of the probabilistic model, the addition of the dependencies contributes to improve the quality of the solutions obtained. |
BibTeX:
@inproceedings{Santana_et_al:2008d, author = {R. Santana and P. Larrañaga and J. A. Lozano}, editor = {G. Rudolph and T. Jansen and S. Lucas and C. Poloni and N. Beume}, title = {Adding probabilistic dependencies to the search of protein side chain configurations using EDAs}, booktitle = {Parallel Problem Solving from Nature - PPSN X}, publisher = {Springer}, year = {2008}, volume = {5199}, pages = {1120--1129}, url = {http://dx.doi.org/10.1007/978-3-540-87700-4_111} } |
Santana R, Larrañaga P and Lozano JA (2008), "Combining Variable Neighborhood Search and Estimation of Distribution Algorithms in the Protein Side Chain Placement Problem", Journal of Heuristics. Vol. 14, pp. 519-547. |
Abstract: The aim of this work is to introduce several proposals for combining two metaheuristics: variable neighborhood search (VNS) and estimation of distribution algorithms (EDAs). Although each of these metaheuristics has been previously hybridized in several ways, this paper constitutes the first attempt to combine both optimization methods. The different ways of combining VNS and EDAs will be classified into three groups. In the first group, we will consider combinations where the philosophy underlying VNS is embedded in EDAs. Considering different neighborhood spaces (points, populations or probability distributions), we will obtain instantiations for the approaches in this group. The second group of algorithms is obtained when probabilistic models (or any other machine learning paradigm) are used in order to exploit the good and bad shakes of the randomly generated solutions in a reduced variable neighborhood search. The last group of algorithms contains the results of alternating VNS and EDAs. An application of the first approach is presented in the protein side chain placement problem. The results obtained show the superiority of the hybrid algorithm in comparison with EDAs and VNS. |
BibTeX:
@article{Santana_et_al:2008f, author = {R. Santana and P. Larrañaga and J. A. Lozano}, title = {Combining Variable Neighborhood Search and Estimation of Distribution Algorithms in the Protein Side Chain Placement Problem}, journal = {Journal of Heuristics}, year = {2008}, volume = {14}, pages = {519--547}, url = {http://dx.doi.org/10.1007/s10732-007-9049-8} } |
Santana R, Mendiburu A and Lozano JA (2008), "An empirical analysis of loopy belief propagation in three topologies: Grids, small-world networks and random graphs", In Proceedings of the Fourth European Workshop on Probabilistic Graphical Models (PGM-2008). Hirtshals, Denmark , pp. 249-256. |
Abstract: Recently, much research has been devoted to the study of loopy belief propagation algorithm. However, little attention has been paid to the change of its behavior in relation with the problem graph topology. In this paper we empirically study the behavior of loopy belief propagation on different network topologies which include grids, small-world networks and random graphs. In our experiments, several descriptors of the algorithm are collected in order to analyze its behavior. We show that the performance of the algorithm is highly sensitive to changes in the topologies. Furthermore, evidence is given showing that the addition of shortcuts to grids can determine important changes in the dynamics of the algorithm. |
BibTeX:
@inproceedings{Santana_et_al:2008g, author = {R. Santana and A. Mendiburu and J. A. Lozano}, editor = {Manfred Jaeger and Thomas D. Nielsen}, title = {An empirical analysis of loopy belief propagation in three topologies: Grids, small-world networks and random graphs}, booktitle = {Proceedings of the Fourth European Workshop on Probabilistic Graphical Models (PGM-2008)}, year = {2008}, pages = {249--256} } |
Shakya S and Santana R (2008), "An EDA based on local Markov property and Gibbs sampling", In Proceedings of the 2008 Genetic and evolutionary computation conference (GECCO). New York, NY, USA , pp. 475-476. ACM. |
Abstract: The key ideas behind most of the recently proposed Markov networks based EDAs were to factorise the joint probability distribution in terms of the cliques in the undirected graph. As such, they made use of the global Markov property of the Markov network. Here we presents a Markov Network based EDA that exploits Gibbs sampling to sample from the Local Markov property, the Markovianity, and does not directly model the joint distribution. We call it Markovianity based Optimisation Algorithm. Some initial results on the performance of the proposed algorithm shows that it compares well with other Bayesian network based EDAs |
BibTeX:
@inproceedings{Shakya_and_Santana:2008, author = {Siddhartha Shakya and R. Santana}, editor = {Maarten Keijzer}, title = {An EDA based on local Markov property and Gibbs sampling}, booktitle = {Proceedings of the 2008 Genetic and evolutionary computation conference (GECCO)}, publisher = {ACM}, year = {2008}, pages = {475--476}, url = {http://dl.acm.org/citation.cfm?id=1389185} } |