In continuous black-box optimization, various stochastic local search techniques are often employ... more In continuous black-box optimization, various stochastic local search techniques are often employed, with various remedies for fighting the premature convergence. This paper surveys recent developments in the field (the most important from the author's perspective), analyzes the differences and similarities and proposes a taxonomy of these methods. Based on this taxonomy, a variety of novel, previously unexplored, and potentially promising techniques may be envisioned.
Dynamic system modeling of evolutionary algorithms
ACM SIGAPP Applied Computing Review, 2016
Evolutionary algorithms are population-based, metaheuristic, black-box optimization techniques fr... more Evolutionary algorithms are population-based, metaheuristic, black-box optimization techniques from the wider family of evolutionary computation. Optimization algorithms within this family are often based on similar principles and routines inspired by biological evolution. Due to their robustness, the scope of their application is broad and varies from physical engineering to software design problems. Despite sharing similar principles based in common biological inspiration, these algorithms themselves are typically viewed as black-box program routines by the end user, without a deeper insight into the underlying optimization process. We believe that shedding some light into the underlying routines of evolutionary computation algorithms can make them more accessible to wider engineering public. In this paper, we formulate the evolutionary optimization process as a dynamic system simulation, and provide means to prototype evolutionary optimization routines in a visually comprehensibl...
Proceedings of the Companion Publication of the 2015 on Genetic and Evolutionary Computation Conference - GECCO Companion '15, 2015
The recently proposed Brent-STEP algorithm was generalized for separable functions by performing ... more The recently proposed Brent-STEP algorithm was generalized for separable functions by performing axis-parallel searches, interleaving the steps in individual dimensions in a round-robin fashion. This article explores the possibility to choose the dimension for the next step in a more "intelligent way", i.e. to optimize first along dimensions which are believed to bring the highest profit. We present here the results for the epsilon-greedy strategy, and for a method based on the internals of the Brent-STEP algorithm. Although the proposed methods work better than the round robin strategy in some situations, due to the marginal improvement they bring we suggest the round robin strategy to be used, thanks to its simplicity.
Proceedings of the Companion Publication of the 2015 on Genetic and Evolutionary Computation Conference - GECCO Companion '15, 2015
Genetic Programming has been very successful in solving a large area of problems but its use as a... more Genetic Programming has been very successful in solving a large area of problems but its use as a machine learning algorithm has been limited so far. One of the reasons is the problem of overfitting which cannot be solved or suppresed as easily as in more traditional approaches. Another problem, closely related to overfitting, is the selection of the final model from the population. In this article we present our research that addresses both problems: overfitting and model selection. We compare several ways of dealing with ovefitting, based on Random Sampling Technique (RST) and on using a validation set, all with an emphasis on model selection. We subject each approach to a thorough testing on artificial and real-world datasets and compare them with the standard approach, which uses the full training data, as a baseline.
Proceedings of the 2015 on Genetic and Evolutionary Computation Conference - GECCO '15, 2015
We propose a novel hybrid algorithm "Brent-STEP" for univariate global function minimization, bas... more We propose a novel hybrid algorithm "Brent-STEP" for univariate global function minimization, based on the global line search method STEP and accelerated by Brent's method, a local optimizer that combines quadratic interpolation and golden section steps. We analyze the performance of the hybrid algorithm on various one-dimensional functions and experimentally demonstrate a significant improvement relative to its constituent algorithms in most cases. We then generalize the algorithm to multivariate functions, adopting the recently proposed [8] scheme to interleave evaluations across dimensions to achieve smoother and more efficient convergence. We experimentally demonstrate the highly competitive performance of the proposed multivariate algorithm on separable functions of the BBOB benchmark. The combination of good performance and smooth convergence on separable functions makes the algorithm an interesting candidate for inclusion in algorithmic portfolios or hybrid algorithms that aim to provide good performance on a wide range of problems.
Genetic Programming has been very successful in solving a large area of problems but its use as a... more Genetic Programming has been very successful in solving a large area of problems but its use as a machine learning algorithm has been limited so far. One of the reasons is the problem of overfitting which cannot be solved or suppresed as easily as in more traditional approaches. Another problem, closely related to overfitting, is the selection of the final model from the population. In this article we present our research that addresses both problems: overfitting and model selection. We compare several ways of dealing with ovefitting, based on Random Sampling Technique (RST) and on using a validation set, all with an emphasis on model selection. We subject each approach to a thorough testing on artificial and real-world datasets and compare them with the standard approach, which uses the full training data, as a baseline.
ABSTRACT This paper introduces a new scoring method for company default prediction. The method is... more ABSTRACT This paper introduces a new scoring method for company default prediction. The method is based on a modified magic square (a spider diagram with four perpendicular axes) which is used to evaluate economic performance of a country. The evaluation is quantified by the area of a polygon, whose vertices are points lying on the axes. The axes represent economic indicators having significant importance for an economic performance evaluation. The proposed method deals with magic square limitations; e.g. an axis zero point not placed in the axes origins, and extends its usage for an arbitrary (higher than 3) number of variables. This approach is applied on corporations to evaluate their economic performance and identify the companies suspected to default. In general, a company score reflects their economic performance; it is calculated as a polygon area. The proposed method is based on the identification of the parameters (axes order, parameters weights and angles between axes) needed to achieve maximum possible model performance. The developed method uses company financial ratios from its financial statements (debt ratio, return on costs etc.) and the information about a company default or bankruptcy as primary input data. The method is based on obtaining a maximum value of the Gini (or Kolmogorov–Smirnov) index that reflects the quality of the ordering of companies according to their score values. Defaulted companies should have a lower score than non-defaulted companies. The number of parameter groups (axes order, parameters weights and angles between axes) can be reduced without a negative impact on the model performance. Historical data is used to set up model parameters for the prediction of possible future companies default. In addition, the methodology allows calculating the threshold value of the score to separate the companies that are suspicious to the default from other companies. A threshold value is also necessary for a model true positive rate and true negative rate calculations. Training and validation processes for the developed model were performed on two independent and disjunct datasets. The performance of the proposed method is comparable to other methods such as logistic regression and neural networks. One of the major advantages of the proposed method is a graphical interpretation of a company score in the form of a diagram enabling a simple illustration of individual factor contribution to the total score value.
When a simple real-valued estimation of distribution algorithm (EDA) with Gaussian model and maxi... more When a simple real-valued estimation of distribution algorithm (EDA) with Gaussian model and maximum likelihood estimation of parameters is used, it converges prematurely even on the slope of the fitness function. The simplest way of preventing premature convergence by multiplying the variance estimate by a constant factor k each generation is studied. Recent works have shown that when increasing the dimensionality of the search space, such an algorithm becomes very quickly unable to traverse the slope and focus to the optimum at the same time. In this paper it is shown that when isotropic distributions with Gaussian or Cauchy distributed norms are used, the simple constant setting of k is able to ensure a reasonable behaviour of the EDA on the slope and in the valley of the fitness function at the same time.
In continuous black-box optimization, various stochastic local search techniques are often employ... more In continuous black-box optimization, various stochastic local search techniques are often employed, with various remedies for fighting the premature convergence. This paper surveys recent developments in the field (the most important from the author's perspective), analyzes the differences and similarities and proposes a taxonomy of these methods. Based on this taxonomy, a variety of novel, previously unexplored, and potentially promising techniques may be envisioned.
In real-valued estimation-of-distribution algorithms, the Gaussian distribution is often used alo... more In real-valued estimation-of-distribution algorithms, the Gaussian distribution is often used along with maximum likelihood (ML) estimation of its parameters. Such a process is highly prone to premature convergence. The simplest method for preventing premature convergence of Gaussian distribution is enlarging the maximum likelihood estimate of σ by a constant factor k each generation. Such a factor should be large enough to prevent convergence on slopes of the fitness function, but should not be too large to allow the algorithm converge in the neighborhood of the optimum. Previous work showed that for truncation selection such admissible k exists in 1D case. In this article it is shown experimentaly, that for the Gaussian EDA with truncation selection in high-dimensional spaces no admissible k exists!
Proceedings of the 9th annual conference on Genetic and evolutionary computation - GECCO '07, 2007
Evolutionary algorithms applied in real domain should profit from information about the local fit... more Evolutionary algorithms applied in real domain should profit from information about the local fitness function curvature. This paper presents an initial study of an evolutionary strategy with a novel approach for learning the covariance matrix of a Gaussian distribution. The learning method is based on estimation of the fitness landscape contour line between the selected and discarded individuals. The distribution learned this way is then used to generate new population members. The algorithm presented here is the first attempt to construct the Gaussian distribution this way and should be considered only a proof of concept; nevertheless, the empirical comparison on low-dimensional quadratic functions shows that our approach is viable and with respect to the number of evaluations needed to find a solution of certain quality, it is comparable to the state-of-the-art CMA-ES in case of sphere function and outperforms the CMA-ES in case of elliptical function.
This article describes a new model of probability density function and its use in estimation of d... more This article describes a new model of probability density function and its use in estimation of distribution algorithms. The new model, the distribution tree, has interesting properties and can form a solid basis for further improvements which will make it even more competitive. Several comparative experiments on continuous real-valued optimization problems were carried out and the results are promising. It outperformed the genetic algorithm using the traditional crossover operator several times, in the majority of the remaining experiments it was comparable to the genetic algorithm performance.
A good performance of traditional genetic algorithm is determined by its ability to identify buil... more A good performance of traditional genetic algorithm is determined by its ability to identify building blocks and grow them to larger ones. To attain this objective a properly arranged chromosome is needed to ensure that building blocks will survive the application of recombination operators. The proposed algorithm periodically rearranges the order of genes in the chromosome while the actual information about the inter-gene dependencies is calculated on-line through the run. Standard 2-point crossover, operating on the adapted chromosomal structure, is used to generate new solutions. Experimental results show that this algorithm is able to solve separable problems with strong intra building block dependencies among genes as well as the hierarchical problems.
Comparison of cauchy EDA and pPOEMS algorithms on the BBOB noiseless testbed
Proceedings of the 12th annual conference comp on Genetic and evolutionary computation - GECCO '10, 2010
ABSTRACT Estimation-of-distribution algorithm using Cauchy sampling distribution is compared with... more ABSTRACT Estimation-of-distribution algorithm using Cauchy sampling distribution is compared with the iterative prototype opti-mization algorithm with evolved improvement steps. While Cauchy EDA is better on unimodal functions, iterative pro-totype optimization is more suitable for ...
Proceedings of the 11th annual conference companion on Genetic and evolutionary computation conference - GECCO '09, 2009
The generalized generation gap (G3) model of an evolutionary algorithm equipped with the parent c... more The generalized generation gap (G3) model of an evolutionary algorithm equipped with the parent centric crossover (PCX) is tested on the BBOB 2009 benchmark testbed. To improve its behavior on multimodal functions a multistart strategy is used. The algorithm shows promising results especially for the group of 'moderate' functions, but fails systematically on the group of 'multimodal' functions.
Proceedings of the 11th annual conference companion on Genetic and evolutionary computation conference - GECCO '09, 2009
The restarted estimation of distribution algorithm (EDA) with Cauchy distribution as the probabil... more The restarted estimation of distribution algorithm (EDA) with Cauchy distribution as the probabilistic model is tested on the BBOB 2009 testbed. These tests prove that when using the Cauchy distribution and suitably chosen variance enlargment factor, the algorithm is usable for broad range of fitness landscapes, which is not the case for EDA with Gaussian distribution which converges prematurely. The results of the algorithm are of mixed quality and its scaling is at least quadratic.
Uploads
Papers by Petr Pošík