DOE2VEC: REPRESENTATION LEARNING FOR EXPLORATORY LANDSCAPE ANALYSIS

Abstract

We propose DoE2Vec, a variational autoencoder (VAE)-based methodology to learn optimization landscape characteristics for downstream meta-learning tasks, e.g., automated selection of optimization algorithms. Principally, using large training data sets generated with a random function generator, DoE2Vec selflearns an informative latent representation for any design of experiments (DoE). Unlike the classical exploratory landscape analysis (ELA) method, our approach does not require any feature engineering and is easily applicable for high dimensional search spaces. For validation, we inspect the quality of latent reconstructions and analyze the latent representations using different experiments. The latent representations not only show promising potentials in identifying similar (cheapto-evaluate) surrogate functions, but also can significantly boost performances when being used complementary to the ELA features in classification tasks.

1. INTRODUCTION

Solving real-world black-box optimization problems can be extremely complicated, particularly if they are strongly nonlinear and require expensive function evaluations. As suggested by the no free lunch theorem in Wolpert & Macready (1997) , there is no such things as a single-best optimization algorithm, that is capable of optimally solving all kind of problems. The task in identifying the most time-and resource-efficient optimization algorithms for each specific problem, also known as the algorithm selection problem (ASP) (see Rice (1976) ), is tedious and challenging, even with domain knowledge and experience. In recent years, landscape-aware algorithm selection has gained increasing attention from the research community, where the fitness landscape characteristics are exploited to explain the effectiveness of an algorithm across different problem instances (see van Stein et al. (2013); Simoncini et al. (2018)) . Beyond that, it has been shown that landscape characteristics are sufficiently informative in reliably predicting the performance of optimization algorithms, e.g., using Machine Learning approaches (see Bischl et al. ( 2012 Exploratory landscape analysis (ELA), for instance, considers six classes of expertly designed features, including y-distribution, level set, meta-model, local search, curvature and convexity, to numerically quantify the landscape complexity of an optimization problem, such as multimodality, global structure, separability, plateaus, etc. (see Mersmann et al. (2010; 2011) ). Each feature class consists of a set of features, which can be relatively cheaply computed. Other than typical ASP tasks, ELA has shown great potential in a wide variety of applications, such as understanding the underlying landscape of neural architecture search problems in van Stein et al. (2020) and classifying the Black-Box Optimization Benchmarking (BBOB) problems in Renau et al. (2021) . Recently, ELA has been applied not only to analyze the landscape characteristics of crash-worthiness optimization problems from automotive industry, but also to identify appropriate cheap-to-evaluate functions as representative of the expensive real-world problems (see Long et al. (2022) ). While ELA is well established in capturing the optimization landscape characteristics, we would like to raise our concerns regarding the following aspects.



); Dréo et al. (2019); Kerschke & Trautmann (2019a); Jankovic & Doerr (2020); Jankovic et al. (2021); Pikalov & Mironovich (2021)). In other words, the expected performance of an optimization algorithm on an unseen problem can be estimated, once the corresponding landscape characteristics have been identified. Interested readers are referred to Muñoz et al. (2015b;a); Kerschke et al. (2019); Kerschke & Trautmann (2019a); Malan (2021).

