BLACK-BOX OPTIMIZATION REVISITED: IMPROVING ALGORITHM SELECTION WIZARDS THROUGH MAS-SIVE BENCHMARKING

Abstract

Existing studies in black-box optimization for machine learning suffer from low generalizability, caused by a typically selective choice of problem instances used for training and testing different optimization algorithms. Among other issues, this practice promotes overfitting and poor-performing user guidelines. To address this shortcoming, we propose in this work a benchmark suite, OptimSuite, which covers a broad range of black-box optimization problems, ranging from academic benchmarks to real-world applications, from discrete over numerical to mixed-integer problems, from small to very large-scale problems, from noisy over dynamic to static problems, etc. We demonstrate the advantages of such a broad collection by deriving from it Automated Black Box Optimizer (ABBO), a general-purpose algorithm selection wizard. Using three different types of algorithm selection techniques, ABBO achieves competitive performance on all benchmark suites. It significantly outperforms previous state of the art on some of them, including YABBOB and LSGO. ABBO relies on many high-quality base components. Its excellent performance is obtained without any task-specific parametrization. The benchmark collection, the ABBO wizard, its base solvers, as well as all experimental data are reproducible and open source in OptimSuite.

1. INTRODUCTION: STATE OF THE ART

Many real-world optimization challenges are black-box problems; i.e., instead of having an explicit problem formulation, they can only be accessed through the evaluation of solution candidates. These evaluations often require simulations or even physical experiments. Black-box optimization methods are particularly widespread in machine learning (Salimans et al., 2016; Wang et al., 2020) , to the point that it is considered a key research area of artificial intelligence. Black-box optimization algorithms are typically easy to implement and easy to adjust to different problem types. To achieve peak performance, however, proper algorithm selection and configuration are key, since black-box optimization algorithms have complementary strengths and weaknesses (Rice, 1976; Smith-Miles, 2009; Kotthoff, 2014; Bischl et al., 2016; Kerschke & Trautmann, 2018; Kerschke et al., 2018) . But whereas automated algorithm selection has become standard in SAT solving (Xu et al., 2008) and AI planning (Vallati et al., 2015) , a manual selection and configuration of the algorithms is still predominant in the broader black-box optimization context. To reduce the bias inherent to such manual choices, and to support the automation of algorithm selection and configuration, sound comparisons of the different black-box optimization approaches are needed. Existing benchmarking suites, however, are rather selective in the problems they cover. This leads to specialized algorithm frameworks whose performance suffer from poor generalizability. Addressing this flaw in black-box optimization, we present a unified benchmark collection which covers a previously unseen breadth of problem instances. We use this collection to develop a high-performing algorithm selection wizard, ABBO. ABBO uses high-level problem characteristics to select one or several algorithms, which are run for the allocated budget of function evaluations. Originally derived from a subset of the available benchmark collection, in particular YABBOB, the excellent performance of ABBO generalizes across almost all settings of our broad benchmark suite. Implemented as a fork of Nevergrad (Rapin & Teytaud, 2018) , the benchmark collection, the ABBO wizard, the base solvers, and all performance data are open source. The algorithms are automatically rerun at certain time intervals and all

