BOPTFORMER: BEYOND TRANSFORMER FOR BLACK-BOX OPTIMIZATION

Abstract

We design a novel Transformer for continuous unconstrained black-box optimization, called BOptformer. Inspired by the similarity between Vision Transformer and evolutionary algorithms (EAs), we modify Tansformer's multi-head self-attention layer, feed-forward network, and residual connection to implement the functions of crossover, mutation, and selection operators. Moreover, we devise an iterated mode to generate and survive potential solutions like EAs. BOptformer learns the optimization strategies from the target task automatically without human intervention, which addresses the poor generalization of human-designed EAs when given a new task. Compared to baselines, such as EAs, Bayesian optimization, and the learning-to-optimize (L2O) method, BOptformer shows the top performance in six black-box functions and two real-world applications. We also find that untrained BOptformer can achieve good performance on the simple tasks. Deep BOptformer performs better than shallow ones. We bring a new and efficient Transformer-based black-box optimization framework for the L2O and EA communities.

1. INTRODUCTION

Many tasks, such as neural architecture search (Elsken et al., 2019) and hyperparameter optimization (Hutter et al., 2019; Golovin et al., 2017) , can be abstracted as black-box optimization problems, which means that although we can evaluate f (x) for any x ∈ X, we have no access to any other information about f , such as the Hessian and gradients. A series of hand-designed algorithms, such as evolutionary algorithms (EAs) (Mitchell, 1998; Khadka & Tumer, 2018; Zhang & Li, 2007) , Bayesian optimization (Snoek et al., 2012; Mutny & Krause, 2018; Li et al., 2017; Kandasamy et al., 2015; Balandat et al., 2020) , and evolutionary strategies (ES) (Wierstra et al., 2014; Hansen & Ostermeier, 2001; Auger & Hansen, 2005; Salimans et al., 2017) , have been designed to solve black-box optimization. Recently, the learning to optimize (L2O) framework (Chen et al., 2022) gives an new insight on optimization by leveraging the recurrent neural network (RNN), long short-term memory architecture (LSTM) (Chen et al., 2020; Andrychowicz et al., 2016; Chen et al., 2017; Li & Malik, 2016; Wichrowska et al., 2017; Bello et al., 2017) or multilayer perceptron (MLP) (Metz et al., 2019) as the optimizer to develop optimization methods, aiming at reducing the laborious iterations of hand engineering (Sun et al., 2018; Vicol et al., 2021; Flennerhag et al., 2021; Li & Malik, 2016; Sun et al., 2018) . They don't concentrate on issues with black-box optimization. The core of L2O is constructing a strong mapping from the initial solutions to the optimal solution. Although several efforts like (Cao et al., 2019; Chen et al., 2017) have coped with the black-box problems, their effectiveness may be hindered by the limited representational capabilities of RNN, LSTM, and MLP. In EAs, the hand-designed crossover, mutation, and selection operators make the initial population move near the optimal solution. This updated model has stood the test of time. Because the evolutionary operators must be modified to maximize their performance on the target task, humandesigned EAs have a low generalization ability to a new black-box problem. Most notably, the limited use of target function information in EA design due to expert knowledge limitations makes it difficult to adapt to the target task. Learning the optimization strategies from the taget task is the key step to overcome this limitation. This paper designs a novel L2O framework based on the advantages of Vision Transformer (Dosovitskiy et al., 2021) and EAs to overcome the above limitations, termed BOptformer. Moreover,

