POISSON PROCESS FOR BAYESIAN OPTIMIZATION

Abstract

Bayesian Optimization (BO) is a sample-efficient, model-based method for optimizing black-box functions which can be expensive to evaluate. Traditionally, BO seeks a probabilistic surrogate model, such as Tree-structured Parzen Estimator (TPE), Sequential Model Algorithm Configuration (SMAC), and Gaussian process (GP), based on the exact observed values. However, compared to the value response, relative ranking is harder to be disrupted due to noise resulting in better robustness. Moreover, it has better practicality when the exact value responses are intractable, but information about candidate preferences can be acquired. This work introduces an efficient BO framework, namely Poisson Process Bayesian Optimization (PoPBO), consisting of a novel ranking-based response surface based on Poisson process and two acquisition functions to accommodate the proposed surrogate model. We show empirically that PoPBO improves efficacy and efficiency on both simulated and real-world benchmarks, including HPO and NAS.

1. INTRODUCTION

Bayesian optimization (BO) (Mockus et al., 1978 ) is a popular black-box optimization paradigm and has achieved great success in a number of challenging fields, such as robotic control (Calandra et al., 2016 ), biology (González et al., 2015) , and hyperparameter tuning for complex learning tasks (Bergstra et al., 2011) . A standard BO routine usually consists of two steps: (1) Learning a probabilistic response surface that captures the distribution of an unknown function f (x); (2) Optimizing an acquisition function that suggests the most valuable points for the next query iteration. Popular response surface for the first step includes Random Forest (SMAC) (Hutter et al., 2011) , Treestructure Parzen Estimator (TPE) (Bergstra et al., 2011) , Gaussian Process (GP) (Snoek et al., 2012) and Bayesian Neural Network (BNN) (Springenberg et al., 2016; Snoek et al., 2015) . Acquisition functions for the second step include Expected Improvement (EI) (Mockus, 1994) , Thompson Sampling (TS) (Chapelle & Li, 2011; Agrawal & Goyal, 2013) and Upper/Lower Confidence Bound (UCB/LCB) (Srinivas et al., 2012) , which are designed to trade off exploration and exploitation. Most of the existing BO methods (Bergstra et al., 2011; Hutter et al., 2011; Snoek et al., 2012) adopt absolute response surfacesfoot_0 that attempt to fit the black-box function based on the observed absolute function values. However, such an absolute metric can have the following disadvantages. 1) Absolute response can be difficult to obtain or even unavailable in some practical scenarios, such as sports games and recommender systems where only relative evaluationfoot_1 can be provided by pairwise comparison (He et al., 2022) . 2) Absolute response can be sensitive to noise, which is also pointed out by Rosset et al. (2005) . Such an issue will affect the performance of BO in real-world scenarios, where absolute responses are usually noisy. 3) It can be challenging to directly transfer absolute response surfaces. In particular, multi-fidelity metrics usually have different absolute responses for the same candidate, making it hard to utilize history observations on a coarse-fidelity metric to warm up the training of surrogate models on a fine-grained-fidelity one. Similarly, in hyperparameter optimization (HPO) and neural architecture search (NAS) tasks, performance on different datasets of the same hyperparameter selection or neural architectures is also different and is hard to be transferred across datasets.



In this work, 'absolute evaluation (response)' of one query is defined as its exact black-box function value. In this work, 'relative evaluation (response)' of one query is defined as its ranking, which can be computed by comparing with other candidates.

