MODEL-BASED CAUSAL BAYESIAN OPTIMIZATION

Abstract

How should we intervene on an unknown structural causal model to maximize a downstream variable of interest? This optimization of the output of a system of interconnected variables, also known as causal Bayesian optimization (CBO), has important applications in medicine, ecology, and manufacturing. Standard Bayesian optimization algorithms fail to effectively leverage the underlying causal structure. Existing CBO approaches assume noiseless measurements and do not come with guarantees. We propose model-based causal Bayesian optimization (MCBO), an algorithm that learns a full system model instead of only modeling intervention-reward pairs. MCBO propagates epistemic uncertainty about the causal mechanisms through the graph and trades off exploration and exploitation via the optimism principle. We bound its cumulative regret, and obtain the first non-asymptotic bounds for CBO. Unlike in standard Bayesian optimization, our acquisition function cannot be evaluated in closed form, so we show how the reparameterization trick can be used to apply gradient-based optimizers. Empirically we find that MCBO compares favorably with existing state-of-the-art approaches.

1. INTRODUCTION

Many applications, such as drug and material discovery, robotics, agriculture, and automated machine learning, require optimizing an unknown function that is expensive to evaluate. Bayesian optimization (BO) is an efficient framework for sequential optimization of such objectives (Močkus, 1975) . The key idea in BO is to quantify uncertainty in the unknown function via a probabilistic model, and then use this to navigate a trade-off between selecting inputs where the function output is favourable (exploitation) and selecting inputs to learn more about the function in areas of uncertainty (exploration). While most standard BO methods focus on a black-box setup (Figure 1 a ), in practice, we often have more structure on the unknown function that can be used to improve sample efficiency. In this paper, we exploit structural knowledge in the form of a causal graph specified by a directed acyclic graph (DAG). In particular, we assume that actions can be modeled as interventions on a structural causal model (SCM) (Pearl, 2009) that contains the reward (function output) as a variable (Figure 1 b ). While we assume the graph structure to be known, we consider the functional relations in the SCM as unknown. All variables in the SCM are observed along with the reward after each action. This Causal BO setting has important potential applications, such as optimizing medical and ecological interventions (Aglietti et al., 2020b) . For illustrative purposes, consider the example of an agronomist trying to find the optimal Nitrogen fertilizer schedule for maximizing crop yield, described in Figure 1 . There, the concentration of Nitrogen in the soil causes its concentration in the soil at the later timesteps. To exploit the causal graph structure for optimization, we propose model-based causal Bayesian optimization (MCBO). MCBO explicitly models the full SCM and the accompanying uncertainty of all SCM components. This allows our algorithm to select interventions based on an optimistic strategy similar to that used by the upper confidence bound algorithm (Srinivas et al., 2010) . We show that this strategy leads to the first CBO algorithm with a cumulative regret guarantee. For a practical algorithm, maximizing the upper confidence bound in our setting is computationally more difficult, because uncertainty in all system components must be propagated through the entire estimated SCM to the reward variable. We show that an application of the reparameterization trick allows MCBO to be practically implemented with common gradient-based optimizers. Empirically,

