CHARACTERIZING LOOKAHEAD DYNAMICS OF SMOOTH GAMES

Abstract

As multi-agent systems proliferate in machine learning research, games have attracted much attention as a framework to understand optimization of multiple interacting objectives. However, a key challenge in game optimization is that, in general, there is no guarantee for usual gradient-based methods to converge to a local solution of the game. The latest work by Chavdarova et al. (2020) report that Lookahead optimizer (Zhang et al., 2019) significantly improves the performance of Generative Adversarial Networks (GANs) and reduces the rotational force of bilinear games. While promising, their observations were purely empirical, and Lookahead optimization of smooth games still lacks theoretical understanding. In this paper, we fill this gap by theoretically characterizing Lookahead dynamics of smooth games. We provide an intuitive geometric explanation on how and when Lookahead can improve game dynamics in terms of stability and convergence. Furthermore, we present sufficient conditions under which Lookahead optimization of bilinear games provably stabilizes or accelerates convergence to a Nash equilibrium of the game. Finally, we show that Lookahead optimizer preserves locally asymptotically stable equilibria of base dynamics, and can either stabilize or accelerate the local convergence to a given equilibrium with proper assumptions. We verify our theoretical predictions by conducting numerical experiments on two-player zero-sum (non-linear) games.

1. INTRODUCTION

Recently, a plethora of learning problems have been formulated as games between multiple interacting agents, including Generative Adversarial Networks (GANs) (Goodfellow et al., 2014; Brock et al., 2019; Karras et al., 2019 ), adversarial training (Goodfellow et al., 2015; Madry et al., 2018) , self-play (Silver et al., 2018; Bansal et al., 2018) , inverse reinforcement learning (RL) (Fu et al., 2018) and multi-agent RL (Lanctot et al., 2017; Vinyals et al., 2019) . However, the optimization of interdependent objectives is a non-trivial problem, in terms of both computational complexity (Daskalakis et al., 2006; Chen et al., 2009) and convergence to an equilibrium (Goodfellow, 2017; Mertikopoulos et al., 2018; Mescheder et al., 2018; Hsieh et al., 2020) . In particular, gradient-based optimization methods often fail to converge and oscillate around a (local) Nash equilibrium of the game even in a very simple setting (Mescheder et al., 2018; Daskalakis et al., 2018; Mertikopoulos et al., 2019; Gidel et al., 2019b; a) . To tackle such non-convergent game dynamics, a huge effort has been devoted to developing efficient optimization methods with nice convergence guarantees in smooth games (Mescheder et al., 2017; 2018; Daskalakis et al., 2018; Balduzzi et al., 2018; Gidel et al., 2019b; a; Schäfer & Anandkumar, 2019; Yazici et al., 2019; Loizou et al., 2020) . Meanwhile, Chavdarova et al. (2020) have recently reported that the Lookahead optimizer (Zhang et al., 2019) significantly improves the empirical performance of GANs and reduces the rotational force of a bilinear game dynamics. Specifically, they demonstrate that class-unconditional GANs trained by a Lookahead optimizer can outperform class-conditional BigGAN (Brock et al., 2019) trained by Adam (Kingma & Ba, 2015) even with a model of 1/30 parameters and negligible computation overheads. They also show that Lookahead optimization of a stochastic bilinear game tends to be more robust against large gradient variances than other popular first-order methods, and converges to a Nash equilibrium of the game where other methods fail. Despite its great promise, the study of Chavdarova et al. (2020) relied on purely empirical observations, and the dynamics of Lookahead game optimization still lacks theoretical understanding. Specifically, many open questions, such as the convergence properties of Lookahead dynamics and the impact of its hyperparameters on the convergence, remain unexplained. In this work, we fill this gap by theoretically characterizing the Lookahead dynamics of smooth games. Our contributions are summarized as follows: • We provide an intuitive geometric explanation on how and when Lookahead can improve the game dynamics in terms of stability and convergence to an equilibrium. • We analyze the convergence of Lookahead dynamics in bilinear games and present sufficient conditions under which the base dynamics can be either stabilized or accelerated. • We characterize the limit points of Lookahead dynamics in terms of their stability and local convergence rates. Specifically, we show that Lookahead (i) preserves locally asymptotically stable equilibria of base dynamics and (ii) can either stabilize or accelerate the local convergence to a given equilibrium by carefully choosing its hyperparameters. • Each of our theoretical predictions is verified with numerical experiments on two-player zero-sum (non-linear) smooth games.

2. PRELIMINARIES

We briefly review the objective of smooth game optimization, first-order game dynamics, and Lookahead optimizer. Finally, we discuss previous work on game optimization. We summarize the notations throughout this paper in Table A .1.

2.1. SMOOTH GAMES

Following Balduzzi et al. (2018) , a smooth game between players i = 1, . . . , n can be defined as a set of smooth scalar functions {f i } n i=1 with f i : R d → R such that d = n i=1 d i . Each f i represents the cost of player i's strategy x i ∈ R di with respect to other players' strategies x -i . The goal of this game optimization is finding a (local) Nash equilibrium of the game (Nash, 1951) , which is a strategy profile where no player has an unilateral incentive to change its own strategy. Definition 1 (Nash equilibrium). Let {f i } n i=1 be a smooth game with strategy spaces {R di } n i=1 such that d = n i=1 d i . Then x * ∈ R d is a local Nash equilibrium of the game if, for each i = 1, . . . , n, there is a neighborhood U i of x * i such that f i (x i , x * -i ) ≥ f i (x * ) holds for any x i ∈ U i . Such x * is said to be a global Nash equilibrium of the game when U i = R di for each i = 1, . . . , n. A straightforward computational approach to find a (local) Nash equilibrium of a smooth game is to carefully design a gradient-based strategy update rule for each player. Such update rules that define iterative plays between players are referred to as a dynamics of the game. Definition 2 (Dynamics of a game). A dynamics of a smooth game {f i } n i=1 indicates a differentiable operator F : R d → R d that describes players' iterative strategy updates as x (t+1) = F (x (t) ). One might expect that a simple myopic game dynamics, such as gradient descent, would suffice to find a (local) Nash equilibrium of a game as in traditional minimization problems. However, in general, gradient descent optimization of smooth games often fail to converge and oscillate around an equilibrium of the game (Daskalakis et al., 2018; Gidel et al., 2019b; a; Letcher et al., 2019) . Such non-convergent behavior of game dynamics is mainly due to (non-cooperative) interaction between multiple cost functions, and is considered as a key challenge in the game optimization (Mescheder et al., 2017; 2018; Mazumdar et al., 2019; Hsieh et al., 2020) .

2.2. FIRST-ORDER METHODS FOR SMOOTH GAME OPTIMIZATION

We introduce well-known first-order methods for smooth game optimization. To ease the notation, we use ∇ x f(•) to denote the concatenated partial derivatives (∇ x1 f 1 (•), . . . , ∇ xn f n (•)) of a smooth game {f i } n i=1 , where ∇ xi f i (•) is a partial derivative of a player i's cost function with respective to its own strategy.

