FINDING AND ONLY FINDING LOCAL NASH EQUILIB-RIA BY BOTH PRETENDING TO BE A FOLLOWER

Abstract

Finding Nash equilibria in two-player differentiable games is a classical problem in game theory with important relevance in machine learning. We propose double Follow-the-Ridge (double-FTR), an algorithm that locally converges to and only to local Nash equilibria in general-sum two-player differentiable games. To our knowledge, double-FTR is the first algorithm with such guarantees for general-sum games. Furthermore, we show that by varying its preconditioner, double-FTR leads to a broader family of algorithms with the same convergence guarantee. In addition, double-FTR avoids oscillation near equilibria due to the real-eigenvalues of its Jacobian at fixed points. Empirically, we validate the double-FTR algorithm on a range of simple zero-sum and general sum games, as well as simple Generative Adversarial Network (GAN) tasks.

1. INTRODUCTION

Much of the recent success in deep learning can be attributed to the effectiveness of gradient-based optimization. It is well-known that for a minimization problem, with appropriate choice of learning rates, gradient descent has convergence guarantee to local minima (Lee et al., 2016; 2019) . Based on this foundational result, an array of accelerated and higher-order methods have since been proposed and widely applied in training neural networks (Duchi et al., 2011; Kingma and Ba, 2014; Reddi et al., 2018; Zhang et al., 2019b) . However, once we leave the realm of minimization problems and consider the multi-agent setting, the optimization landscape becomes much more complicated. Multi-agent optimization problems arise in diverse fields such as robotics, economics and machine learning (Foerster et al., 2016; Von Neumann and Morgenstern, 2007; Goodfellow et al., 2014; Ben-Tal and Nemirovski, 2002; Gemp et al., 2020; Anil et al., 2021) . A classical abstraction that is especially relevant for machine learning is two-player differentiable games, where the objective is to find global or local Nash equilibria. The equivalent of gradient descent in such a game would be each agent applying gradient descent to minimize their own objective function. However, in stark contrast with gradient descent in solving minimization problems, this gradient-descent-style algorithm may converge to spurious critical points that are not local Nash equilibria, and in the general-sum game case, local Nash equilibria might not even be stable critical points for this algorithm (Mazumdar et al., 2020b)! These negative results have driven a surge of recent interest in developing other gradient-based algorithms for finding Nash equilibria in differentiable games. Among them is Mazumdar et al. (2019) , who proposed an update algorithm whose attracting critical points are only local Nash equilibria in the special case of zero-sum games. However, to the best of our knowledge, such guarantees have not been extended to general-sum games. We propose double Follow-the-Ridge (double-FTR), a gradient-based algorithm for general-sum differentiable games that locally converges to and only to differential Nash equilibria. Double-FTR is closely related to the Follow-the-Ridge (FTR) algorithm for Stackelberg games (Wang et al., 2019) , which converges to and only to local Stackelberg equilibria (Fiez et al., 2019) . Double-FTR can be viewed as its counterpart for simultaneous games, where each player adopts the "follower" strategy in FTR. The rest of this paper is organized as follows. In Section 2, we give background on two-player differentiable games and equilibrium concepts. We also explain the issues with using gradientdescent-style algorithm on such games. In Section 3, we present the double-FTR algorithm and prove its local convergence to and only to differential Nash equilibria. We also identify a more general class of algorithms that share these properties. We discuss recent works directly relevant to double-FTR in Section 4 and other related work in Section 5. In Section 6, we show empirical evidence of double-FTR's convergence to and only to local Nash equilibria.

2.1. TWO-PLAYER DIFFERENTIABLE GAMES AND EQUILIBRIUM CONCEPTS

In a general-sum two-player differentiable game, player 1 aims to minimize f : R n+m ! R with respect to x 2 R n , whereas player 2 aims to maximize g : R n+m ! R with respect to y 2 R m . Following the notation in Mazumdar et al. (2019) , we denote such the game as {(f, g), R n+m }. We also make the following assumption on the twice-differentiability of f and g. Assumption 1. 8 x 2 R n , y 2 R m , f and g are twice-differentiable, and the second derivatives are continuous. Also, r 2 xx f and r 2 yy g are invertible. For two rational, non-cooperative players, their optimal outcome is to achieve a local Nash equilibrium (Ratliff et al., 2013) . A point (x ⇤ , y ⇤ ) is a local Nash equilibrium 1 of {(f, g), R n+m } if there exists open sets S x ⇢ R n , S y ⇢ R m such that x ⇤ 2 S x , y ⇤ 2 S y , and f (x ⇤ , y ⇤ )  f (x, y ⇤ ), g(x ⇤ , y ⇤ ) g(x ⇤ , y), 8x 2 S x , 8y 2 S y . A closely related notion of equilibrium is the differential Nash equilibrium (DNE) (Ratliff et al., 2013) , which satisfies a second-order sufficient condition for local Nash equilibrium. Definition 2.1 (Differential Nash equilibrium). (x ⇤ , y ⇤ ) is a differential Nash equilibrium of {(f, g), R n+m } if the following two conditions hold: • r x f (x ⇤ , y ⇤ ) = 0 and r y g(x ⇤ , y ⇤ ) = 0. • r 2 xx f (x ⇤ , y ⇤ ) 0 and r 2 yy g(x ⇤ , y ⇤ ) 0. The conditions of DNE are slightly stronger than that of local Nash equilibria in that the second-order conditions are definite instead of semi-definite. In this paper, we focus on DNE, as they make up almost all local Nash equilibria in the mathematical sense, and are well-suited for the analysis of second-order algorithms.

2.2. ISSUES WITH GRADIENT-BASED ALGORITHMS

A natural strategy for agents to search for local Nash equilibria in a differentiable game is to use gradient-based algorithms. The simplest gradient-based algorithm is the gradient descent-ascent (GDA) (Ryu and Boyd, 2016; Zhang et al., 2021b ) (Algorithm 1) or its variants (Zhang et al., 2021a; Korpelevich, 1976; Mokhtari et al., 2020) . Algorithm 1 Gradient descent-ascent (GDA) Require: Number of iterations T , learning rate 1: for t = 1, . . . , T do 2: x t+1 = x t r x f (x t , y t ) 3: y t+1 = y t + r y g(x t , y t ) 4: end for Let z =  x y and > 0 be the learning rate, a gradient-based update algorithm can be written as: z t+1 = z t !(z t ). (1)



Note that local Nash equilibrium is not guaranteed to exist in nonconvex-nonconcave games((Jin et al., 2020), Proposition 6), although the (non-)existence of local NE is out of the scope of this paper.

