SOLVING MIN-MAX OPTIMIZATION WITH HIDDEN STRUCTURE VIA GRADIENT DESCENT ASCENT

Abstract

Many recent AI architectures are inspired by zero-sum games, however, the behavior of their dynamics is still not well understood. Inspired by this, we study standard gradient descent ascent (GDA) dynamics in a specific class of non-convex non-concave zero-sum games, that we call hidden zero-sum games. In this class, players control the inputs of smooth but possibly non-linear functions whose outputs are being applied as inputs to a convex-concave game. Unlike general zerosum games, these games have a well-defined notion of solution; outcomes that implement the von-Neumann equilibrium of the "hidden" convex-concave game. We prove that if the hidden game is strictly convex-concave then vanilla GDA converges not merely to local Nash, but typically to the von-Neumann solution. If the game lacks strict convexity properties, GDA may fail to converge to any equilibrium, however, by applying standard regularization techniques we can prove convergence to a von-Neumann solution of a slightly perturbed zero-sum game. Our convergence guarantees are non-local, which as far as we know is a firstof-its-kind type of result in non-convex non-concave games. Finally, we discuss connections of our framework with generative adversarial networks.

1. INTRODUCTION

Traditionally, our understanding of convex-concave games revolves around von Neumann's celebrated minimax theorem, which implies the existence of saddle point solutions with a uniquely defined value. Although many learning algorithms are known to be able to compute such saddle points (Cesa-Bianchi & Lugoisi, 2006) , recently there has there has been a fervor of activity in proving stronger results such as faster regret minimization rates or analysis of the day-to-day behavior (Mertikopoulos et al., 2018; Daskalakis et al., 2018; Bailey & Piliouras, 2018; Abernethy et al., 2018; Wang & Abernethy, 2018; Daskalakis & Panageas, 2019; Abernethy et al., 2019; Mertikopoulos et al., 2019; Bailey & Piliouras, 2019; Gidel et al., 2019; Zhang & Yu, 2019; Hsieh et al., 2019; Bailey et al., 2020; Mokhtari et al., 2020; Hsieh et al., 2020; Pérolat et al., 2020) . This interest has been largely triggered by the impressive successes of AI architectures inspired by min-max games such as Generative Adversarial Networks (GANS) (Goodfellow et al., 2014a ), adversarial training (Madry et al., 2018) and reinforcement learning self-play in games (Silver et al., 2017) . Critically, however, all these applications are based upon non-convex non-concave games, our understanding of which is still nascent. Nevertheless, some important early work in the area has focused on identifying new solution concepts that are widely applicable in general min-max games, such as (local/differential) Nash equilibrium (Adolphs et al., 2019; Mazumdar & Ratliff, 2019 ), local minmax (Daskalakis & Panageas, 2018 ), local minimax (Jin et al., 2019) , (local/differential) Stackleberg equilibrium (Fiez et al., 2020 ), local robust point (Zhang et al., 2020) . The plethora of solutions concepts is perhaps suggestive that "solving" general min-max games unequivocally may be too ambitious a task. Attraction to spurious fixed points (Daskalakis & Panageas, 2018) , cycles (Vlatakis-Gkaragkounis et al., 2019) , robustly chaotic behavior (Cheung & Piliouras, 2019; Cheung & Piliouras, 2020) and computational hardness issues (Daskalakis et al., 2020) all suggest that general min-max games might inherently involve messy, unpredictable and complex behavior. Are there rich classes of non-convex non-concave games with an effectively unique game theoretic solution that is selected by standard optimization dynamics (e.g. gradient descent)? Our class of games. We will define a general class of min-max optimization problems, where each agent selects its own vectors of parameters which are then processed separately by smooth functions. Each agent receives their respective payoff after entering the outputs of the processed decision vectors as inputs to a standard convex-concave game. Formally, there exist functions F : R N → X ⊂ R n and G : R M → Y ⊂ R m and a continuous convex-concave function L : X × Y → R, such that the min-max game is min θ θ θ∈R N max φ φ φ∈R M L(F(θ θ θ), G(φ φ φ)). (Hidden Convex-Concave (HCC)) We call this class of min-max problems Hidden Convex-Concave Games. It generalizes the recently defined hidden bilinear games of Vlatakis-Gkaragkounis et al. (2019) . Our solution concept. Out of all the local Nash equilibria of HCC games, there exists a special subclass, the vectors (θ θ θ * , φ φ φ * ) that implement the von Neumann solution of the convex-concave game. This solution has a strong and intuitive game theoretic justification. Indeed, it is stable even if the agents could perform arbitrary deviations directly on the output spaces X, Y . These parameter combinations (θ θ θ * , φ φ φ * ) "solve" the "hidden" convex-concave L and thus we call them von Neumann solutions. Naturally, HCCs will typically have numerous local saddle/Nash equilibria/fixed points that do not satisfy this property. Instead, they correspond to stationary points of the F, G where their output is stuck, e.g., due to an unfortunate initialization. At these points the agents may be receiving payoffs which can be arbitrarily smaller/larger than the game theoretic value of game L. Fortunately, we show that Gradient Descent Ascent (GDA) strongly favors von Neumann solutions over generic fixed points. Our results. In this work, we study the behavior of continuous GDA dynamics for the class of HCC games where each coordinate of F, G is controlled by disjoint sets of variables. In a nutshell, we show that GDA trajectories stabilize around or converge to the corresponding von Neumann solutions of the hidden game. Despite restricting our attention to a subset of HCC games, our analysis has to overcome unique hurdles not shared by standard convex concave games. Challenges of HCC games. In convex-concave games, deriving the stability of the von Neumann solutions relies on the Euclidean distance from the equilibrium being a Lyapunov function. In contrast, in HCC games where optimization happens in the parameter space of θ θ θ, φ φ φ, the non-linear nature of F, G distorts the convex-concave landscape in the output space. Thus, the Euclidean distance will not be in general a Lyapunov function. Moreover, the existence of any Lyapunov function for the trajectories in the output space of F, G does not translate to a well-defined function in the parameter space (unless F, G are trivial, invertible maps). Worse yet, even if L has a unique solution in the output space, this solution could be implemented by multiple equilibria in the parameter space and thus each of them can not be individually globally attracting. Clearly any transfer of stability or convergence properties from the output to the parameter space needs to be initialization dependent. Lyapunov Stability. Our first step is to construct an initialization-dependent Lyapunov function that accounts for the curvature induced by the operators F and G (Lemma 2). Leveraging a potentially infinite number of initialization-dependent Lyapunov functions in Theorem 4 we prove that under mild assumptions the outputs of F, G stabilize around the von Neumann solution of L. Convergence. Mirroring convex concave games, we require strict convexity or concavity of L to provide convergence guarantees to von Neumann solutions (Theorem 5). Barring initializations where von Neumann solutions are not reachable due to the limitations imposed by F and G, the set of von Neumann solutions are globally asymptotically stable (Corollary 1). Even in non-strict HCC games, we can add regularization terms to make L strictly convex concave. Small amounts of regularization allows for convergence without significantly perturbing the von Neumann solution (Theorem 6) while increasing regularization enables exponentially faster convergence rates (Theorem 7). Organization. In Section 2 we provide some preliminary notation, the definition of our model and some useful technical lemmas. Section 3 is devoted to the presentation of our the main results. Section 4 discusses applications of our framework to specific GAN formulations. Section 5 concludes our work with a discussion of future directions and challenges. We defer the full proofs of our results as well as further discussion on applications to the Appendix.

