TOWARDS CONVERGENCE TO NASH EQUILIBRIA IN TWO-TEAM ZERO-SUM GAMES

Abstract

Contemporary applications of machine learning in two-team e-sports and the superior expressivity of multi-agent generative adversarial networks raise important and overlooked theoretical questions regarding optimization in two-team games. Formally, two-team zero-sum games are defined as multi-player games where players are split into two competing sets of agents, each experiencing a utility identical to that of their teammates and opposite to that of the opposing team. We focus on the solution concept of Nash equilibria (NE). We first show that computing NE for this class of games is hard for the complexity class CLS. To further examine the capabilities of online learning algorithms in games with fullinformation feedback, we propose a benchmark of a simple -yet nontrivialfamily of such games. These games do not enjoy the properties used to prove convergence for relevant algorithms. In particular, we use a dynamical systems perspective to demonstrate that gradient descent-ascent, its optimistic variant, optimistic multiplicative weights update, and extra gradient fail to converge (even locally) to a Nash equilibrium. On a brighter note, we propose a first-order method that leverages control theory techniques and under some conditions enjoys lastiterate local convergence to a Nash equilibrium. We also believe our proposed method is of independent interest for general min-max optimization.

1. INTRODUCTION

Online learning shares an enduring relationship with game theory that has a very early onset dating back to the analysis of fictitious play by (Robinson, 1951 ) and Blackwell's approachability theorem (Blackwell, 1956) . A key question within this context is whether self-interested agents can arrive at a game-theoretic equilibrium in an independent and decentralized manner with only limited feedback from their environment. Learning dynamics that converge to different notions of equilibria are known to exist for two-player zero-sum games (Robinson, 1951; Arora et al., 2012; Daskalakis et al., 2011 ), potential games (Monderer & Shapley, 1996) , near-potential games (Anagnostides et al., 2022b) , socially concave games (Golowich et al., 2020) , and extensive form games (Anagnostides et al., 2022a) . We try to push the boundary further and explore whether equilibria -in particular, Nash equilibria-can be reached by agents that follow decentralized learning algorithms in two-team zero-sum games. Team competition has played a central role in the development of game theory (Marschak, 1955; von Stengel & Koller, 1997; Bacharach, 1999; Gold, 2005) , economics (Marschak, 1955; Gottinger, 1974) , and evolutionary biology (Nagylaki, 1993; Nowak et al., 2004) . Recently, competition among teams has attracted the interest of the machine learning community due to the advances that multiagent systems have accomplished: e.g., multi-GAN's (Hoang et al., 2017; Hardy et al., 2019) for generative tasks, adversarial regression with multiple learners (Tong et al., 2018) , or AI agents competing in e-sports (e.g., CTF (Jaderberg et al., 2019) or Starcraft (Vinyals et al., 2019) ) as well as card games (Moravčík et al., 2017; Brown & Sandholm, 2018; Bowling et al., 2015) . Our class of games. We turn our attention to two-team zero-sum games a quite general class of min-max optimization problems that include bilinear games and a wide range of nonconvexnonconcave games as well. In this class of games, players fall into two teams of size n, m and submit their own randomized strategy vectors independently. We note that the games that we focus on are

