A RISK-AVERSE EQUILIBRIUM FOR MULTI-AGENT SYSTEMS Anonymous

Abstract

In multi-agent systems, intelligent agents are tasked with making decisions that lead to optimal outcomes when actions of the other agents are as expected, whilst also being prepared for their unexpected behaviour. In this work, we introduce a novel risk-averse solution concept that allows the learner to accommodate low probability actions by finding the strategy with minimum variance, given any level of expected utility. We first prove the existence of such a risk-averse equilibrium, and propose one fictitious-play type learning algorithm for smaller games that enjoys provable convergence guarantees in games classes including zero-sum and potential. Furthermore, we propose an approximation method for larger games based on iterative population-based training that generates a population of riskaverse agents. Empirically, our equilibrium is shown to be able to reduce the utility variance, specifically in the sense that other agents' low probability behaviour is better accounted for by our equilibrium in comparison to playing other solutions. Importantly, we show that our population of agents that approximate a risk-averse equilibrium is particularly effective against unseen opposing populations, especially in the case of guaranteeing a minimum level of performance, which is critical to safety-aware multi-agent systems.

1. INTRODUCTION

Game Theory (GT) has become an important analytical tool in solving Machine Learning (ML) problems; the idea of "gamification" has become popular in recent years (Wellman, 2006; Lanctot et al., 2017) particularly in multi-agent systems research. The importance of risk-aversion in the single-agent decision making literature (Zhang et al., 2020; Mihatsch & Neuneier, 2002; Chow et al., 2017) is obvious, whilst there still exist many open questions in the current game theory research domain. This paper aims to add to the current research in the multi-agent strategic decision-making literature based on the notion of risk-aversion through the lens of a new equilibrium concept. One reason that risk-aversion is important is that multi-agent interaction is rife with strategic uncertainty; this is because performance doesn't solely depend on ones own action. It is rarely the case that one will have certainty over the execution and the strategy of the opponent in situations ranging from board games to economic negotiations (Calford, 2020) . This presents a dilemma for autonomous decision-makers in human-AI interaction as one can no longer rely on perfect execution or complete strategy knowledge. Therefore, an important issue is what happens when actors take dangerous low probability actions such that could be considered as mistakes. These issues in play can arise in an array of circumstances, from misunderstandings of reward structures to execution fatigue, leading to the execution of an unexpected pure strategy. Hedging against unexpected play is important for the agents as otherwise it can lead to large costs. As demonstrated in Fig. (1), a mistake in the execution of the pure-strategy Nash equilibrium (NE) could lead to both cars overtaking and crashing into each other, a negative yet critical outcome in multi-agent system. Traditional equilibrium solutions in GT (e.g. NE, Trembling Hand Perfect Equilibrium (THPE) (Bielefeld, 1988) ) lack the ability to handle this style of risk as either: 1) they assume strategies are executed perfectly, and/or, 2) large costs may be undervalued if there is a low probability attached to them. We address these by introducing a new framework for studying risk in multi-agent systems through mean-variance analysis. In our framework, strategies are evaluated both in terms of expected utility against the opponent, but also the potential utility variance if the opponent played 1

