PETTINGZOO: GYM FOR MULTI-AGENT REINFORCE-MENT LEARNING

Abstract

This paper introduces PettingZoo, a library of diverse sets of multi-agent environments under a single elegant Python API. PettingZoo was developed with the goal of accelerating research in multi-agent reinforcement learning, by creating a set of benchmark environments easily accessible to all researchers and a standardized API for the field. This goal is inspired by what OpenAI's Gym library did for accelerating research in single-agent reinforcement learning, and PettingZoo draws heavily from Gym in terms of API and user experience. PettingZoo is unique from other multi-agent environment libraries in that it's API is based on the model of Agent Environment Cycle ("AEC") games, which allows for the sensible representation of all varieties of games under one API for the first time. While retaining a very simple and Gym-like API, PettingZoo still allows access to low-level environment properties required by non-traditional learning methods.

1. INTRODUCTION

Reinforcement Learning ("RL") considers learning a policy -a function that takes in an observation from an environment and emits an action -that achieves the maximum expected discounted reward when acting in an environment, and it's capabilities have been one of the great success of modern machine learning. Multi-Agent Reinforcement Learning (MARL) in particular has been behind many of the most publicized achievements of modern machine learning -AlphaGo Zero (Silver et al., 2017 ), OpenAI Five (OpenAI, 2018) , AlphaStar (Vinyals et al., 2019) -and has seen a boom in recent years. However, popular benchmark environments are scattered across many different locations (or made from scratch), are based around heterogeneous APIs, and are often in unmaintained states. Because of this, highly influential research in the field is generally restricted to institutions with dedicated engineering teams, research into new methods generally aren't compared in like environments, and progress has been slow compared to single agent reinforcement learning (though this obviously cannot be attributed to benchmarks alone). Motivated by this, we introduce PettingZoo -a Python library collecting maintained versions of all popular MARL environments under a single simple Python API similar to that of OpenAI's Gym library. It's available on PyPI and can be installed via pip install pettingzoo.

2. A TALE OF TOO MANY LIBRARIES

OpenAI Gym (Brockman et al., 2016) was introduced shortly after the potential of reinforcement learning became widely known with Mnih et al. (2015) . At the time, doing basic research in reinforcement learning was a large engineering challenge. The most popular set of environments were Atari games as part of the Arcade Learning Environment ("ALE") (Bellemare et al., 2013) . The ALE originally was challenging to compile and install, and had an involved C API and later an unofficial fork with a Python wrapper (Goodrich, 2015) . A scattering of other environments existed as independent projects, in various languages, all with unique APIs. This level of heterogeneity meant that reinforcement learning code had to be adapted to every environment (including bridging programming languages). Accordingly, standardized reinforcement learning implementations weren't possible, comparisons against a wide variety of environments were very difficult, and doing simple research in reinforcement learning was generally restricted to organizations with software engineering divisions. Gym was created to promote research in reinforcement learning by making comprehensive

