EMERGENT COLLECTIVE INTELLIGENCE FROM MASSIVE-AGENT COOPERATION AND COMPETITION Anonymous authors Paper under double-blind review

Abstract

Inspired by organisms evolving through cooperation and competition between different populations on Earth, we study the emergence of artificial collective intelligence through massive-agent reinforcement learning. To this end, We propose a new massive-agent reinforcement learning environment, Lux, where dynamic and massive agents in two teams scramble for limited resources and fight off the darkness. In Lux, we build our agents through the standard reinforcement learning algorithm in curriculum learning phases and leverage centralized control via a pixel-to-pixel policy network. As agents co-evolve through self-play, we observe several stages of intelligence, from the acquisition of atomic skills to the development of group strategies. Since these learned group strategies arise from individual decisions without an explicit coordination mechanism, we claim that artificial collective intelligence emerges from massive-agent cooperation and competition. We further analyze the emergence of various learned strategies through metrics and ablation studies, aiming to provide insights for reinforcement learning implementations in massive-agent environments.

1. INTRODUCTION

Complex group and social behaviors widely exist in humans and animals on Earth. In a vast ecosystem, the simultaneous cooperation and competition between populations and the changing environment serve as a natural driving force for the co-evolution of massive numbers of organisms (Wolpert & Tumer, 1999; Dawkins & Krebs, 1979) . This large-scale co-evolution between populations has enabled group strategies for tasks individuals cannot accomplish (Ha & Tang, 2022) . Inspired by this self-organizing mechanism in nature, i.e., collective intelligence emerges from massive-agent cooperation and competition, we propose to simulate the emergence of collective intelligence through training reinforcement learning agents in a massive-agent environment. We hope this can become a stepping stone toward massive-agent reinforcement learning research and an inspiring method for complex massive-agent problems. Recent progress in multi-agent reinforcement learning (MARL) demonstrates its potential to complete complex tasks through multi-agent cooperation, such as playing StarCraft2 (Vinyals et al., 2019) and DOTA2 (Berner et al., 2019) . However, the number of agents is still limited to dozens in those scenarios, far away from natural populations. To support large-scale multi-agent cooperation and competition, we reintroduce the massive-agent setting into multi-agent reinforcement learning. To this end, we propose Lux, a cooperative and competitive environment where hundreds of agents in two populations scramble for limited resources and fight off the darkness. We believe Lux is a suitable testbench for experimenting with collective intelligence because it provides an open environment for hundreds of agents to cooperate, compete and evolve. From the algorithmic perspective, the massive-agent setting poses great difficulties to reinforcement learning algorithms since the credit assignment problem becomes increasingly challenging. Some research (Lowe et al., 2017) focuses on the credit assignment problem between multi-agents, however, it lacks the scalability to massive-agent scenarios. To overcome that, we present a centralized control solution for Lux using a pixel-to-pixel modeling architecture (Han et al., 2019) coupled with Proximal Policy Optimization (PPO) (Schulman et al., 2017) algorithm. Using that solution, we avoid the problem of credit assignment, with up to a 90% win rate versus the state-of-the-art policy

