LEARNING TO COMMUNICATE THROUGH IMAGINA-TION WITH MODEL-BASED DEEP MULTI-AGENT REIN-FORCEMENT LEARNING

Abstract

The human imagination is an integral component of our intelligence. Furthermore, the core utility of our imagination is deeply coupled with communication. Language, argued to have been developed through complex interaction within growing collective societies serves as an instruction to the imagination, giving us the ability to share abstract mental representations and perform joint spatiotemporal planning. In this paper, we explore communication through imagination with multi-agent reinforcement learning. Specifically, we develop a model-based approach where agents jointly plan through recurrent communication of their respective predictions of the future. Each agent has access to a learned world model capable of producing model rollouts of future states and predicted rewards, conditioned on the actions sampled from the agent's policy. These rollouts are then encoded into messages and used to learn a communication protocol during training via differentiable message passing. We highlight the benefits of our model-based approach, compared to a set of strong baselines, by developing a set of specialised experiments using novel as well as well-known multi-agent environments.

1. INTRODUCTION

"We use imagination in our ordinary perception of the world. This perception cannot be separated from interpretation." (Warnock, 1976) . The human brain, and the mind that emerges from its working, is currently our best example of a general purpose intelligent learning system. And our ability to imagine, is an integral part of it (Abraham, 2020). The imagination is furthermore intimately connected to other parts of our cognition such as our use of language (Shulman, 2012). In fact, Dor (2015) argues that: "The functional specificity of language lies in the very particular functional strategy it employs. It is dedicated to the systematic instruction of imagination: we use it to communicate directly with our interlocutors' imaginations." However, the origin of language resides not only in individual cognition, but in society (Von Humboldt, 1999) , grounded in part through interpersonal experience (Bisk et al., 2020) . The complexity of the world necessitates our use of individual mental models (Forrester, 1971) , to store abstract representations of the information we perceive through the direct experiences of our senses (Chang and Tsao, 2017). As society expanded, the sharing of direct experiences within groups reached its limit. Growing societies could only continue to function through the invention of language, a unique and effective communication protocol where a sender's coded message of abstract mental representations delivered through speech, could serve as a direct instruction to the receiver's imagination (Dor, 2015) . Therefore, the combination of language and imagination gave us the ability to solve complex tasks by performing abstract reasoning (Perkins, 1985) and joint spatiotemporal planning (Reuland, 2010) . In this work, we explore a plausible learning system architecture for the development of an artificial multi-agent communication protocol of the imagination. Based on the above discussion, the minimum set of required features of such a system include: (1) that it be constructed from multiple individual agents where, (2) each agent possesses an abstract model of the world that can serve as an imagination, (3) has access to a communication medium, or channel, and (4) jointly learns and interacts in a 1

