INDIVIDUALITY IN THE HIVE -LEARNING TO EMBED LIFETIME SOCIAL BEHAVIOR OF HONEY BEES

Abstract

Honey bees are a popular model for complex social systems, in which global behavior emerges from the actions and interactions of thousands of individuals. While the average life of a bee is organized as a sequence of tasks roughly determined by age, there is substantial variation at the individual level. For example, young bees can become foragers early in life, depending on the colony's needs. Using a unique dataset containing lifetime trajectories of all individuals over multiple generations in two honey bee colonies, we propose a new temporal matrix factorization model that jointly learns the average developmental path and structured variations of individuals in the social network over their entire lives. Our method yields inherently interpretable embeddings that are biologically plausible and consistent over time, which allows comparing individuals regardless of when or in which colony they lived. Our method provides a quantitative framework for understanding behavioral heterogeneity in complex social systems applicable in fields such as behavioral biology, social sciences, neuroscience, and information science.

1. INTRODUCTION

Animals living in large groups often coordinate their behaviors, resulting in emergent properties at the group level, from flocking birds to democratic elections. In most animal groups, the role an individual plays in this process is thought to be reflected in the way it interacts with group members. Technological advances have made it possible to track all individuals and their interactions in animal societies, ranging from social insects to primate groups (Mersch et al., 2013; Gernat et al., 2018; Mathis et al., 2018; Graving et al., 2019; Pereira et al., 2019) . These datasets have unprecedented scale and complexity, but understanding these data has emerged as a new and challenging problem in itself (Pinter-Wollman et al., 2014; Krause et al., 2015; Brask et al., 2020) . A popular approach to understand high-dimensional data is to learn semantic embeddings (Frome et al., 2013; Asgari & Mofrad, 2015; Camacho-Collados & Pilehvar, 2018; Nelson et al., 2019) . Such embeddings can be learned without supervision, are interpretable, and are useful for accomplishing downstream tasks. Individuals in animal societies can be described with semantic embeddings extracted from social interaction networks using matrix factorization methods. For example, in symmetric non-negative matrix factorization (SymNMF), the dot products of any two animals' factor vectors reconstruct the interaction matrix (Wang et al., 2011; Shi et al., 2015) , see Figure 1 a and b ). If the embeddings allow us to predict relevant behavioral properties, they serve our understanding as semantic representations. However, in temporal settings where the interaction matrices change over time, there is no straightforward extension of this algorithm. The interaction matrices at different time points can be factorized individually, but there is no guarantee that the embeddings stay semantically consistent over time, i.e. the prediction of relevant behavioral properties will deteriorate. In living systems, interaction dynamics are highly variable; individuals differ in when they appear in the data and how long they live. Different non-overlapping groups of individuals, e.g. from different years, may not interact with each other at all. How can we find a common semantic embedding even in these extreme cases? How do we learn embeddings that generalize to different groups and still provide insights into each individual's functional role? If animals take on roles partially determined by a common factor, such as age, how can we learn this dependency? Several approaches to extend NMF to temporal settings have been proposed in a variety of problem settings. Yu et al. (2016) and Mackevicius et al. (2019) propose a factorization method for time

