EIGEN MEMORY TREES

Abstract

This work introduces the Eigen Memory Tree (EMT), a novel online memory model for sequential learning scenarios. EMTs store data at the leaves of a binary tree and route new samples through the structure using the principal components of previous experiences, facilitating efficient (logarithmic) access to relevant memories. We demonstrate that EMT outperforms existing online memory approaches, and provide a hybridized EMT-parametric algorithm that enjoys drastically improved performance over purely parametric methods with nearly no downsides. Our findings are validated using 206 datasets from the OpenML repository in both bounded and infinite memory budget situations.

1. INTRODUCTION

A sequential learning framework (also known as online or incremental learning (Hoi et al., 2021; Losing et al., 2018 )) considers a setting in which data instances x t ∈ R d arrive incrementally. After each instance, the agent is required to make a decision from a set of |A| possibilities, a t ∈ A. The agent then receives scalar feedback y t regarding the quality of the action, and the goal is for the agent to learn a mapping from x t to a t that maximizes the sum of all observed y t . This general paradigm accommodates a wide array of well-studied machine learning scenarios. For example, in online supervised learning, A is a set of labels-the agent is required to predict a label for each x t , and the feedback y t , indicates the quality of the prediction. In a contextual bandit or reinforcement learning setting, x t acts as a context or state, a t is an action, and y t corresponds to a reward provided by the environment. Contextual Bandits have proven useful in a wide variety of settings; their properties are extremely well studied (Langford & Zhang, 2007) and have tremendous theoretical and real-world applications (Bouneffouf et al., 2020) . Regardless of the particulars of the learning scenario, a primary consideration is sample complexity. That is, how can we obtain the highest-performing model given a fixed interaction budget? This often arises when agents only receive feedback corresponding to the chosen action a t , i.e. partial feedback. Here, after an interaction with the environment, the agent does not get access to what the best action in hindsight would have been. As a consequence, learners in a partial-feedback setting need to explore different actions even for a fixed x t in order to discover optimal behavior. Recent work in reinforcement learning has demonstrated that episodic memory mechanisms can facilitate more efficient learning (Lengyel & Dayan, 2007; Blundell et al., 2016; Pritzel et al., 2017; Hansen et al., 2018; Lin et al., 2018; Zhu et al., 2020) . Episodic memory (Tulving, 1972) refers to memory of specific past experiences (e.g., what did I have for breakfast yesterday). This is in contrast to semantic memory, which generalizes across many experiences (e.g., what is my favorite meal for breakfast). Semantic memory is functionally closer to parametric approaches to learning, which also rely on generalizations while discarding information about specific events or items. This paper investigates the use of episodic memory for accelerating learning in sequential problems. We introduce Eigen Memory Trees (EMT), a model that stores past observations in the leaves of a binary tree. Each leaf contains experiences that are somewhat similar to each other, and the EMT is structured such that new samples are routed through the tree based on the statistical properties of previously encountered data. When the EMT is queried with a new observation, this property affords an efficient way to compare it with only the most relevant memories. A learned "scoring" function w is used to identify the most salient memory in the leaf to be used for decision making.

