EIGEN MEMORY TREES

Abstract

This work introduces the Eigen Memory Tree (EMT), a novel online memory model for sequential learning scenarios. EMTs store data at the leaves of a binary tree and route new samples through the structure using the principal components of previous experiences, facilitating efficient (logarithmic) access to relevant memories. We demonstrate that EMT outperforms existing online memory approaches, and provide a hybridized EMT-parametric algorithm that enjoys drastically improved performance over purely parametric methods with nearly no downsides. Our findings are validated using 206 datasets from the OpenML repository in both bounded and infinite memory budget situations.

1. INTRODUCTION

A sequential learning framework (also known as online or incremental learning (Hoi et al., 2021; Losing et al., 2018 )) considers a setting in which data instances x t ∈ R d arrive incrementally. After each instance, the agent is required to make a decision from a set of |A| possibilities, a t ∈ A. The agent then receives scalar feedback y t regarding the quality of the action, and the goal is for the agent to learn a mapping from x t to a t that maximizes the sum of all observed y t . This general paradigm accommodates a wide array of well-studied machine learning scenarios. For example, in online supervised learning, A is a set of labels-the agent is required to predict a label for each x t , and the feedback y t , indicates the quality of the prediction. In a contextual bandit or reinforcement learning setting, x t acts as a context or state, a t is an action, and y t corresponds to a reward provided by the environment. Contextual Bandits have proven useful in a wide variety of settings; their properties are extremely well studied (Langford & Zhang, 2007) and have tremendous theoretical and real-world applications (Bouneffouf et al., 2020) . Regardless of the particulars of the learning scenario, a primary consideration is sample complexity. That is, how can we obtain the highest-performing model given a fixed interaction budget? This often arises when agents only receive feedback corresponding to the chosen action a t , i.e. partial feedback. Here, after an interaction with the environment, the agent does not get access to what the best action in hindsight would have been. As a consequence, learners in a partial-feedback setting need to explore different actions even for a fixed x t in order to discover optimal behavior. Recent work in reinforcement learning has demonstrated that episodic memory mechanisms can facilitate more efficient learning (Lengyel & Dayan, 2007; Blundell et al., 2016; Pritzel et al., 2017; Hansen et al., 2018; Lin et al., 2018; Zhu et al., 2020) . Episodic memory (Tulving, 1972) refers to memory of specific past experiences (e.g., what did I have for breakfast yesterday). This is in contrast to semantic memory, which generalizes across many experiences (e.g., what is my favorite meal for breakfast). Semantic memory is functionally closer to parametric approaches to learning, which also rely on generalizations while discarding information about specific events or items. This paper investigates the use of episodic memory for accelerating learning in sequential problems. We introduce Eigen Memory Trees (EMT), a model that stores past observations in the leaves of a binary tree. Each leaf contains experiences that are somewhat similar to each other, and the EMT is structured such that new samples are routed through the tree based on the statistical properties of previously encountered data. When the EMT is queried with a new observation, this property affords an efficient way to compare it with only the most relevant memories. A learned "scoring" function w is used to identify the most salient memory in the leaf to be used for decision making. • introduces the Eigen Memory Tree, an efficient tool for storing, accessing, and comparing memories to current observations (see Figure 1 ). • shows that the EMT gives drastically improved performance over comparable episodic memory data structures, and sometimes even outperforms parametric approaches that have no explicit memory mechanism. • proposes a simple combination EMT-parametric (PEMT) approach, which outperforms both purely parametric and EMT methods with nearly no downsides (see Table 1 ). In the following section, we introduce the Eigen Memory Tree and overview the algorithms required for storing and retrieving memories. A schematic of EMT and its high-level algorithms can be seen in Figure 1 . With this in mind, Section 3 describes related work. Section 4 follows with an exhaustive set of experiments, demonstrating the superiority of EMT to previous episodic memory models and motivating the EMT-parametric (PEMT) method, a simple and powerful hybrid strategy for obtaining competitive performance on sequential learning problems. Importantly, we show that the PEMT performance advantage holds even when it is constrained to have a fixed limit on the number of memories it is allowed to store. All experiments here consider the contextual bandit setting, but EMTs are applicable in broader domains as well. This section continually references Table 1 , identifying which methods outperform which other methods by a statistically significant amount over all datasets and replicates. We consider a substantial number of datasets from the OpenML (Vanschoren et al., 2014) repository. Section 5 summarizes our findings, discusses limitations, and overviews directions for future work.

2. EIGEN MEMORY TREE

As with episodic memory, EMT is structured around storing and retrieving exact memories. We formalize this notion as the self-consistency property: if a memory has been previously Inserted, then a Query with the same key should return the previously inserted value. The self-consistency property encodes the assumption that the optimal memory to return, if possible, is an exact match for the current observation. EMT is a memory model with four key characteristics: (1) self-consistency, (2) incremental memory growth, (3) incremental query improvement via supervised feedback, and (4) sub-linear computational complexity with respect to the number of memories in the tree. As we discuss in the literature review below, this combination of characteristics separates EMT from previous approaches. Memory. EMT represents memories as a mapping from keys to values, M := R d → R, where d is the dimensionality of the context x t , and y t ∈ R corresponds to observed feedback. A query to the EMT requires an x t and returns a previously observed value ŷ ∈ R from its bank of memories. EMT learning, which updates both the underlying data structure and the scoring mechanism, requires a full (x t , y t ) observation pair M.



Figure 1: A schematic of the Eigen Memory Tree (EMT) algorithm. The outer boxes indicate the two operations supported by EMT, Learn and Query. The inner boxes show the high-level subroutines that occur to accomplish the outer operation.

Cells indicate the number of datasets in which the row algorithm beats the column algorithm with statistical significance. The stacked algorithm we propose (PEMT) has the most wins in each column as indicated by bold.

