CONTINUAL PROTOTYPE EVOLUTION: LEARNING ONLINE FROM NON-STATIONARY DATA STREAMS

Abstract

Attaining prototypical features to represent class distributions is well established in representation learning. However, learning prototypes online from streams of data proves a challenging endeavor as they rapidly become outdated, caused by an ever-changing parameter space in the learning process. Additionally, continual learning does not assume the data stream to be stationary, typically resulting in catastrophic forgetting of previous knowledge. As a first, we introduce a system addressing both problems, where prototypes evolve continually in a shared latent space, enabling learning and prediction at any point in time. In contrast to the major body of work in continual learning, data streams are processed in an online fashion, without additional task-information, and an efficient memory scheme provides robustness to imbalanced data streams. Besides nearest neighbor based prediction, learning is facilitated by a novel objective function, encouraging cluster density about the class prototype and increased inter-class variance. Furthermore, the latent space quality is elevated by pseudo-prototypes in each batch, constituted by replay of exemplars from memory. We generalize the existing paradigms in continual learning to incorporate data incremental learning from data streams by formalizing a two-agent learner-evaluator framework, and obtain state-of-the-art performance by a significant margin on eight benchmarks, including three highly imbalanced data streams.

1. INTRODUCTION

The prevalence of data streams in contemporary applications urges systems to learn in a continual fashion. Autonomous vehicles, sensory robot data, and video streaming yield never-ending streams of data, with abrupt changes in the observed environment behind every vehicle turn, robot entering a new room, or camera cut to a subsequent scene. Alas, learning from streaming data is far from trivial due to these changes, as neural networks tend to forget the knowledge they previously acquired. The data stream presented to the network is not identically and independently distributed (iid), emanating a trade-off between neural stability to retain the current state of knowledge and neural plasticity to swiftly adopt the new knowledge (Grossberg, 1982) . Finding the balance in this stability-plasticity dilemma addresses the catastrophic forgetting (French, 1999) induced by the non-iid intrinsics of the data stream, and is considered the main hurdle for continually learning systems. Although a lot of progress has been established in the literature, often strong assumptions apply, impeding applicability for real-world systems. The static training and testing paradigms prevail, whereas a true continual learner should enable both simultaneously and independently. Therefore, we propose the two-agent learner-evaluator framework to redefine perspective on existing paradigms in the field. Within this framework, we introduce data incremental learning, enabling completely task-free learning and evaluation. Furthermore, we introduce Continual Prototype Evolution (CoPE), a new online data incremental learner wherein prototypes perpetually represent the most salient features of the class population, shifting the catastrophic forgetting problem from the full network parameter space to the lowerdimensional latent space. As a first, our prototypes evolve continually with the data stream, enabling learning and evaluation at any point in time. Similar to representativeness heuristics in human cognition (Kahneman & Tversky, 1972) , the class prototypes are the cornerstone for nearest neighbor classification. Additionally, the system is robust to highly imbalanced data streams by the combination 1

