INCREMENTAL LEARNING OF STRUCTURED MEMORY VIA CLOSED-LOOP TRANSCRIPTION

Abstract

This work proposes a minimal computational model for learning structured memories of multiple object classes in an incremental setting. Our approach is based on establishing a closed-loop transcription between the classes and a corresponding set of subspaces, known as a linear discriminative representation, in a low-dimensional feature space. Our method is simpler than existing approaches for incremental learning, and more efficient in terms of model size, storage, and computation: it requires only a single, fixed-capacity autoencoding network with a feature space that is used for both discriminative and generative purposes. Network parameters are optimized simultaneously without architectural manipulations, by solving a constrained minimax game between the encoding and decoding maps over a single rate reduction-based objective. Experimental results show that our method can effectively alleviate catastrophic forgetting, achieving significantly better performance than prior work of generative replay on MNIST, CIFAR-10, and ImageNet-50, despite requiring fewer resources.

1. INTRODUCTION

Artificial neural networks have demonstrated a great ability to learn representations for hundreds or even thousands of classes of objects, in both discriminative and generative contexts. However, networks typically must be trained offline, with uniformly sampled data from all classes simultaneously. When the same network is updated to learn new classes without data from the old ones, previously learned knowledge will fall victim to the problem of catastrophic forgetting (McCloskey & Cohen, 1989) . This is known in neuroscience as the stability-plasticity dilemma: the challenge of ensuring that a neural system can learn from a new environment while retaining essential knowledge from previous ones (Grossberg, 1987) . In contrast, natural neural systems (e.g. animal brains) do not seem to suffer from such catastrophic forgetting at all. They are capable of developing new memory of new objects while retaining memory of previously learned objects. This ability, for either natural or artificial neural systems, is often referred to as incremental learning, continual learning, sequential learning, or life-long learning (Allred & Roy, 2020). While many recent works have highlighted how incremental learning might enable artificial neural systems that are trained in more flexible ways, the strongest existing efforts toward answering the stability-plasticity dilemma for artificial neural networks typically require raw exemplars (Rebuffi et al., 2017; Chaudhry et al., 2019b) or require external task information (Kirkpatrick et al., 2017) . Raw exemplars, particularly in the case of high-dimensional inputs like images, are costly and difficult to scale, while external mechanisms -which, as surveyed in Section 2, include secondary networks and representation spaces for generative replay, incremental allocation of network resources, network duplication, or explicit isolation of used and unused parts of the network -require heuristics and incur hidden costs. In this work, we are interested in an incremental learning setting that counters these trends with two key qualities. (1) The first is that it is memory-based. When learning new classes, no raw exemplars of old classes are available to train the network together with new data. This implies that one has to rely on a compact and thus structured "memory" of old classes, such as incrementally learned generative representations of the old classes, as well as the associated encoding and decoding mappings (Kemker

