INCREMENTAL LEARNING OF STRUCTURED MEMORY VIA CLOSED-LOOP TRANSCRIPTION

Abstract

This work proposes a minimal computational model for learning structured memories of multiple object classes in an incremental setting. Our approach is based on establishing a closed-loop transcription between the classes and a corresponding set of subspaces, known as a linear discriminative representation, in a low-dimensional feature space. Our method is simpler than existing approaches for incremental learning, and more efficient in terms of model size, storage, and computation: it requires only a single, fixed-capacity autoencoding network with a feature space that is used for both discriminative and generative purposes. Network parameters are optimized simultaneously without architectural manipulations, by solving a constrained minimax game between the encoding and decoding maps over a single rate reduction-based objective. Experimental results show that our method can effectively alleviate catastrophic forgetting, achieving significantly better performance than prior work of generative replay on MNIST, CIFAR-10, and ImageNet-50, despite requiring fewer resources.

1. INTRODUCTION

Artificial neural networks have demonstrated a great ability to learn representations for hundreds or even thousands of classes of objects, in both discriminative and generative contexts. However, networks typically must be trained offline, with uniformly sampled data from all classes simultaneously. When the same network is updated to learn new classes without data from the old ones, previously learned knowledge will fall victim to the problem of catastrophic forgetting (McCloskey & Cohen, 1989) . This is known in neuroscience as the stability-plasticity dilemma: the challenge of ensuring that a neural system can learn from a new environment while retaining essential knowledge from previous ones (Grossberg, 1987) . In contrast, natural neural systems (e.g. animal brains) do not seem to suffer from such catastrophic forgetting at all. They are capable of developing new memory of new objects while retaining memory of previously learned objects. This ability, for either natural or artificial neural systems, is often referred to as incremental learning, continual learning, sequential learning, or life-long learning (Allred & Roy, 2020). While many recent works have highlighted how incremental learning might enable artificial neural systems that are trained in more flexible ways, the strongest existing efforts toward answering the stability-plasticity dilemma for artificial neural networks typically require raw exemplars (Rebuffi et al., 2017; Chaudhry et al., 2019b) or require external task information (Kirkpatrick et al., 2017) . Raw exemplars, particularly in the case of high-dimensional inputs like images, are costly and difficult to scale, while external mechanisms -which, as surveyed in Section 2, include secondary networks and representation spaces for generative replay, incremental allocation of network resources, network duplication, or explicit isolation of used and unused parts of the network -require heuristics and incur hidden costs. In this work, we are interested in an incremental learning setting that counters these trends with two key qualities. (1) The first is that it is memory-based. When learning new classes, no raw exemplars of old classes are available to train the network together with new data. This implies that one has to rely on a compact and thus structured "memory" of old classes, such as incrementally learned generative representations of the old classes, as well as the associated encoding and decoding mappings (Kemker & Kanan, 2018). ( 2) The second is that it is self-contained. Incremental learning takes place in a single neural system with a fixed capacity, and in a common representation space. The ability to minimize forgetting is implied by optimizing an overall learning objective, without external networks, architectural modifications, or resource allocation mechanisms. Concretely, the contributions of our work are as follows: (1) We demonstrate how the closed-loop transcription (CTRL) framework (Dai et al., 2022; 2023) can be adapted for memory-based, self-contained mitigation of catastrophic forgetting (Figure 1 ). To the best of our knowledge, these qualities have not yet been demonstrated by existing methods. Closedloop transcription aims to learn linear discriminative representations (LDRs) via a rate reductionbased (Yu et al., 2020; Ma et al., 2007; Ding et al., 2023) minimax game: our method, which we call incremental closed-loop transcription (i-CTRL), shows how these principled representations and objectives can uniquely facilitate incremental learning of stable and structured class memories. This requires only a fixed-sized neural system and a common learning objective, which transforms the standard CTRL minimax game into a constrained one, where the goal is to optimize a rate reduction objective for each new class while keeping the memory of old classes intact. (2) We quantitatively evaluate i-CTRL on class-incremental learning for a range of datasets: MNIST (LeCun et al., 1998) , CIFAR-10 (Krizhevsky et al., 2009), and ImageNet-50 (Deng et al., 2009) . Despite requiring fewer resources (smaller network and nearly no extra memory buffer), i-CTRL outperforms comparable alternatives: it achieves a 5.8% improvement in average classification accuracy over the previous state of the art on CIFAR-10, and a 10.6% improvement in average accuracy on ImageNet-50. (3) We qualitatively verify the structure and generative abilities of learned representations. Notably, the self-contained i-CTRL system's common representation is used for both classification and generation, which eliminates the redundancy of external generative replay representations used by prior work. (4) We demonstrate a "class-unsupervised" incremental reviewing process for i-CTRL. As an incremental neural system learns more classes, the memory of previously learned classes inevitably degrades: by seeing a class only once, we can only expect to form a temporary memory. Facilitated by the structure of our linear discriminative representations, the incremental reviewing process shows that the standard i-CTRL objective function can reverse forgetting in a trained i-CTRL system using samples from previously seen classes even if they are unlabeled. The resulting semi-supervised process improves generative quality and raises the accuracy of i-CTRL from 59.9% to 65.8% on CIFAR-10, achieving jointly-optimal performance despite only incrementally provided class labels.

2. RELATED WORK

A significant body of work has studied methods for addressing forms of the incremental learning problem. In this section, we discuss a selection of representative approaches, and highlight relationships to i-CTRL.



Figure 1: Overall framework of our closed-loop transcription based incremental learning for a structured LDR memory. Only a single, entirely self-contained, encoding-decoding network is needed: for a new data class X new , a new LDR memory Z new is incrementally learned as a minimax game between the encoder and decoder subject to the constraint that old memory of past classes Z old is intact through the closed-loop transcription (or replay): Z old ≈ Ẑold = f (g(Z old )).

