A DISTINCT UNSUPERVISED REFERENCE MODEL FROM THE ENVIRONMENT HELPS CONTINUAL LEARNING Anonymous authors Paper under double-blind review

Abstract

The existing continual learning methods are mainly focused on fully-supervised scenarios and are still not able to take advantage of unlabeled data available in the environment. Some recent works tried to investigate semi-supervised continual learning (SSCL) settings in which the unlabeled data are available, but it is only from the same distribution as the labeled data. This assumption is still not general enough for real-world applications and restricts the utilization of unsupervised data. In this work, we introduce Open-Set Semi-Supervised Continual Learning (OSSCL), a more realistic semi-supervised continual learning setting in which outof-distribution (OoD) unlabeled samples in the environment are assumed to coexist with the in-distribution ones. Under this configuration, we present a model with two distinct parts: (i) the reference network captures general-purpose and task-agnostic knowledge in the environment by using a broad spectrum of unlabeled samples, (ii) the learner network is designed to learn task-specific representations by exploiting supervised samples. The reference model both provides a pivotal representation space and also segregates unlabeled data to exploit them more efficiently. By performing a diverse range of experiments, we show the superior performance of our model compared with other competitors and prove the effectiveness of each component of the proposed model.

1. INTRODUCTION

In a real-world continual learning (CL) problem, the agent has to learn from a non-i.i.d. stream of samples with serious restrictions on storing data. In this case, the agent must be prone to catastrophic forgetting during training (French, 1999) . The existing CL methods are mainly focused on supervised scenarios and can be categorized into three main approaches (Parisi et al., 2019) : (i) Replay-based methods reuse samples from previous tasks either by keeping raw samples in a limited memory buffer (Rebuffi et al., 2017; Lopez-Paz & Ranzato, 2017; Aljundi et al., 2019) or by generating pseudo-samples from previous classes (Shin et al., 2017; Wu et al., 2018; van de Ven et al., 2020) . (ii) Regularization-based methods aim to maintain the stability of the network across tasks by penalizing deviation from the previously learned representations or parameters (Nguyen et al., 2018; Cha et al., 2021; Rebuffi et al., 2017; Li & Hoiem, 2016) . (iii) Methods based on parameter isolation dedicate distinct parameters to each task by introducing new task-specific weights or masks (Rusu et al., 2016; Yoon et al., 2018; Wortsman et al., 2020) . Humans, as intelligent agents, are constantly in contact with tons of unsupervised data being endlessly streamed in the environment that can be used to facilitate concept learning in the brain (Zhuang et al., 2021; Bi & Poo, 1998; Hinton & Sejnowski, 1999) . With this in mind, an important but less explored issue in many practical CL applications is how to effectively utilize a vast stream of unlabeled data along with limited labeled samples. Recently, efforts have been made in this direction leading to the investigation of three different configurations: Wang et al. ( 2021) introduced a very restricted scenario for semi-supervised continual learning in which the unsupervised data are only from the classes which are being learned at the current time step. On the other hand, Lee et al. ( 2019) introduced a configuration that is "more similar to self-taught learning rather than semi-supervised learning". In fact, they introduced a setting in which the model is exposed to plenty of labeled samples which is a necessary assumption for their model to achieve a good performance; in addition, their model has access to a large corpse of unsupervised data in an environment that typically does not include samples related to the current CL problem. By adopting this idea, Smith et al. (2021) proposed a

