ONLINE BOUNDARY-FREE CONTINUAL LEARNING BY SCHEDULED DATA PRIOR

Abstract

Typical continual learning setup assumes that the dataset is split into multiple discrete tasks. We argue that it is less realistic as the streamed data would have no notion of task boundary in real-world data. Here, we take a step forward to investigate more realistic online continual learning -learning continuously changing data distribution without explicit task boundary, which we call boundary-free setup. Due to the lack of boundary, it is not obvious when and what information in the past to be preserved for a better remedy for the stability-plasticity dilemma. To this end, we propose a scheduled transfer of previously learned knowledge. In addition, we further propose a data-driven balancing between the knowledge in the past and the present in learning objective. Moreover, since it is not straightforward to use the previously proposed forgetting measure without task boundaries, we further propose a novel forgetting and knowledge gain measure based on information theory. We empirically evaluate our method on a Gaussian data stream and its periodic extension, which is frequently observed in real-life data, as well as the conventional disjoint task-split. Our method outperforms prior arts by large margins in various setups, using four benchmark datasets in continual learning literature -CIFAR-10, CIFAR-100, TinyImageNet and ImageNet. Code is available at https://github.com/yonseivnl/sdp.

1. INTRODUCTION

In real-world continual learning (CL) scenarios (He et al., 2020) , data arrive in a streamed manner (Aljundi et al., 2019a; Cai et al., 2021) whereas typical continual learning setups split the data into multiple discrete tasks whose data distributions differ from each other. Moreover, most CL algorithms are studied in an offline CL setup (Kirkpatrick et al., 2017; Rebuffi et al., 2017; Saha et al., 2021) , where the model can access data multiple times. While being prevalent in the literature, this setup has a number of issues far from the realistic scenario. Although the task setup have been partly addressed by (Prabhu et al., 2020; Koh et al., 2021; Kim et al., 2021b; Bang et al., 2022) , the revised setups still have the notion of task boundary whereas real-world data may not have the explicit task boundaries as the data distribution changes continuously. Despite that many methods update the model in a boundary-agnostic manner, called task-free CL (Aljundi et al., 2019b; Koh et al., 2021) , they still leverage the notion of task boundary for knowledge transfer and evaluation, e.g., leveraging the fact that distribution shift in data stream occurs only at task boundaries. In addition, the definition of forgetting depends on the notion of 'old' and 'new' tasks, which are defined by the task boundary. We argue to address an online CL setup where data are learned online (allowing only a single access to data) with continuous distribution shift without explicit task boundaries. We refer to the setup as online boundary-free continual learning. In this setup, a small set of data is streamed to the model one by one, and the model only has access to the current data batch only (Aljundi et al., 2019c; a) without the notion of task boundary. For the distribution of a continuous data stream, following (Shanahan et al., 2021; Wang et al., 2022) , we consider Gaussian distribution as an instance of data streaming distributions.The Gaussian † indicates the corresponding author.

