ON THE SOFT-SUBNETWORK FOR FEW-SHOT CLASS INCREMENTAL LEARNING

Abstract

Inspired by Regularized Lottery Ticket Hypothesis, which states that competitive smooth (non-binary) subnetworks exist within a dense network, we propose a fewshot class-incremental learning method referred to as Soft-SubNetworks (SoftNet). Our objective is to learn a sequence of sessions incrementally, where each session only includes a few training instances per class while preserving the knowledge of the previously learned ones. SoftNet jointly learns the model weights and adaptive non-binary soft masks at a base training session in which each mask consists of the major and minor subnetwork; the former aims to minimize catastrophic forgetting during training, and the latter aims to avoid overfitting to a few samples in each new training session. We provide comprehensive empirical validations demonstrating that our SoftNet effectively tackles the few-shot incremental learning problem by surpassing the performance of state-of-the-art baselines over benchmark datasets.

1. INTRODUCTION

Lifelong Learning, or Continual Learning, is a learning paradigm to expand knowledge and skills through sequential training of multiple tasks (Thrun, 1995) . According to the accessibility of task identity during training and inference, the community often categorizes the field into specific problems, such as task-incremental (Pfülb and Gepperth, 2019; Delange et al., 2021; Yoon et al., 2020; Kang et al., 2022 ), class-incremental (Chaudhry et al., 2018; Kuzborskij et al., 2013; Li and Hoiem, 2017; Rebuffi et al., 2017; Kemker and Kanan, 2017; Castro et al., 2018; Hou et al., 2019; Wu et al., 2019) , and task-free continual learning (Aljundi et al., 2019; Jin et al., 2021; Pham et al., 2022; Harrison et al., 2020) . While the standard scenarios for continual learning assume a sufficiently large number of instances per task, a lifelong learner for real-world applications often suffers from insufficient training instances for each problem to solve. This paper aims to tackle the issue of limited training instances for practical Class-Incremental Learning (CIL), referred to as Few-Shot CIL (FSCIL) (Ren et al., 2019; Chen and Lee, 2020; Tao et al., 2020; Zhang et al., 2021; Cheraghian et al., 2021; Shi et al., 2021) . However, there are two critical challenges in solving FSCIL problems: catastrophic forgetting and overfitting. Catastrophic forgetting (Goodfellow et al., 2013; Kirkpatrick et al., 2017) or Catastrophic Interference McCloskey and Cohen (1989) is a phenomenon in which a continual learner loses the previously learned task knowledge by updating the weights to adapt to new tasks, resulting in significant performance degeneration on previous tasks. Such undesired knowledge drift is irreversible since the scenario does not allow the model to revisit past task data. Recent works propose to mitigate catastrophic forgetting for class-incremental learning, often categorized in multiple directions, such as constraint-based (Rebuffi et al., 2017; Castro et al., 2018; Hou et al., 2018; 2019; Wu et al., 2019 ), memory-based (Rebuffi et al., 2017; Chen and Lee, 2020; Mazumder et al., 2021; Shi et al., 2021) , and architecture-based methods (Mazumder et al., 2021; Serra et al., 2018; Mallya and Lazebnik, 2018; Kang et al., 2022) . However, we note that catastrophic forgetting becomes further challenging

availability

//github.com/ihaeyong/

