SEMANTIC PRIOR FOR WEAKLY SUPERVISED CLASS-INCREMENTAL SEGMENTATION

Abstract

Class-incremental semantic image segmentation assumes multiple model updates, each enriching the model to segment new categories. This is typically carried out by providing pixel-level manual annotations for all new objects, limiting the adoption of such methods. Approaches which solely require image-level labels offer an attractive alternative, yet, such annotations lack crucial information about the location and boundary of new objects. In this paper we argue that, since classes represent not just indices but semantic entities, the conceptual relationships between them can provide valuable information that should be leveraged. We propose a weakly supervised approach that leverages such semantic relations in order to transfer some cues from the previously learned classes into the new ones, complementing the supervisory signal from image-level labels. We validate our approach on a number of continual learning tasks, and show how even a simple pairwise interaction between classes can significantly improve the segmentation mask quality of both old and new classes. We show these conclusions still hold for longer and, hence, more realistic sequences of tasks and for a challenging few-shot scenario.

1. INTRODUCTION

When working towards the real-world deployment of artificial intelligence systems, two main challenges arise: such systems should possess the ability to continuously learn, and this learning process should only require limited human intervention. While deep learning models have proved effective in tackling tasks for which large amounts of curated data as well as abundant computational resources are available, they still struggle to learn over continuous and potentially heterogeneous sequences of tasks, especially if supervision is limited. In this work, we focus on the task of semantic image segmentation (SIS). A reliable and versatile SIS model should be able to seamlessly add new categories to its repertoire without forgetting about the old ones. Considering for instance a house robot or a self-driving vehicle with such segmentation capability, we would like it to be able to handle new classes without having to retrain the segmentation model from scratch. Such ability is at the core of continual learning research, the main challenge being to mitigate catastrophic forgetting of what has been previously learned (Parisi et al., 2019) . Figure 1 : Our proposed Relation-aware Prior Loss (RaSP) is based on the intuition that predictions from existing classes provide valuable cues to better segment new, semantically related classes. This allows reducing supervision to image-level labels for incremental SiS. Most learning algorithms for SIS assume training samples with accurate pixel-level annotations, a time-consuming and tedious operation. We argue that this is cumbersome and severely hinders continual learning; adding new classes over time should be a lighter-weight process. This is why, here, we focus on the case where only image-level labels are provided (e.g. , adding the 'sheep' class comes as easily as only providing images guaranteed to contain at least a sheep). This weakly supervised task is an extremely challenging problem in itself and very few attempts have been made in the context of continual learning (Cermelli et al., 2022) .

