PUSH AND PULL: COMPETING FEATURE-PROTOTYPE INTERACTIONS IMPROVE SEMI-SUPERVISED SEMAN-TIC SEGMENTATION

Abstract

This paper challenges semi-supervised segmentation with a rethink on the featureprototype interaction in the classification head. Specifically, we view each weight vector in the classification head as the prototype of a semantic category. The basic practice in the softmax classifier is to pull a feature towards its positive prototype (i.e., the prototype of its class), as well as to push it away from its negative prototypes. In this paper, we focus on the interaction between the feature and its negative prototypes, which is always "pushing" to make them dissimilar. While the pushing-away interaction is necessary, this paper reveals a new mechanism that the contrary interaction of pulling close negative prototypes is also beneficial. We have two insights for this counter-intuitive interaction: 1) some pseudo negative prototypes might actually be positive so that the pulling interaction can help resisting the pseudo-label noises, and 2) some true negative prototypes might contain contextual information that is beneficial. Therefore, we integrate these two competing interactions into a Push-and-Pull Learning (PPL) method. On the one hand, PPL introduces the novel pulling-close interaction between features and negative prototypes with a feature-to-prototype attention. On the other hand, PPL reinforces the original pushing-away interaction with a multi-prototype contrastive learning. While PPL is very simple, experiments show that it substantially improves semi-supervised segmentation and sets a new state of the art.

1. INTRODUCTION

This paper considers the semi-supervised semantic segmentation task. We focus on an essential component in the segmentation model, i.e. the classification head, which consists of a set of learnable weight vectors. These weight vectors are usually viewed as a set of prototypes representing the corresponding semantic categories. The essential training process is to pull each deep feature towards its positive prototype (i.e., the prototype of its class), as well as to push it away from its negative prototypes. The "pushing-away" interaction between a feature and its negative prototypes makes them dissimilar to each other and is critical for discriminating different classes. In all the following parts of this paper, our discussion focuses on the interaction between features and their negative prototypes and neglects the positive prototypes (unless explicitly pointed out). While this "pushing away" interaction is necessary, this paper reveals a new mechanism that the contrary interaction of pulling close features and their negative prototypes is also beneficial. This "pulling close" interaction may seem counter-intuitive at the first glance but is actually reasonable from two insights as below: 1) It brings a task-specific benefit for semi-supervised segmentation by resisting the pseudo-label noises. Specifically, the popular pseudo-label-based pipeline is inevitably confronted with the pseudo-label noises: some pseudo negative prototypes might be actually positive. Under this condition, the "pulling close" interaction gives the feature a chance to approach its actual-positive prototype. Experiments confirm that the pulling interaction effectively reduces the pseudo label noises (Section 4.4) and this task-specific benefit is the primary reason for our improvement (Section 4.4). 2) It brings a general benefit for both the semi-supervised and fully-supervised segmentation because some negative prototypes contain contextual information. Specifically, since our prototypes (i.e.,

