NEIGHBOR CLASS CONSISTENCY ON UNSUPERVISED DOMAIN ADAPTATION

Abstract

Unsupervised domain adaptation (UDA) is to make a prediction for unlabeled data in a target domain with labeled data from the source domain available. Recent advances exploit entropy minimization and self-training to align the feature of two domains. However, as decision boundary is largely biased towards source data, class-wise pseudo labels generated by target predictions are usually very noisy, and trusting those noisy supervisions might potentially deteriorate the intrinsic target discriminative feature. Motivated by agglomerative clustering which assumes that features in the near neighborhood should be clustered together, we observe that target features from source pre-trained model are highly intrinsic discriminative and have a high probability of sharing the same label with their neighbors. Based on those observations, we propose a simple but effective method to impose Neighbor Class Consistency on target features to preserve and further strengthen the intrinsic discriminative nature of target data while regularizing the unified classifier less biased towards source data. We also introduce an entropybased weighting scheme to help our framework more robust to the potential noisy neighbor supervision. We conduct ablation studies and extensive experiments on three UDA image classification benchmarks. Our method outperforms all existing UDA state-of-the-art.

1. INTRODUCTION

Recent advances in deep neural network have dominated many computer vision tasks, such as image recognition He et al. (2016 ), object detectionGirshick (2015) , and semantic segmentationLong et al. (2015) . However, collection and manual annotation need no trivial human effort, especially for vision tasks like semantic segmentation where dense annotations are required. Thanks to the growth of computer graphics field, it is now possible to leverage CNN to synthetic images with computergenerated annotations (Richter et al. (2016); Ros et al. (2016) ), so unlimited amount of data with free annotation is available for training network in scale. However, directly applying the model trained on synthetic source data to unlabeled target data leads to performance degradation and Unsupervised Domain Adaptation (UDA) aims to tackle this domain shift problem. A widespread of UDA methods were proposed to align the domain-invariant representations by simultaneously minimizing the source error and discrepancy(e. wards source data, trusting biased network predictions will push target features towards their nearest source class prototypes while deteriorating the intrinsic discriminative target structure as shown in Fig. 1 (2.a). Motivated by agglomerative clustering methods (Sarfraz et al. ( 2019)) which assume that features in the nearby region should be clustered together, we investigate target features from source pretrained model and observe that they are intrinsically discriminative and have a very high possibility of sharing the same label with their neighbors as shown in Fig. 1 (2.b). To utilize this high-quality pairwise neighbor supervision, we propose a simple and effective approach to impose Neighbor Class Consistency between target samples and their neighbors. To alleviate propagated errors from false neighbor supervision, we introduce an Entropy-based weighting scheme to emphasize more on the reliable pairwise neighbor supervision. Additionally, we categorize Self Class Consistency as a special case of our method where the nearest neighbor of a sample is its self-augmentation. Further, we explore feature representation learning based on the ranking relationship between selfaugmentation and the first neighbor given an anchor. We enforce the features of anchors to be closer to their self-augmentation than their first neighbors. In summary, our main contributions are shown as follows: (1) We revisit the source pre-trained model and observe the intrinsic discriminative nature of target features from source model. ( 2 



Figure 1: (1) Our Neighbor Class Consistency framework. (2) An overview of our approach: (a) Self-Training methods ignore the intrinsic target structure while aligning the features of two domains based on a biased classifier. They potentially deteriorate the intrinsic target clusters. (b) Our approach enforces the Neighbor Class Consistency on target features, therefore progressively strengthen the instrinsic discrimination of target features while regularizing the unified classifier less biased towards source data.

) Based on this observation, we propose Neighbor Class Consistency (NC) to utilize the high-quality pairwise neighbor pseudo supervision over noisy class-wise pseudo supervision from Self-Training methods. (3) We introduce an Entropy-based weighting scheme to help our framework be more robust to unreliable neighbor supervision. (4) We categorize Self Class Consistency as a special case of our framework and explore the first neighbor for feature representation learning. (5) We conduct extensive experiments on three UDA benchmarks datasets. NC outperforms all existing methods and achieves a new UDA state-of-the-art performance. Notably, we achieve 86.2% on challenging VisDA17 dataset.2 RELATED WORKDiscrepancy based domain adaptation Following the theoretical upper bound proposed in Ben-David et al. (2007), existing methods have explored to align the feature representations of the source and target images by minimizing the distribution discrepancy. For example, Maximum Mean Discrepancy (MMD) Tzeng et al. (2014) is proposed to match the mean and covariance of source and target distributions. Alternatively, adversarial domain adaptation methods Ganin & Lempitsky (2015); Tzeng et al. (2017); Radford et al. (2015); Hoffman et al. (2018); Tsai et al. (2018); Sankaranarayanan et al. (2018); Luo et al. (2019) solve this domain discrepancy by training a domain-

