JOINT ATTENTION-DRIVEN DOMAIN FUSION AND NOISE-TOLERANT LEARNING FOR MULTI-SOURCE DOMAIN ADAPTATION

Abstract

Multi-source Unsupervised Domain Adaptation (MUDA) transfers knowledge from multiple source domains with labeled data to an unlabeled target domain. Recently, endeavours have been made in establishing connections among different domains to enable feature interaction. However, as these approaches essentially enhance category information, they lack the transfer of the domain-specific information. Moreover, few research has explored the connection between pseudolabel generation and the framework's learning capabilities, crucial for ensuring robust MUDA. In this paper, we propose a novel framework, which significantly reduces the domain discrepancy and demonstrates new state-of-the-art performance. In particular, we first propose a Contrary Attention-based Domain Merge (CADM) module to enable the interaction among the features so as to achieve the mixture of domain-specific information instead of focusing on the category information. Secondly, to enable the network to correct the pseudo labels during training, we propose an adaptive and reverse cross-entropy loss, which can adaptively impose constraints on the pseudo-label generation process. We conduct experiments on four benchmark datasets, showing that our approach can efficiently fuse all domains for MUDA while showing much better performance than the prior methods.

1. INTRODUCTION

Deep neural networks (DNNs) have achieved excellent performance on various vision tasks under the assumption that training and test data come from the same distribution. However, different scenes have different illumination, viewing angles, and styles, which may cause the domain shift problem (Zhu et al., 2019; Tzeng et al., 2017; Long et al., 2016) . This can eventually lead to a significant performance drop on the target task. Unsupervised Domain Adaptation (UDA) aims at addressing this issue by transferring knowledge from the source domain to the unlabeled target domain (Saenko et al., 2010) . Early research has mostly focused on Single-source Unsupervised Domain Adaptation (SUDA), which transfers knowledge from one source domain to the target domain. Accordingly, some methods align the feature distribution among source and target domains (Tzeng et al., 2014) MUDA leverages all of the available data and thus enables performance gains; nonetheless, it introduces a new challenge of reducing domain shift between all source and target domains. For this, some research (Peng et al., 2019) builds their methods based on SUDA, aiming to extract common domain-invariant features for all domains. Moreover, some works, e.g., Venkat et al. (2021); Zhou et al. (2021) focus on the classifier's predictions to achieve domain alignment. Recently, some approaches (Li et al., 2021; Wen et al., 2020) take advantage of the MUDA property to create connections for each domain. Overall, since the main challenge of MUDA is to eliminate the differences between all domains, there are two main ways to achieve this. One is to extract domain invariant features among all domains, i.e., filter domain-specific information for different domains. The other is by mixing domain-specific information from different domains so that all domains share such 1



while some(Tzeng et al., 2017)   learn domain invariants through adversarial learning. Liang et al. (2020) use the label information to maintain the robust training process. However, data is usually collected from multiple domains in the real-world scenario, which arises a more practical task, i.e., Multi-source Unsupervised Domain Adaptation (MUDA)(Duan et al., 2012).

