A SIMPLE UNIFIED INFORMATION REGULARIZATION FRAMEWORK FOR MULTI-SOURCE DOMAIN ADAPTA-TION

Abstract

Adversarial learning strategy has demonstrated remarkable performance in dealing with single-source unsupervised Domain Adaptation (DA) problems, and it has recently been applied to multi-source DA problems. Although most existing DA methods use multiple domain discriminators, the effect of using multiple discriminators on the quality of latent space representations has been poorly understood. Here we provide theoretical insights into potential pitfalls of using multiple domain discriminators: First, domain-discriminative information is inevitably distributed across multiple discriminators. Second, it is not scalable in terms of computational resources. Third, the variance of stochastic gradients from multiple discriminators may increase, which significantly undermines training stability. To fully address these issues, we situate adversarial DA in the context of information regularization. First, we present a unified information regularization framework for multi-source DA. It provides a theoretical justification for using a single and unified domain discriminator to encourage the synergistic integration of the information gleaned from each domain. Second, this motivates us to implement a novel neural architecture called a Multi-source Information-regularized Adaptation Networks (MIAN). The proposed model significantly reduces the variance of stochastic gradients and increases computational-efficiency. Large-scale simulations on various multi-source DA scenarios demonstrate that MIAN, despite its structural simplicity, reliably outperforms other state-of-the-art methods by a large margin especially for difficult target domains.

1. INTRODUCTION

Although a large number of studies have demonstrated the ability of deep neural networks to solve challenging tasks, the tasks solved by networks are mostly confined to a similar type or a single domain. One remaining challenge is the problem known as domain shift (Gretton et al. (2009) ), where a direct transfer of information gleaned from a single source domain to unseen target domains may lead to significant performance impairment. Domain adaptation (DA) approaches aim to mitigate this problem by learning to map data of both domains onto a common feature space. Whereas several theoretical results (Ben-David et al. ( 2007 ), the potential pitfalls of this setting have not been fully explored. The existing works do not provide a theoretical guarantee that the unnecessary domain-specific information is fully filtered out, because the domain-discriminative information is inevitably distributed across multiple discriminators. For example, the multiple domain discriminators focus only on estimating the domain shift between source domains and the target, while the discrepancies between the source domains are neglected, making it hard to align all the given domains. This necessitates garnering the domain-discriminative information with a



); Blitzer et al. (2008); Zhao et al. (2019a)) and algorithms for DA (Long et al. (2015; 2017); Ganin et al. (2016)) have focused on the case in which only a single-source domain dataset is given, we consider a more challenging and generalized problem of knowledge transfer, referred to as Multi-source unsupervised DA (MDA). Following a seminal theoretical result on MDA (Blitzer et al. (2008); Ben-David et al. (2010)), technical advances have been made, mainly on the adversarial methods. (Xu et al. (2018); Zhao et al. (2019c)). While most of adversarial MDA methods use multiple independent domain discriminators (Xu et al. (2018); Zhao et al. (2018); Li et al. (2018); Zhao et al. (2019c;b)

