IDENTIFYING LATENT CAUSAL CONTENT FOR MULTI-SOURCE DOMAIN ADAPTATION Anonymous authors Paper under double-blind

Abstract

Multi-source domain adaptation (MSDA) learns to predict the labels in target domain data, under the setting that data from multiple source domains are labelled and data from the target domain are unlabelled. Most methods for this task focus on learning invariant representations across domains. However, their success relies heavily on the assumption that the label distribution remains consistent across domains, which may not hold in general real-world problems. In this paper, we propose a new and more flexible assumption, termed latent covariate shift, where a latent content variable z c and a latent style variable z s are introduced in the generative process, with the marginal distribution of z c changing across domains and the conditional distribution of the label given z c remaining invariant across domains. We show that although (completely) identifying the proposed latent causal model is challenging, the latent content variable can be identified up to scaling by using its dependence with labels from source domains, together with the identifiability conditions of nonlinear ICA. This motivates us to propose a novel method for MSDA, which learns the invariant label distribution conditional on the latent content variable, instead of learning invariant representations. Empirical evaluation on simulation and real data demonstrates the effectiveness of the proposed method.

1. INTRODUCTION

Traditional machine learning requires the training and testing data to be independent and identically distributed (Vapnik, 1999) . This strict assumption may not be fulfilled in various potential real-world applications. For example, in medical applications, it is common to seek to train a model on patients from a few hospitals and generalize it to a new hospital (Zech et al., 2018) . In this case, it is often reasonable to consider that the distributions of data from training hospitals are different from the new hospital (Koh et al., 2021) . Domain adaptation is a promising research area to handle such problems. In this work, we focus on multi-source DA (MSDA) settings where source domain data are collected from multiple domains. Formally, let x denote the input, e.g. image, y denote the labels in source and target domains, and D denote the domain index. We observe labeled data pairs (x S , y S ) from the multiple joint distributions p(x, y|D = 1), ..., p(x, y|D = m), ..., p(x, y|D = M ) in source domains, and unlabeled input data samples x T from the joint distribution p(x, y|D T ) in the target domain. The training phase of MSDA is to use the sets of (x S , y S ) and x T , to train a predictor so that it can provide a satisfactory estimation for y T in the target domain. The key for MSDA is to understand how the joint distribution p D (x, y) change across all different source and target domains. Most early methods assume that the change of the joint distribution results from Covariate Shift (Huang et al., 2006; Bickel et al., 2007; Sugiyama et al., 2007; Wen et al., 2014) , e.g., p D (x, y) = p D (y|x)p D (x), as depicted by Figure 1(a) . This setting assumes that p D (x) changes across domains, while the conditional distribution p D (y|x) is invariant across domains. Such assumption may not always hold for some real applications, e.g., image classification. For example, the assumption of invariant p D (y|x) implies that p D (y) should change as p D (x) changes. However, we can easily change style information (e.g., hue, view) in the images to change p D (x) and keep p D (y) unchanged, which is common in classification but violates the assumption. In contrast to covariate shift, most current works consider Conditional Shift as depicted by Figure 1 



(b). It assumes that the conditional p D (x|y) changes while p D (y) is invariant across domains 1

