DOMAIN-INDEXING VARIATIONAL BAYES: INTER-PRETABLE DOMAIN INDEX FOR DOMAIN ADAPTATION

Abstract

Previous studies have shown that leveraging domain index can significantly boost domain adaptation performance (Wang et al., 2020; Xu et al., 2022). However, such domain indices are not always available. To address this challenge, we first provide a formal definition of domain index from the probabilistic perspective, and then propose an adversarial variational Bayesian framework that infers domain indices from multi-domain data, thereby providing additional insight on domain relations and improving domain adaptation performance. Our theoretical analysis shows that our adversarial variational Bayesian framework finds the optimal domain index at equilibrium. Empirical results on both synthetic and real data verify that our model can produce interpretable domain indices which enable us to achieve superior performance compared to state-of-the-art domain adaptation methods.

1. INTRODUCTION

In machine learning, it is standard to assume that training data and test data share an identical distribution. However, this assumption is often violated (Ganin & Lempitsky, 2015; Romera et al., 2019; Sun et al., 2017; Yuan et al., 2019; Ramponi & Plank, 2020) when training and test data come from different domains. Domain adaptation (DA) tries to solve such a cross-domain generalization problem by producing domain-invariant features. Typically, DA methods enforce independence between a data point's latent representation and its domain identity, which is a one-hot vector indicating which domain the data point comes from (Ganin et al., 2016; Tzeng et al., 2017; Zhao et al., 2017; Zhang et al., 2019) . More recent studies have found that using domain index, which is a real-value scalar (or vector) to embed domain semantics, as a replacement of domain identity, significantly boosted domain adaptation performance (Wang et al., 2020; Xu et al., 2022) . For instance, Wang et al. ( 2020) adapted sleeping stage prediction models across patients with different ages, with "age" as the domain index, and achieved superior performance compared to traditional models that split patients into groups by age and used discrete group IDs as domain identities (more discussion in Sec. J). Although significant progress has been made in leveraging domain indices to improve domain adaptation (Wang et al., 2020; Xu et al., 2022) , a major challenge exists: domain indices are not always available. This severely limits the applicability of such indexed DA methods. Thus a natural question is motivated: Can one infer the domain index as a latent variable from data? This prompts us to first develop an expressive and formal definition of "domain index". We argue that an effective "domain index" (1) is independent of the data's encoding, (2) retains as much information on the data as possible, and (3) maximizes adaptation performance, e.g., accuracy (see Sec. 3.2 for rigorous descriptions). With this definition, we then develop an adversarial variational Bayesian deep learning model (Wang et al., 2015; Wang & Yeung, 2016; 2020) that describes intuitive conditional dependencies among the input data, labels, encodings, and the associated domain indices. Our theoretical analysis shows that maximizing our model's evidence lower bound while adversarially training an additional discriminator (Ganin et al., 2016; Wang et al., 2020) is equivalent to inferring the optimal domain indices (according to our definition) that maximize the mutual information among the input data, labels, encodings, and the associated domain indices while minimizing the mutual information between the data's encodings and the domain indices. Our contributions are as follows: • We identify the problem of inferring domain indices as latent variables, provide a rigorous definition of "domain index", and develop the first general method, dubbed variational domain indexing (VDI), for inferring such domain indices. • Our theoretical analysis shows that training with VDI's final objective function is equivalent to inferring the optimal domain indices according to our definition. • Experiments on both synthetic and real-world datasets show that VDI can infer non-trivial domain indices, thereby significantly improving performance over state-of-the-art DA methods. et al., 2018; Kumar et al., 2020; Prabhu et al., 2021) , domain-specific normalization (Maria Carlucci et al., 2017; Mancini et al., 2019; Tasar et al., 2020) , and deep learning models with adversarial training (Ganin et al., 2016; Tzeng et al., 2017; Zhang et al., 2019; Zhao et al., 2017; Chen et al., 2019; Dai et al., 2019) . Most of these methods rely on (one-hot) domain identities for feature alignment. Ganin et al., 2016; Tzeng et al., 2017; Prabhu et al., 2021) , which are considered as our baselines in Sec. 5.

Domain Adaptation with Domain Identities and

Recent studies have found that replacing (one-hot) domain identities with (continuous) domain indices improves adaptation performance (Wang et al., 2020; Xu et al., 2022) . However, none of these works provide a canonical definition of "domain index"; instead they mainly rely on intuition, e.g., using rotation angles as domain indices for RotatingMNIST (Wang et al., 2020) and using graph node embeddings as domain indices for adaptation across graph-relational domains (Xu et al., 2022) . More importantly, they assume that such domain indices are always available, which may not be true (Matsuura & Harada, 2020; Rebuffi et al., 2017) ; hence they are not applicable to our setting. Also related to our work is Peng et al. (2020a) , which generates embeddings of visual domains to represent domain similarities; however, they do not formally define "domain embedding" and only work for classification tasks. In contrast, our VDI provides a rigorous and formal definition for "domain indices" and handles both classification and regression tasks.

3. METHOD

In this section, we formalize the definition of "domain index" and describe our VDI for inferring domain indices. We provide theoretical guarantees that VDI infers optimal domain indices in Sec. 4.



Domain Indices. There are also works that generate domain identities from data to improve domain adaptation. Chen & Chao (2021) generates a sequence of domain identities for intermediate domains between a source domain and a target domain, trying to facilitate better incremental domain adaptation (Bobu et al., 2018). Deecke et al. (2021) generates a set of domain identities to split a dataset into different domains and perform multi-domain learning. Du et al. (2021); Lu et al. (2022) partition time series data by learning domain identities that maximize the domain-wise distribution gap. All these works focus on generating (ordinal or one-hot) domain identities. In contrast, our VDI assumes such domain identities are already given and focuses on inferring (continuous) domain indices, which contain richer and more interpretable information. Note that in our setting, since domain identities are given, methods such as Deecke et al. (2021) are equivalent to typical domain adaptation methods (

