ENERGY-BASED TEST SAMPLE ADAPTATION FOR DOMAIN GENERALIZATION

Abstract

In this paper, we propose energy-based sample adaptation at test time for domain generalization. Where previous works adapt their models to target domains, we adapt the unseen target samples to source-trained models. To this end, we design a discriminative energy-based model, which is trained on source domains to jointly model the conditional distribution for classification and data distribution for sample adaptation. The model is optimized to simultaneously learn a classifier and an energy function. To adapt target samples to source distributions, we iteratively update the samples by energy minimization with stochastic gradient Langevin dynamics. Moreover, to preserve the categorical information in the sample during adaptation, we introduce a categorical latent variable into the energy-based model. The latent variable is learned from the original sample before adaptation by variational inference and fixed as a condition to guide the sample update. Experiments on six benchmarks for classification of images and microblog threads demonstrate the effectiveness of our proposal.

1. INTRODUCTION

Deep neural networks are vulnerable to domain shifts and suffer from lack of generalization on test samples that do not resemble the ones in the training distribution (Recht et al., 2019; Zhou et al., 2021; Krueger et al., 2021; Shen et al., 2022) . To deal with the domain shifts, domain generalization has been proposed (Muandet et al., 2013; Gulrajani & Lopez-Paz, 2020; Cha et al., 2021) . Domain generalization strives to learn a model exclusively on source domains in order to generalize well on unseen target domains. The major challenge stems from the large domain shifts and the unavailability of any target domain data during training. To address the problem, domain invariant learning has been widely studied, e.g., (Motiian et al., 2017; Zhao et al., 2020; Nguyen et al., 2021) , based on the assumption that invariant representations obtained on source domains are also valid for unseen target domains. However, since the target data is inaccessible during training, it is likely an "adaptivity gap" (Dubey et al., 2021) exists between representations from the source and target domains. Therefore, recent works try to adapt the classification model with target samples at test time by further fine-tuning model parameters (Sun et al., 2020; Wang et al., 2021) or by introducing an extra network module for adaptation (Dubey et al., 2021) . Rather than adapting the model to target domains, Xiao et al. ( 2022) adapt the classifier for each sample at test time. Nevertheless, a single sample would not be able to adjust the whole model due to the large number of model parameters and the limited information contained in the sample. This makes it challenging for their method to handle large domain gaps. Instead, we propose to adapt each target sample to the source distributions, which does not require any fine-tuning or parameter updates of the source model. In this paper, we propose energy-based test sample adaptation for domain generalization. The method is motivated by the fact that energy-based models (Hinton, 2002; LeCun et al., 2006) flexibly model complex data distributions and allow for efficient sampling from the modeled distribution by Langevin dynamics (Du & Mordatch, 2019; Welling & Teh, 2011) . Specifically, we define a new discriminative energy-based model as the composition of a classifier and a neural-network-based energy function in the data space, which are trained simultaneously on the source domains. The trained model iteratively * Currently with United Imaging Healthcare, Co., Ltd., China. 1

