ADDRESSING PARAMETER CHOICE ISSUES IN UNSU-PERVISED DOMAIN ADAPTATION BY AGGREGATION

Abstract

We study the problem of choosing algorithm hyper-parameters in unsupervised domain adaptation, i.e., with labeled data in a source domain and unlabeled data in a target domain, drawn from a different input distribution. We follow the strategy to compute several models using different hyper-parameters, and, to subsequently compute a linear aggregation of the models. While several heuristics exist that follow this strategy, methods are still missing that rely on thorough theories for bounding the target error. In this turn, we propose a method that extends weighted least squares to vector-valued functions, e.g., deep neural networks. We show that the target error of the proposed algorithm is asymptotically not worse than twice the error of the unknown optimal aggregation. We also perform a large scale empirical comparative study on several datasets, including text, images, electroencephalogram, body sensor signals and signals from mobile phones. Our method 1 outperforms deep embedded validation (DEV) and importance weighted validation (IWV) on all datasets, setting a new state-of-the-art performance for solving parameter choice issues in unsupervised domain adaptation with theoretical error guarantees. We further study several competitive heuristics, all outperforming IWV and DEV on at least five datasets. However, our method outperforms each heuristic on at least five of seven datasets.

1. INTRODUCTION

The goal of unsupervised domain adaptation is to learn a model on unlabeled data from a target input distribution using labeled data from a different source distribution (Pan & Yang, 2010; Ben-David et al., 2010) . If this goal is achieved, medical diagnostic systems can successfully be trained on unlabeled images using labeled images with a different modality (Varsavsky et al., 2020; Zou et al., 2020) ; segmentation models for natural images can be learned using only labeled data from computer simulations Peng et al. (2018) ; natural language models can be learned from unlabeled biomedical abstracts by means of labeled data from financial journals (Blitzer et al., 2006) ; industrial quality inspection systems can be learned on unlabeled data from new products using data from related products (Jiao et al., 2019; Zellinger et al., 2020) . However, missing target labels combined with distribution shift makes parameter choice a hard problem (Sugiyama et al., 2007; You et al., 2019; Saito et al., 2021; Zellinger et al., 2021; Musgrave et al., 2021) . Often, one ends up with a sequence of models, e.g., originating from different hyperparameter configurations (Ben-David et al., 2007; Saenko et al., 2010; Ganin et al., 2016; Long et al., 



Large scale benchmark experiments are available at https://github.com/Xpitfire/iwa; dinu@ml.jku.at, werner.zellinger@ricam.oeaw.ac.at

