FUNDAMENTAL LIMITS AND TRADEOFFS IN INVARIANT REPRESENTATION LEARNING

Abstract

Many machine learning applications involve learning representations that achieve two competing goals: To maximize information or accuracy with respect to a target while simultaneously maximizing invariance or independence with respect to a subset of features. Typical examples include privacy-preserving learning, domain adaptation, and algorithmic fairness, just to name a few. In fact, all of the above problems admit a common minimax game-theoretic formulation, whose equilibrium represents a fundamental tradeoff between accuracy and invariance. In this paper, we provide an information theoretic analysis of this general and important problem under both classification and regression settings. In both cases, we analyze the inherent tradeoffs between accuracy and invariance by providing a geometric characterization of the feasible region in the information plane, where we connect the geometric properties of this feasible region to the fundamental limitations of the tradeoff problem. In the regression setting, we also derive a tight lower bound on the Lagrangian objective that quantifies the tradeoff between accuracy and invariance. Our results shed new light on this fundamental problem by providing insights on the interplay between accuracy and invariance. These results deepen our understanding of this fundamental problem and may be useful in guiding the design of adversarial representation learning algorithms.

1. INTRODUCTION

One of the fundamental tasks in both supervised and unsupervised learning is to learn proper representations of data for various downstream tasks. Due to the recent advances in deep learning, there has been a surge of interest in learning so-called invariant representations. Roughly speaking, the underlying problem of invariant representation learning is to find a feature transformation of the data that balances two goals simultaneously. First, the features should preserve enough information with respect to the target task of interest, e.g., good predictive accuracy. On the other hand, the representations should be invariant to the change of a pre-defined attribute, e.g., in visual perceptions the representations should be invariant to the change of perspective or lighting conditions, etc. Clearly, in general there is often a tension between these two competing goals of error minimization and invariance maximization. Understanding the fundamental limits and tradeoffs therein remains an important open problem. In practice, the problem of learning invariant representations is often formulated as solving a minimax sequential game between two agents, a feature encoder and an adversary. Under this framework, the goal of the feature encoder is to learn representations that could confuse a worst-case adversary in discriminating the pre-defined attribute. Meanwhile, the representations given by the feature encoder should be amenable for a follow-up predictor of target task. In this paper, we consider the situation where both the adversary and the predictor have infinity capacity, so that the tradeoff between accuracy and invariance solely depends on the representations given by the feature encoder. In particular, our results shed light on the best possible tradeoff attainable by any algorithm. This leads to a Lagrangian objective with a tradeoff parameter between these two competing goals, and we study the fundamental limitations of this tradeoff by analyzing the extremal values of this Lagrangian in both classification and regression settings. Our results shed new light on the fundamental tradeoff between accuracy and invariance, and give a crisp characterization of how the dependence between the target task and the pre-defined attribute affects the limits of representation learning.

Contributions

We geometrically characterize the tradeoff between accuracy and invariance via the information plane (Shwartz-Ziv & Tishby, 2017) analysis under both classification and regression settings, where each feature transformation correspond to a point on the information plane. For the classification setting, we provide a fundamental characterization of the feasible region in the information plane, including its boundedness, convexity, and extremal vertices. For the regression setting, we provide an analogous characterization of the feasible region by replacing mutual information with conditional variances. Finally, in the regression setting, we prove a tight information-theoretic lower bound on a Lagrangian objective that trades off accuracy and invariance. The proof relies on an interesting SDP relaxation, which may be of independent interest.

Related Work

There are abundant applications of learning invariant representations in various downstream tasks, including domain adaptation (Ben-David et al., 2007; 2010; Ganin et al., 2016; Zhao et al., 2018) , algorithmic fairness (Edwards & Storkey, 2015; Zemel et al., 2013; Zhang et al., 2018; Zhao et al., 2019b) , privacy-preserving learning (Hamm, 2015; 2017; Coavoux et al., 2018; Xiao et al., 2019) , invariant visual representations (Quiroga et al., 2005; Gens & Domingos, 2014; Bouvrie et al., 2009; Mallat, 2012; Anselmi et al., 2016) , and causal inference (Johansson et al., 2016; Shalit et al., 2017; Johansson et al., 2020) , just to name a few. To the best of our knowledge, no previous work studies the particular tradeoff problem in this paper. Closest to our work are results in domain adaptation (Zhao et al., 2019a) and algorithmic fairness (Menon & Williamson, 2018; Zhao & Gordon, 2019) , showing a lower bound on the classification accuracy on two groups, e.g., source vs. target in domain adaptation and majority vs. minority in algorithmic fairness. Compared to these previous results, our work directly characterizes the tradeoff between accuracy and invariance using information-theoretic concepts in both classification and regression settings. Furthermore, we also give an approximation to the Pareto frontier between accuracy and invariance in both cases.

2. BACKGROUND AND PRELIMINARIES

Notation We adopt the usual setup given (X, Y ) ∈ X × Y, where Y is the response, X ∈ R p represents the input vector, and we seek a classification/regression function f (X) that minimizes E (f (X), Y ) where : Y × Y → R is some loss function depending on the context of the underlying problem. In this paper, we consider two typical choices of : (1) the cross entropy loss, i.e. (y, y ) = -y log(y )-(1-y) log(1-y ), which is typically used when Y is a discrete variable in classification; (2) the squared loss, i.e. (y, y ) = (yy ) 2 , which is suitable for Y continuous, as in regression. Throughout the paper, we will assume that all random variables have finite second-order moments. Problem Setup Apart from the input/output pairs, in our setting there is a third variable A, which corresponds to a variable that a predictor should be invariant to. Depending on the particular application, A could correspond to potential protected attributes in algorithmic fairness, e.g., the ethnicity or gender of an individual; or A could be the identity of domain index in domain adaptation, etc. In general, we assume that there is a joint distribution D over the triple (X, A, Y ), from which our observational data are sampled from. Upon receiving the data, the goal of the learner has two folds. On one hand, the learner aims to accurately predict the target Y . On the other hand, it also tries to be insensitive to variation in A. To achieve this dual goal, one standard approach in the literature (Zemel et al., 2013; Edwards & Storkey, 2015; Hamm, 2015; Ganin et al., 2016; Zhao et al., 2018) is through the lens of representation learning. Specifically, let Z = g(X) where g(•) is a (possibly randomized) transformation function that takes X as input and gives the corresponding feature encoding Z. The hope is that, by learning the transformation function g(•), Z contains as much information as possible about the target Y while at the same time filtering out information related to A. This problem is often phrased as an adversarial game: min f,g max f E D [ (f • g(X), Y )] -λ • E D [ (f • g(X), A)], where the two competing agents are the feature transformation g and the adversary f , and λ > 0 is a tradeoff hyperparameter between the task variable Y and the attribute A. For example, the adversary f could be understood as a domain discriminator in applications related to domain adaptation, or an auditor of sensitive attribute in algorithmic fairness. In the above minimax game, the first term corresponds to the accuracy of the target task, and the second term is the loss incurred by the adversary. It is worth pointing out that the minimax problem in (1) is separable for any fixed feature transformation g, in the sense that once g has been fixed, the optimization of f and f are independent of each other. Formally, define R * Y (g) := inf f E D (f (g(X)), Y ) to be the optimal risk in predicting

