DISENTANGLEMENT OF CORRELATED FACTORS VIA HAUSDORFF FACTORIZED SUPPORT

Abstract

A grand goal in deep learning research is to learn representations capable of generalizing across distribution shifts. Disentanglement is one promising direction aimed at aligning a model's representation with the underlying factors generating the data (e.g. color or background). Existing disentanglement methods, however, rely on an often unrealistic assumption: that factors are statistically independent. In reality, factors (like object color and shape) are correlated. To address this limitation, we consider the use of a relaxed disentanglement criterion -the Hausdorff Factorized Support (HFS) criterion -that encourages only pairwise factorized support, rather than a factorial distribution, by minimizing a Hausdorff distance. This allows for arbitrary distributions of the factors over their support, including correlations between them. We show that the use of HFS consistently facilitates disentanglement and recovery of ground-truth factors across a variety of correlation settings and benchmarks, even under severe training correlations and correlation shifts, with in parts over +60% in relative improvement over existing disentanglement methods. In addition, we find that leveraging HFS for representation learning can even facilitate transfer to downstream tasks such as classification under distribution shifts. We hope our original approach and positive empirical results inspire further progress on the open problem of robust generalization. Code available at https://github.com/facebookresearch/disentangling-correlated-factors.

1. INTRODUCTION

Figure 1 : Real data exhibits correlations between generative factors: cows are likely on grass, camels on sand. This contradicts disentanglement methods assuming statistically independent factors. Instead, we show that merely assuming and aiming for a factorized support can yield robust disentanglement even under correlated factors. Disentangled representation learning (Bengio et al., 2013; Higgins et al., 2018) is a promising path to facilitate reliable generalization to in-and out-ofdistribution downstream tasks (Bengio et al., 2013; Higgins et al., 2018; Milbich et al., 2020; Dittadi et al., 2021; Horan et al., 2021) , on top of being more interpretable and fair (Locatello et al., 2019a; Träuble et al., 2021) . While Higgins et al. (2018) propose a formal definition based on group equivariance, and various metrics have been proposed to measure disentanglement (Higgins et al., 2017; Chen et al., 2018; Eastwood & Williams, 2018) the most commonly understood definition is as follows: Definition 1.1 (Disentanglement) Assuming data generated by a set of unknown ground-truth latent factors, a representation is said to be disentangled if there exists a one-to-one correspondence between each factor and dimension of the representation.

