FAILURE MODES OF VARIATIONAL AUTOENCODERS AND THEIR EFFECTS ON DOWNSTREAM TASKS Anonymous

Abstract

Variational Auto-encoders (VAEs) are deep generative latent variable models that are widely used for a number of downstream tasks. While it has been demonstrated that VAE training can suffer from a number of pathologies, existing literature lacks characterizations of exactly when these pathologies occur and how they impact down-stream task performance. In this paper we concretely characterize conditions under which VAE training exhibits pathologies and connect these failure modes to undesirable effects on specific downstream tasks, such as learning compressed and disentangled representations, adversarial robustness and semi-supervised learning.

1. INTRODUCTION

Variational Auto-encoders (VAEs) are deep generative latent variable models that transform simple distributions over a latent space to model complex data distributions Kingma & Welling (2013) . They have been used for a wide range of downstream tasks, including: generating realistic looking synthetic data (e.g Pu et al. ( 2016)), learning compressed representations (e.g. Miao & Blunsom (2016); Gregor et al. (2016) ; Alemi et al. ( 2017)), adversarial defense using de-noising (Luo & Pfister, 2018; Ghosh et al., 2018) , and, when expert knowledge is available, generating counter-factual data using weak or semi-supervision (e.g. Kingma et al. (2014); Siddharth et al. (2017) ; Klys et al. (2018) ). Variational auto-encoders are widely used by practitioners due to the ease of their implementation and simplicity of their training. In particular, the common choice of mean-field Gaussian (MFG) approximate posteriors for VAEs (MFG-VAE) results an inference procedure that is straight-forward to implement and stable in training. Unfortunately, a growing body of work has demonstrated that MFG-VAEs suffer from a variety of pathologies, including learning un-informative latent codes (e.g. van den Oord et al. (2017) ; Kim et al. (2018) ) and unrealistic data distributions (e.g. Tomczak & Welling (2017)). When the data consists of images or text, rather than evaluating the model based on metrics alone, we often rely on "gut checks" to make sure that the quality of the latent representations the model learns and the synthetic data (as well as counterfactual data) generated by the model is high (e.g. by reading generate text or inspecting generated images visually (Chen et al., 2018; Klys et al., 2018) ). However, as VAEs are increasingly being used in application where the data is numeric, e.g. in medical or financial domains (Pfohl et al., 2019; Joshi et al., 2019; Way & Greene, 2017) , these intuitive qualitative checks no longer apply. For example, in many medical applications, the original data features themselves (e.g. biometric reading) are difficult to analyze by human experts in raw form. In these cases, where the application touches human lives and potential model error/pathologies are particularly consequential, we need to have a clear theoretical understanding of the failure modes of our models as well as the potential negative consequences on down-stream tasks. Recent work (Yacoby et al., 2020) attributes a number of the pathologies of MFG-VAEs to properties of the training objective; in particular, the objective may compromise learning a good generative model in order to learn a good inference model -in other words, the inference model over-regularizes the generative model. While this pathology has been noted in literature (Burda et al., 2016; Zhao et al., 2017; Cremer et al., 2018) , no prior work has characterizes the conditions under which the MFG-VAE objective compromises learning a good generative model in order to learn a good inference model; moreover, no prior work has related MFG-VAE pathologies with the performance on downstream tasks. Rather, existing literature focuses on mitigating the regularizing effect of the inference model on the VAE generative model by using richer variational families (e.g. Kingma et al.

