HOW DO VARIATIONAL AUTOENCODERS LEARN? INSIGHTS FROM REPRESENTATIONAL SIMILARITY

Abstract

The ability of Variational Autoencoders (VAEs) to learn disentangled representations has made them popular for practical applications. However, their behaviour is not yet fully understood. For example, the questions of when they can provide disentangled representations, or suffer from posterior collapse are still areas of active research. Despite this, there are no layerwise comparisons of the representations learned by VAEs, which would further our understanding of these models. In this paper, we thus look into the internal behaviour of VAEs using representational similarity techniques. Specifically, using the CKA and Procrustes similarities, we found that the encoders' representations are learned long before the decoders', and this behaviour is independent of hyperparameters, learning objectives, and datasets. Moreover, the encoders' representations in all but the mean and variance layers are similar across hyperparameters and learning objectives.

1. INTRODUCTION

Variational Autoencoders (VAEs) are considered state-of-the-art techniques to learn unsupervised disentangled representations, and have been shown to be beneficial for fairness (Locatello et al., 2019a) . As a result, VAEs producing disentangled representations have been extensively studied in the last few years (Locatello et al., 2019b; Mathieu et al., 2019; Rolinek et al., 2019; Zietlow et al., 2021) , but they still suffer from poorly understood issues such as posterior collapse (Dai et al., 2020) . While some work using explainability techniques has been done to shed light on the behaviour of VAEs (Liu et al., 2020) , a comparison of the representations learned by different methods is still lacking (Zietlow et al., 2021) . Moreover, the layer-by-layer similarity of the representations within models has yet to be investigated. Fortunately, the domain of deep representational similarity is an active area of research and metrics such as SVCCA (Raghu et al., 2017; Morcos et al., 2018) , Procrustes distance (Schönemann, 1966) , or Centred Kernel Alignment (CKA) (Kornblith et al., 2019) have proven very useful in analysing the learning dynamics of various models (Wang et al., 2019; Kudugunta et al., 2019; Raghu et al., 2019; Neyshabur et al., 2020) , and even helped to design UDR (Duan et al., 2020) , an unsupervised metric for model selection for VAEs. The fact that good models are more similar to each other than bad ones in the context of classification (Morcos et al., 2018) generalised well to Unsupervised Disentanglement Ranking (UDR) (Rolinek et al., 2019; Duan et al., 2020) . However, such a generalisation may not always be possible, and without sufficient evidence, it would be wise to expect substantial differences between the representations learned by supervised and unsupervised models. In this paper, our aim is to take a first step toward investigating the representational similarity of generative models by analysing the similarity scores obtained for a variety of VAEs learning disentangled representations, and by providing some insights into why various VAE-specific methods preventing posterior collapse (Bowman et al., 2016; He et al., 2019) or providing better reconstruction (Liu et al., 2021) are successful. Our contributions are as follows: 1

