GM-VAE: REPRESENTATION LEARNING WITH VAE ON GAUSSIAN MANIFOLD

Abstract

We propose a Gaussian manifold variational auto-encoder (GM-VAE) whose latent space consists of a set of diagonal Gaussian distributions. It is known that the set of the diagonal Gaussian distributions with the Fisher information metric forms a product hyperbolic space, which we call a Gaussian manifold. To learn the VAE endowed with the Gaussian manifold, we first propose a pseudo Gaussian manifold normal distribution based on the Kullback-Leibler divergence, a local approximation of the squared Fisher-Rao distance, to define a density over the latent space. With the newly proposed distribution, we introduce geometric transformations at the last and the first of the encoder and the decoder of VAE, respectively to help the transition between the Euclidean and Gaussian manifolds. Through the empirical experiments, we show competitive generalization performance of GM-VAE against other variants of hyperbolic-and Euclidean-VAEs. Our model achieves strong numerical stability, which is a common limitation reported with previous hyperbolic-VAEs.

1. INTRODUCTION

The geometry of latent space in generative models, such as the variational auto-encoders (VAE) (Kingma & Welling, 2013) and generative adversarial networks (GAN) (Goodfellow et al., 2020) , reflects the structure of the representation of the data. Mathieu et al. (2019); Nagano et al. (2019); Cho et al. (2022) show that employing a hyperbolic space as the latent space improves in preserving the hierarchical structure of the data in the latent space. The expanded geometry is not just limited to the hyperbolic space, as the space can be other types of Riemannian manifolds, such as spherical manifolds (Xu & Durrett, 2018; Davidson et al., 2018) and the product of Riemannian manifolds with mixed curvatures (Skopek et al., 2019) . Meanwhile, it is known that univariate Gaussian distributions equipped with Fisher information metric (FIM) form a Riemannian manifold, sharing the manifold with Poincaré half-plane which is one of the four isometric hyperbolic models. This statistical manifold is known to have a metric tensor akin to that of the Poincaré half-plane (Costa et al., 2015) , providing a possibility of viewing it as a hyperbolic space. Furthermore, the diagonal Gaussian distributions form a product of Riemannian manifolds showing the presence of an extended statistical manifold. Based on the connection between hyperbolic spaces and statistical manifolds, in this work, we add an alternative perspective on hyperbolic VAEs with a viewpoint from the information geometry. Previously proposed hyperbolic VAEs rely on the distributions defined over the hyperbolic space. Riemannian normal and wrapped normal are commonly used as prior and variational distributions over the hyperbolic space. Unlike the Gaussian distribution in Euclidean space, these distributions suffer from numerical instability (Mathieu et al., 2019; Skopek et al., 2019) . In addition, the Riemannian normal requires performing rejection sampling, which often generates too many unwanted samples. From the information geometric perspective of the hyperbolic space, we introduce a new distribution, named a pseudo Gaussian manifold normal distribution (PGM normal). The Gaussian manifold, here, refers to the statistical manifold with univariate Gaussian distributions. The newly proposed distribution uses the KL divergence as a statistical distance between two distributions in the Gaussian manifold. Since the KL divergence approximates the squared Riemannian distance of the statistical manifold, derived from FIM, the proposed distribution follows the geometric property of the Gaussian

