DENSITY ESTIMATION ON LOW-DIMENSIONAL MANI-FOLDS: AN INFLATION-DEFLATION APPROACH

Abstract

Normalizing Flows (NFs) are universal density estimators based on Neuronal Networks. However, this universality is limited: the density's support needs to be diffeomorphic to a Euclidean space. In this paper, we propose a novel method to overcome this limitation without sacrificing the universality. The proposed method inflates the data manifold by adding noise in the normal space, trains an NF on this inflated manifold and, finally, deflates the learned density. Our main result provides sufficient conditions on the manifold and the specific choice of noise under which the corresponding estimator is exact. Our method has the same computational complexity as NFs, and does not require to compute an inverse flow. We also show that, if the embedding dimension is much larger than the manifold dimension, noise in the normal space can be well approximated by some Gaussian noise. This allows using our method for approximating arbitrary densities on non-flat manifolds provided that the manifold dimension is known.

1. INTRODUCTION

Many modern problems involving high dimensional data are formulated probabilistically. Key concepts, such as Bayesian Classification, Denoising or Anomaly Detection, rely on the data generating density p * (x). Therefore, a main research area and of crucial importance is learning this data generating density p * (x) from samples. For the case where the corresponding random variable X ∈ R D takes values on a manifold diffeomorphic to R D , a Normalizing Flow (NF) can be used to learn p * (x) exactly (Huang et al., 2018) . Recently, a few attempts have been made to overcome this topological constraint. However, to do so, all of these methods either need to know the manifold beforehand (Gemici et al. ( 2016 Our goal in this paper is to overcome both the aforementioned limitations of using NFs for density estimation on Riemannian manifolds. Given data points from a d-dimensional Riemannian manifold embedded in R D , d < D, we first inflate the manifold by adding a specific noise in the normal space direction of the manifold, then train an NF on this inflated manifold, and, finally, deflate the trained density by exploiting the choice of noise and the geometry of the manifold. See Figure 1 for a schematic overview of these points. Our main theorem states sufficient conditions on the manifold and the type of noise we use for the inflation step such that the deflation becomes exact. To guarantee the exactness, we do need to know the manifold as in e.g. Rezende et al. (2020) because we need to be able to sample in the manifold's normal space. However, as we will show, for the special case where D d, the usual Gaussian noise is an excellent approximation for a noise in the normal space component. This allows using our method for approximating arbitrary densities on Riemannian manifolds provided that the manifold dimension is known. In addition, our method is based on a single NF without the necessity to invert it. Hence, we don't add any additional complexity to the usual training procedure of NFs. Notations: We denote the determinant of the Gram matrix of f as g f (x) := | det J f (x) T J f (x) | where J f (x) is the Jacobian of f . We denote the Lebesque measure in R n as λ n . Random variables will be denoted with a capital letter, say X, and its corresponding state space with the calligraphical



), Rezende et al. (2020)) or they sacrifice the exactness of the estimate (Cornish et al. (2019), Dupont et al. (2019)).

