ON THE OPTIMAL PRECISION OF GANS

Abstract

Generative adversarial networks (GANs) are known to face model misspecification when learning disconnected distributions. Indeed, continuous mapping from a unimodal latent distribution to a disconnected one is impossible, so GANs necessarily generate samples outside of the support of the target distribution. In this paper, we make the connection between the performance of GANs and their latent space configuration. In particular, we raise the following question: what is the latent space partition that minimizes the measure of out-of-manifold samples? Building on a recent result of geometric measure theory, we prove a sufficient condition for GANs to be optimal when the dimension of the latent space is larger than the number of modes. In particular, we show the optimality of generators that structure their latent space as a 'simplicial cluster' -a Voronoi partition where centers are equally distant. We derive both an upper and a lower bound on the optimal precision of GANs learning disconnected manifolds. Interestingly, these two bounds have the same order of decrease: √ log m, m being the number of modes. Finally, we perform several experiments to exhibit the geometry of the latent space and experimentally show that GANs have a geometry with similar properties to the theoretical one.

1. INTRODUCTION

GANs (Goodfellow et al., 2014) , a family of deep generative models, have shown great capacities to generate photorealistic images (Karras et al., 2019) . State-of-the-art models, like StyleGAN (Karras et al., 2019) or TransformerGAN (Jiang et al., 2021) , show empirical benefits from relying on overparametrized networks with high-dimensional latent spaces. Besides, manipulating the latent representation of a GAN is also helpful for diverse tasks such as image editing (Shen et al., 2020; Wu et al., 2021) or unsupervised learning of image segmentation (Abdal et al., 2021) . However, there is still a poor theoretical understanding of how GANs organize their latent space. We argue that this is a crucial step in better apprehending the behavior of GANs. To better understand GANs, the setting of disconnected distributions learning is enlightening. Experimental and theoretical works (Khayatkhoei et al., 2018; Tanielian et al., 2020) have shown a fundamental limitation of GANs when dealing with such distributions. Since the distribution modeled by GANs is connected, some areas of GANs' support are necessarily mapped outside the true data distribution. When covering modes of a disconnected distribution, GANs try to minimize the measure of the generated distribution lying outside the true modes (e.g. the purple area on the right of Figure 1 ). In other words, GANs need to minimize the measure of the borders between the modes in the latent space. Considering a Gaussian latent space, minimizing this measure is closely linked to the field of Gaussian isoperimetric inequalities (Ledoux, 1996) . This field aims at deriving the partitions that decompose a Gaussian space with a minimal Gaussian-weighted perimeter. We argue that the optimal partitions derived in Gaussian isoperimetric inequalities cast a light on the structure of the latent space of GANs. Most notably, a recent result (Milman and Neeman, 2022) shows that, as long as the number of components m in the partition and the number of dimensions d of the Gaussian space are such that m ≤ d + 1, the optimal partition is a 'simplicial cluster': a Voronoi diagram obtained from the cells of equidistant points (see left of Figure 1 for m = 3 and d = 3). In this paper, we apply this result to the field of GANs and show, both experimentally and theoretically, that GANs with 'simplicial cluster' latent space minimize out-of-distribution generated samples. We draw the connection between GANs and Gaussian isoperimetric inequalities by using the precision metric (Sajjadi et al., 2018; Kynkäänniemi et al., 2019) , which quantifies the portion of generated 2013)). On the right, we show the 3D Gaussian latent space of a GAN trained on three classes of MNIST. Each area colored in blue, green, or red maps samples in one of the three classes. In purple, we observe the samples that are classified with low confidence. We see that the partition reached by the GAN (right) is close to optimality (left), since the latent space partition is similar to the intersection of the propeller on a sphere. points that support the target distribution. We show that GANs with a latent space organized as a simplicial cluster reach optimal precision levels and derive both an upper and a lower bound on the precision of such GANs. Experimentally, we show that the GANs with higher performances tend to organize their latent space as simplicial clusters. To summarize, our contributions are the following: • We are the first to import the latest results from Gaussian isoperimetric inequalities by (Milman and Neeman, 2022) to the study and understanding of GANs. We use it to show that the latent space structure has major implications on the precision of GANs. • We derive a new theoretical analysis, stating both an upper bound and a lower bound on the precision of GANs. We show that GANs with latent space organized as a simplicial cluster have an optimal precision whose lower bound decrease in the same order as the upper bound: √ log m, where m is the number of modes. • Experimentally, we show that GANs tend to structure their latent space as 'simplicial clusters' on image datasets. First, we explore two properties of the latent space: linear separability and convexity of classes. Then, we play with the latent space dimension and highlight how it impacts the performance of GANs. Finally, we show that overparametrization helps approaching the optimal structure and improving GANs performance.

2. RELATED WORK

2.1 NOTATION Data. We assume that the target distribution µ is defined on Euclidean space R D (potentially a high-dimensional space), equipped with the Euclidean norm • . We denote S µ the support of this unknown distribution µ . In practice, however, we only have access to a finite collection of i.i.d. observations X 1 , . . . , X m distributed according to µ . Thus, for the remainder of the article, we let µ m be the empirical measure based on X 1 , . . . , X m . Generative model. We consider G L the set of L-Lipschitz continuous functions from the latent space R d to the high-dimensional space R D . Each generator aims at producing realistic samples. The latent space distribution defined on R d is supposed to be Gaussian and is noted γ. Thus, each candidate distribution is the push forward between γ and a generator G and is noted as G γ. This Lipschitzness assumption on G L is reasonable since Virmaux and Scaman (2018) has shown and presented an algorithm that upper-bounds the Lipschitz constant of any deep neural network. In practice, one can enforce the Lipschitzness of generator functions by clipping the neural networks'



Figure 1: Illustration of the ability of GANs to find an optimal configuration in the latent space. On the left, the propeller shape is a partition of 3D Gaussian space with the smallest Gaussian-weighted perimeter (Figure from Heilman et al. (2013)). On the right, we show the 3D Gaussian latent space of a GAN trained on three classes of MNIST. Each area colored in blue, green, or red maps samples in one of the three classes. In purple, we observe the samples that are classified with low confidence. We see that the partition reached by the GAN (right) is close to optimality (left), since the latent space partition is similar to the intersection of the propeller on a sphere.

