MONOFLOW: A UNIFIED GENERATIVE MODELING FRAMEWORK FOR DIVERGENCE GANS

Abstract

Generative adversarial networks (GANs) play a minmax two-player game via adversarial training. The conventional understanding of adversarial training is that the discriminator is trained to estimate a divergence and the generator learns to minimize this divergence. We argue that despite the fact that many variants of GANs are developed following this paradigm, the existing theoretical understanding of GANs and the practical algorithms are inconsistent. In order to gain deeper theoretical insights and algorithmic inspiration for these GAN variants, we leverage Wasserstein gradient flows which characterize the evolution of particles in the sample space. Based on this, we introduce a unified generative modeling framework -MonoFlow: the particle evolution is rescaled via an arbitrary monotonically increasing mapping. Under our framework, adversarial training can be viewed as a procedure first obtaining MonoFlow's vector field via the discriminator and then the generator learns to parameterize the flow defined by the corresponding vector field. We also reveal the fundamental difference between variational divergence minimization and adversarial training. This analysis helps us to identify what types of generator loss functions can lead to the successful training of GANs and suggest that GANs may have more loss designs beyond those developed in the literature, e.g., non-saturated loss, as long as they realize MonoFlow. Consistent empirical studies are also included to validate the effectiveness of our framework.

1. INTRODUCTION

Generative adversarial nets (GANs) (Goodfellow et al., 2014; Jabbar et al., 2021) are a powerful generative modeling framework that has gained tremendous attention in recent years. GANs have achieved significant successes in applications, especially in high-dimensional image processing such as high-fidelity image generation (Brock et al., 2018; Karras et al., 2019 ), super-resolution (Ledig et al., 2017) and domain adaption (Zhang et al., 2017) . In the GAN framework, a discriminator d and a generator g play a minmax game. The discriminator is trained to distinguish real and fake samples and the generator is trained to generate fake samples to fool the discriminator. The equilibrium of the vanilla GAN is defined byfoot_0  min g max d V (g, d) = E x∼pdata log σ[d(x)] + E z∼pz log 1 -σ[d(g(z))] The elementary optimization approach to solve the minmax game is adversarial training. Previous perspectives explained it as first estimating Jensen-Shannon divergence then the generator learns to minimize this divergence. 



We use a slightly different notation: d(x) is the logit output of the classifier and σ(•) is the Sigmoid activation. 1



Several variants of GANs have been developed based on this point of view for other probability divergences, e.g., χ 2 divergence(Mao et al., 2017), Kullback-Leibler (KL) divergence(Arbel et al., 2021)  and general f -divergences(Nowozin et al., 2016; Uehara et al.,  2016), while others are developed with Integral Probability Metrics(Arjovsky et al., 2017; Dziugaite  et al., 2015; Mroueh et al., 2018b). However, we emphasize that the traditional understanding over GANs is incomplete and here we present three non-negligible facts which are commonly associated with adversarial training, making it different from the standard variational divergence minimization (VDM) problem(Blei et al., 2017):

