GENERALIZED ENERGY BASED MODELS

Abstract

We introduce the Generalized Energy Based Model (GEBM) for generative modelling. These models combine two trained components: a base distribution (generally an implicit model), which can learn the support of data with low intrinsic dimension in a high dimensional space; and an energy function, to refine the probability mass on the learned support. Both the energy function and base jointly constitute the final model, unlike GANs, which retain only the base distribution (the "generator"). GEBMs are trained by alternating between learning the energy and the base. We show that both training stages are well-defined: the energy is learned by maximising a generalized likelihood, and the resulting energy-based loss provides informative gradients for learning the base. Samples from the posterior on the latent space of the trained model can be obtained via MCMC, thus finding regions in this space that produce better quality samples. Empirically, the GEBM samples on image-generation tasks are of much better quality than those from the learned generator alone, indicating that all else being equal, the GEBM will outperform a GAN of the same complexity. When using normalizing flows as base measures, GEBMs succeed on density modelling tasks, returning comparable performance to direct maximum likelihood of the same networks.

1. INTRODUCTION

Energy-based models (EBMs) have a long history in physics, statistics and machine learning (LeCun et al., 2006) . They belong to the class of explicit models, and can be described by a family of energies E which define probability distributions with density proportional to exp(-E). Those models are often known up to a normalizing constant Z(E), also called the partition function. The learning task consists of finding an optimal function that best describes a given system or target distribution P. This can be achieved using maximum likelihood estimation (MLE), however the intractability of the normalizing partition function makes this learning task challenging. Thus, various methods have been proposed to address this (Hinton, 2002; Hyvärinen, 2005; Gutmann and Hyvärinen, 2012; Dai et al., 2019a; b) . All these methods estimate EBMs that are supported over the whole space. In many applications, however, P is believed to be supported on an unknown lower dimensional manifold. This happens in particular when there are strong dependencies between variables in the data (Thiry et al., 2021) , and suggests incorporating a low-dimensionality hypothesis in the model . Generative Adversarial Networks (GANs) (Goodfellow et al., 2014) are a particular way to enforce low dimensional structure in a model. They rely on an implicit model, the generator, to produce samples supported on a low-dimensional manifold by mapping a pre-defined latent noise to the sample space using a trained function. GANs have been very successful in generating high-quality samples on various tasks, especially for unsupervised image generation (Brock et al., 2018) . The generator is trained adversarially against a discriminator network whose goal is to distinguish samples produced by the generator from the target data. This has inspired further research to extend the training procedure to more general losses (Nowozin et al., 2016; Arjovsky et al., 2017; Li et al., 2017; Bińkowski et al., 2018; Arbel et al., 2018) and to improve its stability (Miyato et al., 2018; Gulrajani et al., 2017; Nagarajan and Kolter, 2017; Kodali et al., 2017) . While the generator of a GAN has effectively a low-dimensional support, it remains challenging to refine the distribution of mass on that support using pre-defined latent noise. For instance, as shown by Cornish et al. (2020) for normalizing flows, when the latent distribution is unimodal and the target distribution possesses multiple disconnected low-dimensional components, the generator, as a continuous map, compensates for this mismatch using steeper slopes. In practice, this implies the need for more complicated generators. * Correspondence: michael.n.arbel@gmail.com.

