APPROXIMATE PROBABILISTIC INFERENCE WITH COMPOSED FLOWS

Abstract

We study the problem of probabilistic inference on the joint distribution defined by a normalizing flow model. Given a pre-trained flow model p(x), we wish to estimate p(x 2 | x 1 ) for some arbitrary partitioning of the variables x = (x 1 , x 2 ). We first show that this task is computationally hard for a large class of flow models. Motivated by this hardness result, we propose a framework for approximate probabilistic inference. Specifically, our method trains a new generative model with the property that its composition with the given model approximates the target conditional distribution. By parametrizing this new distribution as another flow model, we can efficiently train it using variational inference and also handle conditioning under arbitrary differentiable transformations. Since the resulting approximate posterior remains a flow, it offers exact likelihood evaluation, inversion, and efficient sampling. We provide an extensive empirical evidence showcasing the flexibility of our method on a variety of inference tasks with applications to inverse problems. We also experimentally demonstrate that our approach is comparable to simple MCMC baselines in terms of sample quality. Further, we explain the failure of naively applying variational inference and show that our method does not suffer from the same issue.

1. INTRODUCTION

Generative modeling has seen an unprecedented growth in the recent years. Building on the success of deep learning, deep generative models have shown impressive ability to model complex distributions in a variety of domains and modalities. Among them, normalizing flow models (see Papamakarios et al. (2019) and references therein) stand out due to their computational flexibility, as they offer efficient sampling, likelihood evaluation, and inversion. While other types of models currently outperform flow models in terms of likelihood and sample quality, flow models have the advantage that they are relatively easy to train using maximum likelihood and do not suffer from issues that other models possess (e.g. mode collapse for GANs, posterior collapse for VAEs, slow sampling for autoregressive models). These characteristics make normalizing flows attractive for a variety of downstream tasks, including density estimation, inverse problems, semi-supervised learning, reinforcement learning, and audio synthesis (Ho et al., 2019; Asim et al., 2019; Atanov et al., 2019; Ward et al., 2019; Oord et al., 2018) . Even with such computational flexibility, how to perform efficient probabilistic inference on a flow model still remains largely unknown. This question is becoming increasingly important as generative models increase in size and the computational resources necessary to train them from scratch are out of reach for many researchers and practitionersfoot_0 . If it was possible to perform probabilistic inference on flow models, we could re-purpose these powerful pre-trained generators for numerous custom tasks. This is the central question we study in this paper: One wishes to estimate the conditional distribution p(x 2 | x 1 ) from a given flow model p(x) for some partitioning of variables x = (x 1 , x 2 ). Existing methods for this task largely fall under two categories: Markov Chain Monte Carlo (MCMC) and variational inference (VI). While MCMC methods can perform exact conditional sampling in theory, they often have prohibitively long mixing time for complex high-dimensional distributions and also



For example, Kingma & Dhariwal (2018) report that their largest model had 200M parameters and was trained on 40 GPUs for a week.

