MULTISCALE INVERTIBLE GENERATIVE NETWORKS FOR HIGH-DIMENSIONAL BAYESIAN INFERENCE Anonymous

Abstract

High-dimensional Bayesian inference problems cast a long-standing challenge in generating samples, especially when the posterior has multiple modes. For a wide class of Bayesian inference problems equipped with the multiscale structure that low-dimensional (coarse-scale) surrogate can approximate the original highdimensional (fine-scale) problem well, we propose to train a Multiscale Invertible Generative Network (MsIGN) for sample generation. A novel prior conditioning layer is designed to bridge networks at different resolutions, enabling coarse-tofine multi-stage training. Jeffreys divergence is adopted as the training objective to avoid mode dropping. On two high-dimensional Bayesian inverse problems, MsIGN approximates the posterior accurately and clearly captures multiple modes, showing superior performance compared with previous deep generative network approaches. On the natural image synthesis task, MsIGN achieves the superior performance in bits-per-dimension compared with our baseline models and yields great interpret-ability of its neurons in intermediate layers.

1. INTRODUCTION

Bayesian inference provides a powerful framework to blend prior knowledge, data generation process and (possibly small) data for statistical inference. With some prior knowledge ⇢ (distribution) for the quantity of interest x 2 R d , and some (noisy) measurement y 2 R dy , it casts on x a posterior q(x|y) / ⇢(x)L(y|x) , where L(y|x) = N (y F(x); 0, " ) . ( ) where L(y|x) is the likelihood that compares the data y with system prediction F(x) from the candidate x, here F denotes the forward process. We can use different distributions to model the mismatch " = y F(x), and for illustration simplicity, we assume Gaussian in Equation 1. For example, Bayesian deep learning generates model predicted logits F(x) from model parameters x, and compares it with discrete labels y through binomial or multinomial distribution. Sampling or inferring from q is a long-standing challenge, especially for high-dimensional (high-d) cases. An arbitrary high-d posterior can have its importance regions (also called "modes") anywhere in the high-d space, and finding these modes requires computational cost that grows exponentially with the dimension d. This intrinsic difficulty is the consequence of "the curse of dimensionality", which all existing Bayesian inference methods suffer from, e.g 



In this paper, we focus on Bayesian inference problems with multiscale structure and exploit this structure to sample from a high-d posterior. While the original problem has a high spatial resolution (fine-scale), its low resolution (coarse-scale) analogy is computationally attractive because it lies in a low-dimension (low-d) space. A problem has the multiscale structure if such coarse-scale low-d surrogate exists and gives good approximation to the fine-scale high-d problem, see Section 2.1. Such multiscale property is very common in high-d Bayesian inference problems. For example, inferring 3-D permeability field of subsurface at the scale of meters is a reasonable approximation of itself at the scale of centimeters, while the problem dimension is 10 6 -times fewer.We propose a Multiscale Invertible Generative Network (MsIGN) to sample from high-d Bayesian inference problems with multiscale structure. MsIGN is a flow-based generative network that can both generate samples and give density evaluation. It consists of multiple scales that recursively

