RG-FLOW: A HIERARCHICAL AND EXPLAINABLE FLOW MODEL BASED ON RENORMALIZATION GROUP AND SPARSE PRIOR

Abstract

Flow-based generative models have become an important class of unsupervised learning approaches. In this work, we incorporate the key idea of renormalization group (RG) and sparse prior distribution to design a hierarchical flow-based generative model, called RG-Flow, which can separate information at different scales of images with disentangled representations at each scale. We demonstrate our method mainly on the CelebA dataset and show that the disentangled representations at different scales enable semantic manipulation and style mixing of the images. To visualize the latent representations, we introduce receptive fields for flow-based models and find that the receptive fields learned by RG-Flow are similar to those in convolutional neural networks. In addition, we replace the widely adopted Gaussian prior distribution by a sparse prior distribution to further enhance the disentanglement of representations. From a theoretical perspective, the proposed method has O(log L) complexity for image inpainting compared to previous generative models with O(L 2 ) complexity.

1. INTRODUCTION

One of the most important unsupervised learning tasks is to learn the data distribution and build generative models. Over the past few years, various types of generative models have been proposed. Flow-based generative models are a particular family of generative models with tractable distributions (Dinh et al., 2017; Kingma & Dhariwal, 2018; Chen et al., 2018b; 2019; Behrmann et al., 2019; Hoogeboom et al., 2019; Brehmer & Cranmer, 2020; Rezende et al., 2020; Karami et al., 2019 ). Yet the latent variables are on equal footing and mixed globally. Here, we propose a new flow-based model, RG-Flow, which is inspired by the idea of renormalization group in statistical physics. RG-Flow imposes locality and hierarchical structure in bijective transformations. It allows us to access information at different scales in original images by latent variables at different locations, which offers better explainability. Combined with sparse priors (Olshausen & Field, 1996; 1997; Hyvärinen & Oja, 2000) , we show that RG-Flow achieves hierarchical disentangled representations. Renormalization group (RG) is a powerful tool to analyze statistical mechanics models and quantum field theories in physics (Kadanoff, 1966; Wilson, 1971) . It progressively extracts more coarse-scale statistical features of the physical system and decimates irrelevant fine-grained statistics at each scale. Typically, the local transformations used in RG are designed by human physicists and they are not bijective. On the other hand, the flow-based models use cascaded invertible global transformations to progressively turn a complicated data distribution into Gaussian distribution. Here, we would like to combine the key ideas from RG and flow-based models. The proposed RG-flow enables the machine to learn the optimal RG transformation from data, by constructing local invertible transformations and build a hierarchical generative model for the data distribution. Latent representations are introduced at different scales, which capture the statistical features at the corresponding scales. Together, the latent representations of all scales can be jointly inverted to generate the data. This method was recently proposed in the physics community as NeuralRG (Li & Wang, 2018; Hu et al., 2020) . Our main contributions are two-fold: First, RG-Flow can separate the signal statistics of different scales in the input distribution naturally, and represent information at each scale in its latent vari-

