COLD DIFFUSION: INVERTING ARBITRARY IMAGE TRANSFORMS WITHOUT NOISE

Abstract

Standard diffusion models involve an image transform -adding Gaussian noise -and an image restoration operator that inverts this degradation. We observe that the generative behavior of diffusion models is not strongly dependent on the choice of image degradation, and in fact an entire family of generative models can be constructed by varying this choice. Even when using completely deterministic degradations (e.g., blur, masking, and more), the training and test-time update rules that underlie diffusion models can be easily generalized to create generative models. The success of these fully deterministic models calls into question the community's understanding of diffusion models, which relies on noise in either gradient Langevin dynamics or variational inference, and paves the way for generalized diffusion models that invert arbitrary processes.

Original

Forward -----------------------→ Degraded Reverse ----------------------→ Generated Snow Pixelate Mask Animorph Blur Noise Figure 1 : Demonstration of the forward and backward processes for both hot and cold diffusions. While standard diffusions are built on Gaussian noise (top row), we show that generative models can be built on arbitrary and even noiseless/cold image transforms, including the ImageNet-C snowification operator, and an animorphosis operator that adds a random animal image from AFHQ.

1. INTRODUCTION

Diffusion models have recently emerged as powerful tools for generative modeling (Ramesh et al., 2022) . Diffusion models come in many flavors, but all are built around the concept of random noise removal; one trains an image restoration/denoising network that accepts an image contaminated with Gaussian noise, and outputs a denoised image. At test time, the denoising network is used to convert pure Gaussian noise into a photo-realistic image using an update rule that alternates between 1

