FREQUENCY DECOMPOSITION IN NEURAL PROCESSES

Abstract

Neural Processes are a powerful tool for learning representations of function spaces purely from examples, in a way that allows them to perform predictions at test time conditioned on so-called context observations. The learned representations are finite-dimensional, while function spaces are infinite-dimensional, and so far it has been unclear how these representations are learned and what kinds of functions can be represented. We show that deterministic Neural Processes implicitly perform a decomposition of the training signals into different frequency components, similar to a Fourier transform. In this context, we derive a theoretical upper bound on the maximum frequency Neural Processes can reproduce, depending on their representation size. This bound is confirmed empirically. Finally, we show that Neural Processes can be trained to only represent a subset of possible frequencies and suppress others, which makes them programmable band-pass or band-stop filters.

1. INTRODUCTION

Neural Processes (Garnelo et al., 2018a; b) are a class of models that can learn a distribution over functions, or more generally a function space. In contrast to many other approaches that do the same, for example Bayesian Neural Networks, Neural Processes learn an explicit representation of such a function space, which allows them to condition their predictions on an arbitrary number of observations that are only available at test time. This representation is finite-dimensional, while function spaces are infinite-dimensional, and so far it has not been understood how they are able to bridge this gap and under what conditions they can successfully do so. Our work reveals how Neural Processes learn to represent infinite-dimensional function spaces in a finite-dimensional space, and in the process describes constraints and conditions that decide what kinds of function spaces can be represented. We begin with an observation that prior art in the context of learning on sets can be reinterpreted from a signal-processing perspective, which allows us to derive a theoretical upper bound on the frequencies, i.e. Fourier components, of functions that can be represented. We subsequently confirm this bound empirically, which suggests that the learned representations should contain a notion of frequency. To further investigate this hypothesis, we continue with a visualization of the learned representations, which reveals that Neural Processes can decompose a function space into different frequency components, essentially finding a representation in Fourier space without any explicit supervision on the representations to elicit such behaviour. As further evidence of this we train Neural Processes to represent only certain frequencies, which results in them suppressing those frequencies that were not observed in the training data. Our contributions can be summarized as followsfoot_0 : • We derive a theoretical upper bound on the signal frequency Neural Processes of a given representation size can reconstruct. As we show, the bound is observed either in the expected way-by suppressing high frequencies-or by implicitly limiting the signal interval. • We investigate learned representations qualitatively, presenting evidence that Neural Processes perform a frequency decomposition of the function space, akin to a Fourier transform. This behaviour is not incentivized externally but rather emerges naturally.



The complete source code to reproduce our experiments is available at https://github.com/ *** 1

