UNSUPERVISED NON-PARAMETRIC SIGNAL SEPARA-TION USING BAYESIAN NEURAL NETWORKS

Abstract

Bayesian neural networks (BNN) take the best from two worlds: the one of flexible and scalable neural networks and the one of probabilistic graphical models, the latter allowing for probabilistic interpretation of inference results. We make one extra step towards unification of these two domains and render BNN as an elementary unit of abstraction in the framework of probabilistic modeling, which allows us to promote well-known distributions to distribution fields. We use transformations to obtain field versions of several popular distributions and demonstrate the utility of our approach on the problem of signal/background separation. Starting from prior knowledge that a certain region of space contains predominantly one of the components, in an unsupervised and non-parametric manner, we recover the representation of both previously unseen components as well as their proportions.

1. INTRODUCTION

Neural networks as predictive models have been wildly successful across a variety of domains, be it image recognition or language modeling. And while they may be used to make predictions on previously unseen samples, one of fundamental weaknesses of traditional neural networks is the inability to quantify the prediction uncertainty. Evaluation of prediction uncertainty is important in basic research (identification of fundamental laws), reinforcement learning (identification of value functions), anomaly detection, etc. Uncertainty quantification in neural networks has been addressed both from frequentist (see, for instance, Pearce et al. ( 2018)) and Bayesian ( Kendall and Gal (2017)) sides. In the Bayesian setting it was naturally proposed to promote the weights of neural layers to normally distributed random variables (MacKay (1992)). Later it was shown that the learnt uncertainty in the weights improves generalization in non-linear regression problems, and it can be applied to drive the explorationexploitation trade-off in reinforcement learning ( Blundell et al. (2015) ). Depeweg et al. (2017) designed a method of separation of uncertainty into epistemic and aleatoric. Epistemic uncertainty expresses uncertainties inherent to the model and can not be reduced with additional observations, whereas aleatoric uncertainty captures the amount of noise due to training on a specific sample. In physics the former and latter are referred to as the systematic and statistical uncertainties, respectively. In treating both types of uncertainty within the same framework authors essentially bridged the gap towards graphical models. Graphical models, unlike traditional neural networks, are probabilistic in nature and allow for incorporation of prior beliefs with respect to models. They are flexible in representing various processes and allow for introduction of latent degrees of freedom. Initially graphical models used various point distribution as building blocks, while mostly normal distribution has been promoted to a random in the notable example of Gaussian random fields. In this work we propose using Bayesian Neural Networks (BNN) as building blocks in graphical models and demonstrate the power of synthesis of Probabilistic Graphical Models (PGM) and BNNs on a synthetic example of signal/background separation. As a demonstration of our approach we propose the additive mixture model: a superposition of signal and background spectra whose proportion varies in space. During inference we are able to learn the proportion of signal and background and their spectral shapes that match ground truth values to adequate precision. The paper is organized as follows. In Section 2 we recapitulate the feed-forward (vanilla) BNN and the variational inference approach. In Section 3 we present the transformations of a vanilla BNN that allow to emulate various distribution fields. To illustrate the power of composition of transformed

