BAYESIAN NEURAL NETWORKS WITH VARIANCE PROPAGATION FOR UNCERTAINTY EVALUATION

Abstract

Uncertainty evaluation is a core technique when deep neural networks (DNNs) are used in real-world problems. In practical applications, we often encounter unexpected samples that have not seen in the training process. Not only achieving the high-prediction accuracy but also detecting uncertain data is significant for safetycritical systems. In statistics and machine learning, Bayesian inference has been exploited for uncertainty evaluation. The Bayesian neural networks (BNNs) have recently attracted considerable attention in this context, as the DNN trained using dropout is interpreted as a Bayesian method. Based on this interpretation, several methods to calculate the Bayes predictive distribution for DNNs have been developed. Though the Monte-Carlo method called MC dropout is a popular method for uncertainty evaluation, it requires a number of repeated feed-forward calculations of DNNs with randomly sampled weight parameters. To overcome the computational issue, we propose a sampling-free method to evaluate uncertainty. Our method converts a neural network trained using dropout to the corresponding Bayesian neural network with variance propagation. Our method is available not only to feed-forward NNs but also to recurrent NNs including LSTM. We report the computational efficiency and statistical reliability of our method in numerical experiments of language modeling using RNNs, and the out-of-distribution detection with DNNs.

1. INTRODUCTION

Uncertainty evaluation is a core technique in practical applications of deep neural networks (DNNs). As an example, let us consider the Cyber-Physical Systems (CPS) such as the automated driving system. In the past decade, machine learning methods are widely utilized to realize the environment perception and path-planing components in the CPS. In particular, the automated driving system has drawn a huge attention as a safety-critical and real-time CPS (NITRD CPS Senior Steering Group, 2012; Wing, 2009) . In the automated driving system, the environment perception component is built using DNN-based predictive models. In real-world applications, the CPS is required to deal with unexpected samples that have not seen in the training process. Therefore, not only achieving the high-prediction accuracy under the ideal environment but providing uncertainty evaluation for real-world data is significant for safety-critical systems (Henne et al., 2019) . The CPS should prepare some options such as the rejection of the recommended action to promote the user's intervention when the uncertainty is high. Such an interactive system is necessary to build fail-safe systems (Varshney & Alemzadeh, 2017; Varshney, 2016) . On the other hand, the uncertainty evaluation is useful to enhance the efficiency of learning algorithms, i.e., samples with high uncertainty are thought to convey important information for training networks. Active data selection based on the uncertainty has been studied for long time under the name of active learning (David et al., 1996; Gal et al., 2017; Holub et al., 2008; Li & Guo, 2013; Shui et al., 2020) . In statistics and machine learning, Bayesian estimation has been commonly exploited for uncertainty evaluation (Bishop, 2006.) . In the Bayesian framework, the prior knowledge is represented as the prior distribution of the statistical model. The prior distribution is updated to the posterior distribution based on observations. The epistemic model uncertainty is represented in the prior distribution,

