Islands of Confidence: Robust Neural Network Classification with Uncertainty Quantification

Abstract

We propose a Gaussian confidence measure and its optimization, for use in neural network classifiers. The measure comes with theoretical results, simultaneously resolving two pressing problems in Deep Neural Network classification: uncertainty quantification, and robustness. Existing research in uncertainty quantification mostly revolves around the confidence reflected in the input feature space. Instead, we focus on the learned representation of the network and analyze the confidence in the penultimate layer space. We formally prove that, independent of optimization-procedural effects, a set of centroids always exists such that softmax classifiers are nearestcentroid classifiers. Softmax confidence, however, does not reflect that the classification is based on nearest centroids: artificially inflated confidence is also given to out-of-distributions samples that are not near any centroid, but slightly less distant from one centroid than from the others. Our new confidence measure is centroid-based, and hence no longer suffers from the artificial confidence inflation of out-of-distribution samples. We also show that our proposed centroidal confidence measure is providing a robustness certificate against attacks. As such, it manages to reflect what the model doesn't know (as demanded by uncertainty quantification), and to resolve the issue of robustness of neural networks.

1. Introduction

The last layer of state-of-the-art neural networks computes the final classifications by approximation through the softmax function (Boltzmann, 1868) . This function partitions the transformed input space into Voronoi calls, each of which encompasses a single class. Conceptually, this is equivalent to putting a number of centroids in this transformed space, and clustering the data points in the dataset by proximity to these centroids through k-means. Several recent papers posed that exploring a relation between softmax and k-means can be beneficial (Kilinc & Uysal, 2018; Peng et al., 2018; Schilling et al., 2018) . The current state of scientific knowledge on the relation between k-means and softmax is empirical. In this paper, we theoretically prove that softmax is a centroid-based classifier, and we derive a centroid-based robustness certificate. This certificate motivates the usage of a confidence measurefoot_0 , the Gauss confidence, which reflects the distance of observations to their assigned centroids. Gauss confidence therefore expresses the uncertainties of the model; moreover, it indicates the vulnerabilities to attacks. We show that our Gauss networks can match (median absolute difference: 0.45 percentage points) the test accuracy of softmax networks, but at a lower confidence (as desired); both outperform the competing DUQ networks (van Amersfoort et al., 2020) when the dataset has many classes. The lower confidence also results in Gauss networks being much less susceptible to adversarial attacks. Hence, the islands of confidence as illustrated in the rightmost plot of Figure 1 reflect reality much better than the confidence landscapes of existing methods (cf. other two plots in Figure 1 ).



previously published informally (non-peer-reviewed) in January as (NN et al., 2020) 1

