PAC CONFIDENCE PREDICTIONS FOR DEEP NEURAL NETWORK CLASSIFIERS

Abstract

A key challenge for deploying deep neural networks (DNNs) in safety critical settings is the need to provide rigorous ways to quantify their uncertainty. In this paper, we propose a novel algorithm for constructing predicted classification confidences for DNNs that comes with provable correctness guarantees. Our approach uses Clopper-Pearson confidence intervals for the Binomial distribution in conjunction with the histogram binning approach to calibrated prediction. In addition, we demonstrate how our predicted confidences can be used to enable downstream guarantees in two settings: (i) fast DNN inference, where we demonstrate how to compose a fast but inaccurate DNN with an accurate but slow DNN in a rigorous way to improve performance without sacrificing accuracy, and (ii) safe planning, where we guarantee safety when using a DNN to predict whether a given action is safe based on visual observations. In our experiments, we demonstrate that our approach can be used to provide guarantees for state-of-the-art DNNs.

1. INTRODUCTION

Due to the recent success of machine learning, there has been increasing interest in using predictive models such as deep neural networks (DNNs) in safety-critical settings, such as robotics (e.g., obstacle detection (Ren et al., 2015) and forecasting (Kitani et al., 2012) ) and healthcare (e.g., diagnosis (Gulshan et al., 2016; Esteva et al., 2017) and patient care management (Liao et al., 2020) ). One of the key challenges is the need to provide guarantees on the safety or performance of DNNs used in these settings. The potential for failure is inevitable when using DNNs, since they will inevitably make some mistakes in their predictions. Instead, our goal is to design tools for quantifying the uncertainty of these models; then, the overall system can estimate and account for the risk inherent in using the predictions made by these models. For instance, a medical decision-making system may want to fall back on a doctor when its prediction is uncertain whether its diagnosis is correct, or a robot may want to stop moving and ask a human for help if it is uncertain to act safely. Uncertainty estimates can also be useful for human decision-makers-e.g., for a doctor to decide whether to trust their intuition over the predicted diagnosis. While many DNNs provide confidences in their predictions, especially in the classification setting, these are often overconfident. This phenomenon is most likely because DNNs are designed to overfit the training data (e.g., to avoid local minima (Safran & Shamir, 2018)), which results in the predicted probabilities on the training data being very close to one for the correct prediction. Recent work has demonstrated how to calibrate the confidences to significantly reduce overconfidence (Guo et al., 2017) . Intuitively, these techniques rescale the confidences on a held-out calibration set. Because they are only fitting a small number of parameters, they do not overfit the data as was the case in the original DNN training. However, these techniques do not provide theoretical guarantees on their correctness, which can be necessary in safety-critical settings to guarantee correctness. We propose a novel algorithm for calibrated prediction in the classification setting that provides theoretical guarantees on the predicted confidences. We focus on on-distribution guaranteesi.e., where the test distribution is the same as the training distribution. In this setting, we can build on ideas from statistical learning theory to provide probably approximately correctness (PAC) guarantees (Valiant, 1984) . Our approach is based on a calibrated prediction technique called histogram binning (Zadrozny & Elkan, 2001) , which rescales the confidences by binning them and then rescaling each bin independently. We use Clopper-Pearson bounds on the tails of the binomial distribution to obtain PAC upper/lower bounds on the predicted confidences. Next, we study how it enables theoretical guarantees in two applications. First, we consider the problem of speeding up DNN inference by composing a fast but inaccurate model with a slow but accurate model-i.e., by using the accurate model for inference only if the confidence of the inaccurate one is underconfident (Teerapittayanon et al., 2016) . We use our algorithm to obtain guarantees on accuracy of the composed model. Second, for safe planning, we consider using a DNN to predict whether or not a given action (e.g., move forward) is safe (e.g., do not run into obstacles) given an observation (e.g., a camera image). The robot only continues to act if the predicted confidence is above some threshold. We use our algorithm to ensure safety with high probability. Finally, we evaluate the efficacy of our approach in the context of these applications. Related work. Calibrated prediction (Murphy, 1972; DeGroot & Fienberg, 1983; Platt, 1999) has recently gained attention as a way to improve DNN confidences (Guo et al., 2017) . Histogram binning is a non-parametric approach that sorts the data into finitely many bins and rescales the confidences per bin (Zadrozny & Elkan, 2001; 2002; Naeini et al., 2015) . However, traditional approaches do not provide theoretical guarantees on the predicted confidences. There has been work on predicting confidence sets (i.e., predict a set of labels instead of a single label) with theoretical guarantees (Park et al., 2020a) , but this approach does not provide the confidence of the most likely prediction, as is often desired. There has also been work providing guarantees on the overall calibration error (Kumar et al., 2019) , but this approach does not provide per-prediction guarantees. There has been work speeding up DNN inference (Hinton et al., 2015) . One approach is to allow intermediate layers to be dynamically skipped (Teerapittayanon et al., 2016; Figurnov et al., 2017; Wang et al., 2018) , which can be thought of as composing multiple models that share a backbone. Unlike our approach, they do not provide guarantees on the accuracy of the composed model. There has also been work on safe learning-based control (Akametalu et al., 2014; Fisac et al., 2019; Bastani, 2019; Li & Bastani, 2020; Wabersich & Zeilinger, 2018; Alshiekh et al., 2018) ; however, these approaches are not applicable to perception-based control. The most closely related work is Dean et al. ( 2019), which handles perception, but they are restricted to known linear dynamics.

2. PAC CONFIDENCE PREDICTION

In this section, we begin by formalizing the PAC confidence coverage prediction problem; then, we describe our algorithm for solving this problem based on histogram binning. Calibrated prediction. Let x ∈ X be an example and y ∈ Y be one of a finite label set, and let D be a distribution over X × Y. A confidence predictor is a model f : X → P Y , where P Y is the space of probability distributions over labels. In particular, f (x) y is the predicted confidence that the true label for x is y. We let ŷ : X → Y be the corresponding label predictor-i.e., ŷ(x) := arg max y∈Y f (x) y -and let p : X → R ≥0 be corresponding top-label confidence predictori.e., p(x) := max y∈Y f (x) y . While traditional DNN classifiers are confidence predictors, a naively trained DNN is not reliable-i.e., predicted confidence does not match to the true confidence; recent work has studied heuristics for improving reliability (Guo et al., 2017) . In contrast, our goal is to construct a confidence predictor that comes with theoretical guarantees. We first introduce the definition of calibration (DeGroot & Fienberg, 1983; Zadrozny & Elkan, 2002; Park et al., 2020b)-i.e ., what we mean for a predicted confidence to be "correct". In many cases, the main quantity of interest is the confidence of the top prediction. Thus, we focus on ensuring that the top-label predicted confidence p(x) is calibrated (Guo et al., 2017) ; our approach can easily be extended to providing guarantees on all confidences predicted using f . Then, we say a confidence predictor f is well-calibrated with respect to distribution D if P (x,y)∼D [y = ŷ(x) | p(x) = t] = t (∀t ∈ [0, 1]). That is, among all examples x such that the label prediction ŷ(x) has predicted confidence t = p(x), ŷ(x) is the correct label for exactly a t fraction of these examples. Using a change of variables (Park

