PAC CONFIDENCE PREDICTIONS FOR DEEP NEURAL NETWORK CLASSIFIERS

Abstract

A key challenge for deploying deep neural networks (DNNs) in safety critical settings is the need to provide rigorous ways to quantify their uncertainty. In this paper, we propose a novel algorithm for constructing predicted classification confidences for DNNs that comes with provable correctness guarantees. Our approach uses Clopper-Pearson confidence intervals for the Binomial distribution in conjunction with the histogram binning approach to calibrated prediction. In addition, we demonstrate how our predicted confidences can be used to enable downstream guarantees in two settings: (i) fast DNN inference, where we demonstrate how to compose a fast but inaccurate DNN with an accurate but slow DNN in a rigorous way to improve performance without sacrificing accuracy, and (ii) safe planning, where we guarantee safety when using a DNN to predict whether a given action is safe based on visual observations. In our experiments, we demonstrate that our approach can be used to provide guarantees for state-of-the-art DNNs.

1. INTRODUCTION

Due to the recent success of machine learning, there has been increasing interest in using predictive models such as deep neural networks (DNNs) in safety-critical settings, such as robotics (e.g., obstacle detection (Ren et al., 2015) and forecasting (Kitani et al., 2012) ) and healthcare (e.g., diagnosis (Gulshan et al., 2016; Esteva et al., 2017) and patient care management (Liao et al., 2020) ). One of the key challenges is the need to provide guarantees on the safety or performance of DNNs used in these settings. The potential for failure is inevitable when using DNNs, since they will inevitably make some mistakes in their predictions. Instead, our goal is to design tools for quantifying the uncertainty of these models; then, the overall system can estimate and account for the risk inherent in using the predictions made by these models. For instance, a medical decision-making system may want to fall back on a doctor when its prediction is uncertain whether its diagnosis is correct, or a robot may want to stop moving and ask a human for help if it is uncertain to act safely. Uncertainty estimates can also be useful for human decision-makers-e.g., for a doctor to decide whether to trust their intuition over the predicted diagnosis. While many DNNs provide confidences in their predictions, especially in the classification setting, these are often overconfident. This phenomenon is most likely because DNNs are designed to overfit the training data (e.g., to avoid local minima (Safran & Shamir, 2018)), which results in the predicted probabilities on the training data being very close to one for the correct prediction. Recent work has demonstrated how to calibrate the confidences to significantly reduce overconfidence (Guo et al., 2017) . Intuitively, these techniques rescale the confidences on a held-out calibration set. Because they are only fitting a small number of parameters, they do not overfit the data as was the case in the original DNN training. However, these techniques do not provide theoretical guarantees on their correctness, which can be necessary in safety-critical settings to guarantee correctness. We propose a novel algorithm for calibrated prediction in the classification setting that provides theoretical guarantees on the predicted confidences. We focus on on-distribution guaranteesi.e., where the test distribution is the same as the training distribution. In this setting, we can build on ideas from statistical learning theory to provide probably approximately correctness (PAC) guarantees (Valiant, 1984) . Our approach is based on a calibrated prediction technique called histogram

