INTERPRETABLE OUT-OF-DISTRIBUTION DETECTION USING PATTERN IDENTIFICATION

Abstract

Out-of-distribution (OoD) detection for data-based programs is a goal of paramount importance. Common approaches in the literature tend to train detectors requiring inside-of-distribution (in-distribution, or IoD) and OoD validation samples, and/or implement confidence metrics that are often abstract and therefore difficult to interpret. In this work, we propose to use existing work from the field of explainable AI, namely the PARTICUL pattern identification algorithm, in order to build more interpretable and robust OoD detectors for visual classifiers. Crucially, this approach does not require to retrain the classifier and is tuned directly to the IoD dataset, making it applicable to domains where OoD does not have a clear definition. Moreover, pattern identification allows us to provide images from the IoD dataset as reference points to better explain the confidence scores. We demonstrates that the detection capabilities of this approach are on par with existing methods through an extensive benchmark across four datasets and two definitions of OoD. In particular, we introduce a new benchmark based on perturbations of the IoD dataset which provides a known and quantifiable evaluation of the discrepancy between the IoD and OoD datasets that serves as a reference value for the comparison between various OoD detection methods. Our experiments show that the robustness of all metrics under test does not solely depend on the nature of the IoD dataset or the OoD definition, but also on the architecture of the classifier, which stresses the need for thorough experimentations for future work on OoD detection.

1. INTRODUCTION

A fundamental aspect of software safety is arguably the modelling of its expected operational domain through a formal or semi-formal specification, giving clear boundaries on when it is sensible to deploy the program, and when it is not. It is however difficult to define such boundaries for machine learning programs, especially for visual classifiers based on artificial neural networks (ANN), which are the subject of this paper. Indeed, such programs process high-dimensional data (images, videos) and are the result of a complex optimization procedure, but they do not embed clear failure modes that could get trigged in the case of an unknown distribution, with potentially dire consequences in critical applications. Although it is difficult to characterize an operational distribution, one could still measure its dissimilarity with other distributions. In this context, Out-of-Distribution (OoD) detection -which aims to detect whether an input of an ANN is Inside-of-distribution (IoD) or outside of it -serves several purposes. It helps characterize the extent to which the ANN can operate outside a bounded dataset (which is important due to the incompleteness of the training set w.r.t. the operational domain). It also constitutes a surrogate measure of the generalization abilities of the ANN. Finally, OoD detection can help assess when an input is too far away from the operational domain, which prevents misuses of the program and increases its safety.

2. RELATED WORK AND CONTRIBUTION

Out-of-distribution detection The maximum class probability (MCP) obtained after softmax normalization of the classifier logits already constitutes a good baseline for OoD detection Hendrycks & Gimpel (2017) . However, neural networks tend to be overconfident on their predictions Szegedy et al.

