A SIMPLE AND EFFECTIVE BASELINE FOR OUT-OF-DISTRIBUTION DETECTION USING AN ABSTENTION CLASS Anonymous

Abstract

Refraining from confidently predicting when faced with categories of inputs different from those seen during training is an important requirement for the safe deployment of deep learning systems. While simple to state, this has been a particularly challenging problem in deep learning, where models often end up making overconfident predictions in such situations. In this work we present a simple, but highly effective approach to deal with out-of-distribution detection that uses the principle of abstention: when encountering a sample from an unseen class, the desired behavior is to abstain from predicting. Our approach uses a network with an extra abstention class and is trained on a dataset that is augmented with an uncurated set that consists of a large number of out-of-distribution (OoD) samples that are assigned the label of the abstention class; the model is then trained to learn an effective discriminator between in and out-of-distribution samples. We compare this relatively simple approach against a wide variety of more complex methods that have been proposed both for out-of-distribution detection as well as uncertainty modeling in deep learning, and empirically demonstrate its effectiveness on a wide variety of of benchmarks and deep architectures for image recognition and text classification, often outperforming existing approaches by significant margins. Given the simplicity and effectiveness of this method, we propose that this approach be used as a new additional baseline for future work in this domain.

1. INTRODUCTION AND RELATED WORK

Most of supervised machine learning has been developed with the assumption that the distribution of classes seen at train and test time are the same. However, the real-world is unpredictable and open-ended, and making machine learning systems robust to the presence of unknown categories and out-of-distribution samples has become increasingly essential for their safe deployment. While refraining from predicting when uncertain should be intuitively obvious to humans, the peculiarities of DNNs makes them overconfident to unknown inputs Nguyen et al. (2015) and makes this a challenging problem to solve in deep learning. A very active sub-field of deep learning, known as out-of-distribution (OoD) detection, has emerged in recent years that attempts to impart to deep neural networks the quality of "knowing when it doesn't know". The most straight-forward approach in this regard is based on using the DNNs output as a proxy for predictive confidence. For example, a simple baseline for detecting OoD samples using thresholded softmax scores was presented in Hendrycks & Gimpel (2016) . where the authors provided empirical evidence that for DNN classifiers, in-distribution predictions do tend to have higher winning scores than OoD samples, thus empirically justifying the use of softmax thresholding as a useful baseline. However this approach is vulnerable to the pathologies discussed in Nguyen et al. (2015) . Subsequently, increasingly sophisticated methods have been developed to attack the OoD problem. Liang et al. (2018) introduced a detection technique that involves perturbing the inputs in the direction of increasing the confidence of the network's predictions on a given input, based on the observation that the magnitude of gradients on in-distribution data tend to be larger than for OoD data. The method proposed in Lee et al. ( 2018) also involves input perturbation, but confidence in this case was measured by the Mahalanobis distance score using the computed mean and covariance of the pre-softmax scores. A drawback of such methods, however, is that it introduces a number of hyperparameters that need to be tuned on the OoD dataset, which is infeasible in many real-world scenarios as one does not often know in advance the properties of unknown classes. A modified version of the perturbation approach was recently proposed in in Hsu et al. ( 2020) that circumvents some of these issues, though one still needs to ascertain an ideal perturbation magnitude, which might not generalize from one OoD set to the other. Given that one might expect a classifier to be more uncertain when faced with OoD data, many methods developed for estimating uncertainty for DNN predictions have also been used for OoD detection. A useful baseline in this regard is the temperature scaling method of Guo et al. ( 2017) that was was proposed for calibrating DNN predictions on in-distribution data and has been observed to also serve as a useful OoD detector in some scenarios. Training the model to recognize unknown classes by using data from categories that do not overlap with classes of interest has been shown to be quite effective for out-of-distribution detection and a slew of methods that use additional data for discriminating between ID and OD data have been proposed. DeVries & Taylor (2018) describes a method ithat uses a separate confidence branch and misclassified training data samples that serve as a proxy for OoD samples. In the outlier exposure technique described in Hendrycks et al. ( 2018), the predictions on natural outlier images used in training are regularized against the uniform distribution to encourage high-entropy posteriors on outlier samples. An approach that uses an extra-class for outlier samples is described in Neal et al. (2018) , where instead of natural outliers, counterfactual images that lie just outside the class boundaries of known classes are generated using a GAN and assigned the extra class label. A similar approach using generative samples for the extra class, but using a conditional Variational Auto-Encoders Kingma & Welling (2013) for generation, is described in Vernekar et al. (2019) . A method to force a DNN to produce high-entropy (i.e., low confidence) predictions and suppress the magnitude of feature activations for OoD samples was discussed in Dhamija et al. (2018) , where, arguing that methods that use an extra background class for OoD samples force all such samples to lie in one region of the feature space, the work also forces separation by suppressing the activation magnitudes of samples from unknown classes The above works have shown that the use of known OoD samples (or known unknowns) often generalizes well to unknown unknown samples. Ineed, even though the space of unknown classes is potentially infinite, and one can never know in advance the myriad of inputs that can occur during test time, empirically this approach has been shown to work. The abstention method that we describe in the next section borrows ideas from many of the above methods: as in Hendrycks et al. ( 2018), we uses additional samples of real images and text from non-overlapping categories to train the model to abstain, but instead of entropy regularization over OoD samples, out method uses an extra abstention class. While it has been sometimes argued in the literature that that using an additional abstention (or rejection) class is not an effective approach for OoD detection Dhamija et al. (2018) ; Lee et al. (2017) , comprehensive experiments we conduct in this work demonstrate that this is not the case. Indeed, we find that such an approach is not only simple but also highly effective for OoD detection, often outperforming existing methods that are more complicated and involve tuning of multiple hyperparameters. The main contributions of this work are as follows: • To the best of our knowledge, this is the first work to comprehensively demonstrate the efficacy of using an extra abstention (or rejection class) in combination with outlier training data for effective OoD detection. • In addition to being effective, our method is also simple: we introduce no additional hyperparameters in the loss function, and train with regular cross entropy. From a practical standpoint, this is especially useful for deep learning practitioners who might not wish



Further, label smoothing techniques like mixup Zhang et al. (2017) have also been shown to be able to improve OoD detection performance in DNNs Thulasidasan et al. (2019). An ensemble-of-deep models approach, that is also augmented with adversarial examples during training, described in Lakshminarayanan et al. (2017) was also shown to improve predictive uncertainty and succesfully applied to OoD detection. In the Bayesian realm, methods such as Maddox et al. (2019) and Osawa et al. (2019) have also been used for OoD detection, though at increased computational cost. However, it has been argued that for OoD detection, Bayesian priors on the data are not completely justified since one does not have access to the prior of the open-set Boult et al. (2019). Nevertheless, simple approaches like dropout -which have been shown to be equivalent to deep gaussian processes Gal & Ghahramani (2016)have been used as baselines for OoD detection.

