UNLEASHING MASK: EXPLORE THE INTRINSIC OUT-OF-DISTRIBUTION DETECTION CAPABILITY Anonymous

Abstract

Out-of-distribution (OOD) detection is an important aspect of safely deploying machine learning models in real-world applications. Previous approaches either design better scoring functions or utilize the knowledge of outliers to equip the well-trained models with the ability of OOD detection. However, few of them explore to excavate the intrinsic OOD detection capability of a given model. In this work, we discover the existence of an intermediate stage of a model trained on in-distribution data having higher OOD detection performance than that of its final stage across different settings and further identify the critical attribution to be learning with atypical samples. Based on such empirical insights, we propose a new method, Unleashing Mask (UM), that restores the OOD discriminative capabilities of the model. To be specific, we utilize the mask to figure out the memorized atypical samples and fine-tune the model to forget them. Extensive experiments have been conducted to characterize and verify the effectiveness of our method.

1. INTRODUCTION

Out-of-distribution (OOD) detection has drawn increasing attention when deploying machine learning models into the open-world scenarios (Nguyen et al., 2015; Lee et al., 2018a) . Since the test samples can naturally arise from a label-different distribution, identifying OOD inputs is important, especially for those safety-critical applications like autonomous driving and medical intelligence. Previous studies focus on designing a series of scoring functions (Hendrycks & Gimpel, 2017b; Liang et al., 2018; Lee et al., 2018a; Liu et al., 2020; Sun et al., 2021; 2022) for OOD uncertainty estimation or finetuning with auxiliary outlier data to better distinguish the OOD inputs (Hendrycks et al., 2019c; Tack et al., 2020; Mohseni et al., 2020; Sehwag et al., 2021; Wei et al., 2022; Ming et al., 2022) . Despite the promising results achieved by previous methods (Hendrycks & Gimpel, 2017a; Hendrycks et al., 2019c; Liu et al., 2020; Ming et al., 2022) , little attention is paid to considering whether the well-trained given model is the most appropriate for OOD detection. In general, models deployed for various applications have different targets (e.g., multi-class classification) (Goodfellow et al., 2016) instead of OOD detection (Nguyen et al., 2015; Lee et al., 2018a) . However, most representative score functions, e.g., MSP (Hendrycks et al., 2019c) , ODIN(Liang et al., 2018), and Energy (Liu et al., 2020) , uniformly leverage the given models for OOD detection. Considering the target-oriented discrepancy, it arises a critical question: does the well-trained given model have the optimal OOD detection capability? If not, how can we find a more appropriate model for OOD detection? In this work, we start by revealing an important observation (as illustrated in Figure 1 ), i.e., there exists a historical training stage where the model has a higher OOD detection performance than the final well-trained one. This is generally true across different OOD/ID datasets (Netzer et al., 2011; Van Horn et al., 2018; Cimpoi et al., 2014) , learning rate schedules (Loshchilov & Hutter, 2017) , and model structures (Huang et al., 2017; Zagoruyko & Komodakis, 2016) . The empirical results of Figure 1 reflect the inconsistency between gaining better OOD detection capability (Nguyen et al., 2015) and pursuing better performance on ID data. We delve into the differences between the intermediate model and the final model by visualizing the misclassified examples. As shown in Figure 2 , one possible attribution for covering the detection capability should be memorizing the atypical samples (at the semantic level) that are hard to learn for the model. Seeking zero error on those samples makes the model more confident on OOD data (see Figures 1(b ) and 1(c)). The above analysis inspires us to propose a new strategy, namely, Unleashing Mask (UM), to excavate the once-covered detection capability of a well-trained given model by alleviating the memorization of those atypical samples (as illustrated in Figure 3 ) of ID data. In general, we aim to backtrack its previous stage with a better OOD detection capability. To achieve this target, there are two essential issues: (1) the model that is well-trained on ID data has already memorized some atypical samples; (2) how to forget those memorized atypical samples considering the given model? Accordingly, our proposed UM contains two parts utilizing different insights to address the above problems. First, as atypical samples are more sensitive to the change of model parameters, we initialize a mask with the specific cutting rate to mine these samples with constructed discrepancy. Then, with the loss reference estimated by the mask, we conduct the constrained gradient ascent (i.e., Eq. 3) for model forgetting. It will encourage the model to finally stabilize around the optimal stage. To avoid the severe sacrifice of the original task performance on ID data, we further propose UM Adopts Pruning (UMAP) which performs the tuning on the introduced mask with the newly designed objective. For our proposed methods, we conduct extensive experiments to characterize and understand the working mechanism (in Section 4 and Appendix F). The comprehensive results accordingly demonstrate their effectiveness. We have verified the effectiveness of UM with a series of OOD detection benchmarks considering the two different ID datasets, i.e., CIFAR-10 and CIFAR-100. Under the various evaluation metrics, our UM, as well as UMAP, can indeed excavate the better OOD detection capability of given models and the averaged FPR95 can be reduced by a significant margin. Finally, a range of ablation studies and further discussions related to our proposed strategy are provided. We summarize our main contributions as follows, • Conceptually, we explore the OOD detection performance via a new perspective, i.e., backtracking the model training phase without regularizing by any auxiliary outliers, different from most previous works that start with the well-trained model on ID data. • Empirically, we reveal the potential detection capability of the well-trained model. We observe the general existence of an intermediate stage where the model has more appropriate discriminative features that can be utilized for OOD detection. • Technically, we introduce a new strategy, i.e., Unleashing Mask, to excavate the once-covered OOD detection capability of a given model. By introducing the mask, we estimate the loss constraint for forgetting the atypical samples and empower the detection performance. • Experimentally, we conduct extensive explorations to verify the general effectiveness on improving the OOD detection performance of our methods. Using various ID and OOD benchmarks, we provide comprehensive results across different setups and further discussion.



Figure 1: (a) the curves of FPR95 (false positive rate of OOD examples when the true positive rate of in-distribution examples is at 95%) based on the Energy score (Liu et al., 2020) across three different OOD datasets during the training on the CIFAR-10 dataset. (b) comparison between ID and OOD distribution at Epoch 60. (c) comparison between ID and OOD distribution at Epoch 100.All the experiments testing for OOD detection performance have been conducted multiple times. By backtracking the training phase, we can observe the existence of the model stage with better OOD detection capability using the Energy score to distinguish the OOD inputs. When zooming in the ID and OOD distributions at Epoch 60 and Epoch 100 respectively, it can be seen that, along with the training at the later stage, the overlap between them grows. Figure2contains further exploration.

