AN EMPIRICAL EXPLORATION OF OPEN-SET RECOG-NITION VIA LIGHTWEIGHT STATISTICAL PIPELINES Anonymous

Abstract

Machine-learned safety-critical systems need to be self-aware and reliably know their unknowns in the open-world. This is often explored through the lens of anomaly/outlier detection or out-of-distribution modeling. One popular formulation is that of open-set classification, where an image classifier trained for 1-of-K classes should also recognize images belonging to a (K + 1) th "other" class, not present in the training set. Recent work has shown that, somewhat surprisingly, most if not all existing open-world methods do not work well on high-dimensional open-world images (Shafaei et al., 2019). In this paper, we carry out an empirical exploration of open-set classification, and find that combining classic statistical methods with carefully computed features can dramatically outperform prior work. We extract features from off-the-shelf (OTS) state-of-the-art networks for the underlying K-way closed-world task. We leverage insights from the retrieval community for computing feature descriptors that are low-dimensional (via pooling and PCA) and normalized (via L2-normalization), enabling the modeling of training data densities via classic statistical tools such as kmeans and Gaussian Mixture Models (GMMs). Finally, we (re)introduce the task of open-set semantic segmentation, which requires classifying individual pixels into one of K known classes or an "other" class. In this setting, our feature-based statistical models noticeably outperform prior open-world methods.

1. INTRODUCTION

Embodied perception and autonomy require systems to be self-aware and reliably know their unknowns. This requirement is often formulated as the open set recognition problem (Scheirer et al., 2012) , meaning that the system, e.g., a K-way classification model, should recognize anomalous examples that do not belong to one of K closed-world classes. This is a significant challenge for machine-learned systems that notoriously over-generalize to anomalies and unknowns on which they should instead raise a warning flag (Amodei et al., 2016) . In this paper, we carry out a rigorous empirical exploration of open-set recognition of highdimensionial images. We explore simple statistical models such as Nearest Class Means (NCMs), kmeans and Gaussian Mixture Models (GMMs). Our hypothesis is that such classic statistical methods can reliably model the closed-world distribution (through the closed-world training data), 1



Open-world benchmarks: Curating open-world benchmarks is hard(Liu et al., 2019). One common strategy re-purposes existing classification datasets into closed vs open examples -e.g., declaring MNIST digits 0-5 as closed and 6-9 as open(Neal et al., 2018; Oza & Patel, 2019; Geng et al., 2020). In contrast, anomaly/out-of-distribution (OOD) benchmarks usually generate anomalous samples by adding examples from different datasets -e.g., declaring CIFAR as anomalous forMNIST (Ge et al.,  2017; Oza & Patel, 2019; Liu et al., 2019). Most open-world protocols assume open-world data is not available during training (Liang et al., 2018; Oza & Patel, 2019). Interestingly, Dhamija et al. (2018); Hendrycks et al. (2019b) find that, if some open examples are available during training, one can learn simple open-vs-closed binary classifiers that are remarkably effective. However, Shafaei et al. (2019) comprehensively compare various well-known open-world methods through rigorous experiments, and empirically show that none of the compared methods generalize to high-dimensional open-world images. Intuitively, classifiers can easily overfit to the available set of open-world images, which won't likely exhaustively span the open world outside the K classes of interest.

