CADET: FULLY SELF-SUPERVISED ANOMALY DETEC-TION WITH CONTRASTIVE LEARNING

Abstract

Handling out-of-distribution (OOD) samples has become a major stake in the realworld deployment of machine learning systems. This work explores the application of self-supervised contrastive learning to the simultaneous detection of two types of OOD samples: unseen classes and adversarial perturbations. Since in practice the distribution of such samples is not known in advance, we do not assume access to OOD examples. We show that similarity functions trained with contrastive learning can be leveraged with the maximum mean discrepancy (MMD) two-sample test to verify whether two independent sets of samples are drawn from the same distribution. Inspired by this approach, we introduce CADet (Contrastive Anomaly Detection), a method based on image augmentations to perform anomaly detection on single samples. CADet compares favorably to adversarial detection methods to detect adversarially perturbed samples on ImageNet. Simultaneously, it achieves comparable performance to unseen label detection methods on two challenging benchmarks: ImageNet-O and iNaturalist. CADet is fully self-supervised and requires neither labels for in-distribution samples nor access to OOD examples.

1. INTRODUCTION

While modern machine learning systems have achieved countless successful real-world applications, handling out-of-distribution (OOD) inputs remains a tough challenge of significant importance. The problem is especially acute for high-dimensional problems like image classification. Models are typically trained in a close-world setting but inevitably faced with novel input classes when deployed in the real world. The impact can range from displeasing customer experience to dire consequences in the case of safety-critical applications such as autonomous driving (Kitt et al., 2010) or medical analysis (Schlegl et al., 2017a) . Although achieving high accuracy against all meaningful distributional shifts is the most desirable solution, it is particularly challenging. An efficient method to mitigate the consequences of unexpected inputs is to perform anomaly detection, which allows the system to anticipate its inability to process unusual inputs and react adequately. Anomaly detection methods generally rely on one of three types of statistics: features, logits, and softmax probabilities, with some systems leveraging a mix of these (Wang et al., 2022 ). An anomaly score f (x) is computed, and then detection with threshold τ is performed based on whether f (x) > τ . The goal of a detection system is to find an anomaly score that efficiently discriminates between in-distribution and out-of-distribution samples. However, the common problem of these systems is that different distributional shifts will unpredictably affect these statistics. Accordingly, detection systems either achieve good performance on specific types of distributions or require tuning on OOD samples. In both cases, their practical use is severely limited. Motivated by these issues, recent work has tackled the challenge of designing detection systems for unseen classes without prior knowledge of the unseen label set or access to OOD samples (Winkens et al., 2020; Tack et al., 2020; Wang et al., 2022) . We first investigate the use of maximum mean discrepancy two-sample test (MMD) (Gretton et al., 2012) in conjunction with self-supervised contrastive learning to assess whether two sets of samples have been drawn from the same distribution. Motivated by the strong testing power of this method, we then introduce a statistic inspired by MMD and leveraging contrastive transformations. Based on this statistic, we propose CADet (Contrastive Anomaly Detection), which is able to detect OOD samples from single inputs and performs well on both label-based and adversarial detection benchmarks, without requiring access to any OOD samples to train or tune the method. Only a few works have addressed these tasks simultaneously. These works either focus on particular in-distribution data such as medical imaging for specific diseases (Uwimana1 & Senanayake, 2021) or evaluate their performances on datasets with very distant classes such as CIFAR10 (Krizhevsky, 2009) , SVHN (Netzer et al., 2011), and LSUN (Yu et al., 2015) , resulting in simple benchmarks that do not translate to general real world applications (Lee et al., 2018) . Contributions: Our main contributions are as follows: • We use similarity functions learned by self-supervised contrastive learning with MMD to show that the test sets of CIFAR10 and CIFAR10.1 (Recht et al., 2019) have different distributions. • We propose a novel improvement to MMD and show it can also be used to confidently detect distributional shifts when given a small number of samples. • We introduce CADet, a fully self-supervised method for anomaly detection, and show it outperforms current methods in adversarial detection tasks while performing well on class-based OOD detection. The outline is as follows: in Section 2, we discuss relevant previous work. Section 3 describes the self-supervised contrastive method based on SimCLRv2 (Chen et al., 2020b) used in this work. Section 4 explores the application of learned similarity functions in conjunction with MMD to verify whether two independent sets of samples are drawn from the same distribution. Section 5 presents CADet and evaluates its empirical performance. Finally, we discuss results and limitations in Section 6.

2. RELATED WORK

We propose a self-supervised contrastive method for anomaly detection (both unknown classes and adversarial attacks) inspired by MMD. Thus, our work intersects with the MMD, label-based OOD detection, adversarial detection, and self-supervised contrastive learning literature. MMD two-sample test has been extensively studied (Gretton et al., 2012; Wenliang et al., 2019; Gretton et al., 2009; Sutherland et al., 2016; Chwialkowski et al., 2015; Jitkrittum et al., 2016) , though it is, to the best of our knowledge, the first time a similarity function trained via contrastive learning is used in conjunction with MMD. Liu et al. (2020a) uses MMD with a deep kernel trained on a fraction of the samples to argue that CIFAR10 and CIFAR10.1 have different test distributions. We build upon that work by confirming their finding with higher confidence levels, using fewer samples. Label-based OOD detection methods discriminate samples that differ from those in the training distribution. We focus on unsupervised OOD detection in this work, i.e., we do not assume access to data labeled as OOD. Unsupervised OOD detection methods include density-based (Zhai et al., 2016; Nalisnick et al., 2018; 2019; Choi et al., 2018; Du & Mordatch, 2019; Ren et al., 2019; Serrà et al., 2019; Grathwohl et al., 2019; Liu et al., 2020b; Dinh et al., 2016 ), reconstruction-based (Schlegl et al., 2017b; Zong et al., 2018; Deecke et al., 2018; Pidhorskyi et al., 2018; Perera et al., 2019; Choi et al., 2018) , one-class classifiers (Schölkopf et al., 1999; Ruff et al., 2018) , self-supervised (Golan & El-Yaniv, 2018; Hendrycks et al., 2019b; Bergman & Hoshen, 2020; Tack et al., 2020) , and supervised approaches (Liang et al., 2017; Hendrycks & Gimpel, 2016) , though some works do not fall into any of these categories (Wang et al., 2022) . Adversarial detection discriminates adversarial samples from the original data. Adversarial samples are generated by minimally perturbing actual samples to produce a change in the model's output, such as a misclassification. Most works rely on the knowledge of some attacks for training (Abusnaina et al., 2021; Metzen et al., 2017; Feinman et al., 2017; Lust & Condurache, 2020; Zuo & Zeng, 2021; Papernot & McDaniel, 2018; Ma et al., 2018) , with the exception of Hu et al. (2019) . Self-supervised contrastive learning methods (Wu et al., 2018; He et al., 2020; Chen et al., 2020a; b) are commonly used to pre-train a model from unlabeled data to solve a downstream task such as image classification. Contrastive learning relies on instance discrimination trained with a contrastive loss (Hadsell et al., 2006) such as infoNCE (Gutmann & Hyvärinen, 2010) .

