EXPLAINABLE DEEP ONE-CLASS CLASSIFICATION

Abstract

Deep one-class classification variants for anomaly detection learn a mapping that concentrates nominal samples in feature space causing anomalies to be mapped away. Because this transformation is highly non-linear, finding interpretations poses a significant challenge. In this paper we present an explainable deep one-class classification method, Fully Convolutional Data Description (FCDD), where the mapped samples are themselves also an explanation heatmap. FCDD yields competitive detection performance and provides reasonable explanations on common anomaly detection benchmarks with CIFAR-10 and ImageNet. On MVTec-AD, a recent manufacturing dataset offering ground-truth anomaly maps, FCDD sets a new state of the art in the unsupervised setting. Our method can incorporate ground-truth anomaly explanations during training and using even a few of these (∼ 5) improves performance significantly. Finally, using FCDD's explanations, we demonstrate the vulnerability of deep one-class classification models to spurious image features such as image watermarks.

1. INTRODUCTION

Anomaly detection (AD) is the task of identifying anomalies in a corpus of data (Edgeworth, 1887; Barnett and Lewis, 1994; Chandola et al., 2009; Ruff et al., 2021) . Powerful new anomaly detectors based on deep learning have made AD more effective and scalable to large, complex datasets such as high-resolution images (Ruff et al., 2018; Bergmann et al., 2019) . While there exists much recent work on deep AD, there is limited work on making such techniques explainable. Explanations are needed in industrial applications to meet safety and security requirements (Berkenkamp et al., 2017; Katz et al., 2017; Samek et al., 2020) , avoid unfair social biases (Gupta et al., 2018) , and support human experts in decision making (Jarrahi, 2018; Montavon et al., 2018; Samek et al., 2020) . One typically makes anomaly detection explainable by annotating pixels with an anomaly score and, in some applications, such as finding tumors in cancer detection (Quellec et al., 2016) , these annotations are the primary goal of the detector. One approach to deep AD, known as Deep Support Vector Data Description (DSVDD) (Ruff et al., 2018) , is based on finding a neural network that transforms data such that nominal data is concentrated to a predetermined center and anomalous data lies elsewhere. In this paper we present Fully Convolutional Data Description (FCDD), a modification of DSVDD so that the transformed samples are themselves an image corresponding to a downsampled anomaly heatmap. The pixels in this heatmap that are far from the center correspond to anomalous regions in the input image. FCDD does this by only using convolutional and pooling layers, thereby limiting the receptive field of each output pixel. Our method is based on the one-class classification paradigm (Moya et al., 1993; Tax, 2001; Tax and Duin, 2004; Ruff et al., 2018) , which is able to naturally incorporate known anomalies Ruff et al. ( 2021), but is also effective when simply using synthetic anomalies. We show that FCDD's anomaly detection performance is close to the state of the art on the standard AD benchmarks with CIFAR-10 and ImageNet while providing transparent explanations. On MVTec-AD, an AD dataset containing ground-truth anomaly maps, we demonstrate the accuracy of FCDD's explanations (see Figure 1 ), where FCDD sets a new state of the art. In further experiments we find that deep one-class classification models (e.g. DSVDD) are prone to the "Clever Hans" effect (Lapuschkin et al., 2019) where a detector fixates on spurious features such as image watermarks. In general, we find that the generated anomaly heatmaps are less noisy and provide more structure than the baselines, including gradient-based methods (Simonyan et al., 2013; Sundararajan et al., 2017) and autoencoders (Sakurada and Yairi, 2014; Bergmann et al., 2019) . 

2. RELATED WORK

Here we outline related works on deep AD focusing on explanation approaches. Classically deep AD used autoencoders (Hawkins et al., 2002; Sakurada and Yairi, 2014; Zhou and Paffenroth, 2017; Zhao et al., 2017) . Trained on a nominal dataset autoencoders are assumed to reconstruct anomalous samples poorly. Thus, the reconstruction error can be used as an anomaly score and the pixel-wise difference as an explanation (Bergmann et al., 2019) , thereby naturally providing an anomaly heatmap. Recent works have incorporated attention into reconstruction models that can be used as explanations (Venkataramanan et al., 2019; Liu et al., 2020) . In the domain of videos, Sabokrou et al. ( 2018) used a pre-trained fully convolutional architecture in combination with a sparse autoencoder to extract 2D features and provide bounding boxes for anomaly localization. One drawback of reconstruction methods is that they offer no natural way to incorporate known anomalies during training. More recently, one-class classification methods for deep AD have been proposed. These methods attempt to separate nominal samples from anomalies in an unsupervised manner by concentrating nominal data in feature space while mapping anomalies to distant locations (Ruff et al., 2018; Chalapathy et al., 2018; Goyal et al., 2020) . In the domain of NLP, DSVDD has been successfully applied to text, which yields a form of interpretation using attention mechanisms (Ruff et al., 2019) . For images, Kauffmann et al. ( 2020) have used a deep Taylor decomposition (Montavon et al., 2017) to derive relevance scores. Some of the best performing deep AD methods are based on self-supervision. These methods transform nominal samples, train a network to predict which transformation was used on the input, and



Figure 1: FCDD explanation heatmaps for MVTec-AD (Bergmann et al., 2019). Rows from top to bottom show: (1) nominal samples (2) anomalous samples (3) FCDD anomaly heatmaps (4) ground-truth anomaly maps.

availability

://github.

