LEARNING AND EVALUATING REPRESENTATIONS FOR DEEP ONE-CLASS CLASSIFICATION

Abstract

We present a two-stage framework for deep one-class classification. We first learn self-supervised representations from one-class data, and then build one-class classifiers on learned representations. The framework not only allows to learn better representations, but also permits building one-class classifiers that are faithful to the target task. We argue that classifiers inspired by the statistical perspective in generative or discriminative models are more effective than existing approaches, such as a normality score from a surrogate classifier. We thoroughly evaluate different self-supervised representation learning algorithms under the proposed framework for one-class classification. Moreover, we present a novel distribution-augmented contrastive learning that extends training distributions via data augmentation to obstruct the uniformity of contrastive representations. In experiments, we demonstrate state-of-the-art performance on visual domain oneclass classification benchmarks, including novelty and anomaly detection. Finally, we present visual explanations, confirming that the decision-making process of deep one-class classifiers is intuitive to humans.

1. INTRODUCTION

One-class classification aims to identify if an example belongs to the same distribution as the training data. There are several applications of one-class classification, such as anomaly detection or outlier detection, where we learn a classifier that distinguishes the anomaly/outlier data without access to them from the normal/inlier data accessible at training. This problem is common in various domains, such as manufacturing defect detection, financial fraud detection, etc. Generative models, such as kernel density estimation (KDE), is popular for one-class classification [1, 2] as they model the distribution by assigning high density to the training data. At test time, low density examples are determined as outliers. Unfortunately, the curse of dimensionality hinders accurate density estimation in high dimensions [3] . Deep generative models (e.g. [4, 5, 6] ), have demonstrated success in modeling high-dimensional data (e.g., images) and have been applied to anomaly detection [7, 8, 9, 10, 11] . However, learning deep generative models on raw inputs remains as challenging as they appear to assign high density to background pixels [10] or learn local pixel correlations [12] . A good representation might still be beneficial to those models. Alternately, discriminative models like one-class SVM (OC-SVM) [13] or support vector data description (SVDD) [14] learn classifiers describing the support of one-class distributions to distinguish them from outliers. These methods are powerful when being with non-linear kernels. However, its performance is still limited by the quality of input data representations. In either generative or discriminative approaches, the fundamental limitation of one-class classification centers on learning good high-level data representations. Following the success of deep learning [15] , deep one-class classifications [16, 17, 18] , which extend the discriminative one-class classification using trainable deep neural networks, have shown promising results compared to their kernel counterparts. However, a naive training of deep one-class classifiers leads to a degenerate solution that maps all data into a single representation, also known as "hypersphere collapse" [16] . Previous works circumvent such issues by constraining network architectures [16] , autoencoder * Equal contribution. 1

