FEW-SHOT ANOMALY DETECTION ON INDUSTRIAL IMAGES THROUGH CONTRASTIVE FINE-TUNING Anonymous

Abstract

Detecting abnormal products through imagery data is essential to quality control in manufacturing. Existing approaches towards anomaly detection (AD) often rely on substantial amount of anomaly-free samples to train representation and density models. Nevertheless, large anomaly-free datasets may not always be available before inference stage and this requires building an anomaly detection framework with only a handful of normal samples, a.k.a. few-shot anomaly detection (FSAD). We propose two techniques to address the challenges in FSAD. First, we employ a model pretrained on large source dataset to initialize model weights. To ameliorate the covariate shift between source and target domains, we adopt contrastive training on the few-shot target domain data. Second, to encourage learning representations suitable for downstream AD, we further incorporate cross-instance pairs to increase tightness within normal sample cluster and better separation between normal and synthesized negative samples. Extensive evaluations on six few-shot anomaly detection benchmarks demonstrate the effectiveness of the proposed method.

1. INTRODUCTION

Industrial defect detection is an important real-world use-case for visual anomaly detection methods. In this setting, anomaly detection models typically have to be trained with only defect-free, or normal images, as defects rarely occur on functioning production lines. Anomaly detection methods for this one-class classification setting typically assume that normal images are available in abundance, even though this may not always be the case. For example, in certain applications such as semiconductor manufacturing where image acquisition requires 3D scans using specialized equipment (Pahwa et al., 2021) , acquiring defect-free images is time-consuming and costly. Flexible manufacturing systems also require rapid adaptation to changes in the type and quantity of products to be manufactured (Shivanand, 2006) . As a result, large numbers of defect-free images may not be available for new products, or in the initial stages of bootstrapping a visual inspection system. Although anomaly detection in general is a well-studied topic (Chandola et al., 2009; Pang et al., 2021b) , anomaly detection on images with only few normal and no abnormal images, or fewshot anomaly detection (FSAD), has only recently begun to receive attention from the community (Sheynin et al., 2021; Huang et al., 2022) . In their pioneering work, Sheynin et al. ( 2021) developed a generative adversarial model to distinguish transformed image patches from generated ones. However, such adversarial models may be tricky to tune (Kodali et al., 2017) and the method requires multiple transformations on test samples at inference time, resulting in additional computation overhead. The more recent work of Huang et al. ( 2022) learns a common model over multiple classes of normal images using a feature registration proxy task, but their method requires a training set with normal images from multiple known classes, which is a more restrictive setting. In this work, we develop a simple yet effective method for few-shot anomaly detection. We achieve this by synergistically combining transfer learning from a pretrained model with representation learning on the few-shot normal data. Finetuning from a backbone network pretrained on a large source domain dataset, e.g. ImageNet (Russakovsky et al., 2015) , allows reusing good low-level feature extractors and better initialization of network parameters (Kornblith et al., 2019) . We believe finetuning from pretrained weights could particularly contribute to few-shot anomaly detection when not enough training data is available for training good representations. However, as pointed

