PUSHING THE LIMITS OF FEW-SHOT ANOMALY DE-TECTION IN INDUSTRY VISION: GRAPHCORE

Abstract

In the area of few-shot anomaly detection (FSAD), efficient visual feature plays an essential role in the memory bank M-based methods. However, these methods do not account for the relationship between the visual feature and its rotated visual feature, drastically limiting the anomaly detection performance. To push the limits, we reveal that rotation-invariant feature property has a significant impact on industrial-based FSAD. Specifically, we utilize graph representation in FSAD and provide a novel visual isometric invariant feature (VIIF) as an anomaly measurement feature. As a result, VIIF can robustly improve the anomaly discriminating ability and can further reduce the size of redundant features stored in M by a large amount. Besides, we provide a novel model GraphCore via VIIFs that can fast implement unsupervised FSAD training and improve the performance of anomaly detection. A comprehensive evaluation is provided for comparing GraphCore and other SOTA anomaly detection models under our proposed few-shot anomaly detection setting, which shows GraphCore can increase average AUC by 5.8%, 4.1%, 3.4%, and 1.6% on MVTec AD and by 25.5%, 22.0%, 16.9%, and 14.1% on MPDD for 1, 2, 4, and 8-shot cases, respectively.

1. INTRODUCTION

With the rapid development of deep vision detection technology in artificial intelligence, detecting anomalies/defects on the surface of industrial products has received unprecedented attention. Changeover in manufacturing refers to converting a line or machine from processing one product to another. Since the equipment has not been completely fine-tuned after the start of the production line, changeover frequently results in unsatisfactory anomaly detection (AD) performance. How to achieve rapid training of industrial product models in the changeover scenario while assuring accurate anomaly detection is a critical issue in the actual production process. The current state of AD in the industry is as follows: (1) In terms of detection accuracy, the performance of state-ofthe-art (SOTA) AD models degrades dramatically during the changeover. Current mainstream work utilizes a considerable amount of training data as input to train the model, as shown in Fig. 1(a) . However, this will make data collecting challenging, even for unsupervised learning. As a result, many approaches based on few-shot learning at the price of accuracy have been proposed. For instance, Huang et al. (2022) employ meta-learning, as shown in Fig. 1(b) . While due to complicated settings, it is impossible to migrate to the new product during the changeover flexibly, and the detection accuracy cannot be guaranteed. (2) In terms of training speed, when a large amount of data is utilized for training, the training progress for new goods is slowed in the actual production line. As is well-known, vanilla unsupervised AD requires to collect a large amount of information. Even though meta-learning works in few-shot learning, as shown in Fig. 1 (b), it is still necessary to train a massive portion of previously collected data. We state that AD of industrial products requires just a small quantity of data to achieve performance comparable to a large amount of data, i.e., a small quantity of image data can contain sufficient information to represent a large number of data. Due to the fact that industrial products are manufactured with high stability (no evident distortion of shape and color cast), the taken images lack the diversity of natural images, and there is a problem with the shooting angle or rotation. Therefore, it is essential to extract rotation-invariant structural features. 2016)), a large number of redundant features are stored in M. Note that these redundant features maybe come from multiple rotation features of the same patch structure. It will hence require a huge quantity of training data to ensure the high accuracy of the test set. To avoid these redundant features, we propose VIIFs, which not only produce more robust visual features but also dramatically lower the size of M and accelerate detection. Based on the previous considerations, the goal of our work is to handle the cold start of the production line during the changeover. As shown in Fig. 1(c) , a new FSAD method, called GraphCore, is developed that employs a small number of normal samples to accomplish fast training and competitive AD accuracy performance of the new product. On the one hand, by utilizing a small amount of data, we would rapidly train and accelerate the speed of anomaly inference. On the other hand, because we directly train new product samples, adaptation and migration of anomalies from the old product to the new product do not occur. Contributions. In summary, the main contributions of this work are as follows: • We present a feature-augmented method for FSAD in order to investigate the property of visual features generated by CNNs. • We propose a novel anomaly detection model, GraphCore, to add a new VIIF into the memory bank-based AD paradigm, which can drastically reduce the quantity of redundant visual features.



Figure 1: Different from (a) vanilla unsupervised AD and (b) few-shot unsupervised AD in meta learning. As input training samples, our setting (c) only utilizes a small number of normal samples. For our setting (c), there is no requirement to aggregate training categories in advance. The proposed model, vision isometric invariant GNN, can fast obtain the invariant feature within a few normal samples, and its accuracy outperforms models trained in a meta-learning context.

As graph neural networks (GNNs) are capable of robustly extracting non-serialized structural features (Han et al. (2022), Bruna et al. (2013), Hamilton et al. (2017), Xu et al. (2018)), and they integrate global information better and faster Wang et al. (2020); Li et al. (2020). They are more suited than convolution neural networks (CNNs) to handle the problem of extracting rotation-invariant features. For this reason, the core idea of the proposed GraphCore method in this paper is to use the visual isometric invariant features (VIIFs) as the anomaly measurement features. In the method using memory bank (M) as the AD paradigm, PatchCore (Roth et al. (2022)) uses ResNet (He et al. (2016)) as the feature extractor. However, since their features obtained by CNNs do not have rotation invariance (Dieleman et al. (

