DEEP GRAPH-LEVEL ORTHOGONAL HYPERSPHERE COMPRESSION FOR ANOMALY DETECTION Anonymous

Abstract

Graph-level anomaly detection aims to identify abnormal samples of a set of graphs in an unsupervised manner. It is non-trivial to find a reasonable decision boundary between normal data and anomalous data without using any anomalous data in the training stage, especially for data in graphs. This paper first proposes a novel deep graph-level anomaly detection model, which learns the graph representation with maximum mutual information between substructure features and global structure features while exploring a hypersphere anomaly decision boundary. We implement an orthogonal projection layer to keep the training data distribution consistent with the decision hypersphere thus avoiding erroneous evaluations. More importantly, we further propose projecting the normal data into the interval region between two co-centered hyperspheres, which makes the normal data distribution more compact and effectively overcomes the issue of outliers falling close to the center of the hypersphere. The numerical and visualization results on a few graph datasets demonstrate the effectiveness and superiority of our methods in comparison to many baselines and state-of-the-art.

1. INTRODUCTION

Anomaly detection is an essential task with various applications, such as detecting abnormal patterns or actions in credit-card fraud, medical diagnosis, sudden natural disasters (Aggarwal, 2017), etc. Usually, in anomaly detection, the training data only contain normal data and are used to train a model that can distinguish unusual patterns from abnormal ones. Anomaly detection on tabular data and images has been extensively studied recently (Ruff et al., 2018; Goyal et al., 2020; Chen et al., 2022; Liznerski et al., 2021; Sohn et al., 2021) . In contrast, there is little work on graph data despite the fact that graph data anomaly detection is very useful in various problems, such as identifying abnormal communities in social networks or detecting unusual protein structures in biology experiments. Compared with the other types of data, graph data is inherently complicated and rich in structural and relational information. The complexity of graph structure facilitates us to learn graph-level representations with discriminative patterns in many supervised tasks (e.g., graph classification). As for graph-level anomaly detection, however, the intricate graph structure brings many obstacles to this unsupervised learning problem. Graph anomaly detection usually composes four families: anomalous edge (Ouyang et al., 2020; Xu et al., 2020 ), node (Zhu & Zhu, 2020; Bojchevski & Günnemann, 2018 ), sub-graph (Wang et al., 2018; Zheng et al., 2018) , and graph-level detections (Zheng et al., 2019; Chalapathy et al., 2018) . Herein, the target of the graph-level algorithms is to explore a regular group pattern and distinguish the abnormal manifestations of the group. Group abnormal behaviors usually foreshadow some unusual events and thus play an important role in practical applications. In the past five years, few approaches have focused on graph-level anomaly detection because of the difficulty of representing graphs into feature vectors without using any label information. Graph kernel can measure the similarity between graphs and regard the result as a representation non-strictly or implicitly. Based on this, graph anomaly detection task usually performs as two-stage. In our experiments (see Section 4), we also find that one-class SVM with graph kernels sometimes yields unsatisfying performances since graph kernels may not be effective enough to quantify the similarity between graphs. So there is a large room for improvement regarding graph anomaly detection to our best knowledge. 1

