NODE NUMBER AWARENESS REPRESENTATION FOR GRAPH SIMILARITY LEARNING Anonymous

Abstract

This work aims to address two important issues in the graph similarity computation, the first one is the Node Number Awareness Issue (N 2 AI), and the second one is how to accelerate the inference speed of graph similarity computation in downstream tasks. We found that existing Graph Neural Network based graph similarity models have a large error in predicting the similarity score of two graphs with similar number of nodes. Our analysis shows that this is because of the global pooling function in graph neural networks that maps graphs with similar number of nodes to similar embedding distributions, reducing the separability of their embeddings, which we refer to as the N 2 AI. Our motivation is to enhance the difference between the two embeddings to improve their separability, thus we leverage our proposed Different Attention (DiffAtt) to construct Node Number Awareness Graph Similarity Model (N 2 AGim). In addition, we propose the Graph Similarity Learning with Landmarks (GSL 2 ) to accelerate similarity computation. GSL 2 uses the trained N 2 AGim to generate the individual embedding for each graph without any additional learning, and this individual embedding can effectively help GSL 2 to improve its inference speed. Experiments demonstrate that our N 2 AGim outperforms the second best approach on Mean Square Error by 24.3%(1.170 vs 1.546), 43.1%(0.066 vs 0.116), and 44.3%(0.308 vs 0.553), for AIDS700nef, LINUX, and IMDBMulti datasets, respectively. Our GSL 2 is at most 47.7 and 1.36 times faster than N 2 AGim and the second faster model. Our code is publicly available on https://github.com/iclr231312/N2AGim.

1. INTRODUCTION

Graph similarity computation is a fundamental problem for graph-based applications, e.g., graph data mining, graph retrieval, and graph clustering (Kriege et al., 2020; Ok & Korea, 2020) . Graph Edit Distance (GED), which is defined as the least number of graph edit operators to transform graph G i to graph G j , is one of the most popular graph similarity metrics (Gao et al., 2010; Neuhaus et al., 2006; Bougleux et al., 2015) . The graph edit operators are insert or delete a node/edge, or relabel an edge. Unfortunately, the exact GED computation is NP-Hard in general (Zeng et al., 2009) , which is too expensive to leverage in the downstream tasks. Recently, many Graph Neural Networks (GNNs) based graph similarity computation algorithms have been proposed to compute the GED in a faster manner (Bai et al., 2019; 2020; Li et al., 2019; Ling et al., 2021; Bai & Zhao, 2021; Wang et al., 2021) . The GNN-based algorithms transform the GED value to a similarity score and use an end-to-end framework to learn to map the given two graphs to their similarity score. As a general framework, the Siamese neural network can be used to aggregate information on each graph, while the feature fusion module can be used to capture the similarity between them, and the Multi-layer Perceptron (MLP) is then leveraged for the regression. However, the existing popular graph similarity models become very inaccurate in predicting the similarity of two graphs with similar number of nodes, as shown in Fig 1 . It is clear that the MSE of all four models becomes large as the difference in the number of nodes in the two graphs becomes smaller. In order to better understand this issue, we present in Section 3 a theoretical analysis of the most widely used modules in the graph similarity models from a statistical viewpoint. As shown in Fig 2(a) -(e), our conclusion is that all global pooling functions, also called graph readout functions, map graphs with similar number of nodes to similar embeddings, which reduces the separability

