RETHINKING THE EXPLANATION OF GRAPH NEU-RAL NETWORK VIA NON-PARAMETRIC SUBGRAPH MATCHING Anonymous

Abstract

The great success in graph neural networks (GNNs) provokes the question about explainability: "Which fraction of the input graph is the most determinant to the prediction?" However, current approaches usually resort to a black-box to decipher another black-box (i.e., GNN), making it difficult to understand how the explanation is made. Based on the observation that graphs typically share some joint motif patterns, we propose a novel subgraph matching framework named MatchExplainer to explore explanatory subgraphs. It couples the target graph with other counterpart instances and identifies the most crucial joint substructure by minimizing the node corresponding-based distance between them. After that, an external graph ranking is followed to select the most informative substructure from all subgraph candidates. Thus, MatchExplainer is entirely non-parametric. Moreover, present graph sampling or node dropping methods usually suffer from the false positive sampling problem. To ameliorate that issue, we take advantage of MatchExplainer to fix the most informative portion of the graph and merely operate graph augmentations on the rest less informative part, which is dubbed as MatchDrop. We conduct extensive experiments on both synthetic and real-world datasets, showing the effectiveness of our MatchExplainer by outperforming all parametric baselines with large margins. Additional results also demonstrate that our MatchDrop is a general paradigm to be equipped with GNNs for enhanced performance.

1. INTRODUCTION

Graph neural networks (GNNs) have drawn broad interest due to their success for learning representations of graph-structured data, such as social networks (Fan et al., 2019) , knowledge graphs (Schlichtkrull et al., 2018) , traffic networks (Geng et al., 2019) , and microbiological graphs (Gilmer et al., 2017) . Despite their remarkable efficacy, GNNs lack transparency as the rationales of their predictions are not easy for humans to comprehend. This prohibits practitioners from not only gaining an understanding of the network characteristics, but correcting systematic patterns of mistakes made by models before deploying them in real-world applications. Recently, extensive efforts have been devoted to studying the explainability of GNNs (Yuan et al., 2020) . Researchers strive to answer the questions like "What knowledge of the input graph is the most dominantly important in the model's decision?" Towards this end, feature attribution and selection (Selvaraju et al., 2017; Sundararajan et al., 2017; Ancona et al., 2017) is a prevalent paradigm. They distribute the model's outcome prediction to the input graph via gradient-like signals (Baldassarre & Azizpour, 2019; Pope et al., 2019; Schnake et al., 2020) , mask or attention scores (Ying et al., 2019; Luo et al., 2020) , or prediction changes on perturbed features (Schwab & Karlen, 2019; Yuan et al., 2021) , and then choose a salient substructure as the explanation. Nonetheless, the latest approaches are all deep learning-based and rely on a network to parameterize the generation process of explanations (Vu & Thai, 2020; Wang et al., 2021b) . We argue that depending on another black-box to comprehend the prediction of the target black-box (i.e., GNNs) is sub-optimal, since the behavior of those explainers is hard to interpret. These black-boxes, indeed, always fail to give a clue of how they find proper explanatory subgraphs. In contrast, a decent explainer ought to provide clear insights of how it captures and values this substructure. Otherwise, a lack of interpretability in explainers can undermine our trust in them. Moreover, some prior works (Chen et al., 2018; Ying et al., 2019; Yuan et al., 2021) independently excavate explanations for each instance without explicitly referring to other training data in the inference phase. They ignore the fact that different essential subgraph patterns are shared by different groups of graphs, which can be the key to decipher the decision of GNNs. These frequently occurred motifs usually contain rich semantic meanings and indicate the characteristics of the whole graph instance (Henderson et al., 2012; Zhang et al., 2020; Banjade et al., 2021) . For example, the hydroxide group (-OH) in small molecules typically results in higher water solubility, and the pivotal role of functional groups has also been proven in protein structure prediction (Senior et al., 2020) . To overcome these drawbacks, we propose to mine the explanatory motif in a subgraph matching manner. In contrast to a learnable network, we design a non-parametric algorithm dubbed MatchExplainer with no need for training, which is composed of two stages. At the first stage, it marries the target graph iteratively with other counterpart graphs and endeavors to explore the most crucial joint substructure by minimizing the node corresponding-based distance in the high-dimensional feature space. Since the counterpart graphs are diverse, the explanations at the first stage of MatchExplainer can be non-unique for the same instance. Thus, an external graph ranking technique is followed as the second stage of MatchExplainer to pick out the most appropriate one. To be explicit, it examines the important role that these substructures plays in determining the graph property by subtracting the subgraphs from the original input graph and testing the prediction of the remaining part. Our MatchExplainer not only shows great potential in fast discovering the explanations for GNNs, but also can be employed to enhance the traditional graph augmentation methods. Though exhibiting strong power in preventing over-fitting and over-smoothing, present graph sampling or node dropping mechanisms suffer from the false positive sampling problem. That is, nodes or edges of the most informative substructure are accidentally dropped or erased but the model is still required to forecast the original property, which can be misleading. To alleviate this obstacle, we take advantage of MatchExplainer and introduce a simple technique called MatchDrop. Specifically, it first digs out the explanatory subgraph by means of MatchExplainer and keeps this part unchanged. Then the graph sampling or node dropping is implemented solely on the remaining less informative part. As a consequence, the core fraction of the input graph that reveals the label information is not affected and the false positive sampling issue is effectively mitigated. To summarize, we are the foremost to investigate the explainability of GNNs from the perspective of non-parametric subgraph matching to the best of our knowledge. Extensive experiments on synthetic and real-world applications demonstrate that our MatchExplainer can find the explanatory subgraphs fast and accurately with state-of-the-art performance. Apart from that, we empirically show that our MatchDrop can serve as an efficient way to promote the graph augmentation methods.

2. PRELIMINARY AND TASK DESCRIPTION

In this section, we begin with the description of the GNN explanation task and then briefly review the relevant background of graph matching and mutual information theory. Explanations for GNNs. Let f Y denote a well-trained GNN to be explained, which gives the prediction ŷG of the input graph G to approximate the ground-truth label y G . Without loss of generality, we consider the problem of explaining a graph classification task (Ying et al., 2019; Yuan et al., 2020) as to find an explainer f S that discovers the subgraph G S from input graph G by: arg min f S R(f Y • f S (G), ŷG ), s.t.|f S (G)| ≤ K, where R(•, •) is the risk function, which is usually implemented as a cross-entropy (CE) loss or a mean squared error (MSE) loss, and | • | returns the graph size (namely the number of nodes in this paper), and K is a prefixed constraint. Graph Matching. As a classic combinatorial problem, graph matching is known to be NPhard (Loiola et al., 2007) . Addressing it requires expensive, complex, and impractical solvers, hence plenty of inexact but practical solutions (Wang et al., 2020) have been proposed. Given two different

