RETHINKING THE EXPLANATION OF GRAPH NEU-RAL NETWORK VIA NON-PARAMETRIC SUBGRAPH MATCHING Anonymous

Abstract

The great success in graph neural networks (GNNs) provokes the question about explainability: "Which fraction of the input graph is the most determinant to the prediction?" However, current approaches usually resort to a black-box to decipher another black-box (i.e., GNN), making it difficult to understand how the explanation is made. Based on the observation that graphs typically share some joint motif patterns, we propose a novel subgraph matching framework named MatchExplainer to explore explanatory subgraphs. It couples the target graph with other counterpart instances and identifies the most crucial joint substructure by minimizing the node corresponding-based distance between them. After that, an external graph ranking is followed to select the most informative substructure from all subgraph candidates. Thus, MatchExplainer is entirely non-parametric. Moreover, present graph sampling or node dropping methods usually suffer from the false positive sampling problem. To ameliorate that issue, we take advantage of MatchExplainer to fix the most informative portion of the graph and merely operate graph augmentations on the rest less informative part, which is dubbed as MatchDrop. We conduct extensive experiments on both synthetic and real-world datasets, showing the effectiveness of our MatchExplainer by outperforming all parametric baselines with large margins. Additional results also demonstrate that our MatchDrop is a general paradigm to be equipped with GNNs for enhanced performance.

1. INTRODUCTION

Graph neural networks (GNNs) have drawn broad interest due to their success for learning representations of graph-structured data, such as social networks (Fan et al., 2019) , knowledge graphs (Schlichtkrull et al., 2018 ), traffic networks (Geng et al., 2019) , and microbiological graphs (Gilmer et al., 2017) . Despite their remarkable efficacy, GNNs lack transparency as the rationales of their predictions are not easy for humans to comprehend. This prohibits practitioners from not only gaining an understanding of the network characteristics, but correcting systematic patterns of mistakes made by models before deploying them in real-world applications. Recently, extensive efforts have been devoted to studying the explainability of GNNs (Yuan et al., 2020) . Researchers strive to answer the questions like "What knowledge of the input graph is the most dominantly important in the model's decision?" Towards this end, feature attribution and selection (Selvaraju et al., 2017; Sundararajan et al., 2017; Ancona et al., 2017) is a prevalent paradigm. They distribute the model's outcome prediction to the input graph via gradient-like signals (Baldassarre & Azizpour, 2019; Pope et al., 2019; Schnake et al., 2020) , mask or attention scores (Ying et al., 2019; Luo et al., 2020) , or prediction changes on perturbed features (Schwab & Karlen, 2019; Yuan et al., 2021) , and then choose a salient substructure as the explanation. Nonetheless, the latest approaches are all deep learning-based and rely on a network to parameterize the generation process of explanations (Vu & Thai, 2020; Wang et al., 2021b) . We argue that depending on another black-box to comprehend the prediction of the target black-box (i.e., GNNs) is sub-optimal, since the behavior of those explainers is hard to interpret. These black-boxes, indeed, always fail to give a clue of how they find proper explanatory subgraphs. In contrast, a decent

