GNNINTERPRETER: A PROBABILISTIC GENERATIVE MODEL-LEVEL EXPLANATION FOR GRAPH NEURAL NETWORKS

Abstract

Recently, Graph Neural Networks (GNNs) have significantly advanced the performance of machine learning tasks on graphs. However, this technological breakthrough makes people wonder: how does a GNN make such decisions, and can we trust its prediction with high confidence? When it comes to some critical fields, such as biomedicine, where making wrong decisions can have severe consequences, it is crucial to interpret the inner working mechanisms of GNNs before applying them. In this paper, we propose a model-agnostic model-level explanation method for different GNNs that follow the message passing scheme, GNNInterpreter, to explain the high-level decision-making process of the GNN model. More specifically, GNNInterpreter learns a probabilistic generative graph distribution that produces the most discriminative graph pattern the GNN tries to detect when making a certain prediction by optimizing a novel objective function specifically designed for the model-level explanation for GNNs. Compared to existing works, GNNInterpreter is more flexible and computationally efficient in generating explanation graphs with different types of node and edge features, without introducing another blackbox or requiring manually specified domain-specific rules. In addition, the experimental studies conducted on four different datasets demonstrate that the explanation graphs generated by GNNInterpreter match the desired graph pattern if the model is ideal; otherwise, potential model pitfalls can be revealed by the explanation.

1. INTRODUCTION

Graphs are widely used to model data in many applications such as chemistry, transportation, etc. Since a graph is a unique non-Euclidean data structure, modeling graph data remained a challenging task until Graph Neural Networks (GNNs) emerged (Hamilton et al., 2017; Cao et al., 2016) . As a powerful tool for representation learning on graph data, GNN achieved state-of-the-art performance on various different machine learning tasks on graphs. As the popularity of GNNs rapidly increases, people begin to wonder why one should trust this model and how the model makes decisions. However, the complexity of GNNs prevents humans from interpreting the underlying mechanism in the model. The lack of self-explainability becomes a serious obstacle for applying GNNs to real-world problems, especially when making wrong decisions may incur an unaffordable cost. Explaining deep learning models on text or image data (Simonyan et al., 2014; Selvaraju et al., 2019) has been well-studied. However, explaining deep learning models on graphs is still less explored. Compared with explaining models on text or image data, explaining deep graph models is a more challenging task for several reasons (Yuan et al., 2020b) : (i) the adjacency matrix representing the topological information has only discrete values, which cannot be directly optimized via gradientbased methods (Duval & Malliaros, 2021) , (ii) in some application domains, a graph is valid only if it satisfies a set of domain-specific graph rules, so that generating a valid explanation graph to depicts the underlying decision-making process of GNNs is a nontrivial task, and (iii) graph data structure is heterogeneous in nature with different types of node features and edge features, which makes developing a one-size-fits-all explanation method for GNNs to be even more challenging. In this paper, we attempt to interpret the high-level decision-making process of GNNs and to identify potential model pitfalls, by resolving these three challenges respectively. In recent years, explaining GNNs has aroused great interest, and thus many research works have been conducted. The existing works can be classified into two categories: instance-level explanations (Luo et al., 2020; Ying et al., 2019; Vu & Thai, 2020 ) and model-level explanations (Yuan et al., 2020a) . Instance-level explanation methods focus on explaining the model prediction for a given graph instance, whereas model-level explanation methods aim at understanding the general behavior of the model not specific to any particular graph instance. If the ultimate goal is to examine the model reliability, one will need to examine many instance-level explanations one by one to draw a rigorous conclusion about the model reliability, which is cumbersome and time-consuming. Conversely, the model-level explanation method can directly explain the high-level decision-making rule inside the blackbox (GNN) for a target prediction, which is less time-consuming and more informative regarding the trustworthiness of the GNNs. Besides, it has been shown that any instance-level explanation method would fail to provide a faithful explanation for a GNN that suffers from the bias attribution issue (Faber et al., 2021) , while the model-level explanation methods can not only provide a faithful explanation for this case but also diagnose the bias attribution issue. Even though the model-level explanation methods for GNNs have such advantages, they are much less explored than the instance-level explanation methods. In this paper, we propose a probabilistic generative model-level explanation method for explaining GNNs on the graph classification task, called GNNInterpreter. It learns a generative explanation graph distribution that represents the most discriminative features the GNN tries to detect when making a certain prediction. Precisely, the explanation graph distribution is learned by optimizing a novel objective function specifically designed for the model-level explanation of GNNs such that the generated explanation is faithful and more realistic regarding the domain-specific knowledge. More importantly, GNNInterpreter is a general approach to generating explanation graphs with different types of node features and edge features for explaining different GNNs which follows the message-passing scheme. We quantitatively and qualitatively evaluated the efficiency and effectiveness of GNNInterpreter on four different datasets, including synthetic datasets and public real-world datasets. The experimental results show that GNNInterpreter can precisely find the ideal topological structure for the target prediction if the explained model is ideal, and reveal the potential pitfalls in the model decision-making process if there are any. By identifying these potential model pitfalls, people can be mindful when applying this model to unseen graphs with a specific misleading pattern, which is especially important for some fields in which making wrong decisions may incur an unaffordable cost. Compared with the current state-of-the-art model-level explanation method for GNNs, XGNN (Yuan et al., 2020a) , the quantitative and qualitative evaluation result shows that the explanation graphs generated by GNNInterpreter are more representative regarding the target class than the explanations generated by XGNN. Additionally, GNNInterpreter has the following advantages compared with XGNN: • GNNInterpreter is a more general approach that can generate explanation graphs with different types of node features and edge features, whereas XGNN cannot generate graphs with continuous node features or any type of edge features. • By taking advantage of the special design of our objective function, GNNInterpreter is more flexible in explaining different GNN models without the need of having domain-specific knowledge for the specific task, whereas XGNN requires domain-specific knowledge to manually design the reward function for the reinforcement learning agent. ). There exists a variety of GNN models (Kipf & Welling, 2017; Gilmer et al., 2017; Veličković et al., 2018) , but they often share a common idea of message passing as described in section 3. In this paper, we will focus on interpreting the high-level decision-making process of graph classifiers and diagnosing their potential model pitfall.



• GNNInterpreter is more computationally efficient. The time complexity for GNNInterpreter is much lower than training a deep reinforcement learning agent as in XGNN. Practically, it usually only takes less than a minute to explain one class for a GNN model. • GNNInterpreter is a numerical optimization approach without introducing another blackbox to explain GNNs, unlike XGNN which trains a deep learning model to explain GNNs.

