GLOBAL EXPLAINABILITY OF GNNS VIA LOGIC COMBINATION OF LEARNED CONCEPTS

Abstract

While instance-level explanation of GNN is a well-studied problem with plenty of approaches being developed, providing a global explanation for the behaviour of a GNN is much less explored, despite its potential in interpretability and debugging. Existing solutions either simply list local explanations for a given class, or generate a synthetic prototypical graph with maximal score for a given class, completely missing any combinatorial aspect that the GNN could have learned. In this work, we propose GLGExplainer (Global Logic-based GNN Explainer), the first Global Explainer capable of generating explanations as arbitrary Boolean combinations of learned graphical concepts. GLGExplainer is a fully differentiable architecture that takes local explanations as inputs and combines them into a logic formula over graphical concepts, represented as clusters of local explanations. Contrary to existing solutions, GLGExplainer provides accurate and human-interpretable global explanations that are perfectly aligned with ground-truth explanations (on synthetic data) or match existing domain knowledge (on real-world data). Extracted formulas are faithful to the model predictions, to the point of providing insights into some occasionally incorrect rules learned by the model, making GLGExplainer a promising diagnostic tool for learned GNNs.

1. INTRODUCTION & RELATED WORK

Graph Neural Networks (GNNs) have become increasingly popular for predictive tasks on graph structured data. However, as many other deep learning models, their inner working remains a black box. The ability to understand the reason for a certain prediction represents a critical requirement for any decision-critical application, thus representing a big issue for the transition of such algorithms from benchmarks to real-world critical applications. Over the last years, many works proposed Local Explainers (Ying et al., 2019; Luo et al., 2020; Yuan et al., 2021; Vu & Thai, 2020; Shan et al., 2021; Pope et al., 2019; Magister et al., 2021) to explain the decision process of a GNN in terms of factual explanations, often represented as subgraphs for each sample in the dataset. We leave to Yuan et al. (2022) a detailed overview about Local Explainers, who recently proposed a taxonomy to categorize the heterogeneity of those. Overall, Local Explainers shed light over why the network predicted a certain value for a specific input sample. However, they still lack a global understanding of the model. Global Explainers, on the other hand, are aimed at capturing the behaviour of the model as a whole, abstracting individual noisy local explanations in favor of a single robust overview of the model. Nonetheless, despite this potential in interpretability and debugging, little has been done in this direction. GLocalX (Setzu et al., 2021) is a general solution to produce global explanations of black-box models by hierarchically aggregating local explanations into global rules. This solution is however not readily applicable to GNNs as it requires local explanations to be expressed as logical rules. Yuan et al. Yuan et al. (2020) proposed XGNN, which frames the Global Explanation problem for GNNs as a form of input optimization (Wu et al., 2020) , using policy gradient to generate synthetic prototypical graphs for each class. The approach requires prior domain knowledge, which is not always available, to drive the generation of valid prototypes. Additionally, it cannot identify any compositionality in the returned explanation, and has no principled way to generate alternative explanations for a given class. Indeed, our experi-mental evaluation shows that XGNN fails to generate meaningful global explanations in all the tasks we investigated. Concept-based Explainability (Kim et al., 2018; Ghorbani et al., 2019; Yeh et al., 2020 ) is a parallel line of research where explanations are constructed using "concepts" i.e., intermediate, high-level and semantically meaningful units of information commonly used by humans to explain their decisions. Concept Bottleneck Models (Koh et al., 2020) and Prototypical Part Networks (Chen et al., 2019a) are two popular architectures that leverage concept learning to learn explainable-by-design neural networks. In addition, similarly to Concept Bottleneck Models, Logic Explained Networks (LEN) (Ciravegna et al., 2021a) generate logic-based explanations for each class expressed in terms of a set of input concepts. Such concept-based classifiers improve human understanding as their input and output spaces consist of interpretable symbols (Wu et al., 2018; Ghorbani et al., 2019; Koh et al., 2020) . Those approaches have been recently adapted to GNNs (Zhang et al., 2022; Georgiev et al., 2022; Magister et al., 2022) . However, these solutions are not conceived for explaining already learned GNNs. Our contribution consists in the first Global Explainer for GNNs which i) provides a Global Explanation in terms of logic formulas, extracted by combining in a fully differentiable manner graphical concepts derived from local explanations; ii) is faithful to the data domain, i.e., the logic formulas, being derived from local explanations, are intrinsically part of the input domain without requiring any prior knowledge. We validated our approach on both synthetic and real-world datasets, showing that our method is able to accurately summarize the behaviour of the model to explain in terms of concise logic formulas.

2.1. GRAPH NEURAL NETWORKS

Given a graph G = (V, E) with adjacency matrix A where A ij = 1 if there exists an edge between nodes i and j, and a node feature matrix X ∈ R |V|×r where X i is the r-dimensional feature vector of node i, a GNN layer aggregates the node's neighborhood information into a d-dimensional refined representation H ∈ R |V|×d . The most common form of aggregation corresponds to the GCN (Kipf & Welling, 2016) architecture, defined by the following propagation rule: H k+1 = σ( D-1 2 Ã D-1 2 H k W k ) where Ã = A + I, D is the degree matrix relative to Ã, σ an activation function, and W ∈ R F ×F is a layer-wise learned linear transformation. However, the form of Eq 1 is heavily dependent on the architecture and several variants have been proposed (Kipf & Welling, 2016; Veličković et al., 2017; Gilmer et al., 2017) 2.2 LOCAL EXPLAINABILITY FOR GNNS Many works recently proposed Local Explainers to explain the behaviour of a GNN (Yuan et al., 2022) . In this work, we will broadly refer to all of those whose output can be mapped to a subgraph of the input graph (Ying et al., 2019; Luo et al., 2020; Yuan et al., 2021; Vu & Thai, 2020; Shan et al., 2021; Pope et al., 2019) . For the sake of generality, let LEXP(f, G) = Ĝ be the weighted graph obtained by applying the local explainer LEXP to generate a local explanation for the prediction of the GNN f over the input graph G, where each Âij relative to Ĝ is the likelihood of the edge (i, j) being an important edge. By binarizing the output of the local explainer Ĝ with threshold θ ∈ R we achieve a set of connected components Ḡi such that i Ḡi ⊆ Ĝ. For convenience, we will henceforth refer to each of these Ḡi as local explanation.

3. PROPOSED METHOD

The key contribution of this paper is a novel Global Explainer for GNNs which allows to describe the behaviour of a trained GNN f by providing logic formulas described in terms of humanunderstandable concepts (see Fig. 1 ). In the process, we use one of the available Local Explainers (Ying et al., 2019; Luo et al., 2020; Yuan et al., 2021; Vu & Thai, 2020; Shan et al., 2021; Pope 

