MEGAN: MULTI-EXPLANATION GRAPH ATTENTION NETWORK

Abstract

Explainable artificial intelligence (XAI) methods are expected to improve trust during human-AI interactions, provide tools for model analysis and extend human understanding of complex problems. Explanation-supervised training allows to improve explanation quality by training self-explaining XAI models on ground truth or human-generated explanations. However, existing explanation methods have limited expressiveness and interoperability due to the fact that only single explanations in form of node and edge importance are generated. To that end we propose the novel multi-explanation graph attention network (MEGAN). Our fully differentiable, attention-based model features multiple explanation channels, which can be chosen independently of the task specifications. We first validate our model on a synthetic graph regression dataset. We show that for the special single explanation case, our model significantly outperforms existing posthoc and explanation-supervised baseline methods. Furthermore, we demonstrate significant advantages when using two explanations, both in quantitative explanation measures as well as in human interpretability. Finally, we demonstrate our model's capabilities on multiple real-world datasets. We find that our model produces sparse high-fidelity explanations consistent with human intuition about those tasks and at the same time matches state-of-the-art graph neural networks in predictive performance, indicating that explanations and accuracy are not necessarily a trade-off.

1. INTRODUCTION

Explainable AI (XAI) methods aim to provide explanations complementing a model's predictions to make it's complex inner workings more transparent to humans with the intention to improve trust and reliability, provide tools for model analysis, and comply with anti-discrimination laws (Doshi-Velez & Kim, 2017) . Many explainability methods have already been proposed for graph neural networks (GNNs), as Yuan et al. (2022) demonstrate in their literature survey. However, the majority of work is focused on post-hoc XAI methods that aim to provide explanations for already existing models through external analysis procedures. In contrast to that, we demonstrate significant advantages of methods which Jiménez-Luna et al. ( 2020) call self-explaining methods. This class of models directly generates explanations alongside each prediction. One inherent advantage of many self-explaining models is their capability for explanation-supervised training. In explanation supervision the explanations are trained alongside the main prediction task to match known explanation ground truth or human-generated explanations, improving explanation quality in the process. Recently, impressive successes of explanation-supervision have been reported in the domains of image processing (Linsley et al., 2019; Qiao et al., 2018; Boyd et al., 2022) and natural language processing (Fernandes et al., 2022; Pruthi et al., 2020; Stacey et al., 2022) . In the graph domain, explanation supervision is very sparsely explored yet (Gao et al., 2021; Magister et al., 2022) . Inspired by the explanation-supervision successes demonstrated in other domains, especially by attention-based models, we propose our novel, self-explaining multi-explanation graph attention network (MEGAN) to enable effective explanation-supervised training for graph regression and classification problems. We specifically want to emphasize our focus on graph regression tasks, which have been ignored by previous work on explanation supervision. However, we argue that graph regression problems are becoming an especially important topic due to their high relevance in chemistry and material sci-ence applications. Typical graph XAI methods and existing work on explanation supervision provide single-channel attributional explanations, which means that each input element (nodes and edges) are associated with a single [0, 1] value to denote their importance. We argue that explanations merely indicating importance are hardly interpretable for regression tasks, as it remains unknown for what such explanations provide evidence: The exact predicted value, a certain value range or an especially high/low value? Due to this fact, we design our model to support an arbitrary number of explanation channels, independent of task specifications. For regression tasks, we choose 2 channels: One positive channel which indicates evidence of high target values, and one negative channel which contains evidence of low target values. We introduce a special explanation co-training routine to promote channels to behave according to those interpretations. Using synthetic and real-world datasets, we illustrate how multi-channel explanations help to improve interpretability, especially for graph regression problems. We first validate our model on a synthetic graph regression dataset. We show that even in the single-channel case, our model outperforms existing post-hoc and explanation supervision baseline methods significantly, not only in accuracy of explanations but also in terms of prediction performance. In general, we find that our proposed model shows surprisingly good prediction performance independent from the aspect of explainability. In Appendix D we provide a benchmark comparison with numerous recent GNN architectures which shows that our model achieves state-of-the-art performance for graph regression tasks. Moving to multi-channel explanations we show that our model creates explanations that are accurate, sparse, and faithful to predicted values. On the three real-world datasets about movie review sentiment analysis and the prediction of solubility and photophysical properties of molecular graphs, we show that our model creates explanations consistent with human intuition and knowledge. Furthermore, we show that our model reproduces known structure-property relationships for the nontrivial singlet-triplet task, supports previously hypothesized explanations, and even produces new hypotheses for explanatory motifs.

2. RELATED WORK

Graph explanations. Yuan et al. ( 2022) provides an overview of XAI methods that were either adopted or specifically designed for graph neural networks (GNNs). Notable ones include GradCAM (Pope et al., 2019 ), GraphLIME (Huang et al., 2022) 2022) for example demonstrate this for the language processing domain. Recently, Gao et al. (2021) introduce GNES, a method to perform GNN explanation supervision using GradCAM-generated explanations. In our work, we show that MEGAN is able to significantly improve explanation-supervision capabilities when compared to GNES. Additionally, Magister et al. (2022) emphasize that their method supports explanation supervision with human-generated explanations, however, the concept-based explanations generated by their approach are not empirically comparable to the attributional explanations produced by our model.



and GNNExplainer (Ying et al., 2019). Jiménez-Luna et al. (2020) presents another overview of XAI methods used for the application domain of drug discovery. Sanchez-Lengeling et al. (2020) evaluate many common graph XAI methods for tasks of chemical property prediction. Henderson et al. (2021) for example introduce regularization terms to improve GradCAM-generated explanations for chemical property prediction. Most of the approaches presented here are classified as post-hoc methods, which aim to explain the decision of existing models in hindsight. Few prior works explore the class of GNNs which Jiménez-Luna et al. (2020) describe as self-explaining. Notably, Magister et al. (2022) introduce a self-explaining graph-concept network and Zhang et al. (2022) outline a prototype-learning approach for graphs where internal prototypes act as natural explanations. Explanation supervision. During explanation-supervised training, the explanations generated by the model are trained to match a given dataset of usually human-generated explanations alongside the main prediction task. Linsley et al. (2019), Qiao et al. (2018) and Boyd et al. (2022) for example have demonstrated promising results for explanation supervision in the image processing domain. Likewise, Fernandes et al. (2022), Pruthi et al. (2022) and Stacey et al. (

