THE EFFECTIVE COALITIONS OF SHAPLEY VALUE FOR INTEGRATED GRADIENTS Anonymous

Abstract

Many methods aim to explain deep neural networks (DNN) by attributing the prediction of DNN to its input features, like Integrated Gradients and Deep Shap, which both have critical baseline problems. Previous studies pursue a perfect but intractable baseline value, which is hard to find and has a very high computational cost, limiting the application range of these baseline methods. In this paper, we propose to find a set of baseline values corresponding to Shapley values which are easier to be found and have a lower computation cost. To solve computation dilemma of Shapley value, we propose Effective Shapley value (ES), a proportional sampling method to well simulate the ratios between the Shapley values of features and then propose Shapley Integrated Gradients (SIG) to combine Integrated Gradients with ES, to achieve a good balance between efficiency and effectiveness. Experiment results show that our ES method can well and stably approximate the ratios between Shapley values, and our SIG method has a much better and more accurate performance than common baseline values with similar computational costs.

1. INTRODUCTION

Deep Learning (DL) has exhibited significant success in various tasks, such as computer vision and reinforcement learning. Unfortunately, under the curse of transparency-performance trade-off, it's difficult to understand the intrinsic working logic of DL. Attributing the prediction of a deep network to its input features is one of the most popular methods in DL evaluation domain, such as DeepLift (Shrikumar et al. ( 2017 2021) learns baseline values corresponding to a set of features. Those methods try to approximate the perfect baseline value. However, it's difficult to find a baseline value that perfectly satisfies the two principles for various inputs in practice. For example, in the field of computer vision, zero baseline value is a common baseline value, seen as bringing no additive information. But in facial expression code task for Asian people whose eyes are black is no longer suitable for zero baseline value, owing to the black area around the eyes bringing additive information. It's ideal to use a transparent image as baseline, which is impossible to achieve in computer. Therefore, though many methods try to find a perfect baseline value, most people still use empirical baseline values based on experience, which leads to unsatisfactory and unstable results. Instead of finding a perfect baseline value, we propose to find a set of informative baseline values, which can be found easier and have a much low computation. Shapley value (Shapley (1951) ) is computed as summation of marginal difference for all coalitions and Shapley value can 1



)), Integrated Gradients (Sundararajan et al. (2017)) and Deep Shap (Lundberg & Lee (2017)). All these methods have a crucial problem how to choose a perfect baseline as a benchmark for input. As mentioned in Frye et al. (2020), the quality of baselines determines the quality of explanations for DL. Ren et al. (2021) considers there are two key requirements: (i) baseline values should remove all information represented by origin variable values and (ii) baseline values shouldn't bring in new/abnormal information. Some studies provide empirical baseline values based on actual experience. Ancona et al. (2019) set baseline values as zero, Dabkowski & Gal (2017) set baseline values as mean value over many samples and usually people randomly select some samples from datasets. While other studies try to find a more reasonable baseline value. Fong & Vedaldi (2017) makes baseline values smoothed by blurring the input image with Gaussian noise. Frye et al. (2020) sets the baseline value of a pixel with surrounding pixels. Ren et al. (

