SCALING SYMBOLIC METHODS USING GRADIENTS FOR NEURAL MODEL EXPLANATION

Abstract

Symbolic techniques based on Satisfiability Modulo Theory (SMT) solvers have been proposed for analyzing and verifying neural network properties, but their usage has been fairly limited owing to their poor scalability with larger networks. In this work, we propose a technique for combining gradient-based methods with symbolic techniques to scale such analyses and demonstrate its application for model explanation. In particular, we apply this technique to identify minimal regions in an input that are most relevant for a neural network's prediction. Our approach uses gradient information (based on Integrated Gradients) to focus on a subset of neurons in the first layer, which allows our technique to scale to large networks. The corresponding SMT constraints encode the minimal input mask discovery problem such that after masking the input, the activations of the selected neurons are still above a threshold. After solving for the minimal masks, our approach scores the mask regions to generate a relative ordering of the features within the mask. This produces a saliency map which explains "where a model is looking" when making a prediction. We evaluate our technique on three datasets -MNIST, ImageNet, and Beer Reviews, and demonstrate both quantitatively and qualitatively that the regions generated by our approach are sparser and achieve higher saliency scores compared to the gradient-based methods alone. Code and examples are at -

1. INTRODUCTION

Satisfiability Modulo Theory (SMT) solvers (Barrett & Tinelli, 2018) are routinely used for symbolic modeling and verifying correctness of software programs (Srivastava et al., 2009) , and more recently they have also been used for verifying properties of deep neural networks (Katz et al., 2017) . SMT Solvers in their current form are difficult to scale to large networks. Model explanation is one such domain where SMT solvers have been used but they are limited to very small sized networks (Gopinath et al., 2019; Ignatiev et al., 2019) . The goal of our work is to address the issue of scalability of SMT solvers by using gradient information, thus enabling their use for different applications. In this work, we present a new application of SMT solvers for explaining neural network decisions. Model explanation can be viewed as identifying a minimal set of features in a given input that is critical to a model's prediction (Carter et al., 2018; Macdonald et al., 2019) . Such a problem formulation for identifying a minimal set lends itself to the use of SMT solvers for this task. We can encode a neural network using real arithmetic (Katz et al., 2017) and use an SMT solver to optimize over the constraints to identify a minimal set of inputs that can explain the prediction. However, there are two key challenges in this approach. First, we cannot generate reliable explanations based on final model prediction as the minimal input is typically out of distribution. Second, solving such a formulation is challenging for SMT solvers as the decision procedures for solving these constraints have exponential complexity, and is further exacerbated by the large number of parameters in typical neural network models. Thus, previous approaches for SMT-based analysis of neural networks have been quite limited, and have only been able to scale to networks with few thousands of parameters. To solve these challenges, instead of doing minimization by encoding the entire network, our approach takes advantage of the gradient information, specifically Integrated Gradients (IG) (Sundararajan et al., 2017) , in lieu of encoding the deeper layers, and encodes a much simpler set of linear constraints pertaining to the layer closest to the input. We encode the mathematical equations of a neural network as SMT constraints using the theory of Linear Real Arithmetic (LRA), and use z3 solver (Bjørner et al., 2015) as it additionally supports optimization constraints such as minimization. The SMT solver then finds a minimal subset (of input features) by performing minimization on these equations. Thus, our approach, which we refer to as SMUG, is able to scale Symbolic Methods Using Gradient information while still providing a faithful explanation of the neural network's decision. SMUG is built on two properties. First, based on the target prediction, SMUG uses gradient information propagated from the deeper layers to identify neurons that are important in the first layer, and only encodes those. For this, we use IG (Sundararajan et al., 2017) instead of relying on gradients alone. Second, for the optimization, a set of input pixels are determined to be relevant for prediction if they are able to activate the neurons deemed important, and maintain a fraction of their activation as the original (full) input image. Empirically, we observe good performance on visual and text classification tasks. We evaluate SMUG on three datasets: MNIST (LeCun et al., 2010) , ImageNet (Deng et al., 2009), and Beer Reviews (McAuley et al., 2012) . We show that we can fully encode the minimal feature identification problem for a small feedforward network (without gradient-based neuron selection) for MNIST, but this full SMT encoding scales poorly for even intermediate sized networks. On ImageNet, we observe that our method performs better than Integrated Gradients (Sundararajan et al., 2017) and several strong baselines. Additionally, we observe that our approach finds significantly sparser masks (on average 17% of the original image size). Finally, we also show that our technique is applicable to text models where it performs competitively with other methods including SIS (Carter et al., 2018) and Integrated Gradients (Sundararajan et al., 2017) . This paper makes the following key contributions: • We present a technique (SMUG) to encode the minimal input feature discovery problem for neural model explanation using SMT solvers. Our approach, which does masking on linear equations also overcomes the issue of handling out-of-distribution samples. • Our approach uses gradient information to scale SMT-based analysis of neural networks to larger models and input features. Further, it also overcomes the issue of choosing a "baseline" parameter for Integrated Gradients (Kapishnikov et al., 2019; Sturmfels et al., 2020) . • We empirically evaluate SMUG on image and text datasets, and show that the minimal features identified by it are both quantitatively and qualitatively better than several baselines. • To improve our understanding on saliency map evaluation, we show how the popular and widely used LSC metric (Dabkowski & Gal, 2017) shown promising results, they only scale to Neural Network with 5000 nodes. Thus, scaling these approaches for larger neural networks and performing richer analysis based on global input features still remains a challenge. We present and demonstrate an approach that works for larger and more complex image and text models. While most of the above SMT based techniques focus on verifying properties of deep networks, our work focuses on applying symbolic techniques to the related task of model explanation, i.e. to say where a model is "looking", by solving for the input features responsible for a model's prediction.

