A SAMPLE-BASED METHOD FOR SEMANTIC UNDER-STANDING OF NEURAL NETWORK DECISIONS Anonymous authors Paper under double-blind review

Abstract

Interpretability in deep learning is one of the largest obstacles to its more widespread adoption in critical applications. A variety of methods have been introduced to understand and explain decisions made by Deep Models. A class of these methods highlights which features are most influential to model predictions. These methods have some key weaknesses. First, most of these methods are applicable only to the atomic elements that make up raw inputs to the model (e.g. pixels or words). Second, these methods generally do not distinguish between the importance of features individually and their importance due to interactions with other features. As a result, it is difficult to explore high-level questions about how models use features during decision-making. We tackle these issues by proposing Sample-Based Semantic Analysis (SBSA). We use Sobol variance decomposition as our sample-based method which allows us to quantify the importance of semantic combinations of raw inputs and highlight the extent to which these features are important individually as opposed to due to interactions with other features. We demonstrate the ability of Sobol-SBSA to answer a richer class of questions about the behavior of Deep Learning models by exploring how CNN models from AlexNet to DenseNet use regions when classifying images. We present three key findings. 1) The architectural improvements from AlexNet to DenseNet manifested themselves in CNN models utilizing greater levels of region interactions for predictions. 2) These same architectural improvements increased the importance that CNN models placed on the background of images 3) Adversarially robust CNNs reduce the reliance of modern CNNs on both interactions and image background. Our proposed method is generalizable to a wide variety of network and input types and can help provide greater clarity about model decisions.

1. INTRODUCTION

Deep learning models are becoming endemic in various applications. As models are increasingly used for critical applications in medicine such as detecting lung nodules (Schultheiss et al., 2021) or autonomous driving (Li et al., 2021) , it is important to either create interpretable models or to make opaque models human interpretable. This paper focuses on the latter. Existing methods developed over the last decade for doing this can be broken down into model agnostic vs model dependent. Model agnostic methods, such as Shapley values (Kononenko et al., 2013) and Integrated Gradients (Sundararajan et al., 2017) weigh the importance of input features without relying on the structure of the model. In contrast, methods such as GradCam (Selvaraju et al., 2017) and GradCam++ (Chattopadhay et al., 2018) are heavily dependent on model architecture. While these methods yield valuable information about models, they share common gaps. First, they do not distinguish between the features in input space that are individually important and features that are important because of their interaction with other features. Second, the above methods are generally applied to inputs at their most granular level (pixels, words, etc..) . The combination of these gaps limits the conclusions that Machine Learning practitioners can make about the behavior of models as a whole. We address these limitations in two key ways. First, we introduce a two-part framework called Sample-Based Semantic Analysis (SBSA). The first part of the framework is a function that generates semantic representations of inputs and associates these semantic representations with real numbers. The second part of the framework is a black-box sample-based sensitivity method. In this case, the Sobol method which reports the importance of individual features and their interactions. Second, we demonstrate the ability of Sobol-SBSA to answer a richer set of questions than standard interpretability methods by applying it to CNN models in the context of ImageNet. The key results and contributions of this paper are as follows: 1. We present a general-purpose framework for using sample-based sensitivity methods to analyze the importance of semantic representations of inputs and test it using a variety of black-box methods. 2. We demonstrate that the Sobol method outperforms other popular black box methods, Integrated Gradients, Shapley (Kernel Shap), and LIME, for selecting both the most and least important regions to CNN predictions. 3. We show, through direct measurement, that the main impacts of the evolution of CNN architectures were increasing the extent to which they used region interactions and by which they relied on background information in images. Similarly, we show that adversarially robust versions of CNNs reduce both of these effects for modern CNNs. To our knowledge, Sobol-SBSA is the first pipeline to facilitate the direct measurement of such trends, and to do so within a single pipeline.

2. METHODOLOGY

In this section, we describe the two components of SBSA and specify how we use it to analyze the importance of image regions in ImageNet. In particular, we describe how we associate image regions to quantities that can be analyzed with a sampling-based method, and the specifics of Sobol as a sampling-based sensitivity method.

2.1. SAMPLE-BASED SEMANTIC ANALYSIS (SBSA)

Let us define the following variables. x ∈ R d is an input to a model, f : x → y ∈ R s is a model that takes x as an argument and produces y, x[i] ∈ R d is a sample of x, and N ∈ Z is a prescribed integer that helps to determine the number of x[i] samples generated. Most sample-based sensitivity methods operate by generating a number of samples that is some function of N and d. The model is then evaluated on these samples and the resulting model outputs are used by Sensitivity analysis methods, such as Sobol, to determine the importance of components of x to the model output, y. One thing that immediately becomes clear is that for deep learning applications with highdimensional inputs, such as images, videos, and long documents, applying this process naively is prohibitively expensive. This issue can be greatly minimized if one turns to semantic representations of inputs instead. In this paper, a semantic representation of an input is defined as follows. A semantic representation of an input, x, is some combination of the raw components of that input which yields a human recognizable higher order feature, such as the colors in an image, image regions, or grammatical parts of sentences. We define this semantic representation as {S 1 , . . . S l }, S k ∈ R m , where m < d. Recalling that most sample-based sensitivity methods operate on real numbers, we define three mapping objects. G : x → {S 1 , . . . , S l }, G -1 : {S 1 , . . . , S l } →≈ x x ∈ R d , S k ∈ R m , l < d, m < d (1) H : {S 1 , . . . , S l }, → {r 1 , . . . , r l }, S k ∈ R m , r k ∈ R, (2) R : {(r 1 , S 1 ), . . . , (r l , S l )) → {S * 1 , . . . , S * l }, S * k ∈ R m G maps the raw input, x, to l semantic representations, S k , H associates the semantic representation to some lower dimension vector of real numbers, r ∈ R l , and R creates new semantic representations based on r k and S k . G is invertible. SBSA generates samples of r, [r [1] , . . . , r[n] ]. From these samples, R is used to generate samples of the original semantic representations, R r [i] k , S k = S[i] k , and G -1 uses these samples to create samples of the raw inputs, x[i] .

{R(r [i]

1 , S 1 ), . . . , R(r [n] l , S l )} = { S[i] 1 , . . . , S[n] l } (4) G -1 ( S[i] 1 , . . . , S[i] l ) = x[i] The model is then evaluated on the samples of the raw input, x[i] . Since the sampling was done in r, the sensitivity analysis reports the importance of components of the semantic representations of the input, S k corresponding to the components of r, r k . G, H, and R can be any functions that derive S k from x, associates a real number, r k , with S k , and produces a semantic representation, S * k , as a function of r k and S k . However, we recommend two properties to maximize our approach. Property 1 (Sensitivity) Any change in r should result in a change x. Sample-based sensitivity methods observe how changing different input components impact model outputs in order to determine importance. Property 1 insures that all samples in r contribute information to SBSA. Property 2 (Approximate Reconstruction): When using S and r produced from the original input, R and G -1 should closely approximate x. Sample-based sensitivity analysis methods construct samples that are uniformly distributed between 0 and 1, and scales these samples to bounds determined by the original r (Herman & Usher, 2017) . Property 2 insures that produced samples of x, x[i] , will increase and decrease different semantic components of x with approximately equal probability. In the following section we describe the use of SBSA via image regions.

2.2. SAMPLE-BASED SEMANTIC ANALYSIS (SBSA) APPLIED TO IMAGENET AND REGIONS

In this section, we discuss how we apply our pipeline to the scenario in which the semantic features of interest are regions in an image, and the sample-based sensitivity method is Sobol (Sobol-SBSA). We first describe our mapping functions for regions, then we detail the Sobol method for sensitivity analysis and the relevant measures that it produces. Given an image x ∈ R d where d = W × L, our input to semantic features function, G, extracts l regions and normalizes them by the sum of pixels in those regions. The mapping, H, associates each region, S k , with the sum of pixels in that region, r k . These values are used to construct the vector r ∈ R l . G : x ∈ R d → {S 1 , . . . , S l }, S k = x {t,p}∈S k {t,p}∈S k x tp (6) H : {S 1 , . . . , S l } → {r 1 , . . . , r l }, r j = {t,p}∈S k x tp The function, R, multiplies each semantic feature, S k , with the associated value, r k and finally G -1 creates x by stitching together the regions to reform x. Mathematically, this is as follows. R : {(r 1 , S 1 ), . . . , (r l , S l )} → {S * 1 , . . . , S * l }, S * k = r k S k (8) G -1 : {S 1 , . . . , S l } → x The above mapping, while simple, satisfies the properties of sensitivity and approximate reconstruction. As a result the samples of r, r[i] , produce image samples, x[i] , that amplify and mask regions of the image with relatively equal probability and uniformly. The Saltelli method, (Saltelli et al., 2010) , which chooses optimal points in [0, 1] for the Sobol method, are used to construct these samples and the model outputs of these image samples are fed to the Sobol method. We will briefly give an overview of the Sobol method. For more details about the implementation of Sobol, and the associated sampling method, see Saltelli et al. (2010) and Herman & Usher (2017) . The Sobol method is a variance-based sensitivity method. Given a model, f : (x 1 , . . . , x d ) ∈ R d → y ∈ R s , the variance based first order effect of a component of the input, x, is: where y j is a component of the model output, x i is a component of the input, x, x ∼i are samples of the input where all components of the input except for i are varied, and V xi and E x∼i are the variance and expectation over x i and x ∼i respectively. The Sobol first-order sensitivity is then the ratio of V i to the variance of the model output, y j . V i = V xi (E x∼i (y j |x i )) S i = V i V (y j ) S i quantifies the individual importance of the i th component of the input, x, to the j th component of the output, y. This equation essentially states that Sobol first order index, S i , is the fraction of the variance of the output, y j that is accounted for by x i . What makes Sobol unique is that it simultaneously calculates S i and S Ti , the total effect index. The total effect index, S Ti , is the importance of feature i due to both the feature independently and every higher level interaction of this feature with other features. S Ti = 1 - V x∼i (E xi (y j |x ∼i )) V (y j ) = S i + k;k̸ =i S ik + k,j;k̸ =i,j;j̸ =i S ikj + . . . Dividing both sides of equation 12 by S T i yields 1 = S i S T i + k;k̸ =i S ik S T i + k,j;k̸ =i,j;j̸ =i S ikj S T i + • • • = P IR + . . . The first term in equation 13 reports the extent to which a feature, x i , is important to the model output due to the feature by itself. The larger the term, the greater the importance of the region by itself, contrarily, the smaller the term larger the importance of the interaction of the feature with other features. We will refer to this as the Primary Index Ratio (P IR) for the rest of the paper. We note that the number of samples used for the Sobol method is N (d + 2) (Saltelli et al., 2010) . For all of the experiments that follow, N = 50 and d = l, the number of semantic features.

2.2.1. CHOICE OF SEMANTIC FEATURES

In order to demonstrate the ability of our method to address high level questions about CNNs at a semantic level, we apply our pipelines to three types of regions. 1) Equally sized image patches 2) machine annotated segmentations of the ImageNet training set from Salient ImageNet Singla & Feizi (2021) and 3) human annotated segmentations from the ImageNet-S 919 validation set (Gao et al., 2021) . Salient ImageNet segments images into core regions (regions that should be important for predictions) and spurious ones. ImageNet-S is a dataset that was created for use in judging image segmentation task. A key difference between ImageNet-S and the other types of regions is that ImageNet-S strictly respects boundaries between objects and background in an image. Region patches and Salient ImageNet do not. Figure 1 shows our pipeline when applied to evenly size regions of an image and to ImageNet-S 919 segments.

3.1. VALIDATING OUR PIPELINE

We validate our pipeline in two ways. First, we demonstrate it's ability to accurately rank the importance of regions of different sizes. Second, we demonstrate that Sobol-SBSA accurately reproduces two trends in how CNNs use regions that were determined indirectly by two mutually exclusive papers. • Brendel & Bethge (2019) showed that more modern CNNs use higher levels of interactions between equally sized region patches than older models. • Singla et al. (2022) showing that robust versions of CNN models utilize "spurious" areas of an image, as determined by machine annotation, less than normal versions of these CNNs. Having established trust in Sobol-SBSA, we demonstrate how it can be used to further explore how CNNs use regions, without needing to use a different pipeline. First, we demonstrate the impact of semantic representation on Brendel & Bethge (2019)'s results by measuring the interaction between segmented objects and their backgrounds (ImageNet-S). We show that the trend of more modern CNNs utilizing interactions more than older models is only clear when viewed with respect to regions that do not strictly respect object boundaries. Second, we demonstrate that, regardless of whether or not the segmentation respects boundaries, a result of the development of CNN architectures was greater exploitation of background information when making decisions. Finally, we demonstrate that, robust models reduce the extent to which models use the interaction if regions which do not respect boundaries. To the authors' knowledge, this is the first time that all of these results have been demonstrated through direct measurement and in a single pipeline. We note that all ImageNet models were pre-trained Pytorch models from ImageNet1K V1.

3.2. VALIDATING SOBOL-SBSA FOR IMPORTANCE RANKING AND SEMANTIC UNDERSTANDING

Our SBSA model can be used with a variety of black box models. Thus, we compare the following two methods against Sobol-SBSA. • Shapley (Shap-SBSA) (Kononenko et al., 2013) : Used with SBSA in a similar manner as Sobol-SBSA. • LIME (LIME-SBSA) (Ribeiro et al., 2016): Used with SBSA in a similar manner as Sobol-SBSA. We also compare against the following: • Integrated Gradients (Sundararajan et al., 2017) : Cannot be used with SBSA since it requires gradients with respect to pixels. Instead, we aggregate pixel importance in regions and rank region importance based on this aggregation. • Random: A control baseline where regions are randomly selected. • Ideal: The ideal result based on our metric. This is not experimental. We use two standard metrics for comparing interpretability methods. First, we measure the change in the ground truth label score of a model when regions, as specified by the sensitivity method, are masked. We mask the top and bottom 20% of regions in an image. Second, we measure sensitivity-n correlation. Sensitivity-n correlation is a quantity proposed by Ancona et al. (2017) for comparing the effectiveness of attribution methods when determining which features are important to a model. Quantitatively, it is a measurement of the Pearson correlation between n i=1 R c i and S c (x)-S c (x [x S =0] ). n i=1 R c i is the sum of the attributions associated with the n input features that ) is the difference in the score that the model produces when n input features are masked versus when none are. For each value of i, i random features are selected to be masked 100 times and the correlation is averaged over the examined data. A value closer to one means a more effective method. For all methods, we evaluate the importance of regions to ResNet50 on 1000 randomly selected Im-ageNet validation images. Figures 2 and 3 show the results of the masking and sensitivity-n correlations respectively for all evenly sized region patches. Figure 2 plots the change in the model score when the most important regions are masked versus when the least. A steeper curve is more ideal since it means that the sensitivity method significantly decreases the model score when masking the most important regions, but has little impact when masking the least important regions. The Sobol-SBSA is the most effective by both the masking and sensitivity-n measures. The key takeaways from the plot are 1) S T i performs the best overall in picking the most and least important regions in an image, as well as more properly ranking the regions in between (as measured by sensitivity-n), and 2) S T i is the most robust to changes in region size, followed closely by Integrated Gradients. The gap in performance between S Ti and S i shows the importance of accounting for region interaction when ranking importance. We now use Sobol-SBSA to quantify how the importance of interactions between evenly sized regions to CNN model outputs evolved over time, and test whether or not this matches the trend found by Brendel & Bethge (2019) . Figure 4a plots the average P IR for 10000 validation images when run on 4 × 4, 8 × 8, and 14 × 14 regions. A lower P IR means greater region interaction importance. We see that that for all of these regions, interaction decreases from AlexNet and VGG16 to the more modern InceptionV3, ResNet50, and DenseNet161 architectures, an identical result to Brendel & Bethge (2019)'s. Because Pytorch's pre-trained InceptionV3 takes as inputs images of size 299 × 299 rather than 224 × 224, the number of regions used for InceptionV3 are 5 × 5, 11 × 11, and 18 × 18. Finally, we apply Sobol-SBSA to 10000 Salient ImageNet images whose regions have been labeled as "core" regions that should be important to model predictions, and "spurious" regions that should not (Singla & Feizi, 2021) . Figure 4b plots the average difference between the total index score of core and spurious regions for the normal and adversarially robust versions of VGG16 BN, ResNet50, and DenseNet161. We use pre-trained robust weights from Salman et al. (2020) where the l 2 threat radius was 3. For Salient ImageNet, the difference in core and spurious importance is greater for robust models than their normal counterparts. This compliments Singla et al. ( 2022)'s findings that robust CNN models relied less on spurious areas of images than their normal counterparts. (a) The plots of the average mean of PIR for 10000 images for a series of CNN models when applied to images that are partitioned based on segmentations from ImageNet-S and Salient ImageNet, as well as when applied to the top 20% of region patches. (b) The plots of the average difference in importance, ST , for segmented objects in ImageNet-S and Salient ImageNet when applied to normal and robust models. 10000 images from each dataset were used. Figure 4 : PIR trends and the difference in importance between segmented objects and background for CNN models and their robust counterrparts.

3.3. BEYOND VALIDATION: THE IMPACT OF SEGMENTATION ON REGION IMPORTANCE AND INTERACTION TRENDS

In the previous subsection we validated Sobol-SBSA's ability to 1) correctly rank the importance of regions regardless of size 2) accurately capture trends in how CNNs use regions that were determined by Singla et al. (2022) and Brendel & Bethge (2019) . We now explore the impact of segmentation type on the trends in region interaction and differences between robust and normal models. We also measure how the use of background information evolved with models. Figure 4a plots the P IR for Sobol-SBSA applied to foreground and background regions as determined by Salient ImageNet and ImageNet-S. For foreground and background areas of an image as determined by Salient ImageNet, interactions became more important with more modern CNNs. For ImageNet-S regions, however, these interactions only increased for DenseNet161 and Incep-tionV3 . Recalling that ImageNet-S regions are segments that strictly respect object boundaries, we conclude that while modern CNNs use greater interactions between regions that do not respect object boundaries, this is not consistently the case for those that do. InceptionV3 was not used with Salient ImageNet since the masks provided were 224 × 224 which is incompatible with the expected 299 × 299 input to InceptionV3. Figures 4b shows the average difference in the total index score, S Ti , between foreground and background areas of images as determined by ImageNet-S and Salient ImageNet. This is done for normal and robust models. We see that the trend of robust models reducing the extent to which CNN architectures use background information holds, regardless of whether or not the foreground strictly respects object boundaries. Examples of this are shown in Figure 6 . Figure 5a plots the average P IR between foreground and background objects as deter- Figure 5 : PIR trends and the difference in importance between segmented objects and background for CNN models mined by ImageNet-S and Salient ImageNet for the robust and normal versions of CNNs. It is seen that, for areas defined by Salient ImageNet, robust models decreased the reliance of architectures on interactions, but that for objects and background determined ImageNet-S this was only clearly seen for DenseNet161. The key take away here is that robust models generally decrease model's reliance on interactions between regions that do not respect boundaries, but don't necessarily do so for those that do. Finally, Figure 5b plots the average difference in importance between foreground and background objects when Sobol-SBSA is applied to ImageNet-S and Salient ImageNet. These results are plotted for AlexNet to DenseNet161. It is seen that one of the effects of architectural changes in CNN development was to increase the extent to which these models use background information.

3.3.1. THE LIMIT OF SOBOL-SBSA

While the Sobol-SBSA is a powerful method, it has a similar weakness to other sample-based black box methods. It assumes that the input features that it is analyzing are uncorrelated. When this is not the case, the ability of Sobol to decompose importance into independent subsets of the feature space is weakened (Li et al., 2010) . This manifests itself in the Sobol Indices, which are supposed to satisfy the property that PIR ≤ 1, having at least some input features for which PIR > 1. The assumption that regions are uncorrelated breaks down as regions become smaller in size. We observe this in two ways. First, we examine figure 4a and note that P IR increases as the region sizes decrease. This is intuitively an incorrect result since interactive effects should be greater for smaller regions, as demonstrated by Brendel & Bethge (2019) . To confirm that this result is due to a breakdown in Sobol assumptions we calculate the average percent of regions for which PIR > 1 across all models for a given region size. For 4 × 4, 8 × 8, and 14 × 14 regions the percentages are 22%, 33%, and 39% respectively. Figure 8 shows an example of how Sobol-SBSA can breakdown with smaller region sizes. A key direction for future work is to implement a version of Sobol-SBSA that accounts for correlations in the data.

4. RELATED WORK

In recent years, interest has grown in interpretability methods that can quantify the importance of feature interactions. Janizek et al. (2020) importance, rather than interactions. None of the above works addressed understanding how models used semantic representations, or facilitated quantitatively answering high level questions about models.

5. DISCUSSION AND CONCLUSION

We proposed Sobol-SBSA, a general method for understanding how Deep Learning Models use semantic features when making decisions. We demonstrated the ability of this method to answer in one pipeline a rich set of questions about model behavior by using it to study how CNN models use areas of images during classification. We found 1) that the primary impact of the evolution of CNN models was to make greater use of region interactions and to increase the importance of image background to model predictions and 2) that adversarially robust CNN models are less susceptible to spurious correlations in the data because they force CNN architectures to rely less on region interactions and on image backgrounds. The Sobol-SBSA method has a variety of potential applications beyond the image/region-based analysis that we presented here. Many different types of input partitioning and input modalities can be analyzed using the SBSA method. For images, the partitioning into image regions can be done via external metadata, such as object detection results. For natural language processing (NLP), the use of SBSA is even more straightforward since words/tokens provide a natural way to partition the input sequence. Hence, our model can be utilized to understand how parts of speech are utilized for different iterations of various NLP models, including translation, classification, and generative text models. Beyond unimodal content, SBSA can also be easily extended to multimodal setups. For example, Sobol-SBSA can be used to measure the strength of bias in Image Captioning systems by quantifying the extent to which parts of an image correspond to which parts of the generated caption. In addition to these applications, there are multiple avenues for future work. These include exploring the combination of Sobol-SBSA with automatic feature detectors, implementing a version of Sobol-SBSA that accounts for input correlations, and exploring whether P IR can be used as a proxy for the robustness of non-CNN Deep Learning models. Through SBSA and Sobol-SBSA we have proposed a strong foundation for obtaining a richer understanding of how models use semantic representations of inputs, regardless of whether these representations are generated automatically or by end users. We hope that it can be used to both provide clarity about the mechanisms by which Deep Learning makes decisions, and to influence how we further develop these models.



Figure 1: Sobol-SBSA pipeline for regions and segments (ImageNet-S). 1) Image → Semantic → Vector extracts semantic features from the image and maps these onto real numbers. 2) Vector → Vector Sample produces samples from the vector. 3) Vector Sample → Semantic Sample → Image Sample produces samples of the image that mask and amplify semantic features based on the vector samples. 4) Sample Output → Sobol Method sends outputs from the model evaluated on the image samples to the Sobol method for analysis.

Figure 2: Plots of the change in the ground truth score of the ResNet50 when the top 20% of regions are masked vs the bottom 20% of regions. A steep slope is indicative of a more effective method since it means that the change in ground truth scores are significant when the top 20% of images are masked, but not when the bottom 20% of images are masked. Our measure, S Ti is closest to the ideal.

Figure 3: Plots of the sensitivity-n correlation for different sensitivity methods as a function of the percentage of regions masked. The sensitivity methods with values closer to one are more effective. Our measure, S Ti is closest to the ideal.

(a) The plots of the average mean of P IR when Sobol-SBSA is applied to segmented and background objects in SalientImageNet and ImageNet-S for normal an robust models. 10000 images were used in their respective datasets.(b) The plots of the average difference in importance, ST and Si for segmented objects in ImageNet-S and Salient ImageNet as a function of models. 10000 images from each dataset were used.

Figure 6: A sample result of the Sobol-SBSA when applied to an segmentations determined by the ImageNet-S dataset for the normal and robust versions of ResNet50. Robust ResNet50 focuses only on the segmented objects, while ResNet50 also uses the background.

To explore the importance of image size, we performed our experiments for 4 × 4, 8 × 8, and 16 × 16 regions. The number of samples is selected as a function of the number of regions, as detailed in section 2.2. For 4 × 4, 8 × 8, and 16 × 16 regions, d = 16, d = 64, and d = 256 respectively.

A APPENDIX

A.1 10K VS 50K EXPERIMENTS Figure 7 : Average PIR for the top 20% of regions when calculated for 10000 and 50000 ImageNet validation images. This is done for 4 × 4 and 8 × 8 regions. The results are identical.We calculated the average PIR for the top 20% of regions for 10000 and 50000 validation images in ImageNet using 4 × 4 and 8 × 8 regions respectively. We saw that the results were identical so, for the smaller regions generated when splitting an image into 14 × 14 regions, we calculated the average PIR for 10000 images to save computational cost.A.2 IMPACT OF REGION SIZE Figure 8 : Images of S i and S Ti for an Ostrich class example for different grid sizes. We see that 4 × 4 and 8 × 8 regions are consistent, but that this is not the case for 14 × 14 regions. Figure 8 shows an example of Sobol-SBSA when applied to an Ostrich target class for different region sizes. We see that all of the examples are consistent except for the Robust ReseNet50 14 × 14 example. One of the weaknesses of Sobol is that the results can be corrupted when the input features are correlated. As regions get smaller, this is exactly what occurs. Although, overall, Sobol-SBSA was still able to correctly identify the most and least important regions at the 14 × 14 scale, figure 2 , more work must be done to address this weakness so that the Sobol results can be compared accurately across different sizes of regions or, more generally, features that have different types of correlations. Future work will involve exploring the impact of sampling size and variations of Sobol that account of input correlation on addressing this issue.

