QUANTIFYING STATISTICAL SIGNIFICANCE OF NEU-RAL NETWORK REPRESENTATION-DRIVEN HYPOTHE-SES BY SELECTIVE INFERENCE Anonymous

Abstract

In the past few years, various approaches have been developed to explain and interpret deep neural network (DNN) representations, but it has been pointed out that these representations are sometimes unstable and not reproducible. In this paper, we interpret these representations as hypotheses driven by DNN (called DNN-driven hypotheses) and propose a method to quantify the reliability of these hypotheses in statistical hypothesis testing framework. To this end, we introduce Selective Inference (SI) framework, which has received much attention in the past few years as a new statistical inference framework for data-driven hypotheses. The basic idea of SI is to make conditional inferences on the selected hypotheses under the condition that they are selected. In order to use SI framework for DNN representations, we develop a new SI algorithm based on homotopy method which enables us to derive the exact (non-asymptotic) conditional sampling distribution of the DNN-driven hypotheses. In this paper, we demonstrate the proposed method in computer vision tasks as practical examples. We conduct experiments on both synthetic and real-world datasets, through which we offer evidence that our proposed method can successfully control the false positive rate, has decent performance in terms of computational efficiency, and provides good results in practical applications.

1. INTRODUCTION

The remarkable predictive performance of deep neural networks (DNNs) stems from their ability to learn appropriate representations from data. In order to understand the decision-making process of DNNs, it is thus important to be able to explain and interpret DNN representations. For example, in image classification tasks, knowing the attention region from DNN representation allows us to understand the reason for the classification. In the past few years, several methods have been developed to explain and interpret DNN representations (Ribeiro et al., 2016; Bach et al., 2015; Doshi-Velez & Kim, 2017; Lundberg & Lee, 2017; Zhou et al., 2016; Selvaraju et al., 2017) ; however, some of them have turned out to be unstable and not reproducible (Kindermans et al., 2017; Ghorbani et al., 2019; Melis & Jaakkola, 2018; Zhang et al., 2020; Dombrowski et al., 2019; Heo et al., 2019) . Therefore, it is crucially important to develop a method to quantify the reliability of DNN representations. In this paper, we interpret these representations as hypotheses that are driven by DNN (called DNNdriven hypotheses) and employ statistical hypothesis testing framework to quantify the reliability of DNN representations. For example, in an image classification task, the reliability of an attention region can be quantified based on the statistical significance of the difference between the attention region and the rest of the image. Unfortunately, however, traditional statistical test cannot be applied to this problem because the hypothesis (attention region in the above example) itself is selected by the data. Traditional statistical test is valid only when the hypothesis is non-random. Roughly speaking, if a hypothesis is selected by the data, the hypothesis will over-fit to the data and the bias needs to be corrected when assessing the reliability of the hypothesis. Our main contribution in this paper is to introduce Selective Inference (SI) approach for testing the reliability of DNN representations. The basic idea of SI is to perform statistical inference under the condition that the hypothesis is selected. SI approach has been demonstrated to be effective

