CORTX: CONTRASTIVE FRAMEWORK FOR REAL-TIME EXPLANATION

Abstract

Recent advancements in explainable machine learning provide effective and faithful solutions for interpreting model behaviors. However, many explanation methods encounter efficiency issues, which largely limit their deployments in practical scenarios. Real-time explainer (RTX) frameworks have thus been proposed to accelerate the model explanation process by learning a one-feed-forward explainer. Existing RTX frameworks typically build the explainer under the supervised learning paradigm, which requires large amounts of explanation labels as the ground truth. Considering that accurate explanation labels are usually hard to obtain due to constrained computational resources and limited human efforts, effective explainer training is still challenging in practice. In this work, we propose a COntrastive Real-Time eXplanation (CoRTX) framework to learn the explanation-oriented representation and relieve the intensive dependence of explainer training on explanation labels. Specifically, we design a synthetic strategy to select positive and negative instances for the learning of explanation. Theoretical analysis show that our selection strategy can benefit the contrastive learning process on explanation tasks. Experimental results on three real-world datasets further demonstrate the efficiency and efficacy of our proposed CoRTX framework.

1. INTRODUCTION

The remarkable progress in explainable machine learning (ML) significantly improves the model transparency to human beings (Du et al., 2019) . However, applying explainable ML techniques to real-time scenarios remains to be a challenging task. Real-time systems typically require model explanation to be not only effective but also efficient (Stankovic et al., 1992) . Due to the requirements from both stakeholders and social regulations (Goodman & Flaxman, 2017; Floridi, 2019) , the efficient model explanation is necessary for the real-time ML systems, such as the controlling systems (Steel & Angwin, 2010), online recommender systems (Yang et al., 2018) , and healthcare monitoring systems (Gao et al., 2017) . Nevertheless, existing work on non-amortized explanation methods has high explanation latency, including LIME (Ribeiro et al., 2016) , KernelSHAP (Lundberg & Lee, 2017). These methods rely on either multiple perturbations or backpropagation in deep neural networks (DNN) for deriving explanation (Covert & Lee, 2021; Liu et al., 2021) , which is time-consuming and limited for deployment in real-time scenarios. Real-time explainer (RTX) frameworks have thus been proposed to address such efficiency issues and provide effective explanations for real-time systems (Dabkowski & Gal, 2017; Jethani et al., 2021b) . Specifically, RTX learns an overall explainer on the training set by using the ground-truth explanation labels obtained through either exact calculation or approximation. RTX provides the explanation for each local instance via a single feed-forward process. Existing efforts on RTX can be categorized into two lines of work. The first line (Schwab & Karlen, 2019; Jethani et al., 2021b; Covert et al., 2022) explicitly learns an explainer to minimize the estimation error regarding to the approximated explanation labels. The second line (Dabkowski & Gal, 2017; Chen et al., 2018; Kanehira & Harada, 2019) trains a feature mask generator subject to certain constraints on pre-defined label distribution. * These authors contributed equally to this work. 1

availability

https://github.

