WHY DID THIS MODEL FORECAST THIS FUTURE? INFORMATION-THEORETIC TEMPORAL SALIENCY FOR COUNTERFACTUAL EXPLANATIONS OF PROBABILISTIC FORECASTS

Abstract

Probabilistic forecasting of multivariate time series is significant to several research domains where multiple futures exist for a single observed sequence. Identifying the observations on which a well-performing model bases its forecasts can enable domain experts to form data-driven hypotheses about the causal relationships between features. Consequently, we begin by revisiting the question: what constitutes a causal explanation? One hurdle in the landscape of explainable artificial intelligence is that what constitutes an explanation is not well-grounded. We build upon Miller's framework of explanations derived from research in multiple social science disciplines, and establish a conceptual link between counterfactual reasoning and saliency-based explanation techniques. However, the complication is a lack of a consistent and principled notion of saliency. Also, commonly derived saliency maps may be inconsistent with the data generation process and the underlying model. We therefore leverage a unifying definition of information-theoretic saliency grounded in preattentive human visual cognition and extend it to forecasting settings. In contrast to existing methods that require either explicit training of the saliency mechanism or access to the internal parameters of the underlying model, we obtain a closed-form solution for the resulting saliency map for commonly used density functions in probabilistic forecasting. To empirically evaluate our explainability framework in a principled manner, we construct a synthetic dataset of conversation dynamics and demonstrate that our method recovers the true salient timesteps for a forecast given a well-performing underlying model.

1. INTRODUCTION

The existence of multiple valid futures for a given observed sequence is a crucial attribute of several forecasting tasks, especially surrounding the dynamics of low-level human behavior. These tasks include the forecasting of trajectories of pedestrians (Huang et al., 2019; Mohamed et al., 2020; Rudenko et al., 2020; Salzmann et al., 2021; Zhang et al., 2019 ), vehicles (Carrasco et al., 2021; Gilles et al., 2022; Zeng et al., 2020; Zhao et al., 2020) , and autonomous robots (Ivanovic et al., 2021; Vemula et al., 2017) , or other more general nonverbal cues of humans (Adeli et al., 2020; Barquero et al., 2022; Nguyen & Celiktutan, 2022; Raman et al., 2021; Yao et al., 2018) and artificial virtual agents (Ahuja et al., 2019) in group conversation settings. Consequently, rather than making single (i.e. point) predictions, several machine learning methods in these settings have attempted to forecast a distribution over plausible futures (Mohamed et al., 2020; Raman et al., 2021) . In this work, we introduce and address a novel research question towards gaining domain-relevant insights into such forecasts: given a reliable underlying model, how can we identify the observed timesteps that are salient for the model's probabilistic forecasts over a particular future window?

1.1. NOTIONS OF INTERPRETABILITY IN FORECASTING TASKS & DRAWBACKS

Recently, several works have proposed techniques for making interpretable non-probabilistic predictions for point forecasting tasks (Lim et al., 2020; Oreshkin et al., 2020; Pan et al., 2021) . The  ) what constitutes a good explanation is subject to the biases, intuition, or the visual assessment of the human observer (Adebayo et al., 2018; Miller, 2019) ; a phenomenon we refer to as the interpretation being in the eye of the beholder. This is especially true for when saliency maps have been used as tools for post-hoc explanations: the computed map may not measure the intended saliency, and even be independent of both the model and data generating process (Adebayo et al., 2018; Atrey et al., 2020; Lapuschkin et al., 2019) . Research on saliency-based methods is further confounded by the lack of a common notion of saliency. As Barredo Arrieta et al. (2020, Sec. 5 .3) point out, "there is absolutely no consistency behind what is known as saliency maps, salient masks, heatmaps, neuron activations, attribution, and other approaches alike."

1.2. SALIENCY-BASED EXPLANATIONS FOR COUNTERFACTUAL REASONING & DRAWBACKS

Due to the inconsistencies mentioned above, we argue that the premise of what constitutes an explanation needs to be well-grounded within existing frameworks of how people define, generate, and present explanations. Miller (2019) recently turned to the vast body of research on the topic in philosophy, psychology, and cognitive science and highlighted the importance of causality in explanation. Specifically, in Table 1 we reproduce the categorization of explanatory questions he proposed based on Pearl and Mackenzie's Ladder of Causation (Pearl & Mackenzie, 2018) . Using an abstract notion of an 'event' that needs explaining, Miller argues that the what-questions involve associative reasoning. How questions require interventionist reasoning to determine the set of causes that, if removed, would prevent the event from happening. The why-questions are the most challenging, as they require counterfactual reasoning to undo events and simulate other events. These also require associative and interventionist reasoning. To apply Miller's (2019) framework to forecasting, consider a model M that predicts features over a future window t fut by observing features over a window t obs . We argue that most of the existing interpretability approaches-including the two discussed categories involving the injection of inductive biases and attention-based mechanisms-are associative in nature. For a fixed t fut and single t obs , they reason about the (unobserved) importance of features over t obs using model parameters or attention coefficients based on one prediction from M (the 'event'). In contrast, we argue that the perturbation-based saliency methods have the potential to support counterfactual reasoning. The saliency masks are commonly learned by perturbing different parts of the input over



Barredo Arrieta et al. (2020) argue for the importance of distinguishing between interpretability and explainability as different concepts, the latter denoting any active action or procedure taken by a model with the intent of clarifying or detailing its internal functions. From this perspective, what the cited works term interpretability is closer to the notion of explainability.



Classes of Explanatory Question and the Reasoning Required to Answer. Reproduced from Miller (2019, Table3) Counterfactual Simulating alternative causes to see whether the event still happens general approach has been to train architectures to produce forecasts that are not only accurate but also interpretable.However, in what Lipton (2017)  terms The Mythos of Interpretability, the notion of what renders these models interpretable is often not well-grounded and subject to presenting speculation in the guise of explanationLipton & Steinhardt (2018). Examples of these operationalizations of interpretability 1 for forecasting tasks include: (i) injecting a suitable inductive bias into the model through a set of basis functions and identifying how they combine to produce an output (Oreshkin et al., 2020) (an approach that has recently been applied to the probabilistic setting as well(Rügamer  et al., 2022)); (ii) employing a self-attention mechanism to learn temporal patterns while attending to a common set of features(Lim et al., 2020); and (iii) applying the notion of saliency maps from computer vision(Dabkowski & Gal, 2017)  to time-series data as a measure of how much each feature contributes to the final forecast(Pan et al., 2021). A broader review of notions of interpretability across domains is in Appendix A.Irrespective of the notion of interpretability, these methodologies are underpinned by two common attributes: (a) the interpretability mechanism needs explicit training as part of the model architecture, and

