MEASURING AXIOMATIC SOUNDNESS OF COUNTERFACTUAL IMAGE MODELS

Abstract

We present a general framework for evaluating image counterfactuals. The power and flexibility of deep generative models make them valuable tools for learning mechanisms in structural causal models. However, their flexibility makes counterfactual identifiability impossible in the general case. Motivated by these issues, we revisit Pearl's axiomatic definition of counterfactuals to determine the necessary constraints of any counterfactual inference model: composition, reversibility, and effectiveness. We frame counterfactuals as functions of an input variable, its parents, and counterfactual parents and use the axiomatic constraints to restrict the set of functions that could represent the counterfactual, thus deriving distance metrics between the approximate and ideal functions. We demonstrate how these metrics can be used to compare and choose between different approximate counterfactual inference models and to provide insight into a model's shortcomings and trade-offs.

1. INTRODUCTION

Faithfully answering counterfactual queries is a key challenge in representation learning and a cornerstone for aligning machine intelligence and human reasoning. While significant advances have been made in causal representation learning, enabling approximate counterfactual inference, there is surprisingly little methodology available to assess, measure, and quantify the quality of these models. The structural causal model (SCM) is a mathematical tool that describes causal systems. It offers a convenient computational framework for operationalising causal and counterfactual inference (Pearl, 2009 ). An SCM is a set of functional assignments (called mechanisms) that represent the relationship between a variable, its direct causes (called parents) and all other unaccounted sources of variation (called exogenous noise). In SCMs, we assume that the mechanisms are algorithmically independent of each other. Further, in Markovian SCMs, which we can represent by DAGs, we assume that the exogenous noise variables are statistically independent of each other (Peters et al., 2017) . From here on out, by SCM, we mean Markovian SCM. When the functional form of a mechanism is unknown, learning it from data is a prerequisite for answering counterfactual queries (Bareinboim et al., 2022) . In the context of high-dimensional observations, such as images, the power and flexibility of deep generative models make them indispensable tools for learning the mechanisms of an SCM. However, these same benefits make model identifiability impossible in the general case (Locatello et al., 2020a; Khemakhem et al., 2020) , which can cause entanglement of causal effects (Pawlowski, 2021) and lead to poor approximations of the causal quantities of interest. Regardless, even if the model or counterfactual query is unidentifiable, we can still measure the quality of the counterfactual approximation (Pearl, 2010) . However, evaluating image-counterfactual models is challenging without access to observed counterfactuals, which is unrealistic in most real-world scenarios. In this paper, we focus on what constraints a counterfactual inference model must satisfy and how we can use them to measure model's soundness without having access to observed counterfactuals or the SCM that generated the data. We begin by framing mechanisms as functional assignments that directly translate an observation into a counterfactual, given its parents and counterfactual parents. Next, we use Galles & Pearl (1998)'s axiomatic definition of counterfactual to restrict the space of possible functions that can represent a mechanism. From these constraints, we derive a set of metrics which we can use to measure the soundness of any arbitrary black-box counterfactual inference engine. Lastly, we show how simulated interventions can mitigate estimation issues due to confounding. Representation learning aims to capture semantically meaningful disentangled factors of variation in the data. Arguably, these representations can provide interpretability, reduced sample complexity, and improved generalisation (Bengio et al., 2013) . From a causal perspective, these factors should represent the parents of a variable in the SCM (Schölkopf et al., 2021) . Although there has been extensive research into unsupervised disentanglement (Higgins et al., 2017; Burgess et al., 2018; Kim & Mnih, 2018; Chen et al., 2018; Kumar et al., 2018; Peebles et al., 2020) , recent results Locatello et al. (2020a) reaffirm the impossibility of this task since the true causal generative model is not identifiable by observing a variable in isolation (Peters et al., 2017) . In contrast, supervised disentanglement, where we observe the variable's parents, and weakly supervised disentanglement, where we observe "real" counterfactuals, can lead to causally identifiable generative models (Locatello et al., 2020a) . The integration of causal considerations has led to the emerging field of causal representation learning (Schölkopf et al., 2021) . In the supervised setting, extensive research has been conducted in adapting deep models for individualised treatment effect estimation (Louizos et al., 2017; Yoon et al., 2018; Shi et al., 2019; Madras et al., 2019; Jesson et al., 2020; Yang et al., 2021) 2021) use deep latent variable models for learning to transform independent exogenous factors into endogenous causes that correspond to causally related concepts in the data. In the weakly-supervised setting, recent work has focused on using observations of "real"-counterfactuals instead of a variable's parents to obtain disentangled representations (Hosoya, 2019; Bouchacourt et al., 2018; Shu et al., 2020; Locatello et al., 2020b) . Besserve et al. (2020; 2021) show how relaxing identifiability constraints can lead to some degree of identifiability in unsupervised settings. 2018) retrieve a set of independent mechanisms from a set of transformed data points in an unsupervised manner using multiple competing models. Additionally, many image-to-image translation models can be considered informal counterfactual inference engines (Isola et al., 2017; Zhu et al., 2017; Liu et al., 2017; Choi et al., 2018; Hoffman et al., 2018; Li et al., 2021) . The flexibility of deep models makes them susceptible to learning shortcuts (Geirhos et al., 2020) . Consequently, when the data is biased, the effects of the parents can become entangled (Pawlowski, 2021; Rissanen & Marttinen, 2021) . These issues create identifiability problems even when accounting for causality in representation learning. Simulating interventions through data augmentation or resampling can be used to debias the data (Ilse et al., 2021; Kügelgen et al., 2021; An et al., 2021) . In a closely related field, research has focused on learning from biased data (Nam et al., 2020) or how to become invariant to a protected/spurious attribute (Kim et al., 2019) . Janzing & Schölkopf (2010) show how an algorithmic perspective allows causal inference with only one observation.

3. METHODS

Generating counterfactuals is commonly performed in multiple steps. First, we abduct the exogenous noise from the observation and its parents. Second, we act on some parents. Finally, we use a generative model to map the exogenous noise and the counterfactual parents back to the observation space. For deep models, true abduction is impossible in the general case, and identifiability issues are ubiquitous (Peters et al., 2017; Locatello et al., 2020a; Khemakhem et al., 2020) . There can exist multiple models capable of generating the data, and the true causal model cannot be identified from data alone. We argue that viewing counterfactual inference engines as black-boxes, where we focus on what properties the model's output must obey rather than the model's theoretical shortcomings, leads us to a set of actionable and principled model constraints. While a full causal model of the data generation process is necessary to create new samples from a joint distribution, in many applications,



In the context of image counterfactuals, Pawlowski et al. (2020) demonstrate how to jointly model all the functional assignments in an SCM using deep generative models. Despite presenting a general theory for any generative model, the authors implement only VAEs (Kingma & Welling, 2014; Rezende et al., 2014) and normalising flows (Papamakarios et al., 2021), which Dash et al. (2022) complement by using GANs (Goodfellow et al., 2014). Sanchez & Tsaftaris (2022) use diffusion models for the counterfactual estimation. Van Looveren & Klaise (2021) use class prototypes for finding interpretable counterfactual explanations. Sauer & Geiger (2021) use a deep network to disentangle object shape, object texture and background in natural images. Parascandolo et al. (

. Notably, Louizos et al. (2017) use deep latent variable models to estimate individualised and population-level treatment effects. Yang et al. (

