NEURAL CAUSAL MODELS FOR COUNTERFACTUAL IDENTIFICATION AND ESTIMATION

Abstract

Evaluating hypothetical statements about how the world would be had a different course of action been taken is arguably one key capability expected from modern AI systems. Counterfactual reasoning underpins discussions in fairness, the determination of blame and responsibility, credit assignment, and regret. In this paper, we study the evaluation of counterfactual statements through neural models. Specifically, we tackle two causal problems required to make such evaluations, i.e., counterfactual identification and estimation from an arbitrary combination of observational and experimental data. First, we show that neural causal models (NCMs) are expressive enough and encode the structural constraints necessary for performing counterfactual reasoning. Second, we develop an algorithm for simultaneously identifying and estimating counterfactual distributions. We show that this algorithm is sound and complete for deciding counterfactual identification in general settings. Third, considering the practical implications of these results, we introduce a new strategy for modeling NCMs using generative adversarial networks. Simulations corroborate with the proposed methodology.

1. INTRODUCTION

Counterfactual reasoning is one of human's high-level cognitive capabilities, used across a wide range of affairs, including determining how objects interact, assigning responsibility, credit and blame, and articulating explanations. Counterfactual statements underpin prototypical questions of the form "what if-" and "why-", which inquire about hypothetical worlds that have not necessarily been realized (Pearl & Mackenzie, 2018) . If a patient Alice had taken a drug and died, one may wonder, "why did Alice die?"; "was it the drug that killed her?"; "would she be alive had she not taken the drug?". In the context of fairness, why did an applicant, Joe, not get the job offer? Would the outcome have changed had Joe been a Ph.D.? Or perhaps of a different race? These are examples of fundamental questions about attribution and explanation, which evoke hypothetical scenarios that disagree with the current reality and which not even experimental studies can reconstruct. We build on the semantics of counterfactuals based on a generative process called structural causal model (SCM) (Pearl, 2000) . A fully instantiated SCM M * describes a collection of causal mechanisms and distribution over exogenous conditions. Each M * induces families of qualitatively different distributions related to the activities of seeing (called observational), doing (interventional), and imagining (counterfactual), which together are known as the ladder of causation (Pearl & Mackenzie, 2018; Bareinboim et al., 2022) ; also called the Pearl Causal Hierarchy (PCH). The PCH is a containment hierarchy in which distributions can be put in increasingly refined layers: observational content goes into layer 1 (L 1 ); experimental to layer 2 (L 2 ); counterfactual to layer 3 (L 3 ). It is understood that there are questions about layers 2 and 3 that cannot be answered (i.e. are underdetermined), even given all information in the world about layer 1; further, layer 3 questions are still underdetermined given data from layers 1 and 2 (Bareinboim et al., 2022; Ibeling & Icard, 2020) . Counterfactuals represent the more detailed, finest type of knowledge encoded in the PCH, so naturally, having the ability to evaluate counterfactual distributions is an attractive proposition. In practice, a fully specified model M * is almost never observable, which leads to the question -how can a counterfactual statement, from L * 3 , be evaluated using a combination of observational and experimental data (from L * 1 and L * 2 )? This question embodies the challenge of cross-layer inferences, which entail solving two challenging causal problems in tandem, identification and estimation.

