IDENTIFYING TREATMENT EFFECTS UNDER UNOB-SERVED CONFOUNDING BY CAUSAL REPRESENTA-TION LEARNING

Abstract

As an important problem of causal inference, we discuss the estimation of treatment effects under the existence of unobserved confounding. By representing the confounder as a latent variable, we propose Counterfactual VAE, a new variant of variational autoencoder, based on recent advances in identifiability of representation learning. Combining the identifiability and classical identification results of causal inference, under mild assumptions on the generative model and with small noise on the outcome, we theoretically show that the confounder is identifiable up to an affine transformation and then the treatment effects can be identified. Experiments on synthetic and semi-synthetic datasets demonstrate that our method matches the state-of-the-art, even under settings violating our formal assumptions.

1. INTRODUCTION

Causal inference (Imbens & Rubin, 2015; Pearl, 2009) , i.e, estimating causal effects of interventions, is a fundamental problem across many domains. In this work, we focus on the estimation of treatment effects, e.g., effects of public policies or a new drug, based on a set of observations consisting of binary labels for treatment / control (non-treated), outcome, and other covariates. The fundamental difficulty of causal inference is that we never have observations of counterfactual outcomes, which would have been if we had made another decision (treatment or control). While the ideal protocol for causal inference is randomized controlled trials (RCTs), they often have ethical and practical issues, or are prohibitively expensive. Thus, causal inference from observational data is indispensable, though they introduce other challenges. Perhaps the most crucial one is confounding: there might be variables (called confounders) that causally affect both the treatment and the outcome, and spurious correlation follows. Most of works in causal inference rely on the unconfoundedness assumption that appropriate covariates are collected so that the confounding can be controlled by conditioning on or adjusting for those variables. This is still challenging, due to systematic difference of the distributions of the covariates between the treatment and control groups. One classical way of dealing with this difference is re-weighting (Horvitz & Thompson, 1952) . There are semi-parametric methods, which have better finite sample performance, e.g. TMLE (Van der Laan & Rose, 2011), and also non-parametric, tree-based, methods, e.g. Causal Forests (CF) (Wager & Athey, 2018) . Notably, there is a recent rise of interest in representation learning for causal inference starting from Johansson et al. (2016) . There are a few lines of works that challenge the difficult but important problem of causal inference under unobserved confounding. Without covariates we can adjust for, many of them assume special structures among the variables, such as instrumental variables (IVs) (Angrist et al., 1996 ), proxy variables (Miao et al., 2018 ), network structure (Ogburn, 2018) , and multiple causes (Wang & Blei, 2019) . Among them, instrumental variables and proxy (or surrogate) variables are most commonly exploited. Instrumental variables are not affected by unobserved confounders, influencing the outcome only through the treatment. On the other hand, proxy variables are causally connected to unobserved confounders, but are not confounding the treatment and outcome by themselves. Other methods use restrictive parametric models (Allman et al., 2009) , or only give interval estimation (Manski, 2009; Kallus et al., 2019) .

