CAUSAL REASONING IN THE PRESENCE OF LATENT CONFOUNDERS VIA NEURAL ADMG LEARNING

Abstract

Latent confounding has been a long-standing obstacle for causal reasoning from observational data. One popular approach is to model the data using acyclic directed mixed graphs (ADMGs), which describe ancestral relations between variables using directed and bidirected edges. However, existing methods using AD-MGs are based on either linear functional assumptions or a discrete search that is complicated to use and lacks computational tractability for large datasets. In this work, we further extend the existing body of work and develop a novel gradientbased approach to learning an ADMG with non-linear functional relations from observational data. We first show that the presence of latent confounding is identifiable under the assumptions of bow-free ADMGs with non-linear additive noise models. With this insight, we propose a novel neural causal model based on autoregressive flows for ADMG learning. This not only enables us to determine complex causal structural relationships behind the data in the presence of latent confounding, but also estimate their functional relationships (hence treatment effects) simultaneously. We further validate our approach via experiments on both synthetic and real-world datasets, and demonstrate the competitive performance against relevant baselines.

1. INTRODUCTION

Learning causal relationships and estimating treatment effects from observational studies is a fundamental problem in causal machine learning, and has important applications in many areas of social and natural sciences (Pearl, 2010; Spirtes, 2010) . They enable us to answer questions in causal nature; for example, what is the effect on the expected lifespan of a patient if I increase the dose of X drug? However, many existing methods of causal discovery and inference overwhelmingly rely on the assumption that all necessary information is available. This assumption is often untenable in practice. Indeed, an important, yet often overlooked, form of causal relationships is that of latent confounding; that is, when two variables have an unobserved common cause (Verma & Pearl, 1990) . If not properly accounted for, the presence of latent confounding can lead to incorrect evaluation of causal quantities of interest (Pearl, 2009) . Traditional causal discovery methods that account for the presence of latent confoundings, such as the fast causal inference algorithm (FCI) (Spirtes et al., 2000) and its extensions (Colombo et al., 2012; Claassen et al., 2013; Chen et al., 2021) , rely on uncovering an equivalence class of acyclic directed mixed graphs (ADMGs) that share the same conditional independencies. Without additional assumptions, however, these methods might return uninformative results as they cannot distinguish between members of the same Markov equivalence class (Bellot & van der Schaar, 2021). More recently, causal discovery methods based on structural causal models (SCMs) (Pearl, 1998) have been developed for latent confounding (Nowzohour et al., 2017; Wang & Drton, 2020; Maeda & Shimizu, 2020; 2021; Bhattacharya et al., 2021) . By assuming that the causal effects follow specific functional forms, they have the advantage of being able to distinguish between members of the same Markov equivalence class (Glymour et al., 2019 ). Yet, existing approaches either rely on restrictive linear functional assumptions (Bhattacharya et al., 2021; Maeda & Shimizu, 2020; Bellot & van der Schaar, 2021) , and/or discrete search over the discrete space of causal graphs (Maeda &

