CAUSAL DISCOVERY FROM CONDITIONALLY STATIONARY TIME SERIES

Abstract

Causal discovery, i.e., inferring underlying causal relationships from observational data, has been shown to be highly challenging for AI systems. In time series modeling context, traditional causal discovery methods mainly consider constrained scenarios with fully observed variables and/or data from stationary time-series. We develop a causal discovery approach to handle a wide class of non-stationary time-series that are conditionally stationary, where the non-stationary behaviour is modeled as stationarity conditioned on a set of (possibly hidden) state variables. Named State-Dependent Causal Inference (SDCI), our approach is able to recover the underlying causal dependencies, provably with fully-observed states and empirically with hidden states. The latter is confirmed by experiments on synthetic linear system and nonlinear particle interaction data, where SDCI achieves superior performance over baseline causal discovery methods. Improved results over non-causal RNNs on modeling NBA player movements demonstrate the potential of our method and motivate the use causality-driven methods for forecasting.

1. INTRODUCTION

Deep learning has achieved profound success in vision and language modelling tasks (Brown et al., 2020; Nichol et al., 2021) . Still, it remains a grand challenge and a prominent research direction to enable deep neural networks to perform causal discovery and reasoning (Yi et al., 2020; Girdhar & Ramanan, 2020; Sauer & Geiger, 2021) , which is an inherent mechanism in human cognition (Spelke & Kinzler, 2007) . Specifically for analysing time series data, causal discovery involves identifying the underlying temporal causal structure of the observed sequences. Many existing causal discovery approaches for time series assume stationarity (Granger, 1969; Peters et al., 2017; Löwe et al., 2020; Li et al., 2020; Tank et al., 2021) , which is restrictive as sequence data from real-world scenarios are often non-stationary with potential hidden confounders. Recent works introduce a number of different assumptions to tackle causal discovery for non-stationary time series (Zhang et al., 2017; Ghassami et al., 2018; Huang et al., 2019) , but in general, causal discovery on nonstationary time series under mild and realistic assumptions is an open problem. This work aims at addressing this open challenge by proposing a causal discovery algorithm for condionally stationary time series, for which the dynamics of the observed system change depending on a set of "state" variables. This assumption holds for many real-world scenarios, e.g., with people who behave differently and take different decisions depending on underlying factors such as mood, previous experience, and the actions of other agents. The causal discovery task for such conditionally stationary time series poses different challenges depending on the observability of the states, which is classified into 4 different scenarios: 1. Scenario class 1 concerns the simplest case, where the states are observed and their dynamics are independent on other observed time series data (Figure 1a ). 2. In Scenario class 2, the states are unobserved and directly dependent on observed variables. Figure 1b shows an example, where the states of the variables change according to their positions (pink vs purple regions). Another example is to consider an agent moving in a room where different behaviors are observed depending on their location. 3. Scenario class 3 is more challenging: the state depends on earlier events, and thus cannot be directly inferred from observations. E.g., in Figure 1c , particles that change state upon collision. Also in a football game a player acts differently depending on earlier actions of the others. 1 1 2 2 0 0 0 0 s 0 s T 1 1 1 1 2 2 0 0  2 2 0 0 1 1 2 2   2 0 2 1 1 0 0 2   0 1 0 0 2 2 1 1   1 1 2  1 1 1  2 2 0  0 1 0  2 0 4. Finally, many real-world scenarios (e.g., Figure 1d ) are governed by underlying states that are not fully identifiable from the observations over time. Here the states can be unknown confounders to the observed time series, thus the causal discovery task is ill-defined. Our approach, named State-Dependent Causal Inference (SDCI), is based on discovering the summary graphs (Peters et al., 2017) conditioned on states given observed sequences. It fits a graph neural network based variational auto-encoder (Löwe et al., 2020) to the non-stationary time series data, which enables efficient amortization for causal discovery across multiple observation sequences. We prove identifiability results for cases with fully-observed states; empirically SDCI also applies to cases with hidden states, which is confirmed by experiments on both synthetic linear datasets and spring data (See Figures 1b & 1c ), covering scenario classes 1-3. Compared to baselines including a non-causal RNN-based approach, SDCI achieves better accuracy in identifying the underlying causal graph and forecasting future trajectories from historical observations, on both simulated and real-world data such as particle interactions and player trajectories in NBA games.

2. RELATED WORK

Causal discovery aims to identify causal relationships over a set of variables from observational data (Glymour et al., 2019) . Constraint-based methods rely on conditional independence tests to recover the underlying DAG structure of the data. Representative approaches include the PC algorithm (Spirtes et al., 2000) and Fast Causal Inference (FCI) (Spirtes, 2001) , and their extension to the time series data (Entner & Hoyer, 2010; Runge, 2018) . Score-based methods, such as Greedy Equivalence Search (GES) (Chickering, 2002) , define and optimize score functions of causal graphs to identify the underlying causal structure. Regarding time series data, these methods are reformulated as learning Dynamic Bayesian Networks (DBNs) from data (Murphy et al., 2002) . A recent approach in this line is DYNOTEARS (Pamfil et al., 2020) , which aims to estimate both instantaneous and time-lagged relationships between variables in a time series without performing combinatorial search in the space of all possible graphs. Functional causal model-based methods represent the effect as a function of its direct causes and their independent immeasurable noise (Shimizu et al., 2006; Zhang & Hyvärinen, 2009; Peters et al., 2014; Glymour et al., 2019) . For time series, these approaches fit a dynamic model, often with constrained functional forms and connection sparsity in favor of identifiability (Peters et al., 2013) . Our work is concerned in modelling non-stationary time series using state variables as entities responsible for changing the dynamics along the sequence. 



Figure 1: Graphical representations of the data generation processes considered in this work. x t represents the observations of a time series sequence, and s t denotes the state variables. The state affects the future observations by changing the causal structure (denoted as f t ) for different state values. The representations are examples of (a) scenario class 1, (b) scenario class 2, (c) scenario class 3, and (d) other scenarios (image adapted from Oh et al. (2011)).

Most relevant to ours is Amortized Causal Discovery (ACD) (Löwe et al., 2020), which assumes stationary time series and amortizes summary graph extraction process from samples with different graphs but shared dynamics. Similar ideas are also proposed in Li et al. (2020) for video applications. We extend ACD by allowing the underlying causal structure to vary depending on some state variables. For other works, Huang et al. (2015) extended Gaussian Process regression to identify time-varying functional causal models; Zhang et al. (2017) used kernel embeddings to detect distribution shifts in heterogeneous data, and Ghassami

