RHINO: DEEP CAUSAL TEMPORAL RELATIONSHIP LEARNING WITH HISTORY-DEPENDENT NOISE

Abstract

Discovering causal relationships between different variables from time series data has been a long-standing challenge for many domains such as climate science, finance, and healthcare. Given the complexity of real-world relationships and the nature of observations in discrete time, causal discovery methods need to consider non-linear relations between variables, instantaneous effects and historydependent noise (the change of noise distribution due to past actions). However, previous works do not offer a solution addressing all these problems together. In this paper, we propose a novel causal relationship learning framework for timeseries data, called Rhino, which combines vector auto-regression, deep learning and variational inference to model non-linear relationships with instantaneous effects while allowing the noise distribution to be modulated by historical observations. Theoretically, we prove the structural identifiability of Rhino. Our empirical results from extensive synthetic experiments and two real-world benchmarks demonstrate better discovery performance compared to relevant baselines, with ablation studies revealing its robustness under model misspecification.

1. INTRODUCTION

Time series data is a collection of data points recorded at different timestamps describing a pattern of chronological change. Identifying the causal relations between different variables and their interactions through time (Spirtes et al., 2000; Berzuini et al., 2012; Guo et al., 2020; Peters et al., 2017) is essential for many applications e.g. climate science, health care, etc. Randomized control trials are the gold standard for discovering such relationships, but may be unavailable due to cost and ethical constraints. Therefore, causal discovery with just observational data is important and fundamental to many real-world applications (Löwe et al., 2022; Bussmann et al., 2021; Moraffah et al., 2021; Wu et al., 2020; Runge, 2018; Tank et al., 2018; Hyvärinen et al., 2010; Pamfil et al., 2020) . The task of temporal causal discovery can be challenging for several reasons: (1) relations between variables can be non-linear in the real world; (2) with a slow sampling interval, everything happens in between will be aggregated into the same timestamp, i.e. instantaneous effect; (3) the noise may be non-stationary (its distribution depends on the past observations), i.e. history-dependent noise. For example, in stock markets, the announcements of some decisions from a leading company after the market closes may have complex effects (i.e. non-linearity) on its stock price immediately after the market opening (i.e. slow sampling interval and instantaneous effect) and its price volatility may also be changed (i.e. history-dependent noise). Similarly, in education, students that recently earned good marks on algebra tests should also score well on an upcoming algebra exam with little variation (i.e. history-dependent noise). To the best of our knowledge, existing frameworks' performances suffer in many real-world scenarios as they cannot address these aspects in a satisfactory way. Especially, history-dependent noise has been rarely considered in past. A large category of the preceding works, called Granger causality (Granger, 1969) , is based on the fact that cause-effect relationships can never go against time. Despite many recent advances (Wu et al., 2020; Shojaie & Michailidis, 2010; Siggiridou & 

