LEARNING TO DECOUPLE COMPLEX SYSTEM FOR SEQUENTIAL DATA

Abstract

A complex system with cluttered observations may be a coupled mixture of multiple simple sub-systems corresponding to latent entities. Such sub-systems may hold distinct dynamics in the continuous-time domain, therein complicated interactions between sub-systems also evolve over time. This setting is fairly common in the real world, but has been less considered. In this paper, we propose a sequential learning approach under this setting by decoupling a complex system for handling irregularly sampled and cluttered sequential observations. Such decoupling brings about not only subsystems describing the dynamics of each latent entity, but also a meta-system capturing the interaction between entities over time. Specifically, we argue that the meta-system of interactions is governed by a smoothed version of projected differential equations. Experimental results on synthetic and real-world datasets show the advantages of our approach when facing complex and cluttered sequential data compared to the state-of-the-art.

1. INTRODUCTION

Discovering hidden rules from sequential observations has been an essential topic in machine learning, with a large variety of applications such as physics simulation (Sanchez-Gonzalez et al., 2020) , autonomous driving (Diehl et al., 2019 ), ECG analysis (Golany et al., 2021) and event analysis (Chen et al., 2021) , to name a few. A standard scheme is to consider sequential data at each timestamp being holistic and homogeneous under some ideal assumptions (i.e., only the temporal behavior of one entity is involved in a sequence), under which data/observation is treated as collection of slices at different time from a unified system. A series of sequential learning models fall into this category, including variants of recurrent neural networks (RNNs) (Cho et al., 2014; Hochreiter & Schmidhuber, 1997) , neural differential equations (DEs) (Chen et al., 2018; Kidger et al., 2020; Rusch & Mishra, 2021; Zhu et al., 2021) and spatial/temporal attention-based approaches (Vaswani et al., 2017; Fan et al., 2019; Song et al., 2017) . These variants fit well into the scenarios agreeing with the aforementioned assumptions, and are proved effective in learning or modeling for relatively simple applications with clean data source. In the real world, a system may not only describe a single and holistic entity, but also consist of several distinguishable interacting but simple subsystems, where each subsystem corresponds to a physical entity. For example, we can think of the movement of a solar system being the mixture of distinguishable subsystems of sun and surrounding planets, while interactions between these celestial bodies along time are governed by the laws of gravity. Back to centuries ago, physicists and astronomers made enormous effort to discover the rule of celestial movements from the records of each single bodies, and eventually delivered the neat yet elegant differential equations (DEs) depicting principles of moving bodies and interactions therein. Likewise in nowadays, researchers also developed a series of machine learning models for sequential data with distinguishable partitions (Qin et al., 2017) . Two widely adopted strategies for learning the interactions between subsystems are graph neural networks (Iakovlev et al., 2021; Ha & Jeong, 2021; Kipf et al., 2018; Yıldız et al., 2022; Xhonneux et al., 2020) and attention mechanism (Vaswani et al., 2017; Lu et al., 2020; Goyal et al., 2021) , while the interactions are typically encoded with "messages" between nodes and pair-wise "attention scores", respectively. It is worth noting a even more difficult scenario, in which the data/observation is so cluttered that cannot be readily distinguished into separate parts. This can be either due to the way of data

