Learning Stochastic Behaviour from Aggregate Data

Abstract

Learning nonlinear dynamics from aggregate data is a challenging problem since the full trajectory of each individual is not available, namely, the individual observed at one time point may not be observed at next time point, or the identity of individual is unavailable. This is in sharp contrast to learning dynamics with trajectory data, on which the majority of existing methods are based. We propose a novel method using the weak form of Fokker Planck Equation (FPE) to describe density evolution of data in a sampling form, which is then combined with Wasserstein generative adversarial network (WGAN) in training process. In such a sample-based framework we are able to study nonlinear dynamics from aggregate data without solving the partial differential equation (PDE). The model can also handle high dimensional cases with the help of deep neural networks. We demonstrate our approach in the context of a series of synthetic and real-world data sets.

1. Introduction

In the context of a dynamic system, Aggregate data refers to the data sets that full trajectory of each individual is not available, meaning that there is no known individual level correspondence. Typical examples include data sets collected for DNA evolution, social gathering, density in control problems, bird migration during which it is impossible to identify individual bird, and many more. In those applications, some observed individuals at one time point may be unobserved at the next time spot, or when the individual identities are blocked or unavailable due to various technical and ethical reasons. Rather than inferring the exact information for each individual, the main objective of learning dynamics in aggregate data is to recover and predict the evolution of distribution of all individuals together. Trajectory data, in contrast, is a kind of data that we are able to acquire the information of each individual all the time, although some studies considered the case that some individual trajectories are partially missing. However, the identities of those individuals, whenever they are observable, is always assumed available. For example, stock price, weather, customer behaviors and most training data sets for computer vision and natural language processing. There are many popular models to learn dynamics of full-trajectory data. Typical ones include Hidden Markov Model (HMM) (Alshamaa et al., 2019; Eddy, 1996) , Kalman Filter (KF) (Farahi & Yazdi, 2020; Harvey, 1990; Kalman, 1960) and Particle Filter (PF) (Santos et al., 2019; Djuric et al., 2003) , as well as the models built upon HMM, KF and PF (Deriche et al., 2020; Fang et al., 2019; Hefny et al., 2015; Langford et al., 2009) , they all require full trajectories of each individual, which may not be applicable in the aggregate data situations. On the other side, only a few methods are focused on aggregated data in the recent learning literature. In the work of Hashimoto et al. (2016) , authors assumed that the hidden dynamic of particles follows a stochastic differential equation(SDE), in particular, they use a recurrent neural network to parameterize the drift term. Furthermore, Wang et al. ( 2018) improved traditional HMM model by using an SDE to describe the evolving process of hidden states. To the best of our knowledge, there is no method directly learning the evolution of the density of objects from aggregate data yet. We propose to learn the dynamics of density through the weak form of Fokker Planck Equation (FPE), which is a parabolic partial differential equation (PDE) governing many dynamical systems subject to random noise perturbations, including the typical SDE models in existing studies. Our learning is accomplished by minimizing the Wasserstein distance between predicted distribution given by FPE and the empirical distribution from data samples. Meanwhile we utilize neural networks to handle higher dimensional cases. More importantly, by leveraging the framework of Wasserstein Generative

