SPATIO-TEMPORAL POINT PROCESSES WITH DEEP NON-STATIONARY KERNELS

Abstract

Point process data are becoming ubiquitous in modern applications, such as social networks, health care, and finance. Despite the powerful expressiveness of the popular recurrent neural network (RNN) models for point process data, they may not successfully capture sophisticated non-stationary dependencies in the data due to their recurrent structures. Another popular type of deep model for point process data is based on representing the influence kernel (rather than the intensity function) by neural networks. We take the latter approach and develop a new deep non-stationary influence kernel that can model non-stationary spatio-temporal point processes. The main idea is to approximate the influence kernel with a novel and general low-rank decomposition, enabling efficient representation through deep neural networks and computational efficiency and better performance. We also take a new approach to maintain the non-negativity constraint of the conditional intensity by introducing a log-barrier penalty. We demonstrate our proposed method's good performance and computational efficiency compared with the state-of-the-art on simulated and real data.

1. INTRODUCTION

Point process data, consisting of sequential events with timestamps and associated information such as location or category, are ubiquitous in modern scientific fields and real-world applications. The distribution of events is of great scientific and practical interest, both for predicting new events and understanding the events' generative dynamics (Reinhart, 2018) . To model such discrete events in continuous time and space, spatio-temporal point processes (STPPs) are widely used in a diverse range of domains, including modeling earthquakes (Ogata, 1988; 1998) , the spread of infectious diseases (Schoenberg et al., 2019; Dong et al., 2021) , and wildfire propagation (Hering et al., 2009) . A modeling challenge is to accurately capture the underlying generative model of event occurrence in general spatio-temporal point processes (STPP) while maintaining the model efficiency. Specific parametric forms of conditional intensity are proposed in seminal works of Hawkes process (Hawkes, 1971; Ogata, 1988) to tackle the issue of computational complexity in STPPs, which requires evaluating the complex multivariate integral in the likelihood function. They use an exponentially decaying influence kernel to measure the influence of a past event over time and assume the influence of all past events is positive and linearly additive. Despite computational simplicity (since the integral of the likelihood function is avoided), such a parametric form limits the model's practicality in modern applications. Recent models use neural networks in modeling point processes to capture complicated event occurrences. RNN (Du et al., 2016) and LSTM (Mei and Eisner, 2017) have been used by taking advantage of their representation power and capability in capturing event temporal dependencies. However, the recurrent structures of RNN-based models cannot capture long-range dependency (Bengio et al., 1994) and attention-based structure (Zhang et al., 2020; Zuo et al., 2020) is introduced to address such limitations of RNN. Despite much development, existing models still cannot sufficiently capture spatio-temporal non-stationarity, which are common in real-world data (Graham et al., 2013; Dong et al., 2021) . Moreover, while RNN-type models may produce strong prediction performance, the models consist of general forms of network layers and the modeling power relies on the hidden states, thus often not easily interpretable.

