WEIGHTED CLOCK LOGIC POINT PROCESS

Abstract

Datasets involving multivariate event streams are prevalent in numerous applications. We present a novel framework for modeling temporal point processes called clock logic neural networks (CLNN) which learn weighted clock logic (wCL) formulas as interpretable temporal rules by which some events promote or inhibit other events. Specifically, CLNN models temporal relations between events using conditional intensity rates informed by a set of wCL formulas, which are more expressive than related prior work. Unlike conventional approaches of searching for generative rules through expensive combinatorial optimization, we design smooth activation functions for components of wCL formulas that enable a continuous relaxation of the discrete search space and efficient learning of wCL formulas using gradient-based methods. Experiments on synthetic datasets manifest our model's ability to recover the ground-truth rules and improve computational efficiency. In addition, experiments on real-world datasets show that our models perform competitively when compared with state-of-the-art models.

1. INTRODUCTION AND RELATED WORK

Multivariate event streams are emerging types of data that involve occurrences of different types of events in continuous time. Event streams are observed in a wide range of applications, including but not limited to finance (Bacry et al., 2015) , politics (O'Brien, 2010) , system maintenance (Gunawardana et al., 2011) , healthcare (Weiss & Page, 2013) , and social networks (Farajtabar et al., 2015) . As opposed to time series data that typically comprises continuous-valued variables evolving in regular discrete time stamps, event streams involve events occurring irregularly and asynchronously in continuous time. Modeling the dynamics in event streams is important for a wide range of scientific and industrial processes, such as predicting the occurrence of events of interest or understanding why some deleterious events occur so as to possibly prevent their occurrence. A (multivariate) temporal point process (TPP) provides a formal mathematical framework for representing event streams, where a conditional intensity rate for each event measures its occurrence rate at any time given the historical events in the stream (Daley & Vere-Jones, 2003; Aalen et al., 2008) . There has been a proliferation of research around TPPs in recent years, particularly around the use of neural networks for modeling conditional intensity rates as a function of historical occurrences (Du et al., 2016; Mei & Eisner, 2017; Xiao et al., 2017; Xu et al., 2017; Gao et al., 2020; Zhang et al., 2020; Zuo et al., 2020) . One stream of research studies graphical event models (GEMs) as a compact and interpretable graphical representation for TPPs, where the conditional intensity rate for any particular event depends only on the history of a subset of the events (Didelez, 2008; Gunawardana & Meek, 2016) . While any TPP can be represented as a GEM, various models make assumptions about the parametric form of conditional intensity rates for the sake of learnability, for instance that rates are piece-wise constant with respect to occurrences within historical windows (Gunawardana et al., 2011; Bhattacharjya et al., 2018) . Ordinal GEMs(OGEM) (Bhattacharjya et al., 2020; 2021) are a recent model from this family where a conditional intensity rate depends on the order in which parent events occur within the most recent historical time period. A temporal logic point process (TLPP) framework was proposed as an alternate way to lend some interpretability to TPPs by modeling intensity rates using temporal logic rules (Li et al., 2020) . Although the initial work pre-specified temporal logic rules, recent work has introduced a temporal logic rule learner (TELLER) for automatically discovering rules (Li et al., 2021) . There is however the issue of scalability since TELLER exploits an expensive branch-and-price algorithm to search for temporal logic rules in a discrete space. Another important limitation of this work is that TELLER's rules are not informative enough to explain how the interval length between ordered events impacts the conditional intensity rate. For instance, while predicting the occurrence of diabetes, the rule that "insulin injection happens 20 minutes before eating meal" is more informative and accurate in predicting "blood glucose remains normal" than the rule that "insulin injection happens before eating meal", as the latter rule cannot expose the interval between 'insulin injection' and 'eating meal'. To tackle the above limitations, we propose novel atomic predicates enriching the expressiveness of temporal logic rules as well as a differentiable framework to learn rules in an end-to-end manner. This work introduces a differentiable neuro-symbolic framework, clock logic neural network (CLNN), to model TPPs by learning weighted clock logic (wCL) formulas as explanations. Firstly, event streams are converted into continuous-time clock signals representing the time interval between the last occurrence of an event and the current time. Next, we propose a novel wCL to describe the underlying temporal relations with relative interval length, enabling the design of a CLNN to learn the generative mechanisms. Instead of searching for temporal logic rules in some vast discrete space, CLNN associates every neuron with an order representation or a logical operator and assigns weights to edges to reflect the importance of various inputs, which relaxes the search space to be continuous. Moreover, architecture weights are introduced into CLNN to make the formula structure search differentiable. wCL formula-informed intensity rates are carefully designed so that the parameters appearing in the rules can be learned through maximum likelihood estimation using gradient-based approaches. CLNN is tested on synthetic datasets to show that CLNN can recover the ground-truth rules as well as on real-world datasets to demonstrate its model-fitting performance.

2.1. NOTATION & BACKGROUND

Let L denote the set of event labels, and M = |L| denote the number of event labels. An event stream is a sequence of events including time stamps, denoted as D = {(l 1 , t 1 ), (l 2 , t 2 ), ..., (l N , t N )}, where t i ∈ R + denotes a time stamp between the beginning time t 0 = 0 and end time t N +1 = T , and l i ∈ L is the event label that happens at t i . We refer to 'event label' and 'label' interchangeably. Every event label l ∈ L has an associated conditional intensity rate describing the occurrence rate of label l at t given the history up to t. In multivariate temporal point processes, conditional intensity rates describe the dynamics of events. Let H t = {(l i , t i ) : t i < t} denote the historical events up to time t. The conditional intensity rate of event label l is denoted as λ l (t|H t ). Specifically, λ l (t|H t ) describes the expected number of occurrences of event label l in an infinitesimal interval [t, t+∆t] given the history H t , i.e., λ l (t|H t ) = lim ∆t→0 (E[N l (t + ∆t) -N l (t)|H t ]/∆t), where N l (t) denotes the number of event label l's occurrences up to t. Example 1 A running example of an event stream with 11 events of 4 labels is shown in Figure 1(a) . The overall workflow of the proposed method (POC: paired order cell, SOC: singleton order cell, AC: architecture cell, details presented in Section 2.2 to 3.3).

2.2. ORDER REPRESENTATIONS FOR EVENT STREAMS

The overall workflow of the proposed framework is visualized as Figure 1 (b). The raw event streams first go through a masking function to generate the masked event streams, which are then transformed into event clocks using a clocking function. The event clocks are given as inputs to the clock logic neural network (CLNN) to learn interpretable wCL formulas and the intensity rate of event occurrences. The following sections provide a detailed explanation for each module in Figure 1 (b). We are interested in exploring the effect of temporal ordering between event labels and the occurrences of causal event labels in a historical window on the occurrence rate of a particular event label,



Figure1: (a): An event stream example with N = 11 events of M = 4 event labels over T = 30 days. (Integer-valued time stamps are utilized for easy interpretation, note that the proposed approach also works for t i ∈ R). (b): The overall workflow of the proposed method (POC: paired order cell, SOC: singleton order cell, AC: architecture cell, details presented in Section 2.2 to 3.3).

