REINFORCEMENT LOGIC RULE LEARNING FOR TEM-PORAL POINT PROCESSES

Abstract

We aim to learn a set of temporal logic rules to explain the occurrence of temporal events. Leveraging the temporal point process modeling and learning framework, the rule content and rule weights are jointly learned by maximizing the likelihood of the observed noisy event sequences. The proposed algorithm alternates between a master problem, where the rule weights are updated, and a subproblem, where a new rule is searched and included. The formulated master problem is convex and relatively easy to solve, whereas the subproblem requires searching the huge combinatorial rule predicate and relationship space. To tackle this challenge, we propose a neural search policy to learn to generate the new rule content as a sequence of actions. The policy parameters will be trained end-to-end using the reinforcement learning framework, where the reward signals can be efficiently queried by evaluating the subproblem objective. The trained policy can be used to generate new rules, and moreover, the well-trained policies can be directly transferred to other tasks to speed up the rule searching procedure in the new task. We evaluate our methods on both synthetic and real-world datasets, obtaining promising results.

1. INTRODUCTION

Understanding the generating process of events with irregular timestamps has long been an interesting problem. Temporal point process (TPP) is an elegant probabilistic model for modeling these irregular events in continuous time. Instead of discretizing the time horizons and converting the event data into time-series event counts, TPP models directly model the inter-event times as random variables and can be used to predict the time-to-event as well as the future event types. Recent advances in neural-based temporal point process models have exhibited superior ability in event prediction (Du et al., 2016; Mei & Eisner, 2017) . However, the lack of interpretability of these black-box models hinders their applications in high-stakes systems like healthcare. In healthcare, it is desirable to summarize medical knowledge or clinical experiences about the disease phenotypes and therapies to a collection of logic rules. The discovered rules can contribute to the sharing of clinical experiences and aid to the improvement of the treatment strategy. They can also provide explanations to the occurrence of events. For example, the following clinical report "A 50 years old patient, with a chronic lung disease since 5 years ago, took the booster vaccine shot on March 1st. The patient got exposed to the COVID-19 virus around May 12th, and afterward within a week began to have a mild cough and nasal congestion. The patient received treatment as soon as the symptoms appeared. After intravenous infusions at a healthcare facility for around 3 consecutive days, the patient recovered... " contains many clinical events with timestamps recorded. It sounds appealing to distill compact and human-readable temporal logic rules from these noisy event data. In this paper, we propose an efficient reinforcement temporal logic rule learning algorithm to automatically learn these rules from event sequences. See Fig. 1 for a better illustration of the types of temporal logic rules we aim to discover, where the logic rules are in disjunctive normal form (i.e., OR-of-ANDs) with temporal ordering constraints. Our proposed reinforcement rule learning algorithm builds upon the temporal logic point process (TLPP) models (Li et al., 2020) , where the intensity functions (i.e., occurrence rate) of events are informed by temporal logic rules. TLPP is intrinsically a probabilistic model that treats the temporal

