QUICKEST CHANGE DETECTION FOR MULTI-TASK PROBLEMS UNDER UNKNOWN PARAMETERS

Abstract

We consider the quickest change detection problem where both the parameters of pre-and post-change distributions are unknown, which prevent the use of classical simple hypothesis testing. Without additional assumptions, optimal solutions are not tractable as they rely on some minimax and robust variant of the objective. As a consequence, change points might be detected too late for practical applications (in economics, health care or maintenance for instance). Other approaches solve a relaxed version of the problem through the use of particular probability distributions or the use of domain knowledge. We tackle this problem in the more complex Markovian case and we provide a new scalable approximate algorithm with near optimal performance that runs in O(1).

1. INTRODUCTION

Quickest Change Detection (QCD) problems arise naturally in settings where a latent state controls observable signals (Basseville et al., 1993) . In biology, it is applied in genomic sequencing (Caron et al., 2012) and in reliable healthcare monitoring (Salem et al., 2014) . In industry, it finds application in faulty machinery detection (Lu et al., 2017; Martí et al., 2015) and in leak surveillance (Wang et al., 2014) . It also has environmental applications such as traffic-related pollutants detection (Carslaw et al., 2006) . Any autonomous agent designed to interact with the world and achieve multiple goals must be able to detect relevant changes in the signals it is sensing in order to adapt its behaviour accordingly. This is particularly true for reinforcement learning based agents in multi-task settings as their policy is conditioned to some task parameter (Gupta et al., 2018; Teh et al., 2017) . In order for the agent to be truly autonomous, it needs to identify the task at hand according to the environment requirement. For example, a robot built to assist cooks in the kitchen should be able to recognise the task being executed (chopping vegetables, cutting meat, ..) without external help to assist them efficiently. Otherwise, the agent requires a higher intelligence (one of the cooks for instance) to control it (by stating the task to be executed). In the general case, the current task is unknown and has to be identified sequentially from external sensory signals. The agent must track the changes as quickly as possible to adapt to its environment. However, current solutions for the QCD problem when task parameters are unknown, either do not scale or impose restrictive conditions on the setting. (i.i.d. observations, exponential family distributions, partial knowledge of the parameters, etc.). In this paper, we construct a scalable algorithm with similar performances to optimal solutions. For this purpose, we use the change detection delay under known parameters as a lower bound for the delay in the unknown case. This improves our estimations of the parameters and thus improves our change point detection. We consider the case where the data is generated by some Markovian processes as in reinforcement learning. We assess our algorithm performances on synthetic data generated using distributions parameterised with neural networks in order to match the complexity level of real life applications. We also evaluate our algorithm on standard reinforcement learning environment.

2. QUICKEST CHANGE DETECTION PROBLEMS

Formally, consider a sequence of random observations (X t ) where each X t belongs to some observation space X (say, an Euclidean space for simplicity) and is drawn from f θt (.|X t-1 ), where 1

