TIME SERIES ANOMALY DETECTION VIA HYPOTHESIS TESTING FOR DYNAMICAL SYSTEMS

Abstract

Real world systems-such as robots, weather, energy systems and stock marketsare complicated and high-dimensional. Hence, without prior knowledge of the system dynamics, detecting or forecasting abnormal events from the sequential observations of the system is challenging. In this work, we address the problem caused by high-dimensionality via viewing time series anomaly detection as hypothesis testing on dynamical systems. This perspective can avoid the dimension of the problem from increasing linearly with time horizon, and naturally leads to a novel anomaly detection model, termed as DyAD (Dynamical system Anomaly Detection). Furthermore, as existing time-series anomaly detection algorithms are usually evaluated on relatively small datasets, we released a large-scale one on detecting battery failures in electric vehicles. We benchmarked several popular algorithms on both public datasets and our released new dataset. Our experiments demonstrated that our proposed model achieves state-of-the-art results.

1. INTRODUCTION

Hypothesis testing aims to decide whether the observed data supports or rejects a default belief known as the null hypothesis. Applications are abundant. In this work, we view anomaly detection as an application of hypothesis testing. This perspective is nothing profound-samples from the null hypothesis can be viewed as in-distribution, and rejection can be viewed as detecting anomalies. Despite being rather straightforward, this view was not carefully investigated in large-scale anomaly detection tasks, because most classical hypothesis testing methods suffer from the curse of dimensionality. In this work, we address the problem incurred by high-dimensionality via focusing on time series data collected from unknown dynamical systems. We exploit the structure of dynamical systems and show that although the time series data can be high dimensional due to the long time horizon, the problem still remains tractable. More specifically, the concentration that leads to statistical confidence does not come from independent variables but from martingales. We turn the high dimensionality caused by the long time horizon into our favor. Furthermore, our analysis leads to a detection procedure in which the anomaly in systems (e.g., errors and attacks) can be isolated from the rarity of system input (e.g., control commands), and hence reduces misclassification rates. By combining the above analysis with autoencoder-based probabilistic models, we develop a new model termed DyAD (DYnamical system Anomaly Detection). We show that the theory-motivated DyAD model can achieve state-of-the-art performances on public datasets including MSL (Mars Science Laboratory rover) (Hundman et al., 2018) and SMAP (Soil Moisture Active Passive satellite) (O'Neill et al., 2010) . To further validate our finding, we then release a much larger (roughly 50 times in terms of data points) dataset to benchmark several popular baselines. Our released dataset focuses on the battery safety problem in electric vehicles. In recent years, electric vehicle (EV) adoption rates increased exponentially due to their environmental friendliness, improved cruise range and reduced costs brought by onboard lithium batteries (Schmuch et al., 2018; Mauler et al., 2021 ). Yet, large-scale battery deployment can lead to unexpected fire incidents and product recalls (Deng et al., 2018) . Hence, accurately evaluating the health status of EV batteries is crucial to the safety of drivers and passengers. To promote research in this field, we release a dataset collected from 301 electric vehicles recorded over 3 months to 3 years. Only battery-related data at charging stations was released for anonymity purposes. 50 of the 301 vehicles eventually suffered from battery failure. Experiments on the EV battery dataset confirm that our proposed model achieves better performance for system anomaly detection. In summary, our contributions are: • We formulate hypothesis testing based on data observed from dynamical systems and derive generalized likelihood ratio test that exploits the Markovian structure of observations from dynamical systems. • We show that the above formulation leads to a novel model, termed DyAD, for anomaly detection on dynamical systems. • We release a large dataset collected from 301 electric vehicles, out of which 50 suffered from battery failure. In addition to benchmarking anomaly detection algorithms, the dataset may be of independent interest for machine learning tasks in nonlinear systems.

2.1. ANOMALY DETECTION AND OUT-OF-DISTRIBUTION DETECTION

The difference between out-of-distribution detection (OOD) and anomaly detection (AD) is subtle. Up to the authors' knowledge, anomaly detection, as compared to OOD detection, refers to identifying samples that differ more drastically but rarely from the in-distribution samples. However, the mathematical formulations for the two problems are the same, and hence we use the two terms interchangeably within the context of this manuscript. The core idea of AD is to develop a metric that differs drastically between normal and abnormal samples. Some previous works find that the model output probability for normal samples is higher (Hendrycks & Gimpel, 2016; Golan & El-Yaniv, 2018) in image tasks. Some previous works focus on detecting anomalies in the feature space by forcing/assuming the feature concentration of normal samples (Schölkopf et al., 1999; Lee et al., 2018) . Some enhance the representation power of networks by introducing contrastive learning (Winkens et al., 2020; Tack et al., 2020) 

2.2. TIME SERIES ANOMALY DETECTION

Since the battery system is a complex system that consists of multi-dimensional time series data, the most relevant deep learning research topic is multivariate time series anomaly detection. We will briefly introduce recent progress and the common datasets used in this area. 



and data transformation(Golan & El-Yaniv, 2018).Ren et al. (2019)  partition the input into semantic and background parts and define the log-likelihood difference between a normal model and a background model as the likelihood ratio to distinguish anomalies. More recently, Ristea et al. (2022) propose a self-supervised neural network composed of masked convolutional layers and channel attention modules for vision tasks, which predicts a masked region in the convolutional receptive field. Roth et al. (2022) utilize a memory bank learned from nominal samples and nearest neighborhood search to detect anomalies on industrial images.

Several recent works focus on multivariate time series anomaly detection. Malhotra et al. (2016) propose to model reconstruction probabilities of the time series with an LSTM-based encoderdecoder network and use the reconstruction errors to detect anomalies. Hundman et al. (2018) leverage the prediction errors of the LSTM model to detect telemetry anomaly data. Su et al. (2019) propose OmniAnomaly to find the normal patterns through a stochastic recurrent neural network and use the reconstruction probabilities to determine anomalies. Zhao et al. (2020) capture multivariate correlations by considering each univariate series as an individual feature and including two graph attention layers to learn the dependencies of multivariate series in both temporal and feature dimensions. Deng & Hooi (2021) adopt graph neural networks to learn the inter-variable interactions.There are several public time series datasets for anomaly detection. The SMAP (Soil Moisture Active Passive satellite) dataset is collected by a NASA's Earth Environment Satellite Observation Satellite(O'Neill et al., 2010). The MSL (Mars Science Laboratory rover) collects data sequences to determine if Mars was ever able to support microbial life(Hundman et al., 2018). The water

