ANOMALY DETECTION IN DYNAMICAL SYSTEMS FROM MEASURED TIME SERIES

Abstract

The paper addresses a problem of abnormalities detection in nonlinear processes represented by measured time series. Anomaly detection problem is usually formulated as finding outlier data points relative to some usual signals such as unexpected spikes, drops, or trend changes. In nonlinear dynamical systems, there are cases where a time series does not contain statistical outliers while the process corresponds to an abnormal configuration of the dynamical system. Since the polynomial neural architecture has a strong connection with the theory of differential equations, we use it for the feature extraction that describes the dynamical system itself. The paper considers both simulations and a practical example with real measurements. The applicability of the proposed approach and it's benchmarking with the existing methods is discussed.

1. INTRODUCTION

Most of the works related to anomaly detection in time series data are referred to the detection of the "observation which deviates so much from other observations as to arouse suspicions that it was generated by a different mechanism" (see Hawkins (1980) ). Anomaly detection problem is usually formulated as finding outlier data points relative to some usual signals such as unexpected spikes, drops, trend changes, and level shifts. Blázquez-García et al. (2020) provides a literature review that deals exclusively with time series data and provides a taxonomy for the classification of outlier detection techniques according to their main characteristics. In contrast to these methods, we address the problem of anomaly detection in dynamical systems from measured time series. In this case, we are interested in detecting the anomalies which do not deviate so much from other observations while they were still generated by a different configuration of the dynamical system. To better explain this issue, let us consider a dynamical system in the form of the ordinary differential equation (ODE): d dt X = F (t, X, a 1 , a 2 ), where X = (x 1 , x 2 , . . . , x n ) is a state vector, F is nonlinear function depending on two scalar parameters a 1 and a 2 . The dynamical system (1) generates a trajectory in the form of the multivariate time series as a particular solution for a given initial condition X 0 . Let us also assume for simplicity that the initial conditions X(0) = X 0 are always the same for different trajectories but only parameters are varied and belong to the normal distribution. Since the system (1) is nonlinear, the distribution of the trajectories is unknown in advance and may differ from the normal one. This may cause that trajectories corresponding to abnormal system parameters are located somewhere among the other trajectories. Fig. 1 demonstrates that parameters of the system (1) taken from the tail of the normal distribution can correspond to centered trajectories in the time-space. This example formulates the problem of anomalies detection in the dynamical systems represented only by the measured trajectories. We are interested in the unsupervised methods for recovering the representative features of the dynamical system from the time series. Such a method should calculate features that are correlated with the dynamical system itself but not just with a time series that is generated by the dynamical system. Also, collecting massive training sets in industrial applications Figure 1 : The abnormal parameters of the dynamical system from the tail of the normal distribution may correspond to centered trajectories in the time-space. The normal distribution is not preserved due to the nonlinear differential transformation that describes the dynamical system behavior. is often not feasible. This makes it difficult to apply existing approaches that are based on either the extraction of the statistical features or state-of-the-art neural networks for temporary dependencies modeling. To address these lack-of-the-data challenges, we rely on the polynomial neural networks that were recently gain the attention of researchers in the field of learning of dynamical systems from measured data. The most relevant to our research work is the paper of authors López et al. (1993) , where the connection between the system of ODEs and polynomial neural networks (PNN) is introduced. Further, the PNN architectures were also widely highlighted in the literature. For example, Zjavka (2011) proposes a polynomial neural architecture that approximates differential equations. The Legendre polynomial is chosen as a basis by Yang et al. (2018) . In all these papers, the authors suggest either integrating the parametrized polynomial ODEs or training PNN as a black-box model for identification of the dynamical systems from measured time series and prediction of new trajectories. In opposite to these works, Ivanov et al. ( 2020) suggests a deterministic algorithm to translate an arbitrary nonlinear ODE to PNN without training. If the dynamics of the system follows approximately a given ODE, the Taylor mapping approach allows calculating weights of the neural network (TM-PNN) directly from the equations. This TM-PNN equals initial ODEs with the necessary level of accuracy and can be fine-tuned to recover true dynamics from real measurements without numerical integration of the ODEs. Further, Ivanov & Agapov (2020) applied TM-PNN architecture in a practical application to control one of the larges X-ray source. A deep TM-PNN is initialized from the ODEs that describe the charged particle motion and consists of more than 1500 hidden polynomial layers with unique weights that are fine-tuned with only one measured trajectory of the real system. For the introduced problem of anomaly detection in the dynamical systems represented by time series, we utilize both PNN and TM-PNN architectures. Similar to López et al. (1993) , we train PNN from scratch with measured trajectories. But instead of using the resulting model for dynamics prediction, we focus on the feature extraction problem and consider the weights of the PNN as features describing the dynamical system itself. Since this approach does not require any a priory knowledge about the dynamical system in the form of the ODEs and relies only on data, we compare it with statistical feature extraction and LSTM-based anomaly detection for time series. For the benchmarking with a simple ODE-based parameters estimation, we utilize the TM-PNN architecture and follow the idea proposed in Ivanov & Agapov (2020). Given the system of ODEs that defines the dynamical system with an initial assumption of zero values for all free parameters, we translate the equation to PNN using the Taylor mapping technique. Then we compare the direct tuning of parameters of ODE with the fine-tuning of weights of the pre-initialized TM-PNN. Though the Anonymous Company from the automotive industry provides the real measured data related to the introduced problem, the dataset is small and contains only 50 time series with only one true anomaly. Moreover, the anomaly scores of normal data are unknowns. For benchmarking purposes, we use simulated datasets of the same size based on the Van der Pol oscillator. Then we apply PNN for the analyses of real-world datasets that were generated in production settings.

