TIMEAUTOML: AUTONOMOUS REPRESENTATION LEARNING FOR MULTIVARIATE IRREGULARLY SAM-PLED TIME SERIES

Abstract

Multivariate time series (MTS) data are becoming increasingly ubiquitous in diverse domains, e.g., IoT systems, health informatics, and 5G networks. To obtain an effective representation of MTS data, it is not only essential to consider unpredictable dynamics and highly variable lengths of these data but also important to address the irregularities in the sampling rates of MTS. Existing parametric approaches rely on manual hyperparameter tuning and may cost a huge amount of labor effort. Therefore, it is desirable to learn the representation automatically and efficiently. To this end, we propose an autonomous representation learning approach for multivariate time series (TimeAutoML) with irregular sampling rates and variable lengths. As opposed to previous works, we first present a representation learning pipeline in which the configuration and hyperparameter optimization are fully automatic and can be tailored for various tasks, e.g., anomaly detection, clustering, etc. Next, a negative sample generation approach and an auxiliary classification task are developed and integrated within TimeAutoML to enhance its representation capability. Extensive empirical studies on real-world datasets demonstrate that the proposed TimeAutoML outperforms competing approaches on various tasks by a large margin. In fact, it achieves the best anomaly detection performance among all comparison algorithms on 78 out of all 85 UCR datasets, acquiring up to 20% performance improvement in terms of AUC score.

1. INTRODUCTION

The past decade has witnessed a rising proliferation in Multivariate Time Series (MTS) data, along with a plethora of applications in domains as diverse as IoT data analysis, medical informatics, and network security. Given the huge amount of MTS data, it is crucial to learn their representations effectively so as to facilitate underlying applications such as clustering and anomaly detection. For this purpose, different types of methods have been developed to represent time series data. Traditional time series representation techniques, e.g., Discrete Fourier Transform (DCT) (Faloutsos et al., 1994) , Discrete Wavelet Transform (DWT) (Chan & Fu, 1999) , Piecewise Aggregate Approximation (PAA) (Keogh et al., 2001) , etc., represent raw time series data based on specific domain knowledge/data properties and hence could be suboptimal for subsequent tasks given the fact that their objectives and feature extraction are decoupled. More recent time series representation approaches, e.g., Deep Temporal Clustering Representation (DTCR) (Ma et al., 2019) , Self-Organizing Map based Variational Auto Encoder (SOM-VAE) (Fortuin et al., 2018) , etc., optimize the representation and the underlying task such as clustering in an end-to-end manner. These methods usually assume that time series under investigation are uniformly sampled with a fixed interval. This assumption, however, does not always hold in many applications. For example, within a multimodal IoT system, the sampling rates could vary for different types of sensors. Unsupervised representation learning for irregularly sampled multivariate time series is a challenging task and there are several major hurdles preventing us from building effective models: i) the design of neural network architecture often employs a trial and error procedure which is time consuming and could cost a substantial amount of labor effort; ii) the irregularity in the sampling rates constitutes a major challenge against effective learning of time series representations and render most existing methods not directly applicable; iii) traditional unsupervised time series representation learning approach does not consider contrastive loss functions and consequently only can achieve suboptimal performance. To tackle the aforementioned challenges, we propose an autonomous unsupervised representation learning approach for multivariate time series to represent irregularly sampled multivariate time series. TimeAutoML differs from traditional time series representation approaches in three aspects. First, the representation learning pipeline configuration and hyperparameter optimization are carried out automatically. Second, a negative sample generation approach is proposed to generate negative samples for contrastive learning. Finally, an auxiliary classification task is developed to distinguish normal time series from negative samples. In this way, the representation capability of TimeAu-toML is greatly enhanced. We conduct extensive experiments on UCR time series datasets and UEA multivariate time series datasets. Our experiments demonstrate that the proposed TimeAu-toML outperforms comparison algorithms on both clustering and anomaly detection tasks by a large margin, especially when time series data is irregularly sampled.

2. RELATED WORK

Unsupervised Time Series Representation Learning Time series representation learning plays an essential role in a multitude of downstream analysis such as classification, clustering, anomaly detection. There is a growing interest in unsupervised time series representation learning, partially because no labels are required in the learning process, which suits very well many practical applications. Unsupervised time series representation learning can be broadly divided into two categories, namely 1) multi-stage methods and 2) end-to-end methods. Multi-stage methods first learn a distance metric from a set of time series, or extract the features from the time series, and then perform downstream machine learning tasks based on the learned or the extracted features. Euclidean distance (ED) and Dynamic Time Warping (DTW) are the most commonly used traditional time series distance metrics. Although the ED is competitive, it is very sensitive to outliers in the time series. The main drawback of DTW is its heavy computational burden. Traditional time series feature extraction methods include Singular Value Decomposition (SVD), Symbolic Aggregate Approximation (SAX), Discrete Wavelet Transform (DWT) (Chan & Fu, 1999) , Piecewise Aggregate Approximation (PAA) (Keogh et al., 2001) , etc. Nevertheless, most of these traditional methods are for regularly sampled time series, so they may not perform well on irregularly sampled time series. In recent years, many new feature extraction methods and distance metrics are proposed to overcome the drawbacks mentioned above. For instance, Paparrizos & Gravano (2015); Petitjean et al. ( 2011) combine the proposed distance metrics and K-Means algorithm to achieve clustering. Lei et al. (2017) first extracts sparse features of time series, which is not sensitive to outliers and irregular sampling rate, and then carries out the K-Means clustering. In contrast, end-to-end approaches learn the representation of the time series in an end-to-end manner without explicit feature extraction or distance learning (Fortuin et al., 2018; Ma et al., 2019) . However, the aforementioned methods need to manually design the network architecture based on human experience which is time-consuming and costly. Instead, we propose in this paper a representation learning method which optimizes an AutoML pipeline and their hyperparameters in a fully autonomous manner. Furthermore, we consider negative sampling and contrastive learning in the proposed framework to effectively enhance the representation ability of the proposed neural network architecture. Irregularly Sampled Time Series Learning There exist two main groups of works regarding machine learning for irregularly sampled time series data. The first type of methods impute the missing values before conducting the subsequent machine learning tasks (Shukla & Marlin, 2019; Luo et al., 2018; 2019; Kim & Chi, 2018) . The second type directly learns from the irregularly sampled time series. For instance, Che et al. (2018); Cao et al. (2018) propose a memory decay mechanism, which replaces the memory cell of RNN by the memory of the previous timestamp multiplied by a learnable decay coefficient when there are no sampling value at this timestamp. Rubanova et al. (2019) combines RNN with ordinary differential equation to model the dynamic of irregularly sampled time series. Different from the previous works, TimeAutoML makes use of the special characteristics of RNN (Abid & Zou, 2018) and automatically configure a representation learning pipeline to model the temporal dynamics of time series. It is worthy mentioning that there are many types of

