SELF-SUPERVISED TIME SERIES REPRESENTATION LEARNING BY INTER-INTRA RELATIONAL REASON-ING

Abstract

Self-supervised learning achieves superior performance in many domains by extracting useful representations from the unlabeled data. However, most of traditional self-supervised methods mainly focus on exploring the inter-sample structure while less efforts have been concentrated on the underlying intra-temporal structure, which is important for time series data. In this paper, we present Self-Time: a general Self-supervised Time series representation learning framework, by exploring the inter-sample relation and intra-temporal relation of time series to learn the underlying structure feature on the unlabeled time series. Specifically, we first generate the inter-sample relation by sampling positive and negative samples of a given anchor sample, and intra-temporal relation by sampling time pieces from this anchor. Then, based on the sampled relation, a shared feature extraction backbone combined with two separate relation reasoning heads are employed to quantify the relationships of the sample pairs for inter-sample relation reasoning, and the relationships of the time piece pairs for intra-temporal relation reasoning, respectively. Finally, the useful representations of time series are extracted from the backbone under the supervision of relation reasoning heads. Experimental results on multiple real-world time series datasets for time series classification task demonstrate the effectiveness of the proposed method. Code and data are publicly available 1 .

1. INTRODUCTION

Time series data is ubiquitous and there has been significant progress for time series analysis (Das, 1994) in machine learning, signal processing, and other related areas, with many real-world applications such as healthcare (Stevner et al., 2019 ), industrial diagnosis (Kang et al., 2015) , and financial forecasting (Sen et al., 2019) . Deep learning models have emerged as successful models for time series analysis (Hochreiter & Schmidhuber, 1997; Graves et al., 2013; Shukla & Marlin, 2019; Fortuin et al., 2019; Oreshkin et al., 2020) . Despite their fair share of success, the existing deep supervised models are not suitable for high-dimensional time series data with a limited amount of training samples as those data-driven approaches rely on finding ground truth for supervision, where data labeling is a labor-intensive and time-consuming process, and sometimes impossible for time series data. One solution is to learn useful representations from unlabeled data, which can substantially reduce dependence on costly manual annotation. Self-supervised learning aims to capture the most informative properties from the underlying structure of unlabeled data through the self-generated supervisory signal to learn generalized representations. Recently, self-supervised learning has attracted more and more attention in computer vision by designing different pretext tasks on image data such as solving jigsaw puzzles (Noroozi & Favaro, 2016 ), inpainting (Pathak et al., 2016 ), rotation prediction(Gidaris et al., 2018) , and contrastive learning of visual representations (Chen et al., 2020) , and on video data such as object tracking (Wang & Gupta, 2015) , and pace prediction (Wang et al., 2020) . Although some video-based ap-1 Anonymous repository link. 1

