LEAVES: LEARNING VIEWS FOR TIME-SERIES DATA IN CONTRASTIVE LEARNING

Abstract

Contrastive learning, a self-supervised learning method that can learn representations from unlabeled data, has been developed promisingly. Many methods of contrastive learning depend on data augmentation techniques, which generate different views from the original signal. However, tuning policies and hyper-parameters for more effective data augmentation methods in contrastive learning is often time and resource-consuming. Researchers have designed approaches to automatically generate new views for some input signals, especially on the image data. But the view-learning method is not well developed for time-series data. In this work, we propose a simple but effective module for automating view generation for time-series data in contrastive learning, named learning views for time-series data (LEAVES). The proposed module learns the hyper-parameters for augmentations using adversarial training in contrastive learning. We validate the effectiveness of the proposed method using multiple time-series datasets. The experiments demonstrate that the proposed method is more effective in finding reasonable views and performs downstream tasks better than the baselines, including manually tuned augmentation-based contrastive learning methods and SOTA methods.

1. INTRODUCTION

Contrastive learning has been widely applied to improve the robustness of the model for various downstream tasks such as images (Chen et al., 2020; Grill et al., 2020; Wang & Qi, 2022) and timeseries data (Mohsenvand et al., 2020; Mehari & Strodthoff, 2022) . Among the developed contrastive learning methods, data augmentation plays an essential role in generating different corrupted transformations, as views of the original input for the pretext task. For example, Chen et al. (2020) proposed a SimCLR method to maximize the agreements of augmented views from the same sample to pre-train the model, which significantly outperformed the previous state-of-the-art method in image classification with far less labeled data. However, the data augmentation methods selection is usually empirical, and tuning a set of optimized data augmentation methods can cost thousands of GPU hours even with the automating searching algorithms (Cubuk et al., 2019) . Therefore, it remains an open question how to effectively generate views for a new dataset. Instead of using artificially generated views, researchers have been putting efforts into training deep learning methods to generate optimized views for the input samples (Tamkin et al., 2020; Rusak et al., 2020) . These methods generate reasonably-corrupted views for the image datasets and result in satisfactory results. For example, Tamkin et al. (2020) proposed the ViewMaker, an adversarially trained convolutional module in contrastive learning, to generate augmentation for images. Nevertheless, the aforementioned method, such as the ViewMaker, might not be acclimatized when forthrightly utilized on the time-series data. The main challenge is that, for the time-series signal, we need to not only disturb the magnitudes (spatial) but also distort the temporal dimension Um et al. (2017); Mehari & Strodthoff (2022) . While the image-based methods can only disturb the spatial domain by adding reasonable noise to the input data. In this work, we propose LEAVES, which is a lightweight module for learning views on time-series data in contrastive learning. The LEAVES is optimized adversarially against the contrastive loss to generate challenging views for the encoder in learning representations. In addition, to introduce smooth temporal perturbations to the generated views, we propose a differentiable data augmentation technique for time-series data, named TimeDistort. Figure 1 shows the examples of the gener- We can find that no temporal location is perturbed in Fig. 1 (a), and the flat region (T-P interval as the ECG fiducial) of the original ECG signal was completely distorted. Compared to the View-Maker, the proposed LEAVES can distort the original signal in both spatial and temporal domains, more importantly, reducing the risk of losing intact information due to excessive perturbation in time-series data. Our experiments and analysis showed that the proposed LEAVES (1) outperforms the baselines, including SimCLR and the SOTA methods, and (2) generates more reasonable views compared to the SOTA methods in time-series data.

2.1. AUGMENTATION-BASED CONTRASTIVE LEARNING

Among the contrastive learning algorithms proposed in various areas, the data augmentation methods usually play essential roles in generating views from the original input to form contrastive pairs. Many contrastive learning frameworks have been recently developed based on the image transformation in computer vision (He et al., 2020; Chen et al., 2020; Grill et al., 2020; Chen & He, 2021; Tamkin et al., 2020; Zbontar et al., 2021; Wang & Qi, 2022; Zhang & Ma, 2022) . For example, Chen et al. ( 2020) proposed a SimCLR framework that maximizes the agreement between two views transformed from the same image. BYOL (Grill et al., 2020) encourages two networks, including a target network and an online network, to interact and learn from each other based on two augmented views of an image. Zbontar et al. ( 2021) proposed a Barlow Twin framework that applied two corrupted views from an image with a redundancy-reduction objective function to avoid trivial constant solutions in contrastive learning. Other than the applications in the computer vision area, contrastive learning algorithms have also been applied to time-series data (Gopal et al., 2021; Mehari & Strodthoff, 2022; Wickstrøm et al., 2022) The aforementioned research has achieved promising results by leveraging unlabeled data, however, the empirically augmented views might not be optimal, especially for the datasets that are relatively new or less popular, as exploring appropriate sets of augmentations is expensive.

2.2. AUTOMATIC AUGMENTATION

Rather than setting augmentation methods with empirical settings, researchers proposed multiple methods of optimizing the appropriate augmentation strategies (Cubuk et al., 2019; Ho et al., 2019; Lim et al., 2019; Li et al., 2020; Cubuk et al., 2020; Liu et al., 2021) . For example, AutoAugment (Cubuk et al., 2019) was designed as a reinforcement learning-based algorithm to search for augmentation policies, including the possibility and order of using different augmentation methods. DADA (Li et al., 2020 ) is a gradient-based optimization strategy for finding the augmentation policy with the highest probabilities after training, which significantly reduces the search time compared to



Figure 1: Visualization of the views learned using (a) the ViewMaker and (b) the proposed LEAVES. Compared to the ViewMaker, LEAVES introduces temporal distortion and the augmented view is more faithful. ated views of electrocardiogram (ECG) from the ViewMaker (Tamkin et al., 2020) and our method.We can find that no temporal location is perturbed in Fig.1 (a), and the flat region (T-P interval as the ECG fiducial) of the original ECG signal was completely distorted. Compared to the View-Maker, the proposed LEAVES can distort the original signal in both spatial and temporal domains, more importantly, reducing the risk of losing intact information due to excessive perturbation in time-series data. Our experiments and analysis showed that the proposed LEAVES (1) outperforms the baselines, including SimCLR and the SOTA methods, and (2) generates more reasonable views compared to the SOTA methods in time-series data.

. For example, Gopal et al. (2021) proposed a clinical domain-knowledge-based augmentation on ECG data and generated views from ECG from contrastive learning. Also, Mehari & Strodthoff (2022) applied well-evaluated methods such as SimCLR, BYOL, and CPC Oord et al. (2018) on time-series ECG data for clinical downstream tasks. Wickstrøm et al. (2022) generated contrastive views by applying the MixUp augmentation (Zhang et al., 2017) in the time-series data.

