CROSS-WINDOW SELF-TRAINING VIA CONTEXT VARIATIONS FROM SPARSELY-LABELED TIME SERIES

Abstract

A real-world time series is often sparsely labeled due to the expensive annotation cost. Recently, self-training methods have been applied to a dataset with few labels to infer the labels of unlabeled augmented instances. Accelerating this trend for time-series data, fully taking advantage of its sequential nature, we propose a novel data augmentation approach called context-additive augmentation, which allows a target instance to be augmented easily by adding preceding and succeeding instances to form an augmented instance. Unlike the existing augmentation techniques which may alter the target instance by directly perturbing its features, it preserves a target instance as is but still gives various augmented instances with varying contexts. Additionally, we propose a cross-window self-training framework based on the context-additive augmentation. The framework first augments target instances by applying context-varying windows over a given time series. Then, the framework derives reliability-based cross-window labels and uses them to maintain consistency among augmented instances across the windows. Extensive experiments using real datasets show that the framework outperforms the existing state-of-the-art self-training methods.

1. INTRODUCTION

A time series is a collection of consecutive data points, often annotated with temporally coherent timestamp labels R4: , and this work deals with a model aiming to classify every timestamp in a time series correctly. However, due to the length of and complexity in a time series, labeling every timestamp in the time series requires prohibitively high cost, and therefore, in reality a lot of time series are only sparsely labeled (Moltisanti et al., 2019; Ma et al., 2020; Deldari et al., 2021; Shin et al., 2022) . In this regard, self-training is used as a promising way to train a model from sparse labels, by leveraging the model's output to infer new labels for unlabeled data points (Laine & Aila, 2017; Rizve et al., 2021) . Recent state-of-the-art self-training methods, mostly developed for image data, necessitate domain-specific data augmentation (Sohn et al., 2020; Zhang et al., 2021; Kim & Lee, 2022) . Such conventional data augmentation generates multiple different instances from a target instance R2,R4: (i.e., an instance for pseudo-labeling) by way of data perturbation. If data instances are independent of one another as in image data, there is no other way than to perturb the target instance itself. In contrast, using the sequential nature of time series, where data instances (segments or data points) are temporally correlated, it is feasible to generate multiple different instances from a target instance without perturbing it but by adding its surrounding sequence (i.e., context). R2,R4: As shown in Figure 1 (a), given a target instance sampled from a time series, contexts of varying lengths are added to the preceding and succeeding positions of the target instance to generate different pairs of "augmented" instances. We call this type of data augmentation the context-additive augmentation. The key property of context-additive augmentation is to achieve the effect of data augmentation without perturbing a target instance. Being free of data perturbation brings several benefits. First, consistency between augmented instances can be enforced more reliably because a target instance itself is exactly the same among its augmented instances. Second, a sufficient number of augmented instances can be easily obtained by context variations. Third, it is computationally inexpensive, only requiring the retrieval of a sub-sequence from a time series. Moreover, context-additive augmentation can be used together with conventional data augmentation such as jittering and scaling. Thus, the novel concept of context-additive augmentation opens a new direction of data augmentation for sequential data, i.e., time series. Despite its time-series-savvy concept and big benefits, applying context-additive augmentation for self-training is challenging. First, it requires determining the proper number and range of context variations based on the trade-off between the expected performance improvement and training cost. Note that varying the context length in augmented instances incurs different complexity for a downstream task; intuitively, compared with the conventional data augmentation, a short context gives weak augmentation and a long context gives strong augmentation R4: considering the intensity of perturbation. Second, a new consistency regularization method is needed to fully take advantage of the benefit of context-additive augmentation, which does not perturb a target instance unlike the conventional data augmentation. In order to address the aforementioned challenges, we propose a novel self-training approach, called CrossMatch, for time series. In existing self-training methods such as MixMatch (Berthelot et al., 2019) and FixMatch (Sohn et al., 2020) , an artificial label is created as a form of a hard label; because the model's outputs for augmented instances could be biased by the perturbation of the target instance, the most confident label is chosen by averaging and sharpening in MixMatch and weak augmentation and thresholding in FixMatch. In CrossMatch, on the other hand, due to its target-preserving property, the model's outputs with the contexts of the same length are considered equally meaningful, and therefore, the model's output from an augmented instance (i.e., window) is crossed to other augmented instances in the form of a soft label. R2: As shown in Figure 1 (b), a pair of augmented instances generated from a target instance is fed to a model to get the two softmax outputs of the target instance. Then, a single set of cross-window soft labels is derived and enforced to each output for consistency regularization. The same procedure repeats using diverse pairs of augmented instances. In summary, for time-series self-training, CrossMatch conducts context-additive augmentation with varying contexts and consistency regularization among augmented instances using cross-window soft labels. Especially for the first aforementioned challenge, we empirically analyze the impact of context variations on classification accuracy in Section 4.2. Through extensive evaluation using three sparsely-labeled time-series datasets, despite its simplicity CrossMatch is shown to achieve higher classification accuracy than the existing state-of-the-art methods, significantly outperforming the FixMatch style methods with jittering and scaling up to 23p%.

2. RELATED WORK

2.1 DATA AUGMENTATION Data augmentation perturbs given data instances to generate diverse and sufficient data instances to prevent overfitting (Shorten & Khoshgoftaar, 2019) . The techniques used in data augmentation



soft labels ഥ 𝒀 𝑙 𝑢 𝑓 𝜃 𝑋 left , 𝑓 𝜃 𝑋 right , update with cross-window soft labels.

Figure 1: R1,R2,R3,R4: Illustration of CrossMatch.

