OUT-OF-DISTRIBUTION REPRESENTATION LEARNING FOR TIME SERIES CLASSIFICATION

Abstract

Time series classification is an important problem in the real world. Due to its nonstationary property that the distribution changes over time, it remains challenging to build models for generalization to unseen distributions. In this paper, we propose to view time series classification from the distribution perspective. We argue that the temporal complexity of a time series dataset could attribute to unknown latent distributions that need characterize. To this end, we propose DIVERSIFY for outof-distribution (OOD) representation learning on dynamic distributions of times series. DIVERSIFY takes an iterative process: it first obtains the 'worst-case' latent distribution scenario via adversarial training, then reduces the gap between these latent distributions. We then show that such an algorithm is theoretically supported. Extensive experiments are conducted on seven datasets with different OOD settings across gesture recognition, speech commands recognition, wearable stress and affect detection, and sensor-based human activity recognition. Qualitative and quantitative results demonstrate that DIVERSIFY significantly outperforms other baselines and effectively characterizes the latent distributions. Code is available at https://github.com/microsoft/robustlearn.

1. INTRODUCTION

Time series classification is one of the most challenging problems in the machine learning and statistics community (Fawaz et al., 2019; Du et al., 2021) . One important nature of time series is the non-stationary property, indicating that its statistical features are changing over time. For years, there have been tremendous efforts for time series classification, such as hidden Markov models (Fulcher & Jones, 2014) , RNN-based methods (Hüsken & Stagge, 2003), and Transformer-based approaches (Li et al., 2019; Drouin et al., 2022) . We propose to model time series from the distribution perspective to handle its dynamically changing distributions; more precisely, to learn out-of-distribution (OOD) representations for time series that generalize to unseen distributions. The general OOD/domain generalization problem has been extensively studied (Wang et al., 2022; Lu et al., 2022; Krueger et al., 2021; Rame et al., 2022) , where the key is to bridge the gap between known and unknown distributions. Despite existing efforts, OOD in time series remains less studied and more challenging. Compared to image classification, the dynamic distribution of time series data keeps changing over time, containing diverse distribution information that should be harnessed for better generalization. Figure 1 shows an illustrative example. OOD generalization in image classification often involves several domains whose domain labels are static and known (subfigure (a)), which can be employed to build OOD models. However, Figure 1 (b) shows that in EMG time series data (Lobov et al., 2018) , the distribution is changing dynamically over time and its domain information is unavailable. If no attention is paid to exploring its latent distributions (i.e., sub-domains), predictions may fail in face of diverse sub-domain distributions (subfigure (c)). This will dramatically impede existing OOD algorithms due to their reliance on domain information. In this work, we propose DIVERSIFY, an OOD representation learning algorithm for time series classification by characterizing the latent distributions inside the data. Concretely speaking, DI-

