SIMPER: SIMPLE SELF-SUPERVISED LEARNING OF PERIODIC TARGETS

Abstract

From human physiology to environmental evolution, important processes in nature often exhibit meaningful and strong periodic or quasi-periodic changes. Due to their inherent label scarcity, learning useful representations for periodic tasks with limited or no supervision is of great benefit. Yet, existing self-supervised learning (SSL) methods overlook the intrinsic periodicity in data, and fail to learn representations that capture periodic or frequency attributes. In this paper, we present SimPer, a simple contrastive SSL regime for learning periodic information in data. To exploit the periodic inductive bias, SimPer introduces customized augmentations, feature similarity measures, and a generalized contrastive loss for learning efficient and robust periodic representations. Extensive experiments on common real-world tasks in human behavior analysis, environmental sensing, and healthcare domains verify the superior performance of SimPer compared to state-of-the-art SSL methods, highlighting its intriguing properties including better data efficiency, robustness to spurious correlations, and generalization to distribution shifts.

Supervised

Self-Supervised SimCLR CVRL SimPer (Ours) 1: Learned representations of different methods on a periodic learning dataset, RotatingDigits (details in Section 4). Existing self-supervised learning schemes fail to capture the underlying periodic or frequency information in data. In contrast, SimPer learns robust periodic representations with high frequency resolution.

1. INTRODUCTION

Practical and important applications of machine learning in the real world, from monitoring the earth from space using satellite imagery (Espeholt et al., 2021) to detecting physiological vital signs in a human being (Luo et al., 2019) , often involve recovering periodic changes. In the health domain, learning from video measurement has shown to extract (quasi-)periodic vital signs including atrial fibrillation (Yan et al., 2018) , sleep apnea episodes (Amelard et al., 2018) and blood pressure (Luo et al., 2019) . In the environmental remote sensing domain, periodic learning is often needed to enable nowcasting of environmental changes such as precipitation patterns or land surface temperature (Sønderby et al., 2020) . In the human behavior analysis domain, recovering the frequency of changes or the underlying temporal morphology in human motions (e.g., gait or hand motions) is crucial for those rehabilitating from surgery (Gu et al., 2019) , or for detecting the onset or progression of neurological conditions such as Parkinson's disease (Liu et al., 2022; Yang et al., 2022b) . While learning periodic targets is important, labeling such data is typically challenging and resource intensive. For example, if designing a method to measure heart rate, collecting videos with highly synchronized gold-standard signals from a medical sensor is time consuming, labor intensive, and requires storing privacy sensitive bio-metric data. Fortunately, given the large amount of unlabeled data, self-supervised learning that captures the underlying periodicity in data would be promising. Yet, despite the great success of self-supervised learning (SSL) schemes on solving discrete classification or segmentation tasks, such as image classification (Chen et al., 2020; He et al., 2020) , object detection (Xiao et al., 2021 ), action recognition (Qian et al., 2021) , or semantic labeling (Hu et al., 2021) , less attention has been paid to designing algorithms that capture periodic or quasi-periodic temporal dynamics from data. Interestingly, we highlight that existing SSL methods inevitably overlook the intrinsic periodicity in data: Fig. 1 shows the UMAP (McInnes et al., 2018) visualization of learned representations on RotatingDigits, a toy periodic learning dataset that aims to recover the underlying rotation frequency of different digits (details in Section 4). As the figure shows, state-ofthe-art (SOTA) SSL schemes fail to capture the underlying periodic or frequency information in the data. Such observations persist across tasks and domains as we show later in Section 4. To fill the gap, we present SimPer, a simple self-supervised regime for learning periodic information in data. Specifically, to leverage the temporal properties of periodic targets, SimPer first introduces a temporal self-contrastive learning framework, where positive and negative samples are obtained through periodicity-invariant and periodicity-variant augmentations from the same input instance. Further, we identify the problem of using conventional feature similarity measures (e.g., cos(•)) for representation, and propose periodic feature similarity to explicitly define how to measure similarity in the context of periodic learning. Finally, to harness the intrinsic continuity of augmented samples in the frequency domain, we design a generalized contrastive loss that extends the classic InfoNCE loss to a soft regression variant that enables contrasting over continuous labels (frequency). To support practical evaluation of SSL of periodic targets, we benchmark SimPer against SOTA SSL schemes on six diverse periodic learning datasets for common real-world tasks in human behavior analysis, environmental remote sensing, and healthcare. Rigorous experiments verify the robustness and efficiency of SimPer on learning periodic information in data. Our contributions are as follows: • We identify the limitation of current SSL methods on periodic learning tasks, and uncover intrinsic properties of learning periodic dynamics with self-supervision over other mainstream tasks. • We design SimPer, a simple & effective SSL framework that learns periodic information in data. • We conduct extensive experiments on six diverse periodic learning datasets in different domains: human behavior analysis, environmental sensing, and healthcare. Rigorous evaluations verify the superior performance of SimPer against SOTA SSL schemes. • Further analyses reveal intriguing properties of SimPer on its data efficiency, robustness to spurious correlations & reduced training data, and generalization to unseen targets. (Qian et al., 2021) . However, current SSL methods have limitations in learning periodic information, as the periodic inductive bias is often overlooked in method design. Our work extends existing SSL frameworks to periodic tasks, and introduces new techniques suitable for learning periodic targets.



Periodic Tasks in Machine Learning. Learning or recovering periodic signals from high dimensional data is prevailing in real-world applications. Examples of periodic learning include recovering and magnifying physiological signals (e.g., heart rate or breathing)(Wu et al., 2012), predicting weather and environmental changes (e.g., nowcasting of precipitation or land surface temperatures)(Sønderby et al., 2020; Espeholt et al., 2021), counting motions that are repetitious (e.g., exercises or therapies)(Dwibedi et al., 2020; Ali et al., 2020), and analyzing human behavior (e.g., gait) (Liu et al., 2022). To date, much prior work has focused on designing customized neural architectures (Liu et al., 2020; Dwibedi et al., 2020), loss functions (Starke et al., 2022), and leveraging relevant learning paradigms including transfer learning (Lu et al., 2018) and meta-learning (Liu et al., 2021) for periodic learning in a supervised manner, with high-quality labels available. In contrast to these past work, we aim to learn robust & efficient periodic representations in a self-supervised manner. He et al., 2020) shows great success in self-supervised representations, where similar embeddings are learned for different views of the same training example (positives), and dissimilar embeddings for different training examples (negatives). Successful extensions have been made to temporal learning domains including video understanding (Jenni et al., 2020) or action classification

availability

Code and data are available at: https://github.com/YyzHarry/SimPer.

