NOISE + 2NOISE: CO-TAUGHT DENOISING AUTOEN-CODERS FOR TIME-SERIES DATA

Abstract

We consider the task of learning to recover clean signals given only access to 1 noisy data. Recent work in computer vision has addressed this problem in the 2 context of images using denoising autoencoders (DAEs). However, to date DAEs 3 for learning from noisy data have not been explored in the context of time-series 4



data. DAEs for denoising images often rely on assumptions unlikely to hold in the 5 context of time-series, e.g., multiple noisy samples of the same example. Here, 6 we adapt DAEs to cleaning time-series data with noisy samples only. To recover 7 the clean target signal when only given access to noisy target data, we leverage a 8 noise-free auxiliary time-series signal that is related to the target signal. In addi-9 tion to leveraging the relationship between the target signal and auxiliary signal, 10 we iteratively filter and learn from clean samples using an approach based on co-11 teaching. Applied to the task of recovering carbohydrate values for blood glucose 12 management, our approach reduces noise (MSE) in patient-reported carbohydrates 13 from 72g 2 (95% CI: 54,93) to 18g 2 (13,25), outperforming the best baseline (MSE , 2020) . While these approaches may be considered in a time-series setting (and are treated as 33 baselines here), their applicability is limited as noise in time-series settings is rarely weak or known.

34

Our Contribution. In light of this gap, we adapt denoising autoencoders for time-series data. Our 



33g 2 (27,43)). We demonstrate strong time-series denoising performance, ex-15 tending the applicability of DAEs to a previously under-explored setting. Denoising autoencoders (DAEs) (Vincent et al., 2008) have been used to accurately 18 denoise various signals, including medical images (Gondara, 2016), ECG signals (Xiong et al., 19 2016), and power system measurements (Lin et al., 2019). With respect to time-series data, DAEs 20 have been used for forecasting (Romeu et al., 2015), classification (Zheng et al., 2022) and impu-21 tation (Zhang & Yin, 2019), but generally require access to clean samples at training and do not 22 provide de-noised outputs. In many real-world settings, clean samples are unavailable at training. 23 Work in computer vision has addressed this problem through extensions that either require paired 24 samples (Lehtinen et al., 2018) or rely on patch-based analysis (Krull et al., 2018; Laine et al., 2019; 25 Xie et al., 2020; Batson & Royer, 2019). Similar approaches do not extend to time-series data, 26 where paired samples rarely exist and patch-based techniques do not apply. Beyond approaches that 27 rely on paired samples or patch-based analyses, researchers have recently proposed techniques that 28 utilize knowledge of the noise distribution to recover the clean signal. These approaches either use 29 the properties of the distribution to recover the clean signal after training on noisy data (Kim & Ye, 30 2021; Moran et al., 2019), or rely on the noise having low expectation and variance compared to the 31 signal, in which case a model trained on noisy data can approximate one trained on clean data (Xu 32 et al.

35approach, 'Noise + 2Noise', learns to map a noisy target signal to a clean signal given only noisy 36 samples and an auxiliary clean signal. Inspired by work in image denoising (Lehtinen et al., 2018; 37 Xu et al., 2020), we add additional noise to the noisy target signal during training and attempt to 38 recover the original noisy signal. Provided that the noise has low expectation and variance, a network 39 trained in this manner can learn to recover the true signal because the noise will minimally impact 40 the expected value of the output of the network (Xu et al., 2020). The auxiliary signal is input 41 along with the target signal into a denoising autoencoder, which allows our network to leverage the 42 relationship between the auxiliary and target signals. To address the fact that the signal to noise 43 ratio might not be weak, we adapt a co-teaching approach to train two DAEs (Jiang et al., 2018; 44 1

