LEARNING FROM ASYMMETRICALLY-CORRUPTED DATA IN REGRESSION FOR SENSOR MAGNITUDE Anonymous

Abstract

This paper addresses a regression problem in which output label values represent the results of sensing the magnitude of a phenomenon. A low value of such labels can either mean that the actual magnitude of the phenomenon has been low or that the sensor has made an incomplete observation. This leads to a bias toward lower values in labels and its resultant learning because labels for incomplete observations are recorded as lower than those for typical observations, even if both have monitored similar phenomena. Moreover, because an incomplete observation does not provide any tags indicating incompleteness, we cannot eliminate or impute them. To address this issue, we propose a learning algorithm that explicitly models the incomplete observations to be corrupted with an asymmetric noise that always has a negative value. We show that our algorithm is unbiased with a regression learned from the uncorrupted data that does not involve incomplete observations. We demonstrate the advantages of our algorithm through numerical experiments.

1. INTRODUCTION

This paper addresses a regression problem for predicting the magnitude of a phenomenon when an observed magnitude involves a particular measurement error. The magnitude typically represents how large a phenomenon is or how strong the nature of the phenomenon is. Such examples of predicting the magnitude are found in several application areas, including pressure, vibration, and temperature (Vandal et al., 2017; Shi et al., 2017; Wilby et al., 2004; Tanaka et al., 2019) . In medicine and healthcare, the magnitude may represent pulsation, respiration, or body movements (Inan et al., 2009; Nukaya et al., 2010; Lee et al., 2016; Alaziz et al., 2016; 2017; Carlson et al., 2018) . More specifically, we learn a regression function to predict the label representing the magnitude of a phenomenon from explanatory variables. The training data consists of pairs of the label and explanatory variables, but note that the label in the data is observed with a sensor and is not necessarily in agreement with the actual magnitude of the phenomenon. We note that we use the term "label" even though we address the regression problem, and it refers to a real-valued label in this paper. In the example of predicting the magnitude of body movements, the label in the data is measured with an intrusive sensor attached to the chest or the wrist, and the explanatory variables are the values measured with non-intrusive bed sensors (Mullaney et al., 1980; Webster et al., 1982; Cole et al., 1992; Tryon, 2013) . A regression function for this example would make it possible to replace intrusive sensors with non-intrusive ones, which in turn will reduce the burden on patients. Although the sensors that measure the label generally have high accuracy, they often make incomplete observations, and such incomplete observations are recorded as low values instead of missing values. This leads to the particular challenge where a low value of the label can either mean that the actual magnitude of the phenomenon has been low or that the sensor has made an incomplete observation, and there are no clues that allow us to tell which is the case. We illustrate this challenge in Fig. 1-(a) . Such incomplete observations are prevalent in measuring the magnitude of a phenomenon. For example, the phenomenon may be outside the coverage of a sensor, or the sensing system may experience temporal mechanical failures. In the example of body movements, the sensor may be temporarily detached from the chest or wrist. In all cases, the sensor keeps recording low values, while the actual magnitude may be high, and no tag indicating incompleteness can be provided. This incomplete observation is particularly severe for the sensor measuring the label since it is single-source and has narrower data coverage. This stems from the fact that the sensor is usually intrusive or it is costly to produce highly accurate observations for measuring the label. Examples of this can be seen in chest or wrist sensors that focus on the movements of a local body part with high accuracy and often miss movements outside their coverage, such as those of parts located far from where the sensor is attached. At most, a single intrusive sensor can be attached to a patient to avoid burdening them. In contrast, the sensors measuring the explanatory variables are usually multi-source and provide broader data coverage. For example, multiple sensors can be attached to various places of a bed and globally monitor the movements of all body parts on the bed but with lower accuracy. One cannot simply ignore the problem that the observations of labels may be incomplete because the estimated regression functions trained on such data with incomplete observations are severely biased toward lower values regardless of the amount of available training data. This bias comes from the fact that incomplete observations always have lower values than the actual magnitude of a phenomenon, and they occur intensively on label sensors, while explanatory variables are usually observed completely. Moreover, incomplete observations can be much more frequent than expected. Unfortunately, since we cannot identify which observations are incomplete, we cannot eliminate or impute them by using existing methods that require identifying incomplete observations. Such methods include thresholding, missing value detection (Pearson, 2006; Qahtan et al., 2018 ), imputation (Enders, 2010; Smieja et al., 2018; Ma & Chen, 2019; Sportisse et al., 2020) , and semi-supervised regression (Zhou & Li, 2005; Zhu & Goldberg, 2009; Jean et al., 2018; Zhou et al., 2019) . The issues of incomplete observations also cannot be solved with robust regression (Huber et al., 1964; Narula & Wellington, 1982; Draper & Smith, 1998; Wilcox, 1997) , which takes into account the possibility that the observed labels contain outliers. While robust regression is an established approach and state-of-the-art against corrupted labels in regression, it assumes symmetric label corruption. Namely, the noise is assumed to not be biased either positively or negatively. Since incomplete observations induce the noise that is severely biased toward lower values, robust regression methods still produce regression functions that are biased toward lower values than the one that would be learned from the data without incomplete observations. In this paper, to mitigate the bias toward lower values, we explicitly assume the existence of the noise from incomplete observations, which always has negative values, in addition to the ordinary symmetric noise. That is, we consider our training data to be asymmetrically-corrupted data. We then formulate a regression problem from our asymmetrically-corrupted data and design a principled learning algorithm for this regression problem. By explicitly modeling the incomplete observation, we derive a learning algorithm that has a rather drastic feature: namely, it ignores the labels that have relatively low values (lower-side labeled data). In other words, our algorithm uses the data whose labels have relatively high values (upper-side labeled data) and the data whose labels are ignored (unlabeled data). Hence, we refer to our algorithm



Figure 1: (a) Low sensor value can either mean actual low magnitude or incomplete observation. (b) Labels for incomplete observations, depicted as triangles, become lower than those of typical observations, depicted as circles or squares.

