LEARNING LABEL ENCODINGS FOR DEEP REGRESSION

Abstract

Deep regression networks are widely used to tackle the problem of predicting a continuous value for a given input. Task-specialized approaches for training regression networks have shown significant improvement over generic approaches, such as direct regression. More recently, a generic approach based on regression by binary classification using binary-encoded labels has shown significant improvement over direct regression. The space of label encodings for regression is large. Lacking heretofore have been automated approaches to find a good label encoding for a given application. This paper introduces Regularized Label Encoding Learning (RLEL) for end-to-end training of an entire network and its label encoding. RLEL provides a generic approach for tackling regression. Underlying RLEL is our observation that the search space of label encodings can be constrained and efficiently explored by using a continuous search space of real-valued label encodings combined with a regularization function designed to encourage encodings with certain properties. These properties balance the probability of classification error in individual bits against error correction capability. Label encodings found by RLEL result in lower or comparable errors to manually designed label encodings. Applying RLEL results in 10.9% and 12.4% improvement in Mean Absolute Error (MAE) over direct regression and multiclass classification, respectively. Our evaluation demonstrates that RLEL can be combined with off-the-shelf feature extractors and is suitable across different architectures, datasets, and tasks. Code is available at https://github.com/ubc-aamodt-group/RLEL_regression.

1. INTRODUCTION

Deep regression is an important problem with applications in several fields, including robotics and autonomous vehicles. Recently, neural radiance fields (NeRF) regression networks have shown promising results in novel view synthesis, 3D reconstruction, and scene representation (Liu et al., 2020; Yu et al., 2021) . However, a typical generic approach to direct regression, in which the network is trained by minimizing the mean squared or absolute error between targets and predictions, performs poorly compared to task-specialized approaches (Yang et al., 2018; Ruiz et al., 2018; Niu et al., 2016; Fu et al., 2018) . Recently, generic approaches based on regression by binary classification have shown significant improvement over direct regression using custom-designed label encodings (Shah et al., 2022) . In this approach, a real-valued label is quantized and converted to an M -bit binary code, and these binary-encoded labels are used to train M binary classifiers. In the prediction phase, the output code of classifiers is converted to real-valued prediction using a decoding function. Moreover, binary-encoded labels have been proposed for ordinal regression (Li & Lin, 2006; Niu et al., 2016) and multiclass classification (Allwein et al., 2001; Cissé et al., 2012) . The use of binary-encoded labels for regression has multiple advantages. Additionally, predicting a set of values (e.g., classifiers' output) instead of one value (direct regression) introduces ensemble diversity, which improves accuracy (Song et al., 2021) . Furthermore, encoded labels introduce redundancy in the label presentation, which improves error correcting capability and accuracy (Dietterich & Bakiri, 1995) . Finding suitable label encoding for a given problem is challenging due to the vast design space. Related work on ordinal regression has primarily leveraged unary codes (Li & Lin, 2006; Niu et al., 2016; Fu et al., 2018) . Different approaches for label encoding design, including autoencoder, random search, and simulated annealing, have been proposed to design suitable encoding for multiclass classification (Cissé et al., 2012; Dietterich & Bakiri, 1995; Song et al., 2021) . However, these encodings 1

