AUTO SEG-LOSS: SEARCHING METRIC SURROGATES FOR SEMANTIC SEGMENTATION

Abstract

Designing proper loss functions is essential in training deep networks. Especially in the field of semantic segmentation, various evaluation metrics have been proposed for diverse scenarios. Despite the success of the widely adopted crossentropy loss and its variants, the mis-alignment between the loss functions and evaluation metrics degrades the network performance. Meanwhile, manually designing loss functions for each specific metric requires expertise and significant manpower. In this paper, we propose to automate the design of metric-specific loss functions by searching differentiable surrogate losses for each metric. We substitute the non-differentiable operations in the metrics with parameterized functions, and conduct parameter search to optimize the shape of loss surfaces. Two constraints are introduced to regularize the search space and make the search efficient. Extensive experiments on PASCAL VOC and Cityscapes demonstrate that the searched surrogate losses outperform the manually designed loss functions consistently. The searched losses can generalize well to other datasets and networks.

1. INTRODUCTION

Loss functions are of indispensable components in training deep networks, as they drive the feature learning process for various applications with specific evaluation metrics. However, most metrics, like the commonly used 0-1 classification error, are non-differentiable in their original forms and cannot be directly optimized via gradient-based methods. Empirically, the cross-entropy loss serves well as an effective surrogate objective function for a variety of tasks concerning categorization. This phenomenon is especially prevailing in image semantic segmentation, where various evaluation metrics have been designed to address the diverse task focusing on different scenarios. Some metrics measure the accuracy on the whole image, while others focus more on the segmentation boundaries. Although cross-entropy and its variants work well for many metrics, the mis-alignment between network training and evaluation still exist and inevitably leads to performance degradation. Typically, there are two ways for designing metric-specific loss functions in semantic segmentation. The first is to modify the standard cross-entropy loss to meet the target metric (Ronneberger et al., 2015; Wu et al., 2016) . The other is to design other clever surrogate losses for specific evaluation metrics (Rahman & Wang, 2016; Milletari et al., 2016) . Despite the improvements, these handcrafted losses need expertise and are non-trivial to extend to other evaluation metrics. In contrast to designing loss functions manually, an alternative approach is to find a framework that can design proper loss functions for different evaluation metrics in an automated manner, motivated by recent progress in AutoML (Zoph & Le, 2017; Pham et al., 2018; Liu et al., 2018; Li et al., 2019) . Although automating the design process for loss functions is attractive, it is non-trivial to apply an AutoML framework to loss functions. Typical AutoML algorithms require a proper search space, in which some search algorithms are conducted. Previous search spaces are either unsuitable for loss design, or too general to be searched efficiently. Recently Li et al. (2019) and Wang et al. (2020) proposed search spaces based on existing handcrafted loss functions. And the algorithm searches for the best combination. However, these search spaces are still limited to the variants of cross-entropy loss, and thus do not address the mis-alignment problem well. In this paper, we propose a general framework for searching surrogate losses for mainstream nondifferentiable segmentation metrics. The key idea is that we can build the search space according to the form of evaluation metrics. In this way, the training criteria and evaluation metrics are unified. Meanwhile, the search space is compact enough for efficient search. Specifically, the metrics are first relaxed to the continuous domain by substituting the one-hot prediction and logical operations, which are the non-differentiable parts in most metrics, with their differentiable approximations. Parameterized functions are introduced to approximate the logical operations, ensuring that the loss surfaces are smooth while effective for training. The loss parameterization functions can be of arbitrary families defined on [0, 1]. Parameter search is further conducted on the chosen family so as to optimize the network performance on the validation set with the given evaluation metric. Two essential constraints are introduced to regularize the parameter search space. We find that the searched surrogate losses can effectively generalize to different networks and datasets. Extensive experiments on Pascal VOC (Everingham et al., 2015) and Cityscapes (Cordts et al., 2016) show our approach delivers accuracy superior than the existing losses specifically designed for individual segmentation metrics with a mild computational overhead. Our contributions can be summarized as follows: 1) Our approach is the first general framework of surrogate loss search for mainstream segmentation metrics. 2) We propose an effective parameter regularization and parameter search algorithm, which can find loss surrogates optimizing the target metric performance with mild computational overhead. 3) The surrogate losses obtained via the proposed searching framework promote our understandings on loss function design and by themselves are novel contributions, because they are different from existing loss functions specifically designed for individual metrics, and are transferable across different datasets and networks.

2. RELATED WORK

Loss function design is an active topic in deep network training (Ma, 2020) . In the area of image semantic segmentation, cross-entropy loss is widely used (Ronneberger et al., 2015; Chen et al., 2018) . But the cross-entropy loss is designed for optimizing the global accuracy measure (Rahman & Wang, 2016; Patel et al., 2020) , which is not aligned with many other metrics. Numerous studies are conducted to design proper loss functions for the prevalent evaluation metrics. For the mIoU metric, many works (Ronneberger et al., 2015; Wu et al., 2016) incorporate class frequency to mitigate the class imbalance problem. For the boundary F1 score, the losses at boundary regions are up-weighted (Caliva et al., 2019; Qin et al., 2019) , so as to deliver more accurate boundaries. These works carefully analyze the property of specific evaluation metrics, and design the loss functions in a fully handcrafted way, which needs expertise. By contrast, we propose a unified framework for deriving parameterized surrogate losses for various evaluation metrics. Wherein, the parameters are searched by reinforcement learning in an automatic way. The networks trained with the searched surrogate losses deliver accuracy on par or even superior than those with the best handcrafted losses. Direct loss optimization for non-differentiable evaluation metrics has long been studied for structural SVM models (Joachims, 2005; Yue et al., 2007; Ranjbar et al., 2012) . However, the gradients w.r.t. features cannot be derived from these approaches. Therefore, they cannot drive the training of deep networks through back-propagation. Hazan et al. (2010) proposes to optimize structural SVM with gradient descent, where loss-augmented inference is applied to get the gradients of the expectation of evaluation metrics. Song et al. (2016) further extends this approach to non-linear models (e.g., deep neural networks). However, the computational complexity is very high during each step in gradient descent. Although Song et al. (2016) and Mohapatra et al. ( 2018) have designed efficient algorithms for the Average Precision (AP) metric, other metrics still need specially designed efficient algorithms. Our method, by contrast, is general for the mainstream segmentation metrics. Thanks to the good generalizability, our method only needs to perform the search process

availability

://github.com/fundamentalvision/ Auto

