LOSS FUNCTION DISCOVERY FOR OBJECT DE-TECTION VIA CONVERGENCE-SIMULATION DRIVEN SEARCH

Abstract

Designing proper loss functions for vision tasks has been a long-standing research direction to advance the capability of existing models. For object detection, the well-established classification and regression loss functions have been carefully designed by considering diverse learning challenges (e.g. class imbalance, hard negative samples, and scale variances). Inspired by the recent progress in network architecture search, it is interesting to explore the possibility of discovering new loss function formulations via directly searching the primitive operation combinations. So that the learned losses not only fit for diverse object detection challenges to alleviate huge human efforts, but also have better alignment with evaluation metric and good mathematical convergence property. Beyond the previous auto-loss works on face recognition and image classification, our work makes the first attempt to discover new loss functions for the challenging object detection from primitive operation levels and finds the searched losses are insightful. We propose an effective convergence-simulation driven evolutionary search algorithm, called CSE-Autoloss, for speeding up the search progress by regularizing the mathematical rationality of loss candidates via two progressive convergence simulation modules: convergence property verification and model optimization simulation. CSE-Autoloss involves the search space (i.e. 21 mathematical operators, 3 constant-type inputs, and 3 variable-type inputs) that cover a wide range of the possible variants of existing losses and discovers best-searched loss function combination within a short time (around 1.5 wall-clock days with 20x speedup in comparison to the vanilla evolutionary algorithm). We conduct extensive evaluations of loss function search on popular detectors and validate the good generalization capability of searched losses across diverse architectures and various datasets. Our experiments show that the best-discovered loss function combinations outperform default combinations (Cross-entropy/Focal loss for classification and L1 loss for regression) by 1.1% and 0.8% in terms of mAP for two-stage and one-stage detectors on COCO respectively. Our searched losses are available at

1. INTRODUCTION

The computer vision community has witnessed substantial progress in object detection in recent years. The advances for the architecture design, e.g. two-stage detectors (Ren et al., 2015; Cai & Vasconcelos, 2018) and one-stage detectors (Lin et al., 2017b; Tian et al., 2019) , have remarkably pushed forward the state of the art. The success cannot be separated from the sophisticated design for training objective, i.e. loss function. Traditionally, two-stage detectors equip the combination of Cross-entropy loss (CE) and L1 loss/Smooth L1 loss (Girshick, 2015) for bounding box classification and regression respectively. In contrast, one-stage detectors, suffering from the severe positive-negative sample imbalance due to dense sampling of possible object locations, introduce Focal loss (FL) (Lin et al., 2017b) to alleviate the imbalance issue. However, optimizing object detectors with traditional hand-crafted loss functions may lead to sub-optimal solutions due to the limited connection with the evaluation metric (e.g. AP). et al., 2020) , optimize IoU between predicted and target directly. These works manifest the necessity of developing effective loss functions towards better alignment with evaluation metric for object detection, while they heavily rely on careful design and expertise experience. In this work, we aim to discover novel loss functions for object detection automatically to reduce human burden, inspired by the recent progress in network architecture search (NAS) and automated machine learning (AutoML) (Cai et al., 2019; Liu et al., 2020a) . Different from Wang et al. ( 2020) and Li et al. ( 2019b) that only search for particular hyper-parameters within the fixed loss formula, we steer towards finding new forms of the loss function. Notably, AutoML-Zero (Real et al., 2020) proposes a framework to construct ML algorithm from simple mathematical operations, which motivates us to design loss functions from primitive mathematical operations with evolutionary algorithm. However, it encounters a severe issue that a slight variation of operations would lead to a huge performance drop, which is attributed to the sparse action space. Therefore, we propose a novel Convergence-Simulation driven Evolutionary search algorithm, named CSE-Autoloss, to alleviate the sparsity issue. Benefit from the flexibility and effectiveness of the search space, as Figure 1 shows, CSE-Autoloss discovers distinct loss formulas with comparable performance with the Crossentropy loss, such as (b) and (c) in the figure. Moreover, the best-searched loss function (d), named CSE-Autoloss-A cls , outperformed CE loss by a large margin. Specifically, to get preferable loss functions, CSE-Autoloss contains a well-designed search space, including 20 primitive mathematical operations, 3 constant-type inputs, and 3 variable-type inputs, which can cover a wide range of existing popular hand-crafted loss functions. Besides, to tackle the sparsity issue, CSE-Autoloss puts forward progressive convergence-simulation modules, which verify the evolved loss functions



Figure 1: Computation graphs of loss function examples: (a) Cross-entropy loss; (b, c) Searched loss candidates with comparable performance to Cross-entropy loss; (d) Searched best-performed loss named CSE-Autoloss-A cls .

availability

https://github.com/PerdonLiu/CSE-Autoloss.

