CURRICULUM-INSPIRED TRAINING FOR SELECTIVE NEURAL NETWORKS

Abstract

We consider the problem of training neural network models for selective classification, where the models have the reject option to abstain from predicting certain examples as needed. Recent advances in curriculum learning have demonstrated the benefit of leveraging the example difficulty scores in training deep neural networks for typical classification settings. Example difficulty scores are even more important in selective classification as a lower prediction error rate can be achieved by rejecting hard examples and accepting easy ones. In this paper, we propose a curriculum-inspired method to train selective neural network models by leveraging example difficulty scores. Our method tailors the curriculum idea to selective neural network training by calibrating the ratio of easy and hard examples in each mini-batch, and exploiting difficulty ordering at the mini-batch level. Our experimental results demonstrate that our method outperforms both the state-of-the-art and alternative methods using vanilla curriculum techniques for training selective neural network models.

1. INTRODUCTION

In selective classification, the goal is to design a predictive model that is allowed to abstain from making a prediction whenever it is not sufficiently confident. A model with this reject option is called a selective model. In other words, a selective model will reject certain examples as appropriate, and provide predictions only for accepted examples. In many real-life scenariosfoot_0 , such as medical diagnosis, robotics and self-driving cars (Kompa et al., 2021) , selective models are used to minimize the risk of wrong predictions on the hard examples by abstaining from providing any predictions and possibly seeking human intervention. In this paper, we focus on selective neural network models, which are essentially neural network models with the reject option. These models have been shown to achieve impressive results (Geifman & El-Yaniv, 2019; 2017; Liu et al., 2019) . Specifically, Geifman & El-Yaniv (2019) proposed a neural network model, SELECTIVENET, that allows end-to-end optimization of selective models. SELECTIVENET contains a main body block followed by three heads: one for minimizing the error rate among the accepted examples, one for selecting the examples for acceptance or rejection, and one for the auxiliary task of minimizing the error rate on all examples. These three heads are illustrated later in Figure 1 . The final goal of this model is to minimize the error rate among the accepted examples while satisfying a coverage constraint in terms of the least percentage of examples that need to be accepted. The coverage constraint is imposed to avoid the trivial solution of rejecting all examples to get a 0% error rate. Ideally, the model should reject hard examples and accept easy ones to lower its overall error rate. While it is clear that difficulty scores are helpful, they are typically unknown in most settings. Therefore, to leverage difficulty scores we must overcome two challenges: (1) how to obtain the difficulty scores as accurately as possible and (2) how to best utilize them in a selective neural network model. Recent curriculum learning techniques have investigated how to use example difficulty scores to improve neural network models' performance (Hacohen & Weinshall, 2019; Wu et al., 2020) . To the best of our knowledge, these techniques only consider the typical classification setting where the error rate on all examples should be minimized. Curriculum learning techniques often use a



These are further elaborated in Section A.1 in the appendix.1

