ESTIMATING EXAMPLE DIFFICULTY USING VARIANCE OF GRADIENTS Anonymous

Abstract

In machine learning, a question of great interest is understanding what examples are challenging for a model to classify. Identifying atypical examples helps inform safe deployment of models, isolates examples that require further human inspection, and provides interpretability into model behavior. In this work, we propose Variance of Gradients (VoG) as a valuable and efficient proxy metric for detecting outliers in the data distribution. We provide quantitative and qualitative support that VoG is a meaningful way to rank data by difficulty and to surface a tractable subset of the most challenging examples for human-in-the-loop auditing. Data points with high VoG scores are far more difficult for the model to learn and over-index on corrupted or memorized examples.

1. INTRODUCTION

Reasoning about model behavior is often easier when presented with a subset of data points that are relatively more difficult for a trained model to learn. This not only aids interpretability through case based reasoning (Kim et al., 2016; Caruana, 2000; Hooker et al., 2019) , but can also be used as a mechanism to surface a tractable subset of atypical examples for further human auditing (Leibig et al., 2017; Zhang, 1992; Hooker et al., 2019) , for active learning to inform model improvements, or to choose not to classify certain examples when the model is uncertain (Bartlett & Wegkamp, 2008; Cortes et al., 2016) . One of the biggest bottlenecks for human auditing is the large scale size of modern datasets and the cost of annotating each feature (Veale & Binns, 2017) . Methods which automatically surface a subset of relatively more challenging examples for human inspection help prioritize limited human annotation and auditing time. Despite the urgency of this use-case, ranking examples by difficulty has had limited treatment in the context of deep neural networks due to the computational cost of ranking a high dimensional feature space. Recent work in this direction has either been limited to small scale datasets or features a computational cost which is infeasible for most practitioners (Hooker et al., 2019; Carlini et al., 2019; Koh & Liang, 2017) . In this work, we start with a simple hypothesis -examples that a model has difficulty learning will exhibit higher variance in gradient updates over the course of training. On the other hand, we expect the backpropagated gradients of the samples that are relatively easier to learn will have lower variance because performance on that example does not consistently dominate the loss over the course of training. The gradient updates for the relatively easier examples are expected to stabilize early in training and converge to a narrow range of values. We term this class normalized ranking mechanism Variance of Gradients VoG, and demonstrate across a variety of large scale datasets that it efficiently ranks the difficulty of both training and test examples. VoG can be computed using either the predicted or true label, making it a valuable unsupervised auditing tool at test time when the true label is unknown. Validating the behavior of VoG on artificial data. To begin, we illustrate the principle and effectiveness of VoG in a contrived toy example setting. The data was generated using two separate isotropic Gaussian clusters with a total of 500 data points. In such a simple low dimensional problem, the most challenging examples for the model to classify are closer to the decision boundary. In Fig. 1a we visualize the trained decision boundary of a multiple layer perceptron (MLP) with a single hidden layer trained for 15 epochs. VoG is computed at relative intervals for each training data point.

