DBT: A DETECTION BOOSTER TRAINING METHOD FOR IMPROVING THE ACCURACY OF CLASSIFIERS

Abstract

Deep learning models owe their success at large, to the availability of a large 1 amount of annotated data. They try to extract features from the data that contain 2 useful information needed to improve their performance on target applications. Most works focus on directly optimizing the target loss functions to improve the 4



accuracy by allowing the model to implicitly learn representations from the data. 



There has not been much work on using background/noise data to estimate the 6 statistics of in-domain data to improve the feature representation of deep neural 7 networks. In this paper, we probe this direction by deriving a relationship between 8 the estimation of unknown parameters of the probability density function (pdf) 9 of input data and classification accuracy. Using this relationship, we show that 10 having a better estimate of the unknown parameters using background and in-11 domain data provides better features which leads to better accuracy. Based on 12 this result, we introduce a simple but effective detection booster training (DBT) 13 method that applies a detection loss function on the early layers of a neural network 14 to discriminate in-domain data points from noise/background data, to improve 15 the classifier accuracy. The background/noise data comes from the same family 16 of pdfs of input data but with different parameter sets (e.g., mean, variance). In 17 addition, we also show that our proposed DBT method improves the accuracy even 18 with limited labeled in-domain training samples as compared to normal training. 19 We conduct experiments on face recognition, image classification, and speaker 20 classification problems and show that our method achieves superior performance 21 over strong baselines across various datasets and model architectures.

systems achieve outstanding accuracies on a vast domain of challenging 24 computer vision, natural language, and speech recognition benchmarks(Russakovsky et al. (2015); 25 Lin et al. (2014); Everingham et al. (2015); Panayotov et al. (2015)). The success of deep learning 26 approaches relies on the availability of a large amount of annotated data and on extracting useful 27 features from them for different applications. Learning rich feature representations from the available 28 data is a challenging problem in deep learning. A related line of work includes learning deep latent 29 space embedding through deep generative models (Kingma & Welling (2014); Goodfellow et al. 30 (2014); Berthelot et al. (2019) or using self-supervised learning methods (Noroozi & Favaro (2016); 31 Gidaris et al. (2018); Zhang et al. (2016b)) or through transfer learning approaches (Yosinski et al. 32 (2014); Oquab et al. (2014); Razavian et al. (2014)). 33 In this paper, we propose to use a different approach to improve the feature representations of deep 34 neural nets and eventually improve their accuracy by estimating the unknown parameters of the 35 probability density function (pdf) of input data. Parameter estimation or Point estimation methods 36 are well studied in the field of statistical inference (Lehmann & Casella (1998)). The insights from 37 the theory of point estimation can help us to develop better deep model architectures for improving 38 the model's performance. We make use of this theory to derive a correlation between the estimation 39 of unknown parameters of pdf and classifier outputs. However, directly estimating the unknown 40 pdf parameters for practical problems such as image classification is not feasible since it can sum 41 up to millions of parameters. In order to overcome this bottleneck, we assume that the input data 42 points are sampled from a family of pdfs instead of a single pdf and propose to use a detection 43 based training approach to better estimate the unknowns using in-domain and background/noise data.

