HOW SAMPLING AFFECTS TRAINING: AN EFFECTIVE SAMPLING THEORY STUDY FOR LONG-TAILED IMAGE CLASSIFICATION

Abstract

The long-tailed image classification problem has been very challenging for a long time. Suffered from the unbalanced distribution of categories, many deep vision classification methods perform well in the head classes while poor in the tail ones. This paper proposes an effective sampling theory, attempting to provide a theoretical explanation for the decoupling representation and classifier for longtailed image classification. To apply the above sampling theory in practice, a general jitter sampling strategy is proposed. Experiments show that variety of longtailed distribution algorithms exhibit better performance based on the effective sampling theory. The code will be released soon later.

1. INTRODUCTION

The image classification problems are fundamental tasks in computer vision, and many methods based on deep learning have achieved gratifying results on artificially constructed datasets so far. However, due to the large discrepancy between distributions for different classes, the classification model performs very well for head categories, but usually gives an inaccurate prediction for the tail ones at the same time. This phenomena dose not only occurs in image classification, but also in other common vision tasks such as semantic segmentation He et al. 2020) mentions that the mainstream methods for long-tailed distribution requires two stages learning. Sampling process need be conducted within the original distribution to learn in the first step stage for representation, without an ample theoretical explanation for this phenomena however. Inspired by Cui et al. (2019) , we realised that the growth between the actual effective samples and the actual number of samples does not change synchronously in the first training stage, where the effective sample growth formula is given by Cui et al. (2019) . Based on the concept of effective sample, our expanded effective sampling theory is proposed. Here we give two important findings. The total number of effective samples is the primary factor affecting the training for long-tailed distribution, and the second one is the effective sample utilization.The improvement of accuracy on the long-tailed distribution can be achieved through the process of maximizing the total number of effective samples and balancing the effective samples utilization among categories. The main contributions of this paper are as follows:



(2021); Wang et al. (2020a), object detection Ouyang et al. (2016); Li et al. (2020) and so on. Researches on long-tail classification problems mainly focus on the following research perspectives including loss function re-weighting Cao et al. (2019), training data re-sampling Mahajan et al. (2018), and transfer learning strategies in embedding level Liu et al. (2020). The main idea solving the imbalanced classification problem is to enhance the training proportion for the tail categories so as to alleviate the overfitting for the head ones. Kang et al. (2019) points out the strong dependence between the representation learning for backbone network and classifier learning for the the last fully connected layer, and concludes that the optimal gradient for training the backbone network and classifier are obtained from the original sampling distribution and re-sampling distribution such as class-balanced sampling respectively, from which the mainstream of two-stage optimization strategy is gradually accepted by more researchers. Xiang et al. (2020) further alleviates the strong dependence of the single-expert model with a specific training distribution, leading to an improvement of classification accuracy both for head and tail categories. Kang et al. (2019); Zhou et al. (

