ROBUST LOSS FUNCTIONS FOR COMPLEMENTARY LABELS LEARNING

Abstract

In ordinary-label learning, the correct label is given to each training sample. 1 Similarly, a complementary label is also provided for each training sample in 2 complementary-label learning. A complementary label indicates a class that the 3 example does not belong to. Robust learning of classifiers has been investi-4



gated from many viewpoints under label noise, but little attention has been paid 5 to complementary-label learning. In this paper, we present a new algorithm of 6 complementary-label learning with the robustness of loss function. We also pro-7 vide two sufficient conditions on a loss function so that the minimizer of the risk 8 for complementary labels is theoretically guaranteed to be consistent with the min-9 imizer of the risk for ordinary labels. Finally, the empirical results validate our 10 method's superiority to current state-of-the-art techniques. Especially in cifar10, 11 our algorithm achieves a much higher test accuracy than the gradient ascent algo-12 rithm, and the parameters of our model are less than half of the ResNet-34 they 13 used. Label is only indicating that the class label of a sample is incorrect. In the view of label noise, 28 complementary labels can also be viewed as noise labels but without any true labels in the training 29 set. Our task is to learn a classifier from the given complementary labels, predicting a correct label 30 for a given sample. Collecting complementary labels is much easier and efficient than choosing a 31 true class from many candidate classes precisely. For example, the label-system uniformly chooses a 32 label for a sample. It has a probability of 1 k to be ordinary-label but k-1 k to be complementary-label. 33 Moreover, another potential application of complementary-label is data privacy. For example, on 34 some privacy issues, it is much easier to collect complementary-label than ordinary-label.

35

Robust learning of classifiers has been investigated from many viewpoints in the presence of label 36 noise Ghosh et al. ( 2017), but little attention paid to complementary-label learning. We call a loss 37 function robust if the minimizer of risk under that loss function with complementary labels would be 38 the same as that with ordinary labels. The robustness of risk minimization relies on the loss function 39 used in the training set.

40

This paper presents a general risk formulation that category cross-entropy loss (CCE) can be used to 41 learn with complementary labels and achieve robustness. We then offer some innovative analytical 42 results on robust loss functions under complementary labels. Having robustness of risk minimization 43



have exhibited excellent performance in many real-applications. Yet, their 16 supper performance is based on the correctly labeled large-scale training set. However, labeling 17 such a large-scale dataset is time-consuming and expensive. For example, the crowd-workers need 18 to select the correct label for a sample from 100 labels for CIFAR100. To migrate this problem, 19 reachers have proposed many solutions to learn from weak-supervision: Noise-label learning Li 20 et al. (2017); Hu et al. (2019); Lee et al. (2018); Xia et al. (2019), semi-supervised learning Zhai 21 et al. (2019); Berthelot et al. (2019); Rasmus et al. (2015); Miyato et al. (2019); Sakai et al. (2017), 22 similar-unlabeled learning Tanha (2019); Bao et al. (2018); Zelikovitz & Hirsh (2000), unlabeled-23 unlabeled learning Lu et al. (2018); Chen et al. (2020a;b), positive-unlabeled learning Elkan & Noto 24 (2008); du Plessis et al. (2014); Kiryo et al. (2017), contrast learning Chen et al. (2020a;b), partial 25 label learning Cour et al. (2011); Feng & An (2018); Wu & Zhang (2018) and others. 26 We investigate complementary-label learning Ishida et al. (2017) in this paper. A complementary 27

