PROGRESSIVE PURIFICATION FOR INSTANCE-DEPENDENT PARTIAL LABEL LEARNING

Abstract

Partial label learning (PLL) aims to train multi-class classifiers from instances with partial labels (PLs)-a PL for an instance is a set of candidate labels where a fixed but unknown candidate is the true label. In the last few years, the instanceindependent generation process of PLs has been extensively studied, on the basis of which many practical and theoretical advances have been made in PLL, whereas relatively less attention has been paid to the practical setting of instancedependent PLs, namely, the PL depends not only on the true label but the instance itself. In this paper, we propose a theoretically grounded and practically effective approach called PrOgressive Purification (POP) for instance-dependent PLL: in each epoch, POP updates the learning model while purifying each PL for the next epoch of the model training by progressively moving out false candidate labels. Theoretically, we prove that POP enlarges the region appropriately fast where the model is reliable, and eventually approximates the Bayes optimal classifier with mild assumptions; technically, POP is flexible with arbitrary losses and compatible with deep networks, so that the previous advanced PLL losses can be embedded in it and the performance is often significantly improved.



. We observe that these existing theoretical works have focused on the instance-independent setting where the generation process of partial labels is homogeneous across training examples. With an explicit formulation of the generation process, the asymptotical consistency Mohri et al. (2018) of the methods, namely, whether the classifier learned from partial labels approximates the Bayes optimal classifier, can be analyzed. However, the instance-independent process cannot model the real world well since data labeling is prone to different levels of error in tasks of varying difficulty. Intuitively, instance-dependent (ID) partial labels should be quite realistic as some poor-quality or ambiguous instances are more difficult to be labeled with an exact true label. Although the instance-independent setting has been extensively studied, on the basis of which many practical and theoretical advances have been made 1



deep neural networks owe their popularity much to their ability to (nearly) perfectly memorize large numbers of training examples, and the memorization is known to decrease the generalization error Feldman (2020). On the other hand, scaling the acquisition of examples for training neural networks inevitably introduces non-fully supervised data annotation, a typical example among which is partial label Nguyen & Caruana (2008); Cour et al. (2011); Zhang et al. (2016; 2017b); Feng & An (2018); Xu et al. (2019); Yao et al. (2020b); Lv et al. (2020); Feng et al. (2020b); Wen et al. (2021)-a partial label for an instance is a set of candidate labels where a fixed but unknown candidate is the true label. Partial label learning (PLL) trains multi-class classifiers from instances that are associated with partial labels. It is therefore apparent that some techniques should be applied to prevent memorizing the false candidate labels when PLL resorts to deep learning, and unfortunately, empirical evidence has shown general-purpose regularization cannot achieve that goal Lv et al. (2021). A large number of deep PLL algorithms have recently emerged that aimed to design regularizers Yao et al. (2020a;b); Lyu et al. (2022) or network architectures Wang et al. (2022a) for PLL data. Further, there are some PLL works that provided theoretical guarantees while making their methods compatible with deep networks Lv et al. (2020); Feng et al. (2020b); Wen et al. (2021); Wu & Sugiyama

