DECOMPOSITIONAL GENERATION PROCESS FOR INSTANCE-DEPENDENT PARTIAL LABEL LEARNING

Abstract

Partial label learning (PLL) is a typical weakly supervised learning problem, where each training example is associated with a set of candidate labels among which only one is true. Most existing PLL approaches assume that the incorrect labels in each training example are randomly picked as the candidate labels and model the generation process of the candidate labels in a simple way. However, these approaches usually do not perform as well as expected due to the fact that the generation process of the candidate labels is always instance-dependent. Therefore, it deserves to be modeled in a refined way. In this paper, we consider instancedependent PLL and assume that the generation process of the candidate labels could decompose into two sequential parts, where the correct label emerges first in the mind of the annotator but then the incorrect labels related to the feature are also selected with the correct label as candidate labels due to uncertainty of labeling. Motivated by this consideration, we propose a novel PLL method that performs Maximum A Posterior (MAP) based on an explicitly modeled generation process of candidate labels via decomposed probability distribution models. Extensive experiments on manually corrupted benchmark datasets and real-world datasets validate the effectiveness of the proposed method.



It is challenging to avoid overfitting on candidate labels, especially when the candidate labels depend on instances. Therefore, the previous methods assume that the candidate labels are instanceindependent. Unfortunately, this often tends to be the case that the incorrect labels related to the feature are more likely to be picked as candidate label set for each instance. Recent work Xu et al. (2021) has also shown that the presence of instance-dependent PLL imposes additional challenges but is more realistic in practice than the instance-independent case.



learning (PLL) aims to deal with the problem where each instance is provided with a set of candidate labels, only one of which is the correct label. The problem of learning from partial label examples naturally arises in a number of real-world scenarios such as web data mining Luo & Orabona (2010), multimedia content analysis Zeng et al. (2013); Chen et al. (2017), and ecoinformatics Liu & Dietterich (2012); Tang & Zhang (2017). A number of methods have been proposed to improve the practical performance of PLL. Identificationbased PLL approaches Jin & Ghahramani (2002); Nguyen & Caruana (2008); Liu & Dietterich (2012); Chen et al. (2014); Yu & Zhang (2016) regard the correct label as a latent variable and try to identify it. Average-based approaches Hüllermeier & Beringer (2006); Cour et al. (2011); Zhang & Yu (2015) treat all the candidate labels equally and average the modeling outputs as the prediction. In addition, risk-consistent methods Feng et al. (2020); Wen et al. (2021) and classifier-consistent methods Lv et al. (2020); Feng et al. (2020) are proposed for deep models. Furthermore, aimed at deep models, Wang et al. (2022) investigate contrastive representation learning, Zhang et al. (2021) adapt the class activation map, and Wu et al. (2022) revisit consistency regularization in PLL.

