MUTUAL PARTIAL LABEL LEARNING WITH COMPETITIVE LABEL NOISE

Abstract

Partial label learning (PLL) is an important weakly supervised learning problem, where each training instance is associated with a set of candidate labels that include both the true label and additional noisy labels. Most existing PLL methods assume the candidate noisy labels are randomly chosen, which hardly holds in real-world learning scenarios. In this paper, we consider a more realistic PLL scenario with competitive label noise that is more difficult to distinguish from the true label than the random label noise. We propose a novel Mutual Learning based PLL approach named ML-PLL to address this challenging problem. ML-PLL learns a prediction network based classifier and a class-prototype based classifier cooperatively through interactive mutual learning and label correction. Moreover, we use a transformation network to model the association relationships between the true label and candidate labels, and learn it together with the prediction network to match the observed candidate labels in the training data and enhance label correction. Extensive experiments are conducted on several benchmark PLL datasets, and the proposed ML-PLL approach demonstrates state-of-the-art performance for partial label learning.

1. INTRODUCTION

As it is costly and difficult to annotate each instance with a precise label, weakly supervised learning (WSL) has been widely studied in recent years (Zhou, 2018) , which includes, but not limited to, semi-supervised learning (Van Engelen & Hoos, 2020; Ouali et al., 2020) , noisy label learning (Natarajan et al., 2013; Feng et al., 2021 ), positive-unlabeled learning (Kiryo et al., 2017; Shu et al., 2020) , and partial multi-label learning (Xie & Huang, 2018; Yan & Guo, 2021) . Partial label learning (PLL) is a typical WSL problem and aims to learn a model from training samples with overcomplete labels; that is, each training sample is associated with a set of candidate labels that include both the true label and additional label noise-noisy labels. PLL has been widely applied in many real-world learning scenarios, including automatic face naming (Hüllermeier & Beringer, 2006; Zeng et al., 2013) , web mining (Luo & Orabona, 2010), and multimedia content analysis (Zeng et al., 2013) . Since the ground-truth label is hidden in the candidate label set and not available to the learning algorithms, the main challenge of PLL lies in candidate label disambiguation. To address this challenge, two main label disambiguation strategies have been proposed: average-based disambiguation strategy and identification-based disambiguation strategy. Average-based disambiguation treats each candidate label equally in the model training phase and averages all the modeling outputs from each candidate label in the testing phase (Cour et al., 2011; Hüllermeier & Beringer, 2006; Zhang & Yu, 2015) . Although this strategy is simple and clear, it can make the true label overwhelmed by noisy labels without sufficient differentiation, and lead to poor prediction performance. The identificationbased disambiguation strategy treats the ground-truth label as a latent variable and tries to identify the true label by deriving different confidence scores for the candidate labels (Feng & An, 2018; 2019; Yao et al., 2020b; Yu & Zhang, 2016; Xu et al., 2021) . The identification-based disambiguation approaches are able to achieve relatively better prediction performance than the average-based disambiguation approaches by handling the candidate labels with discrimination, but they can still suffer from the potential drawback of accumulating label identification errors and severely disrupting the subsequent model training. In addition, these existing methods are usually restricted to standard 1

