INVERSE LEARNING WITH EXTREMELY SPARSE FEED-BACK FOR RECOMMENDATION

Abstract

Negative sampling is widely used in modern recommender systems, where negative data is randomly sampled from the whole item pool. However, such a strategy often introduces false-negative noises. Existing approaches about de-noising recommendation mainly focus on positive instances while ignoring the noise in the large amount of sampled negative feedback. In this paper, we propose a meta learning method to annotate the unlabeled data from loss and gradient perspectives, which considers the noises on both positive and negative instances. Specifically, we first propose inverse dual loss (IDL) to boost the true label learning and prevent the false label learning, based on the loss of unlabeled data towards true and false labels during the training process. To achieve more robust sampling on hard instances, we further propose inverse gradient (IG) to explore the correct updating gradient and adjust the updating based on meta learning. We conduct extensive experiments on a benchmark and an industrially collected dataset where our proposed method can significantly improve AUC by 9.25% against state-of-the-art methods. Further analysis verifies the proposed inverse learning is model-agnostic and can well annotate the labels combined with different recommendation backbones. The source code along with the best hyper-parameter settings is available at this link:

1. INTRODUCTION

As one of the most successful machine learning applications in industry, recommender systems are essential to promote user experience and improve user engagement (Ricci et al., 2011; Xue et al., 2017; Liu et al., 2010b) , which are widely adopted in online services such as E-commerce and Mirco-video platforms. Aiming to capture users' preference towards items based on their historical behaviors, existing recommenders generally focus on explicit or implicit feedback. Specifically, explicit feedback (Liang et al., 2021) refers to rating data that represents the user preference explicitly. However, collecting sufficient explicit data for recommendations is difficult because it requires users to actively provide ratings (Jannach et al., 2018) . In contrast, implicit feedback, such as user clicks, purchases, and views (Liang et al., 2016) , is much richer (Liu et al., 2010b) and frequently used in modern recommender systems (Chen et al., 2020) . Particularly, in feed recommendation for online platforms such as Micro-video, users are passive to receive the recommended items without any active clicking or rating action. That is to say, we have a large number of unlabeled feedback, with extremely sparse labeled data, which becomes a key challenge for recommendation. To tackle the unlabeled feedback, most works (He et al., 2017; Chen et al., 2019) randomly sample unlabeled data and treat it as negative feedback, resulting in unavoidable noise. Specifically, the collected user click data is often treated as positive feedback, and the unclicked data is sampled as negative feedback (He et al., 2017; Chen et al., 2019) . However, there may be some positive unlabeled data sampled by the negative sampling strategy, which means that these instances will be false-negative. There are also some works about hard negative sampling which will decrease the false-positive but increase the false-negative instances (Zhang et al., 2013; Ding et al., 2019; 2020) . These hard negative methods tend to perform poorly when tested on both true positive and negative data instead of true positive but sampled negative data (as shown in Appendix 5). A recent work DenoisingRec (Wang et al., 2021) denoises positive feedback by truncating or reweighing the loss of false-positive instances without further consideration of the noisy negative feedback. In general, existing works either focus only on the positive or the negative perspectives. To resolve the datanoise problem from both positive and negative perspectives at the same time, we propose a novel Inverse Learning approach that automatically labels the unlabeled data via inverse dual loss and inverse gradient, assuming there are both positive and negative feedbacks in the unlabeled data. Firstly, based on the property that the loss of a false positive/negative instance will be larger than that of a true positive/negative instance (Wang et al., 2021) , we assign both positive and negative labels to the unlabeled instance with different weights calculated by inverse dual loss. Specifically, we assign larger (smaller) weight to the label with smaller (larger) loss value. In this way, we take full advantage of true positive/negative instances, and eliminate the noise of false positive/negative instances. To achieve more robust sampling on hard instances, we further propose an inverse gradient method. Here we build a meta-learning process (Finn et al., 2017; Li et al., 2017) and split the training data into training-train and training-test data. We first exploit training-train data to pre-train the model. Then we further use training-test data to validate the correctness of classification on the sampled instance. Specifically, we calculate the gradient for the inverse dual loss of sampled instances as well as the additive inverse of the gradient. The model is optimized by either the direct gradient or the additive inverse of gradient, determined by the split training-test data. Experimental results illustrate that inverse gradient can truly improve the inverse dual loss. In summary, the main contributions of this paper are as follows: • To the best of our knowledge, we are the first to sample both positive and negative data in recommendation, which is far more challenging while existing works only sample negative data. • We propose inverse dual loss to learn the label for sampled instances and further exploit inverse gradient to adjust the false label for hard instances. • We experiment on two real-world datasets, verifying the superiority of our method compared with state-of-the-art approaches. Further studies sustain the effectiveness of our proposed method in label annotation and gradient descent.

2. PROBLEM FORMULATION AND OUR APPROACH

In this section, we will firstly formulate the problem and perform in-depth analysis of existing solutions and their limitations. Then we will propose inverse dual loss to address the limitations of existing works for easy samples. Finally, we further propose inverse gradient to address the limitation of inverse dual loss and make it capable for not only easy samples but also hard samples.

2.1. PROBLEM DEFINITION

The recommendation task aims to model relevance score ŷθ ui = f (u, i|θ) of user u towards item i under parameters θ. The LogLoss function (Zhou et al., 2018; 2019) function to learn ideal parameters θ * is as: L D * (θ) = 1 |D * | (u,i,y * ui )∈D * ℓ ŷθ ui , y * ui , where ℓ ŷθ ui , y * ui = -y * ui log ŷθ ui + (1 -y * ui ) log 1 -ŷθ ui , y * ui ∈ {0, 1} is the feedback of user u towards item i. D * = {(u, i, y * ui ) | u ∈ U, i ∈ I} is the reliable interaction data between all user-item pairs. Indeed, due to the limited collected feedback, the model training is truly formalized as: θ = arg min θ L D l (θ) + L D u (θ), where D l ∼ D * is the collected labeled data, and D u = {(u, i, ȳui ) | u ∈ U, i ∈ I} is the sampled unlabeled data where ȳui = 0 is often assumed in existing recommenders for negative sampling. However, such a strategy will inevitably introduce noise because there are some positive unlabeled instances in the sampled data. As a consequence, a model (i.e., θ) trained with noisy data tends to exhibit suboptimal performance. Thus, our goal is to construct a denoising recommender approximating to the ideal recommender θ * as: θ * = arg min θ L D l (θ) + L denoise D u (θ), where L denoise D u (θ) indicates the loss on unlabeled data with all samples annotated correctly, i.e. denoising sampling.

availability

https://anonymous.4open.science/ r

