GRAPH SIGNAL SAMPLING FOR INDUCTIVE ONE-BIT MATRIX COMPLETION: A CLOSED-FORM SOLUTION

Abstract

Inductive one-bit matrix completion is motivated by modern applications such as recommender systems, where new users would appear at test stage with the ratings consisting of only ones and no zeros. We propose a unified graph signal sampling framework which enjoys the benefits of graph signal analysis and processing. The key idea is to transform each user's ratings on the items to a function (graph signal) on the vertices of an item-item graph, then learn structural graph properties to recover the function from its values on certain vertices -the problem of graph signal sampling. We propose a class of regularization functionals that takes into account discrete random label noise in the graph vertex domain, then develop the GS-IMC approach which biases the reconstruction towards functions that vary little between adjacent vertices for noise reduction. Theoretical result shows that accurate reconstructions can be achieved under mild conditions. For the online setting, we develop a Bayesian extension, i.e., BGS-IMC which considers continuous random Gaussian noise in the graph Fourier domain and builds upon a predictioncorrection update algorithm to obtain the unbiased and minimum-variance reconstruction. Both GS-IMC and BGS-IMC have closed-form solutions and thus are highly scalable in large data as verified on public benchmarks.

1. INTRODUCTION

In domains such as recommender systems and social networks, only "likes" (i.e., ones) are observed in the system and service providers (e.g, Netflix) are interested in discovering potential "likes" for existing users to stimulate demand. This motivates the problem of 1-bit matrix completion (OBMC), of which the goal is to recover missing values in an n-by-m item-user matrix R ∈ {0, 1} n×m . We note that R i,j = 1 means that item i is rated by user j, but R i,j = 0 is essentially unlabeled or unknown which is a mixture of unobserved positive examples and true negative examples. However, in real world new users, who are not exposed to the model during training, may appear at testing stage. This fact stimulates the development of inductive 1-bit matrix completion, which aims to recover unseen vector y ∈ {0, 1} n from its partial positive entries Ω + ⊆ {j|y j = 1} at test time. Fig. 1 (a) emphasizes the difference between conventional and inductive approaches. More formally, let M ∈ {0, 1} n×(m+1) denote the underlying matrix, where only a subset of positive examples Ψ is randomly sampled from {(i, j)|M i,j = 1, i ≤ n, j ≤ m} such that R i,j = 1 for (i, j) ∈ Ψ and R i,j = 0 otherwise. Consider (m+1)-th column y out of matrix R, we likewise denote its observations s i = 1 for i ∈ Ω + and s i = 0 otherwise. We note that the sampling process here assumes that there exists a random label noise ξ which flips a 1 to 0 with probability ρ, or equivalently s = y + ξ where ξ i = -1 for i ∈ {j|y j = 1} -Ω + , and ξ i = 0 otherwise. (1) Fig. 1 (a) presents an example of s, y, ξ to better understand their relationships. Fundamentally, the reconstruction of true y from corrupted s bears a resemblance with graph signal sampling. Fig. 1(b) shows that the item-user rating matrix R can be used to define a homogeneous

funding

* Junchi Yan is the correspondence author who is also with Shanghai AI Laboratory. The work was in part supported by NSFC (62222607), STCSM (22511105100).

