COLLABORATIVE FILTERING WITH SMOOTH RECON-STRUCTION OF THE PREFERENCE FUNCTION Anonymous

Abstract

The problem of predicting the rating of a set of users to a set of items in a recommender system based on partial knowledge of the ratings is widely known as collaborative filtering. In this paper, we consider a mapping of the items into a vector space and study the prediction problem by assuming an underlying smooth preference function for each user, the quantization at each given vector yields the associated rating. To estimate the preference functions, we implicitly cluster the users with similar ratings to form dominant types. Next, we associate each dominant type with a smooth preference function; i.e., the function values for items with nearby vectors shall be close to each other. The latter is accomplished by a rich representation learning in a so called frequency domain. In this framework, we propose two approaches for learning user and item representations. First, we use an alternating optimization method in the spirit of k-means to cluster users and map items. We further make this approach less prone to overfitting by a boosting technique. Second, we present a feedforward neural network architecture consisting of interpretable layers which implicitely clusters the users. The performance of the method is evaluated on two benchmark datasets (ML-100k and ML-1M). Albeit the method benefits from simplicity, it shows a remarkable performance and opens a venue for future research. All codes are publicly available on the GitLab.

1. INTRODUCTION

Nowadays, recommender systems (RS) are among the most effective ways for large companies to attract more customers. A few statistics are sufficient to attract attention towards the importance of RS: 80 percent of watched movies on Netflix and 60 percent of video clicks on Youtube are linked with recommendations (Gomez-Uribe & Hunt, 2015; Davidson et al., 2010) . However, the world of RS is not limited to video industry. In general, recommender systems can be categorized into three groups (Zhang et al., 2019) : collaborative filtering (CF), content-based RS, and hybrid RS depending on the used data type. In this paper, we focus on CF, which uses historical interactions to make recommendations. There might be some auxiliary information available to the CF algorithm (like the user personal information); however, a general CF method does not take such side information into account (Zhang & Chen, 2019) . This includes our approach in this paper. Recently, deep learning has found its way to RS and specifically CF methods. Deep networks are able to learn non-linear representations with powerful optimization tools, and their efficient implementations have made then promising CF approaches. However, a quick look at some pervasive deep networks in RS (e.g., He et al. (2017) and Wu et al. ( 2016)) shows that the utilization of deep architectures is limited to shallow networks. Still, it is unclear why networks have not gone deeper in RS in contrast to other fields like computer vision (Zhang et al., 2019) . We suppose that the fundamental reason that limits the application of a deeper structure is the absence of interpretability (look at Seo et al. (2017) , for example). Here, interpretability can be defined in two ways (Zhang et al., 2019) ; first, users be aware of the purpose behind a recommendation, and second, the system operator should know how manipulation of the system will affect the predictions (Zhang et al., 2018) . This paper addresses both issues by formulating the recommendation as a smooth reconstruction of user preferences. Particularly, our contributions are: • The CF problem is formulated as the reconstruction of user preference functions by minimal assumptions. • An alternating optimization method is proposed that effectively optimizes a non-convex loss function and extracts user and item representations. In this regard, effective clustering methods are proposed and tested. • A feed-forward shallow architecture is introduced, which has interpretable layers and performs well in practice. • Despite the simplicity and interpretability of the methods, their performance on benchmark datasets is remarkable. 1 , 2015) . Both methods extend the idea behind MF and use the outputs of two networks as the user and the item representations. The innerproduct makes the prediction of two representations. Although our work has some similarity to this method, we model users by functions and represent these functions in a so-called frequency domain. Thus, user and item representations are not in the same space. AutoEncoder based models. AutoRec (Sedhain et al., 2015) and CFN (Strub et al., 2016) are wellknown autoencoder (AE) structures that transform partial observations (user-based or item-based) into full row or column data. Our method differs from AE structures as our network use item (user) representations and predicts user (item) ratings.

2. SMOOTH RECONSTRUCTION FROM NON-UNIFORM SAMPLES

Rating as the output of the preference function. Most of the time, a finite set of features can characterize users and items that constitute the recommendation problem. Although no two users or items are exactly the same, the number of characterization features can be considerably small without losing much information. Assume that item i is characterized by the vector x i ∈ X ⊂ R d . We further assume that all users observe similar features of an item and user u's ratings are determined by a preference function f u : X → [c min , c max ]. The recovery of a general preference function might need an indefinite number of samples, i.e., observed ratings. However, we do not expect user attitudes to change too much with small changes in an item's feature. E.g., if the price is the determinative factor in someone's preference, small changes in the price must not change the preference over this item significantly (look at figure 1 ). 



Figure 1: Preference function is expected to have smooth behavior over the space of items.

.1 RELATED WORKS The applied methods in CF are versatile and difficult to name. Below, we explain a number of methods which are well-known and are more related to our work.

