RS-FAIRFRS: COMMUNICATION EFFICIENT FAIR FEDERATED RECOMMENDER SYSTEM

Abstract

Federated Recommender Systems (FRSs) aim to provide recommendations to clients in a distributed manner with privacy preservation. FRSs suffer from high communication costs due to the communication between the server and many clients. Some past literature on federated supervised learning shows that sampling clients randomly improve communication efficiency without jeopardizing accuracy. However, each user is considered a separate client in FRS and clients communicate only item gradients. Thus, incorporating random sampling and determining the number of clients to be sampled in each communication round to retain the model's accuracy in FRS becomes challenging. This paper provides sample complexity bounds on the number of clients that must be sampled in an FRS to preserve accuracy. Next, we consider the issue of demographic bias in FRS, quantified as the difference in the average error rates across different groups. Supervised learning algorithms mitigate the group bias by adding the fairness constraint in the training loss, which requires sharing protected attributes with the server. This is prohibited in a federated setting to ensure clients' privacy. We design RS-FAIRFRS, a Random Sampling based Fair Federated Recommender System, which trains to achieve a fair global model. In addition, it also trains local clients towards a fair global model to reduce demographic bias at the client level without the need to share their protected attributes. We empirically demonstrate across the two most popular real-world datasets (ML1M, ML100k) and different sensitive features (age and gender) that RS-FAIRFRS helps reduce communication cost and demographic bias with improved model accuracy.

1. INTRODUCTION

Recommender systems (RSs) have a wide variety of applications in online platforms like e-business, e-commerce, e-learning, e-tourism, music and movie recommendation engines (Lu et al. (2015) ). Traditional RSs require gathering clients' private information at the central server, leading to serious privacy and security risks. ML models can train locally due to edge devices' increased storage and processing power. This has led to 2021)), and many more. This paper focuses on two primary issues: communication efficiency and demographic bias in FRSs. Unlike other applications of FL, where one client has data of many users, in FRS, each user acts as one client constituting a user's profile. FedRec (Lin et al. ( 2021)), an FRS, expects all the clients to train parallelly using matrix factorization (MF). In each communication round, the server aggregates the model updates from a huge number of local clients to obtain a global model, and this global model is then sent back to all the clients. This whole procedure increases the communication cost. We show that random sampling of clients in each communication round reduces the communication cost even when only item gradients of sampled users are communicated. Theoretically, we provide



Federated learning (FL) (McMahan et al. (2017)), which allows clients to share their updates with the server without any data transfer. The server proposes a common model which is communicated with all clients. Using their data and the global model, clients train locally and communicate the updated model to the server. FL has found many applications in the past few years, e.g., Google keyboard query suggestion (Yang et al. (2018)), smartphone voice assistant, mobile edge computing, and visual object detection (Aledhari et al. (2020)). These applications face numerous challenges including communication efficiency (Smith et al. (2018)), statistical heterogeneity (Smith et al. (2017)), systems heterogeneity (Bonawitz et al. (2019)), privacy, personalization, fairness (Kairouz et al. (

