PFEDKT: PERSONALIZED FEDERATED LEARNING VIA KNOWLEDGE TRANSFER

Abstract

Federated learning (FL) has been widely studied as a new paradigm to achieve multi-party collaborative modelling on decentralized data with privacy protection. Unfortunately, traditional horizontal FL suffers from Non-IID data distribution, where clients' private models after FL are even inferior to models trained standalone. To tackle this challenge, most existing approaches focus on personalized federated learning (PFL) to improve personalized private models but present limited accuracy improvements. To this end, we design pFedKT, a novel personalized federated learning framework with private and global knowledge transfer, towards boosting the performances of personalized private models on Non-IID data. It involves two types of knowledge transfer: a) transferring historical private knowledge to new private models by local hypernetworks; b) transferring the global model's knowledge to private models through contrastive learning. After absorbing the historical private knowledge and the latest global knowledge, the personalization and generalization of private models are both enhanced. Besides, we derive pFedKT's generalization and prove its convergence theoretically. Extensive experiments verify that pFedKT presents 1.38% -1.62% accuracy improvements of private models compared with the state-of-the-art baseline.

1. INTRODUCTION

With frequent privacy leakage, directly collecting data and modelling it would violate privacy protection regulations such as GDPR (Kairouz & et al., 2021) . To implement collaborative modelling while protecting data privacy, horizontal federated learning (FL) came into being (McMahan & et al, 2017) . As shown in Fig. 1 (a), FL consists of a central server and multiple clients. In each communication round, the server broadcasts the global model (abbr. GM) to selected clients; then clients train it locally on their local datasets and upload trained private models (abbr. PMs) to the server; finally, the server aggregates received private models to update the global model. The whole procedure is repeated until the global model converges. In short, FL fulfils collaborative modelling by allowing clients to only communicate model updates with the server, while data is always stored locally. However, FL still faces several challenges such as communication efficiency, robustness to attacks, and model accuracy which we focus on in this work. The motivation for clients to participate in FL is to improve their local models' quality. However, the decentralized data held by clients are often not independent and identically distributed (Non-IID) (Kairouz & et al., 2021) , and the global model aggregated through a typical FL algorithm FedAvg (McMahan & et al, 2017) based on Non-IID data may perform worse than clients' solely trained models. Zhao & et al (2018) To further improve personalized private models on Non-IID data, we propose a novel personalized FL framework named pFedKT with two types of transferred knowledge: 1) private knowledge: we deploy a local hypernetwork for each client to transfer historical PMs' knowledge to new PMs; 2) global knowledge: we exploit contrastive learning to enable PMs to absorb the GM's knowledge. We analyzed pFedKT's generalization and proved its convergence theoretically. We also conducted extensive experiments to verify that pFedKT fulfils the state-of-the-art PM's accuracy.

Contributions.

Our main contributions are summarized as follows: a) We devised two types of knowledge transfer to simultaneously enhance the generalization and personalization of private models. b) We analyzed pFedKT's generalization and convergence in theory. c) Extensive experiments verified the superiority of pFedKT on the accuracy of personalized private models.



have verified this fact experimentally and argued that the global model aggregated by skewed local models trained on Non-IID data deviates from the optima (model trained on all local data). To alleviate the accuracy degradation caused by Non-IID data, personalized FL (PFL) methods (Shamsian & et al, 2021) have been widely studied to improve clients' personalized model quality. Existing researches implement PFL by fine-tuning Mansour & et al (2020); Wang & et al (2019), model mixup Arivazhagan & et al (2019); Collins & et al (2021) and etc. But they suffer from limited improvements in the accuracy of private models.

