PERSONALIZED FEDERATED COMPOSITE LEARNING WITH FORWARD-BACKWARD ENVELOPES

Abstract

Federated composite optimization (FCO) is an optimization problem in federated learning whose loss function contains a non-smooth regularizer. It arises naturally in the applications of federated learning (FL) that involve requirements such as sparsity, low rankness, and monotonicity. In this study, we propose a personalization method, called pFedFBE, for FCO by using forward-backward envelope (FBE) as clients' loss functions. With FBE, we not only decouple the personalized model from the global model, but also allow personalized models to be smooth and easily optimized. In spite of the nonsmoothness of FCO, pFedFBE shows the same convergence complexity results as FedAvg for FL with unconstrained smooth objectives. Numerical experiments are shown to demonstrate the effectiveness of our proposed method.

1. INTRODUCTION

Federated learning (FL) is originally proposed by (McMahan et al., 2016) to solve learning tasks with decentralized data arising in various applications. For example, data are generated from medical institutions which can not share its data to each other due to confidentiality or legal constraints. Instead of accessing all the data sets, different institutions or clients are under the coordination of a central server and the central server aggregates the local information to train a global model. Similar methodologies have been investigated in the literature of decentralized optimization (Colorni et al., 1991; Boyd et al., 2011; Yang et al., 2019) . For more introductions and open problems in the field of federated optimization, we refer to the review articles (Kairouz et al., 2021; Wang et al., 2021) . The local loss functions of FL can be nonsmooth. In particular, it is a summation of a smooth function and a nonsmooth regularizer, where the regularizer is used to promote certain structure of the optimal parameters such as sparisity, low rankness, total variation, and additional constraints on the parameters. This has motivated the recent study of the federated setting of composite optimization (Yuan et al., 2021) . The mathematical formulation of FCO is to optimize min w∈R d f (w) := 1 N N i=1 (f i (w) + h(w)) , where f i (w) = E ξi fi (w, ξ i ) or its empirical version f i (w) = 1 |D i | ξi∈D i fi (w, ξ i ) is a smooth function with local dataset D i , and h : R d → R is a nonsmooth but convex regularizer. Besides, we assume that the proximal operator of h, prox h (w) := arg min u∈R d h(u) + 1 2 ∥u -w∥ 2 has closed-form expressions and is easy to compute. The difference to the centralized setting is that D i is local data of client i and the data distribution of each client may not be the same. Optimizing FL with unconstrained smooth objectives, i.e., problem (1) with h ≡ 0, has been extensively studied in the literature, e.g., FedAvg (McMahan et al., 2016 ), FedProx (Li et al., 2020) , SCAFFOLD (Karimireddy et al., 2020b) and MIME (Karimireddy et al., 2020a) , to name a few. When h ̸ = 0, FedDual (Yuan et al., 2021 ), FedDR (Tran Dinh et al., 2021 ) and FedADMM (Wang et al., 2022) are developed. One of the challenges of these algorithms is the heterogeneity of the local dataset D i , where the distribution of D i is none-identical. The model parameter w learned by minimizing f (w) may perform poorly for each client. If each client learns its parameter by its own data, the local model parameters may not generalize well due to the insufficient data. For the case h ≡ 0, in order to learn the global parameter and the local parameters jointly, the concept of the personalized FL has been

