FEDFA: FEDERATED FEATURE AUGMENTATION

Abstract

Federated learning is a distributed paradigm that allows multiple parties to collaboratively train deep models without exchanging the raw data. However, the data distribution among clients is naturally non-i.i.d., which leads to severe degradation of the learnt model. The primary goal of this paper is to develop a robust federated learning algorithm to address feature shift in clients' samples, which can be caused by various factors, e.g., acquisition differences in medical imaging. To reach this goal, we propose FEDFA to tackle federated learning from a distinct perspective of federated feature augmentation. FEDFA is based on a major insight that each client's data distribution can be characterized by statistics (i.e., mean and standard deviation) of latent features; and it is likely to manipulate these local statistics globally, i.e., based on information in the entire federation, to let clients have a better sense of the underlying distribution and therefore alleviate local data bias. Based on this insight, we propose to augment each local feature statistic probabilistically based on a normal distribution, whose mean is the original statistic and variance quantifies the augmentation scope. Key to our approach is the determination of a meaningful Gaussian variance, which is accomplished by taking into account not only biased data of each individual client, but also underlying feature statistics characterized by all participating clients. We offer both theoretical and empirical justifications to verify the effectiveness of FEDFA. Our code is available at https://github.com/tfzhou/FedFA.

1. INTRODUCTION

Federated learning (FL) (Konečnỳ et al., 2016) is an emerging collaborative training framework that enables training on decentralized data residing devices like mobile phones. It comes with the promise of training centralized models using local data points such that the privacy of participating devices is preserved, and has attracted significant attention in critical fields like healthcare or finance. Since data come from different users, it is inevitable that the data of each user have a different underlying distribution, incurring large heterogeneity (non-iid-ness) among users' data. In this work, we focus on feature shift (Li et al., 2020b) , which is common in many real-world cases, like medical data acquired from different medical devices or natural image collected in diverse environments. While the problem of feature shift has been studied in classical centralized learning tasks like domain generalization, little is understood how to tackle it in federated learning. (Li et al., 2020b; Reisizadeh et al., 2020; Jiang et al., 2022; Liu et al., 2020a) are rare exceptions. FEDROBUST (Reisizadeh et al., 2020) and FEDBN (Li et al., 2020b) solve the problem through client-dependent learning by either fitting the shift with a client-specific affine distribution or learning unique BN parameters for each client. However, these algorithms may still suffer significant local dataset bias. Other works (Qu et al., 2022; Jiang et al., 2022; Caldarola et al., 2022) learn robust models by adopting Sharpness Aware Minimization (SAM) (Foret et al., 2021) as the local optimizer, which, however, doubles the computational cost compared to SGD or Adam. In addition to model optimization, FEDHARMO (Jiang et al., 2022) has investigated specialized image normalization techniques to mitigate feature shift in medical domains. Despite the progress, there leaves an alternative space -data augmentation -largely unexplored in federated learning, even though it has been extensively studied in centralized setting to impose regularization and improve generalizibility (Zhou et al., 2021; Zhang et al., 2018) . While seemingly straightforward, it is non-trivial to perform effective data augmentation in federated learning because users have no direct access to external data of other users. Simply applying conventional augmentation techniques to each client is sub-optimal since without injecting global 1

