DATA-FREE ONE-SHOT FEDERATED LEARNING UNDER VERY HIGH STATISTICAL HETEROGENEITY

Abstract

Federated learning (FL) is an emerging distributed learning framework that collaboratively trains a shared model without transferring the local clients' data to a centralized server. Motivated by concerns stemming from extended communication and potential attacks, one-shot FL limits communication to a single round while attempting to retain performance. However, one-shot FL methods often degrade under high statistical heterogeneity, fail to promote pipeline security, or require an auxiliary public dataset. To address these limitations, we propose two novel data-free one-shot FL methods: FEDCVAE-ENS and its extension FEDCVAE-KD. Both approaches reframe the local learning task using a conditional variational autoencoder (CVAE) to address high statistical heterogeneity. Furthermore, FEDCVAE-KD leverages knowledge distillation to compress the ensemble of client decoders into a single decoder. We propose a method that shifts the center of the CVAE prior distribution and experimentally demonstrate that this promotes security, and show how either method can incorporate heterogeneous local models. We confirm the efficacy of the proposed methods over baselines under high statistical heterogeneity using multiple benchmark datasets. In particular, at the highest levels of statistical heterogeneity, both FEDCVAE-ENS and FEDCVAE-KD typically more than double the accuracy of the baselines.

1. INTRODUCTION

Traditional federated learning (FL) achieves privacy protection by sharing learned model parameters with a central server, circumventing the need for a centralized dataset and thus allowing potentially sensitive data to remain local to client devices (McMahan et al., 2017) . FL has shown promise in several practical application domains with privacy concerns, such as health care, mobile phones, and industrial engineering (Li et al., 2020a) . However, most existing FL methods depend on substantial iterative communication (Guha et al., 2019; Li et al., 2020b) , introducing a vulnerability to eavesdropping attacks, among other privacy and security concerns (Mothukuri et al., 2021) . One-shot FL has emerged to address issues associated with communication and security in standard FL (Guha et al., 2019) . One-shot FL limits communication to a single round, which is more practical in scenarios like model markets, where models trained to convergence are sold with no possibility for iterative communication during local client training (Li et al., 2021b) . In high impact settings, like health care, data could be highly heterogeneous and computation capabilities could be varied; for example, health care institutions could have different prevalence rates of particular diseases or no data on a disease and substantially different computing abilities depending on funding (Li et al., 2020a) . Furthermore, fewer communications rounds means fewer opportunities for eavesdropping attacks. While results in one-shot FL are promising, existing methods struggle under high statistical heterogeneity, non-independently-and identically-distributed (non-IID) data, (i.e., Zhou et al. ( 2020 2021)). Furthermore, an auxiliary public dataset is often required to achieve satisfactory performance in one-shot FL (i.e., Guha et al. (2019 ), Li et al. (2021b) ), which may be difficult to obtain in practice (Zhu et al., 2021) . 1



) Zhang et al. (2021)) or do not fully consider statistical heterogeneity (i.e., Guha et al. (2019), Shin et al. (2020), Li et al. (2021b)). Additionally, most do not consider pipeline security (i.e., Shin et al. (2020), Li et al. (2021b), Zhang et al. (

