LABEL-DISTRIBUTION-AGNOSTIC ENSEMBLE LEARN-ING ON FEDERATED LONG-TAILED DATA

Abstract

Federated Learning (FL) is a distributed machine learning paradigm that enables devices to collaboratively train a shared model. However, the long-tailed distribution in nature deteriorates the performance of the global model, which is difficult to address due to data heterogeneity, e.g., local clients may exhibit diverse imbalanced class distributions. Moreover, existing re-balance strategies generally utilize label distribution as the class prior, which may conflict with the privacy requirement of FL. To this end, we propose a Label-Distribution-Agnostic Ensemble (LDAE) learning framework to integrate heterogeneous data distributions using multiple experts, which targets to optimize a balanced global objective under privacy protection. In particular, we derive a privacy-preserving proxy from the model updates of clients to guide the grouping and updating of multiple experts. Knowledge from clients can be aggregated via implicit interactions among different expert groups. We theoretically and experimentally demonstrate that (1) there is a global objective gap between global and local re-balance strategies 1 and (2) with protecting data privacy, the proxy can be used as an alternative to label distribution for existing class prior based re-balance strategies. Extensive experiments on long-tailed decentralized datasets demonstrate the effectiveness of our method, showing superior performance to state-of-the-art methods.

1. INTRODUCTION

Federated Learning (FL) aims to collaboratively learn from data dominated by a number of remote clients and produce a highly accurate global model on the server with aggregated knowledge. The most important issues in practical FL applications mainly involve data heterogeneity and privacy protection during collaboration of disparate data sources. Such issues are even more significant in the setting of long-tailed data distribution for some real-world scenarios (Cui et al., 2019; Liu et al., 2019) , such as medical applications (Li et al., 2019; Malekzadeh et al., 2021) and autonomous vehicles (Samarakoon et al., 2019; Pokhrel & Choi, 2020) . Under the long-tailed global data distribution, it is extremely challenging to learn an effective global model by leveraging knowledge from local clients. From the local perspective, there can be a large divergence among the imbalanced label distributions of different clients, resulting in the heterogeneous imbalance as shown in Figure 1 (a), i.e., local datasets on different clients may have different imbalance ratios or minority classes. From the global perspective, one should handle the imbalance issue with privacy preservation (Li et al., 2021a) , i.e., the server should not require clients to upload label distributions for re-balance strategies. Several techniques have been proposed to tackle the class imbalance problem in FL, such as loss re-weighting (Wang et al., 2021; Shen et al., 2021 ), client clustering (Duan et al., 2020) and the client selection scheme (Yang et al., 2021) . Most of them focus on datasets with only a few classes (e.g., ten or twenty classes), suffering from significant performance drops on large-scale imbalanced datasets with more classes (Liu et al., 2019; Zhang et al., 2021b) . Simultaneously, existing solutions generally assume that some sensitive information is accessible to the global server, e.g., a balanced



The local re-balance strategy means that each client utilizes re-balance methods based on the local label distribution, while the global re-balance strategy applies re-balance methods using global label distribution as the class-wise prior.

