UNDERSTANDING ADVERSARIAL TRANSFERABILITY IN FEDERATED LEARNING

Abstract

With the promises Federated Learning (FL) delivers, various topics regarding its robustness and security issues have been widely studied in recent years: such as the possibility to conduct adversarial attacks (or transferable adversarial attacks) in a while-box setting with full knowledge of the model (or the entire data), or the possibility to conduct poisoning/backdoor attacks during the training process as a malicious client. In this paper, we investigate the robustness and security issues from a different, simpler, but practical setting: a group of malicious clients has impacted the model during training by disguising their identities and acting as benign clients, and only revealing their adversary position after the training to conduct transferable adversarial attacks with their data, which is usually a subset of the data that FL system is trained with. Our aim is to offer a full understanding of the challenges the FL system faces in this setting across a spectrum of configurations. We notice that such an attack is possible, but the federated model is more robust compared with its centralized counterpart when the accuracy on clean images is comparable. Through our study, we hypothesized the robustness is from two factors: the decentralized training on distributed data and the averaging operation. Our work has implications for understanding the robustness of federated learning systems and poses a practical question for federated learning applications.

1. INTRODUCTION

The ever-growing usage of mobile devices such as smartphones and tablets leads to an explosive amount of distributed data collected from user-end. Such private and sensitive data, if can be fully utilized, will greatly improves the power of more intelligent systems. Federated learning (FL) provides a solution for decentralized learning by training quality models through local updates and parameter aggregation McMahan et al. (2017) . A FL system maintains a loose federation of participated clients and a centralized server that holds no data but the aggregated model. During training, the central server distributes the global model to a random subset of participants where they updates the model with the private data locally and submits the updated model back to the server for aggregation (e.g. average) at each round. By design, the system has no visibility to the local data, allowing it to benefit from a wide range of private data while maintaining participant privacy, and the averaging provides an efficient way to leverage the updated parameters compared with distributed SGD. Despite the fact that FL protects privacy, the loose organization and its invisibility to local data makes it more vulnerable to the various attacks including data poisoning Huang et al. ( 2011 



), model poisoning (backdoor attack) Bhagoji et al. (2019); Bagdasaryan et al. (2020), free-riders attack Lin et al. (2019) and various reconstruction attack that leaks the data and privacy of individual participants Geiping et al. (2020); Zhu et al. (2019). Various anomaly detection based methods have been proposed to prevent possible poisoning attacks such as Byzantine-tolerant aggregation Yin et al. (2018), clustering-based selection Shen et al. (2016) and anomaly detection in spectral domain Li et al. (2020). Reputation is introduced to prevent free-rider Xu & Lyu (2020) and differential privacy techniques are leveraged to preserve the privacy against GANs-based reconstruction attacks Augenstein et al. (2019); Hao et al. (2019). Another line of FL security research focuses on the attack during inference, i.e., the adversarial attack Biggio et al. (2013); Szegedy et al. (2013). Same as any other deep learning application, federated systems are also found vulnerable to the adversarial examples carefully crafted to deceive the model Zizzo et al. (2020). FAT first discusses the possibility 1

