A SIMULATION-BASED FRAMEWORK FOR ROBUST FEDERATED LEARNING TO TRAINING-TIME ATTACKS Anonymous

Abstract

Well-known robust aggregation schemes in federated learning (FL) are shown to be vulnerable to an informed adversary who can tailor training-time attacks (Fang et al., 2020; Xie et al., 2020). We frame robust distributed learning problem as a game between a server and an adversary that is able to optimize strong trainingtime attacks. We introduce RobustTailor, a simulation-based framework that prevents the adversary from being omniscient. The simulated game we propose enjoys theoretical guarantees through a regret analysis. RobustTailor improves robustness to training-time attacks significantly while preserving almost the same privacy guarantees as standard robust aggregation schemes in FL. Empirical results under challenging attacks show that RobustTailor performs similar to an upper bound with perfect knowledge of honest clients.

1. INTRODUCTION

In federated learning (FL), a global/personalized model is learnt from data distributed on multiple clients without sharing data (McMahan et al., 2017; Kairouz et al., 2021) . Clients compute their (stochastic) gradients using their own local data and send them to a central server for aggregating and updating a model. While FL offers improvements in terms of privacy, it creates additional challenges in terms of robustness. Clients are often prone to the bias in the stochastic gradient updates, which comes not only from poor sampling or data noise but also from malicious attacks of Byzantine clients who may send arbitrary messages to the server instead of correct gradients (Guerraoui et al., 2018) . Therefore, in FL, it is essential to guarantee some level of robustness to Byzantine clients that might be compromised by an adversary. Compromised clients are vulnerable to data/model poisoning and tailored attacks (Fang et al., 2020) . Byzantine-resilience is typically achieved by robust gradient aggregation schemes e.g., Krum (Blanchard et al., 2017 ), Comed (Yin et al., 2018 ), and trimmedmean (Yin et al., 2018) . These aggregators are resilient against attacks that are designed in advance. However, such robustness is insufficient in practice since a powerful adversary could learn the aggregation rule and tailor its training-time attack. It has been shown that well-known Byzantine-resilient gradient aggregation schemes are susceptible to an informed adversary that can tailor the attacks (Fang et al., 2020). Specifically, Fang et al. ( 2020) and Xie et al. ( 2020) proposed efficient and nearly optimal trainingtime attacks that circumvent Krum, Comed, and trimmedmean. A tailored attack is designed with a prior knowledge of the robust aggregation rule used by the server, such that the attacker has a provable way to corrupt the training process. Given the information leverage of the adversary, it is a significant challenge to establish successful defense mechanisms against such tailored attacks. In this paper, we formulate robust distributed learning problem against training-time attacks as a game between a server and an adversary. To prevent the adversary from being omniscient, we propose to follow a mixed strategy using the existing robust aggregation rules. In real-world settings, both server and adversary have a number of aggregation rules and attack programs. How to utilize these aggregators efficiently and guarantee robustness is a challenging task. We address scenarios where neither the specific attack method is known in advance by the aggregator nor the exact aggregation rule used in each iteration is known in advance by the adversary, while the adversary and the server know the set of server's aggregation rules and the set of attack programs, respectively.foot_0 



While this assumption is essential to frame our game, we provide experimental results on challenging settings where the server does not know the set of attack programs in Section 5.

