PERSONALIZED FEDERATED HYPERNETWORKS FOR PRIVACY PRESERVATION IN MULTI-TASK REINFORCE-MENT LEARNING

Abstract

Multi-Agent Reinforcement Learning currently focuses on implementations where all data and training can be centralized to one machine. But what if local agents are split across multiple tasks, and need to keep data private between each? We develop the first application of Personalized Federated Hypernetworks (PFH) to Reinforcement Learning (RL). We then present a novel application of PFH to few-shot transfer, and demonstrate significant initial increases in learning. PFH has never been demonstrated beyond supervised learning benchmarks, so we apply PFH to an important domain: RL price-setting for energy demand response. We consider a general case across where agents are split across multiple microgrids, wherein energy consumption data must be kept private within each microgrid. Together, our work explores how the fields of personalized federated learning and RL can come together to make learning efficient across multiple tasks while keeping data secure.

1. INTRODUCTION

As Reinforcement Learning (RL) is brought to bear on pressing societal issues such as the green energy transition, the types of environments that RL must perform well in may display characteristics exotic to classical RL environments. Real applications at scale may require privacy guarantees which are not provided by modern multi-agent RL algorithms as they may train on privileged or corporate data (Lowe et al., 2017; Sunehag et al., 2017; Rashid et al., 2018) ; any app that personalizes an RL agent to individual users must take care to protect their privacy by not storing all their data in a central server. Real world applications will also likely feature heterogeneous tasks; every user, robot, energy system will have different traits that cannot be accounted for by "one size fits all" algorithms. As previous work in privacy-preserving RL (Qi et al., 2021; Wang et al., 2020c; Ren et al., 2019; Anwar & Raychowdhury, 2021) does not extend to personalized models, the competing goals of privacy and personalization must be accomplished at the other's expense. One approach toward privacy preservation by decentralizing data servers within supervised learning is federated learning (Shokri & Shmatikov, 2015) . Federated learning algorithms train a global model from gradient updates sent by individual clients training on their own data, which is never sent to the central server. An extension of federated learning technique is personalized federated learning using hypernetworks (PFH, Shamsian et al. (2021) ), which allows for behavior tailored to individual heterogeneous tasks by splitting the model into a global common component (i.e. the hypernetwork), and a local individual component (a local network generated by the hypernetwork), which is tailored to each client. This task specialization allows for learning common features together in the global component while allowing for learning client-specific knowledge in the local component. We present a novel application of PFH to RL in a realistic power systems setting that requires both privacy and heterogeneity in agents to accommodate diverse, sensitive environments. An RL controller optimizing hourly transactive energy pricing has been shown to optimize energy usage (Li & Hong, 2014; Spangher, 2021; Vázquez-Canteli et al., 2019; Agwan et al., 2021) by incentivizing consumers, at the scale of groups of buildings (microgrids) or office workers within buildings, to shift demand to different times of day. By guiding consumers to defer energy demands to hours

