SMART MULTI-TENANT FEDERATED LEARNING

Abstract

Federated learning (FL) is an emerging distributed machine learning method that empowers in-situ model training on decentralized edge devices. However, multiple simultaneous training activities could overload resource-constrained devices. In this work, we propose a smart multi-tenant FL system, MuFL, to effectively coordinate and execute simultaneous training activities. We first formalize the problem of multi-tenant FL, define multi-tenant FL scenarios, and introduce a vanilla multitenant FL system that trains activities sequentially to form baselines. Then, we propose two approaches to optimize multi-tenant FL: 1) activity consolidation merges training activities into one activity with a multi-task architecture; 2) after training it for rounds, activity splitting divides it into groups by employing affinities among activities such that activities within a group have better synergy. Extensive experiments demonstrate that MuFL outperforms other methods while consuming 40% less energy. We hope this work will inspire the community to further study and optimize multi-tenant FL.

1. INTRODUCTION

Federated learning (FL) (McMahan et al., 2017) has attracted considerable attention as it enables privacy-preserving distributed model training among decentralized devices. It is empowering growing numbers of applications in both academia and industry, such as Google Keyboard (Hard et al., 2018) , medical imaging analysis (Li et al., 2019; Sheller et al., 2018) , and autonomous vehicles (Zhang et al., 2021a; Posner et al., 2021) . Among them, some applications contain multiple training activities for different tasks. For example, Google Keyboard includes query suggestion (Yang et al., 2018 ), emoji prediction (Ramaswamy et al., 2019 ), and next-world prediction (Hard et al., 2018) ; autonomous vehicles relates to multiple computer vision (CV) tasks, including lane detection, object detection, and semantic segmentation (Janai et al., 2020) . However, multiple simultaneous training activities could overload edge devices (Bonawitz et al., 2019) . Edge devices have tight resource constraints, whereas training deep neural networks for the aforementioned applications is resource-intensive. As a result, the majority of edge devices can only support one training activity at a time (Liu et al., 2019) ; multiple simultaneous federated learning activities on the same device could overwhelm its memory, computation, and power capacities. Thus, it is important to navigate solutions to well coordinate these training activities. A plethora of research on FL considers only one training activity in an application. Many studies are devoted to addressing challenges including statistical heterogeneity (Li et al., 2020; Wang et al., 2020a ), system heterogeneity (Chai et al., 2020; Yang et al., 2021 ), communication efficiency (Karimireddy et al., 2020; Zhu et al., 2021) , and privacy issues (Bagdasaryan et al., 2020; Huang et al., 2021) . A common limitation is that they only focus on one training activity, but applications like Google Keyboard and autonomous vehicles require multiple training activities for different targets (Yang et al., 2018; Ramaswamy et al., 2019) . Multi-tenancy of an FL system is designed by Bonawitz et al. (2019) to prevent simultaneous training activities from overloading devices. However, it mainly considers differences among training activities, neglecting potential synergies. In this work, we propose a smart multi-tenant federated learning system, MuFL, to efficiently coordinate and execute simultaneous training activities under resource constraints by considering both synergies and differences among training activities. We first formalize the problem of multitenant FL and define four multi-tenant FL scenarios based on two variances in Section 3: 1) whether all training activities are the same type of application, e.g., CV applications; 2) whether all clients

