MULTIGRAPH TOPOLOGY DESIGN FOR CROSS-SILO FEDERATED LEARNING Anonymous

Abstract

Cross-silo federated learning utilizes a few hundred reliable data silos with highspeed access links to jointly train a model. While this approach becomes a popular setting in federated learning, designing a robust topology to reduce the training time is still an open problem. In this paper, we present a new multigraph topology for cross-silo federated learning. We first construct the multigraph using the overlay graph. We then parse this multigraph into different simple graphs with isolated nodes. The existence of isolated nodes allows us to perform model aggregation without waiting for other nodes, hence reducing the training time. We further propose a new distributed learning algorithm to use with our multigraph topology. The intensive experiments on public datasets show that our proposed method significantly reduces the training time compared with recent state-of-theart topologies while ensuring convergence and maintaining the accuracy.



. The accuracy and total wall-clock training time (or overhead time) are reported after 6, 400 communication rounds. Federated learning entails training models via remote devices or siloed data centers while keeping data locally to respect the user's privacy policy (Li et al., 2020a) . According to Kairouz et al. (2019) , there are two popular training scenarios: the cross-device scenario, which encompasses a variety (millions or even billions) of unreliable edge devices with limited computational capacity and slow connection speeds; and the cross-silo scenario, which involves only a few hundred reliable data silos with powerful computing resources and highspeed access links. Recently, cross-silo scenario becomes popular in different federated learning applications such as healthcare (Xu et al., 2021 ), robotics (Nguyen et al., 2021; Zhang et al., 2021c) , medical imaging (Courtiol et al., 2019; Liu et al., 2021) , and finance (Shingi, 2020). In practice, federated learning is a promising research direction where we can utilize the effectiveness of machine learning methods while respecting the user's privacy. Key challenges in federated learning include model convergence, communication congestion, and imbalance of data distributions in different silos (Kairouz et al., 2019) . A popular federated training method is to set a central node that orchestrates the training process and aggregates contributions of all clients. Main limitation of this client-server approach is that the server node potentially represents a communication congestion point in the system, especially when the number of clients is large. To overcome this limitation, recent research has investigated the decentralized (or peer-to-peer) federated learning approach. In the aforementioned approach, the communication is done via peer-to-peer topology without the need for a central node. However, main challenge of decentralized federated learning is to achieve fast training time, while assuring model convergence and maintaining the model accuracy. In federated learning, the communication topology plays an important role. A more efficient topology leads to quicker convergence and reduces the training time, quantifying by the worst-case convergence bounds in the topology design (Jiang et al., 2017; Nedić et al., 2018; Wang & Joshi, 2018) 1 ).

2. LITERATURE REVIEW

Federated Learning. Federated learning has been regarded as a system capable of safeguarding data privacy (Konečnỳ et al., 2016; Gong et al., 2021; Zhang et al., 2021b; Li et al., 2021b et al., 2021; Ma et al., 2022; Zhang et al., 2022; Liu et al., 2022; Elgabli et al., 2022) are introduced to address the convergence and non-IID (nonidentically and independently distributed) problem. Despite its simplicity, the client-server approach suffers from the communication and computational bottlenecks in the central node, especially when the number of clients is large (He et al., 2019; Qu et al., 2022) . Decentralized Federated Learning. Decentralized (or peer-to-peer) federated learning allows each silo data to interact with its neighbors directly without a central node (He et al., 2019) . Due to its nature, decentralized federated learning does not have the communication congestion at the central node, however, optimizing a fully peer-to-peer network is a challenging task (Nedić & Olshevsky, 2014; Lian et al., 2017; He et al., 2018; Lian et al., 2018; Wang et al., 2019; Marfoq et al., 2020; 2021; Li et al., 2021a) . Noticeably, the decentralized periodic averaging stochastic gradient descent (Wang & Joshi, 2018) is proved to converge at a comparable rate to the centralized algorithm while allowing large-scale model training (Wu et al., 2017; Shen et al., 2018; Odeyomi & Zaruba, 2021) . Recently, systematic analysis of the decentralized federated learning has been explored by (Li et al., 2018b; Ghosh et al., 2020; Koloskova et al., 2020) . Communication Topology. The topology has a direct impact on the complexity and convergence of federated learning (Chen et al., 2020) . Many works have been introduced to improve the effective-



Figure 1: Comparison between different topologies on FEMNIST dataset and Exodus network (Miller et al., 2010). The accuracy and total wall-clock training time (or overhead time) are reported after 6, 400 communication rounds.

. Furthermore, topology design is directly related to other problems during the training process such as network congestion, the overall accuracy of the trained model, or energy usage(Yang et al., 2021;  Nguyen et al., 2021; Kang et al., 2019). Designing a robust topology that can reduce the training time while maintaining the model accuracy is still an open problem in federated learning(Kairouz  et al., 2019). This paper aims to design a new topology for cross-silo federated learning, which is one of the most common training scenarios in practice.Recently, different topologies have been proposed for cross-silo federated learning. In(Brandes,  2008), the STAR topology is designed where the orchestrator averages all models throughout each communication round.Wang et al. (2019)  propose MATCHA to decompose the set of possible communications into pairs of clients. At each communication round, they randomly select some pairs and allow them to transmit models.Marfoq et al. (2020)  introduce the RING topology with the largest throughput using max-plus linear systems. While some progress has been made in the field, there are challenging problems that need to be addressed such as congestion at access links(Wang  et al., 2019; Yang et al., 2021), straggler effect (Neglia et al., 2019; Park et al., 2021), or identical topology in all communication rounds(Jiang et al., 2017; Marfoq et al., 2020).In this paper, we propose a new multigraph topology based on the recent RING topology(Marfoq  et al., 2020)  to reduce the training time for cross-silo federated learning. Our method first constructs the multigraph based on the overlay of RING topology. Then we parse this multigraph into simple graphs (i.e., graphs with only one edge between two nodes). We call each simple graph is a state of the multigraph. Each state of the multigraph may have isolated nodes, and these nodes can do model aggregation without waiting for other nodes. This strategy significantly reduces the cycle time in each communication round. To ensure model convergence, we also adapt and propose a new distributed learning algorithm. The intensive experiments show that our proposed topology significantly reduces the training time in cross-silo federated learning (See Figure

