PROMETHEUS: ENDOWING LOW SAMPLE AND COM-MUNICATION COMPLEXITIES TO CONSTRAINED DE-CENTRALIZED STOCHASTIC BILEVEL LEARNING

Abstract

In recent years, constrained decentralized stochastic bilevel optimization has become increasingly important due to its versatility in modeling a wide range of multi-agent learning problems, such as multi-agent reinforcement learning and multi-agent meta-learning with safety constraints. However, one under-explored and fundamental challenge in constrained decentralized stochastic bilevel optimization is how to achieve low sample and communication complexities, which, if not addressed appropriately, could affect the long-term prospect of many emerging multi-agent learning paradigms that use decentralized bilevel optimization as a bedrock. In this paper, we investigate a class of constrained decentralized bilevel optimization problems, where multiple agents collectively solve a nonconvexstrongly-convex bilevel problem with constraints in the upper-level variables. Such problems arise naturally in many multi-agent reinforcement learning and meta learning problems. In this paper, we propose an algorithm called Prometheus (proximal tracked stochastic recursive estimator) that achieves the first O( -1 ) results in both sample and communication complexities for constrained decentralized bilevel optimization, where > 0 is a desired stationarity error. Collectively, the results in this work contribute to a theoretical foundation for low sample-and communicationcomplexity constrained decentralized bilevel learning.

1. INTRODUCTION

In recent years, the problem of constrained decentralized bilevel optimization has attracted increasing attention due to its foundational role in many emerging multi-agent learning paradigms with safety or regularization constraints. Such applications include, but are not limited to, safety-constrained multiagent reinforcement learning for autonomous driving (Bennajeh et al., 2019) , sparsity-regularized multi-agent meta-learning (Poon & Peyré, 2021) , and rank-constrained decentralized matrix completion for recommender systems (Pochmann & Von Zuben, 2022) , etc. As its name suggests, a defining feature of constrained decentralized bilevel optimization is "decentralized," which implies that the problem needs to be solved over a network without any coordination from a centralized server. As a result, all agents must rely on communications to reach a consensus on an optimal solution. Due to the potentially unreliable network connections and the limited computation capability at each agent, such network-consensus approaches for constrained decentralized bilevel optimization typically call for low sample and communication complexities. To date, however, none of the existing works on sample-and communication-efficient decentralized bilevel optimization in the literature considered domain constraints (e.g., Gao et al. ( 2022 (2022b) and Section 2 for detailed discussions). In light of the growing importance of constrained decentralized bilevel optimization, our goal in this paper is to fill this gap by developing sample-and communication-efficient consensus-based algorithms that can effectively handle domains constraints. Specifically, this paper focuses on a class of constrained decentralized multi-task bilevel optimization problems, where we aim to solve a decentralized nonconvex-strongly-convex bilevel optimization problem with i) multiple lower-level problems and ii) consensus and domain constrains on the upper level. Such problems naturally arise in security-constrained bi-level model for integrated natural gas and electricity system (Li et al., 2017) , multi-agent actor-critic reinforcement learning (Zhang et al., 2020) and constraint meta-learning (Liu et al., 2019) . In the optimization literature, a natural 1



); Yang et al. (2022); Lu et al. (2022); Chen et al.

