LEARNING CONTEXT-AWARE ADAPTIVE SOLVERS TO ACCELERATE CONVEX QUADRATIC PROGRAMMING

Abstract

Convex quadratic programming (QP) is an important sub-field of mathematical optimization. The alternating direction method of multipliers (ADMM) is a successful method to solve QP. Even though ADMM shows promising results in solving various types of QP, its convergence speed is known to be highly dependent on the step-size parameter ρ. Due to the absence of a general rule for setting ρ, it is often tuned manually or heuristically. In this paper, we propose CA-ADMM (Context-aware Adaptive ADMM)) which learns to adaptively adjust ρ to accelerate ADMM. CA-ADMM extracts the spatio-temporal context, which captures the dependency of the primal and dual variables of QP and their temporal evolution during the ADMM iterations. CA-ADMM chooses ρ based on the extracted context. Through extensive numerical experiments, we validated that CA-ADMM effectively generalizes to unseen QP problems with different sizes and classes (i.e., having different QP parameter structures). Furthermore, we verified that CA-ADMM could dynamically adjust ρ considering the stage of the optimization process to accelerate the convergence speed further.

1. INTRODUCTION

Among the optimization classes, quadratic program (QP) is widely used due to its mathematical tractability, e.g. convexity, in various fields such as portfolio optimization (Boyd et al., 2017; Cornuéjols et al., 2018; Boyd et al., 2014; Markowitz, 1952) , machine learning (Kecman et al., 2001; Sha et al., 2002 ), control, (Buijs et al., 2002; Krupa et al., 2022; Bartlett et al., 2002) , and communication applications (Luo & Yu, 2006; Hons, 2001) . As the necessity to solve large optimization problems increases, it is becoming increasingly important to ensure the scalability of QP for achieving the solution of the large-sized problem accurately and quickly. Among solution methods to QP, first-order methods (Frank & Wolfe, 1956) owe their popularity due to their superiority in efficiency over other solution methods, for example active set (Wolfe, 1959) and interior points methods (Nesterov & Nemirovskii, 1994) . The alternating direction method of multipliers (ADMM) (Gabay & Mercier, 1976; Mathematique et al., 2004 ) is commonly used for returning high solution quality with relatively small computational expense (Stellato et al., 2020b) . Even though ADMM shows satisfactory results in various applications, its convergence speed is highly dependent on both parameters of QP and user-given step-size ρ. In an attempt to resolve these issues, numerous studies have proposed heuristic (Boyd et al., 2011; He et al., 2000; Stellato et al., 2020a) or theory-driven (Ghadimi et al., 2014) ) methods for deciding ρ. But a strategy for selecting the best performing ρ still needs to be found (Stellato et al., 2020b) . Usually ρ is tuned in a case-dependent manner (Boyd et al., 2011; Stellato et al., 2020a; Ghadimi et al., 2014) . Instead of relying on hand-tuning ρ, a recent study (Ichnowski et al., 2021) utilizes reinforcement learning (RL) to learn a policy that adaptively adjusts ρ to accelerate the convergence of ADMM. They model the iterative procedure of ADMM as the Markov decision process (MDP) and apply the generic RL method to train the policy that maps the current ADMM solution states to a scalar value of ρ. This approach shows relative effectiveness over the heuristic method (e.g., OSQP (Stellato et al., 2020a) ), but it has several limitations. It uses a scalar value of ρ that cannot adjust ρ differently depending on individual constraints. And, it only considers the current state without its history to determine ρ, and therefor cannot capture the non-stationary aspects of ADMM iterations. This method inspired us to model the iteration of ADMM as a non-stationary networked system. We

