LEARNING CONTEXT-AWARE ADAPTIVE SOLVERS TO ACCELERATE CONVEX QUADRATIC PROGRAMMING

Abstract

Convex quadratic programming (QP) is an important sub-field of mathematical optimization. The alternating direction method of multipliers (ADMM) is a successful method to solve QP. Even though ADMM shows promising results in solving various types of QP, its convergence speed is known to be highly dependent on the step-size parameter ρ. Due to the absence of a general rule for setting ρ, it is often tuned manually or heuristically. In this paper, we propose CA-ADMM (Context-aware Adaptive ADMM)) which learns to adaptively adjust ρ to accelerate ADMM. CA-ADMM extracts the spatio-temporal context, which captures the dependency of the primal and dual variables of QP and their temporal evolution during the ADMM iterations. CA-ADMM chooses ρ based on the extracted context. Through extensive numerical experiments, we validated that CA-ADMM effectively generalizes to unseen QP problems with different sizes and classes (i.e., having different QP parameter structures). Furthermore, we verified that CA-ADMM could dynamically adjust ρ considering the stage of the optimization process to accelerate the convergence speed further.

1. INTRODUCTION

Among the optimization classes, quadratic program (QP) is widely used due to its mathematical tractability, e.g. convexity, in various fields such as portfolio optimization (Boyd et al., 2017; Cornuéjols et al., 2018; Boyd et al., 2014; Markowitz, 1952) , machine learning (Kecman et al., 2001; Sha et al., 2002 ), control, (Buijs et al., 2002; Krupa et al., 2022; Bartlett et al., 2002) , and communication applications (Luo & Yu, 2006; Hons, 2001) . As the necessity to solve large optimization problems increases, it is becoming increasingly important to ensure the scalability of QP for achieving the solution of the large-sized problem accurately and quickly. Among solution methods to QP, first-order methods (Frank & Wolfe, 1956) owe their popularity due to their superiority in efficiency over other solution methods, for example active set (Wolfe, 1959) and interior points methods (Nesterov & Nemirovskii, 1994) . The alternating direction method of multipliers (ADMM) (Gabay & Mercier, 1976; Mathematique et al., 2004) is commonly used for returning high solution quality with relatively small computational expense (Stellato et al., 2020b) . Even though ADMM shows satisfactory results in various applications, its convergence speed is highly dependent on both parameters of QP and user-given step-size ρ. In an attempt to resolve these issues, numerous studies have proposed heuristic (Boyd et al., 2011; He et al., 2000; Stellato et al., 2020a) or theory-driven (Ghadimi et al., 2014) ) methods for deciding ρ. But a strategy for selecting the best performing ρ still needs to be found (Stellato et al., 2020b) . Usually ρ is tuned in a case-dependent manner (Boyd et al., 2011; Stellato et al., 2020a; Ghadimi et al., 2014) . Instead of relying on hand-tuning ρ, a recent study (Ichnowski et al., 2021) utilizes reinforcement learning (RL) to learn a policy that adaptively adjusts ρ to accelerate the convergence of ADMM. They model the iterative procedure of ADMM as the Markov decision process (MDP) and apply the generic RL method to train the policy that maps the current ADMM solution states to a scalar value of ρ. This approach shows relative effectiveness over the heuristic method (e.g., OSQP (Stellato et al., 2020a) ), but it has several limitations. It uses a scalar value of ρ that cannot adjust ρ differently depending on individual constraints. And, it only considers the current state without its history to determine ρ, and therefor cannot capture the non-stationary aspects of ADMM iterations. This method inspired us to model the iteration of ADMM as a non-stationary networked system. We In this study, we propose Context-aware Adaptive ADMM (CA-ADMM), an RL-based adaptive ADMM algorithm, to increase the convergence speed of ADMM. To overcome the mentioned limitations of other approaches, we model the iterative solution-finding process of ADMM as the MDP whose context is determined by QP structure (or parameters). We then utilize a graph recurrent neural network (GRNN) to extract (1) the relationship among the primal and dual variables of the QP problem, i.e., its spatial context and (2) the temporal evolutions of the primal and dual variables, i.e., its temporal context. The policy network then utilizes the extracted spatio-temporal context to adjust ρ. From the extensive numerical experiments, we verified that CA-ADMM adaptively adjusts ρ in consideration of QP structures and the iterative stage of the ADMM to accelerate the convergence speed further. We evaluated CA-ADMM in various QP benchmark datasets and found it to be significantly more efficient than the heuristic and learning-based baselines in number of iterations until convergence. CA-ADMM shows remarkable generalization to the change of problem sizes and, more importantly, benchmark datasets. Through the ablation studies, we also confirmed that both spatial and temporal context extraction schemes are crucial to learning a generalizable ρ policy. The contributions of the proposed method are summarized below: • Spatial relationships: We propose a heterogeneous graph representation of QP and ADMM state that captures spatial context and verifies its effect on the generalization of the learned policy. • Temporal relationships: We propose to use a temporal context extraction scheme that captures the relationship of ADMM states over the iteration and verifies its effect on the generalization of the learned policy. • Performance/Generalization: CA-ADMM outperforms state-of-the-art heuristics (i.e., OSQP) and learning-based baselines on the training QP problems and, more importantly, out-of-training QP problems, which include large size problems from a variety of domains.

2. RELATED WORKS

Methods for selecting ρ of ADMM. In the ADMM algorithm, step-size ρ plays a vital role in determining the convergence speed and accuracy. For some special cases of QP, there is a method to compute optimal ρ (Ghadimi et al., 2014). However, this method requires linear independence of the constraints, e.g., nullity of A is nonzero, which is difficult to apply in general QP problems. Thus, various heuristics have been proposed to choose ρ (Boyd et al., 2011; He et al., 2000; Stellato et al., 2020a) . Typically, the adaptive methods that utilize state-dependent step-size ρ t show a relatively faster convergence speed than non-adaptive methods. (2021) employed RL to learn a policy for adaptively adjusting ρ depending on the states of ADMM iterations. This method outperforms other baselines, showing the potential that an effective rule for adjusting ρ can be learned without problem-specific knowledge using data. However, this method still does not sufficiently reflect the structural characteristics of the QP and the temporal evolution of ADMM iterations. Both limitations make capturing the proper problem context challenging, limiting its generalization capability to unseen problems of different sizes and with alternate objectives and constraints. Graph neural network for optimization problems. An optimization problem comprises objective function, decision variables, and constraints. When the optimization variable is a vector, there is typically interaction among components in the decision vector with respect to an objective or constraints. Thus, to capture such interactions, many studies have proposed to use graph representation to model such interactions in optimization problems. Gasse et al. (2019) expresses mixed integer programming (MIP) using a bipartite graph consisting of two node types, decision variable



He et al. (2000);Boyd et al. (2011)  suggest a rule for adjusting ρ t depending on the ratio of residuals. OSQP(Stellato et al., 2020a)  extends the heuristic rule by adjusting ρ with the values of the primal and dual optimization variables. Even though OSQP shows improved performance, designing such adaptive rules requires tremendous effort. Furthermore, the designed rule for a specific QP problem class is hard to generalize to different QP classes having different sizes, objectives and constraints. Recently, Ichnowski et al.

