DC3: A LEARNING METHOD FOR OPTIMIZATION WITH HARD CONSTRAINTS

Abstract

Large optimization problems with hard constraints arise in many settings, yet classical solvers are often prohibitively slow, motivating the use of deep networks as cheap "approximate solvers." Unfortunately, naive deep learning approaches typically cannot enforce the hard constraints of such problems, leading to infeasible solutions. In this work, we present Deep Constraint Completion and Correction (DC3), an algorithm to address this challenge. Specifically, this method enforces feasibility via a differentiable procedure, which implicitly completes partial solutions to satisfy equality constraints and unrolls gradient-based corrections to satisfy inequality constraints. We demonstrate the effectiveness of DC3 in both synthetic optimization tasks and the real-world setting of AC optimal power flow, where hard constraints encode the physics of the electrical grid. In both cases, DC3 achieves near-optimal objective values while preserving feasibility.

1. INTRODUCTION

Traditional approaches to constrained optimization are often expensive to run for large problems, necessitating the use of function approximators. Neural networks are highly expressive and fast to run, making them ideal as function approximators. However, while deep learning has proven its power for unconstrained problem settings, it has struggled to perform well in domains where it is necessary to satisfy hard constraints at test time. For example, in power systems, weather and climate models, materials science, and many other areas, data follows well-known physical laws, and violation of these laws can lead to answers that are unhelpful or even nonsensical. There is thus a need for fast neural network approximators that can operate in settings where traditional optimizers are slow (such as non-convex optimization), yet where strict feasibility criteria must be satisfied. In this work, we introduce Deep Constraint Completion and Correction (DC3), a framework for applying deep learning to optimization problems with hard constraints. Our approach embeds differentiable operations into the training of the neural network to ensure feasibility. Specifically, the network outputs a partial set of variables with codimension equal to the number of equality constraints, and "completes" this partial set into a full solution. This completion process guarantees feasibility with respect to the equality constraints and is differentiable (either explicitly, or via the implicit function theorem). We then fix any violations of the inequality constraints via a differentiable correction procedure based on gradient descent. Together, this process of completion and correction enables feasibility with respect to all constraints. Further, this process is fully differentiable and can be incorporated into standard deep learning methods. Our key contributions are: • Framework for incorporating hard constraints. We describe a general framework, DC3, for incorporating (potentially non-convex) equality and inequality constraints into deeplearning-based optimization algorithms. • Practical demonstration of feasibility. We implement the DC3 algorithm in both convex and non-convex optimization settings. We demonstrate the success of the algorithm in producing approximate solutions with significantly better feasibility than other deep learning approaches, while maintaining near-optimality of the solution. • AC optimal power flow. We show how the general DC3 framework can be used to optimize power flows on the electrical grid. This difficult non-convex optimization task must be solved at scale and is especially critical for renewable energy adoption. Our results greatly improve upon the performance of general-purpose deep learning methods on this task.

2. RELATED WORK

Our approach is situated within the broader literature on fast optimization methods, and draws inspiration from literature on implicit layers and on incorporating constraints into neural networks. We briefly describe each of these areas and their relationship to the present work. Fast optimization methods. Many classical optimization methods have been proposed to improve the practical efficiency of solving optimization problems. These include general techniques such as constraint and variable elimination (i.e., the removal of non-active constraints or redundant variables, respectively), as well as problem-specific techniques (e.g., KKT factorization techniques in the case of convex quadratic programs) (Nocedal & Wright, 2006) . Our present work builds upon aspects of this literature, applying concepts from variable elimination to reduce the number of degrees of freedom associated with the optimization problems we wish to solve. In addition to the classical optimization literature, there has been a large body of literature in deep learning that has sought to approximate or speed up optimization models. As described in reviews on topics such as combinatorial optimization (Bengio et al., 2020) 2018)). We view our work as part of the former set of approaches, but drawing important inspiration from the latter: that employing structural knowledge about the optimization model is paramount to achieving both feasibility and optimality. Constraints in neural networks. While deep learning is often thought of as wholly unconstrained, in reality, it is quite common to incorporate (simple) constraints within deep learning procedures. For instance, softmax layers encode simplex constraints, sigmoids instantiate upper and lower bounds, ReLUs encode projections onto the positive orthant, and convolutional layers enforce translational equivariance (an idea taken further in general group-equivariant networks (Cohen & Welling, 2016)). Recent work has also focused on embedding specialized kinds of constraints into neural networks, such as conservation of energy (see, e.g., Greydanus et al. (2019) and Beucler et al. ( 2019)), and homogeneous linear inequality constraints (Frerix et al., 2020) . However, while these represent common "special cases," there has to date been little work on building more general hard constraints into deep learning models. Implicit layers. In recent years, there has been a great deal of interest in creating structured neural network layers that define implicit relationships between their inputs and outputs. For instance, such layers have been created for SAT solving (Wang et al., 2019) & Kolter, 2017; Donti et al., 2017; Djolonga & Krause, 2017; Tschiatschek et al., 2018; Wilder et al., 2018; Gould et al., 2019) . (Interestingly, softmax, sigmoid, and ReLU layers can also be viewed as implicit layers (Amos, 2019), though in practice it is more efficient to use their explicit form.) In principle, such approaches could be used to directly enforce constraints within neural network settings, e.g., by projecting neural network outputs onto a constraint set using quadratic programming layers (Amos & Kolter, 2017) in the case of linear constraints, or convex optimization layers (Agrawal et al., 2019) in the case of general convex constraints. However, given the computational expense of such optimization layers, these projection-based approaches are likely to be inefficient. Instead, our approach leverages insights from this line of work by using implicit differentiation to backpropagate through the "completion" of equality constraints in cases where these constraints cannot be solved explicitly (such as in AC optimal power flow).

