ON REPRESENTING LINEAR PROGRAMS BY GRAPH NEURAL NETWORKS

Abstract

Learning to optimize is a rapidly growing area that aims to solve optimization problems or improve existing optimization algorithms using machine learning (ML). In particular, the graph neural network (GNN) is considered a suitable ML model for optimization problems whose variables and constraints are permutation-invariant, for example, the linear program (LP). While the literature has reported encouraging numerical results, this paper establishes the theoretical foundation of applying GNNs to solving LPs. Given any size limit of LPs, we construct a GNN that maps different LPs to different outputs. We show that properly built GNNs can reliably predict feasibility, boundedness, and an optimal solution for each LP in a broad class. Our proofs are based upon the recently-discovered connections between the Weisfeiler-Lehman isomorphism test and the GNN. To validate our results, we train a simple GNN and present its accuracy in mapping LPs to their feasibilities and solutions.

1. INTRODUCTION

Applying machine learning (ML) techniques to accelerate optimization, also known as Learning to Optimize (L2O), is attracting increasing attention. It has been reported that L2O shows great potentials on both continuous optimization (Monga et al., 2021; Chen et al., 2021; Amos, 2022) and combinatorial optimization (Bengio et al., 2021; Mazyavkina et al., 2021) . Many of the L2O works train a parameterized model that takes the optimization problem as input and outputs information useful to classic algorithms, such as a good initial solution and branching decisions (Nair et al., 2020) , and some even directly generate an approximate optimal solution (Gregor & LeCun, 2010) . In these works, one is building an ML model to approximate the mapping from an explicit optimization instance either to its key properties or directly to its solution. The ability to achieve accurate approximation is called the representation power or expressive power of the model. When the approximation is accurate, the model can solve the problem or provide useful information to guide an optimization algorithm. This paper tries to address a fundamental but open theoretical problem for linear programming (LP): Which neural network can represent LP and predict its key properties and solution? (P0) To clarify, by solution we mean the optimal solution. Let us also remark that this question is not only of theoretical interest. Although currently neural network models may not be powerful enough to replace those mathematical-grounded LP solvers and obtain an exact LP solution, they are still useful in helping LP solvers from several perspectives, including warm-start and configuration. It requires that neural networks have sufficient power to recognize key characteristics of LPs. Some very recent papers (Deka & Misra, 2019; Pan et al., 2020; Chen et al., 2022) on DC optimal power flow (DC-OPF), an important type of LP, experimentally show the possibility of fast approximating LP solutions with deep neural networks. Practitioners may initialize an LP solver with those approximated solutions. We hope the answer to (P0) paves the way toward answering this question for other optimization types. Linear Programming (LP). LP is an important type of optimization problem with a wide range of applications, such as scheduling (Hanssmann & Hess, 1960)  min x∈R n c ⊤ x, s.t. Ax • b, l ≤ x ≤ u, (1.1) where A ∈ R m×n , c ∈ R n , b ∈ R m , l ∈ (R ∪ {-∞}) n , u ∈ (R ∪ {+∞}) n , and • ∈ {≤, =, ≥} m . Any LP problems must follow one of the following three cases (Bertsimas & Tsitsiklis): • Infeasible. The feasible set X F := {x ∈ R n : Ax • b, l ≤ x ≤ u} is empty. In another word, there is no point in R n that satisfies the constraints in LP (1.1). • Unbounded. The feasible set is non-empty, but the objective value can be arbitrarily good, i.e., unbounded from below. For any R > 0, there exists an x ∈ X F such that c ⊤ x < -R. • Feasible and bounded. There exists x * ∈ X F such that c ⊤ x * ≤ c ⊤ x for all x ∈ X F . Such x * is named as an optimal solution, and c ⊤ x * is the optimal objective value. Thus, considering (P0), an ideal ML model is expected to be able to predict the three key characteristics of LP: feasibility, boundedness, and one of its optimal solutions (if exists), by taking the LP features (A, b, c, l, u, •) as input. Actually, such input has a strong mathematical structure. If we swap the positions of the i, j-th variable in (1.1), elements in vectors b, c, l, u, • and columns of matrix A will be reordered. The reordered features ( Â, b, ĉ, l, û, •) actually represent an exactly equivalent LP problem with the original one (A, b, c, l, u, •). Such property is named as permutation invariance. If we do not explicitly restrict ML models with a permutation invariant structure, the models may overfit to the variable/constraint orders of instances in the training set. Motivated by this point, we adopt Graph Neural Networks (GNNs) that are permutation invariant naturally. GNN in L2O. GNN is a type of neural networks defined on graphs and widely applied in many areas, for example, recommender systems, traffic, chemistry, etc (Wu et al., 2020; Zhou et al., 2020) . Accelerating optimization solvers with GNNs attracts rising interest recently (Peng et al., 2021; Cappart et al., 2021) . Many graph-related optimization problems, like minimum vertex cover, traveling salesman, vehicle routing, can be represented and solved approximately with GNNs due to their problem structures (Khalil et al., 2017; Kool et al., 2019; Joshi et al., 2019; Drori et al., 2020 (Nair et al., 2020; Gupta et al., 2020; 2022; Shen et al., 2021; Khalil et al., 2022; Liu et al., 2022; Paulus et al., 2022; Qu et al., 2022; Li et al., 2022) . Although encouraging empirical results have been observed, theoretical foundations are still lack for this approach. Specifying (P0), we ask: Are there GNNs that can predict the feasibility, boundedness and an optimal solution of LP? (P1) Related works and contributions. To answer (P1), one needs the theory of separation power and representation power. Separation power of a neural network (NN) means its ability to distinguish two different inputs. In our settings, a NN with strong separation power means that it can outputs different results when it is applied on any two different LPs. Representation power of NN means its ability to approximate functions of interest. The theory of representation power is established upon the separation power. Only functions with strong enough separation power may possess strong



* A major part of the work of Z. Chen was completed during his internship at Alibaba US DAMO Academy. † Corresponding author.



, signal processing(Candes & Tao,  2005), machine learning(Dedieu et al., 2022), etc. A general LP problem is defined as:

). Besides that, one may solve a general LP or mixed-integer linear programming (MILP) with the help of GNNs.Gasse et al. (2019)  proposed to represent an MILP with a bipartite graph and apply a GNN on this graph to guide an MILP solver.Ding et al. (2020)  proposed a tripartite graph to represent MILP. Since that, many approaches have been proposed to guide MILP or LP solvers with GNNs

