ON REPRESENTING MIXED-INTEGER LINEAR PRO-GRAMS BY GRAPH NEURAL NETWORKS

Abstract

While Mixed-integer linear programming (MILP) is NP-hard in general, practical MILP has received roughly 100-fold speedup in the past twenty years. Still, many classes of MILPs quickly become unsolvable as their sizes increase, motivating researchers to seek new acceleration techniques for MILPs. With deep learning, they have obtained strong empirical results, and many results were obtained by applying graph neural networks (GNNs) to making decisions in various stages of MILP solution processes. This work discovers a fundamental limitation: there exist feasible and infeasible MILPs that all GNNs will, however, treat equally, indicating GNN's lacking power to express general MILPs. Then, we show that, by restricting the MILPs to unfoldable ones or by adding random features, there exist GNNs that can reliably predict MILP feasibility, optimal objective values, and optimal solutions up to prescribed precision. We conducted small-scale numerical experiments to validate our theoretical findings.

1. INTRODUCTION

Mixed-integer linear programming (MILP) is a type of optimization problems that minimize a linear objective function subject to linear constraints, where some or all variables must take integer values. MILP has a wide type of applications, such as transportation (Schouwenaars et al., 2001) , control (Richards & How, 2005) , scheduling (Floudas & Lin, 2005) , etc. Branch and Bound (B&B) (Land & Doig, 1960) , an algorithm widely adopted in modern solvers that exactly solves general MILPs to global optimality, unfortunately, has an exponential time complexity in the worstcase sense. To make MILP more practical, researchers have to analyze the features of each instance of interest based on their domain knowledge, and use such features to adaptively warm-start B&B or design the heuristics in B&B. To automate such laborious process, researchers turn attention to Machine learning (ML) techniques in recent years (Bengio et al., 2021) . The literature has reported some encouraging findings that a proper chosen ML model is able to learn some useful knowledge of MILP from data and generalize well to some similar but unseen instances. For example, one can learn fast approximations of Strong Branching, an effective but time-consuming branching strategy usually used in B&B (Alvarez et al., 2014; Khalil et al., 2016; Zarpellon et al., 2021; Lin et al., 2022) . One may also learn cutting strategies (Tang et al., 2020; Berthold et al., 2022; Huang et al., 2022) , node selection/pruning strategies (He et al., 2014; Yilmaz & Yorke-Smith, 2020) , or decomposition strategies (Song et al., 2020) with ML models. The role of ML models in those approaches can be summarized as: approximating useful mappings or parameterizing key strategies in MILP solvers, and these mappings/strategies usually take an MILP instance as input and output its key peroperties. The graph neural network (GNN), due to its nice properties, say permutation invariance, is considered a suitable model to represent such mappings/strategies for MILP. More specifically, permutations on variables or constraints of an MILP do not essentially change the problem itself, reliable ML models such as GNNs should satisfy such properties, otherwise the model may overfit to the variable/constraint orders in the training data. Gasse et al. (2019) proposed that an MILP can be encoded into a bipartite graph on which one can use a GNN to approximate Strong Branching. Ding et al. (2020) proposed to represent MILP with a tripartite graph. Since then, GNNs have been adopted to represent mappings/strategies for MILP, for example, approximating Strong Branching (Gupta et al., 2020; Nair et al., 2020; Shen et al., 2021; Gupta et al., 2022) , approximating optimal solution (Nair et al., 2020; Khalil et al., 2022) , parameterizing cutting strategies (Paulus et al., 2022) , and parameterizing branching strategies (Qu et al., 2022; Scavuzzo et al., 2022) . However, theoretical foundations in this direction still remain unclear. A key problem is the ability of GNN to approximate important mappings related with MILP. We ask the following questions: Is GNN able to predict whether an MILP is feasible? (Q1) Is GNN able to approximate the optimal objective value of an MILP? (Q2) Is GNN able to approximate an optimal solution of an MILP? (Q3) To answer questions (Q1 -Q3), one needs theories of separation power and representation power of GNN. The separation power of GNN is measured by whether it can distinguish two non-isomorphic graphs. Given two graphs G 1 , G 2 , we say a mapping F (e.g. a GNN) has strong separation power if F (G 1 ) ̸ = F (G 2 ) as long as G 1 , G 2 that are not the isomorphic. In our settings, since MILPs are represented by graphs, the separation power actually indicates the ability of GNN to distinguish two different MILP instances. The representation power of GNN refers to how well it can approximate mappings with permutation invariant properties. In our settings, we study whether GNN can map an MILP to its feasiblity, optimal objective value and an optimal solution. The separation power and representation power of GNNs are closely related to the Weisfeiler-Lehman (WL) test (Weisfeiler & Leman, 1968 ), a classical algorithm to identify whether two graphs are isomorphic or not. It has been shown that GNN has the same separation power with the WL test (Xu et al., 2019) , and, based on this result, GNNs can universally approximate continuous graph-input mappings with separation power no stronger than the WL test (Azizian & Lelarge, 2021; Geerts & Reutter, 2022). Our contributions With the above tools in hand, one still cannot directly answer questions (Q1 -Q3) since the relationship between characteristics of general MILPs and properties of graphs are not clear yet. Although there are some works studying the representation power of GNN on some graphrelated optimization problems (Sato et al., 2019; Loukas, 2020) and linear programming (Chen et al., 2022) , representing general MILPs with GNNs are still not theoretically studied, to the best of our knowledge. Our contributions are listed below: • (Limitation of GNNs for MILP) We show with an example that GNNs do not have strong enough separation power to distinguish any two different MILP instances. There exist two MILPs such that one of them is feasible while the other one is not, but, unfortunately, all GNNs treat them equally without detecting the essential difference between them. In fact, there are infinitely many pairs of MILP instances that can puzzle GNN. • (Foldable and unfoldable MILP) We provide a precise mathematical description on what type of MILPs makes GNNs fail. These hard MILP instances are named as foldable MILPs. We prove that, for unfoldable MILPs, GNN has strong enough separation power and representation power to approximate the feasibility, optimal objective value and an optimal solution. • (MILP with random features) To handle those foldable MILPs, we propose to append random features to the MILP-induced graphs. We prove that, with the random feature technique, the answers to questions (Q1 -Q3) are affirmative.



* A major part of the work of Z. Chen was completed during his internship at Alibaba US DAMO Academy. † Corresponding author.

