LEARNING COMBINATORIAL NODE LABELING ALGORITHMS

Abstract

We present the combinatorial node labeling framework, which generalizes many prior approaches to solving hard graph optimization problems by supporting problems where solutions consist of arbitrarily many node labels, such as graph coloring. We then introduce a neural network architecture to implement this framework. Our architecture builds on a graph attention network with several inductive biases to improve solution quality and is trained using policy gradient reinforcement learning. We demonstrate our approach on both graph coloring and minimum vertex cover. Our learned heuristics match or outperform classical hand-crafted greedy heuristics and machine learning approaches while taking only seconds on large graphs. We conduct a detailed analysis of the learned heuristics and architecture choices and show that they successfully adapt to different graph structures.

1. INTRODUCTION

Graph problems have numerous real-world applications, ranging from scheduling problems (Marx, 2004) and register allocation (Chaitin, 1982; Smith et al., 2004) , to computational biology (Abukhzam et al., 2004) . However, many useful graph optimizations problems are NP-hard to solve (Karp, 1972) . This has spurred a variety of approaches, from greedy heuristics (Brélaz, 1979; Papadimitriou & Steiglitz, 1982; Matula & Beck, 1983; Avis & Imamura, 2007; Delbot & Laforest, 2008) to integer linear programming (Graver, 1975) . More recently, machine learning approaches have shown increasing promise (Dai et al., 2017; Kool et al., 2019; Li et al., 2018; Karalias & Loukas, 2020) . From a structural point of view, many graph problems fall into one of three classes depending on the type of their solution: Problems that ask for (1) subsets of vertices, (2) permutations of vertices, or (3) partitions of vertices into two or more sets. Most work has focussed on either the first two (Dai et al., 2017) , or just one of the three (Bello et al., 2017; Li et al., 2018; Kool et al., 2019; Karalias & Loukas, 2020; Manchanda et al., 2020; Cappart et al., 2020; Drori et al., 2020; Ma et al., 2020) . Existing machine learning methods for the first two types of problems, such as S2V-DQN (Dai et al., 2017) , do not easily generalize to cases where the number of labels is not known in advance. Many important and challenging problems, such as graph coloring (Marx, 2004; Myszkowski, 2008; Bandh et al., 2009) , require that vertices be partitioned into an unkown number of sets. To address this, we present the combinatorial node labeling framework ( §2), which generalizes prior approaches (Fig. 1 ), and supports many problems, including minimum vertex cover (Onak et al., 2012; Bhattacharya et al., 2017; Ghaffari et al., 2020 ), traveling salesman (Dantzig et al., 1954; Garey & Johnson, 1990 ), maximum cut (Karp, 1972) , and list coloring (Jensen et al., 1995) . These, and many other ( §D), problems can all be framed as iteratively assigning a label to nodes, in some order. We then introduce a neural architecture, GAT-CNL, to learn greedy-inspired heuristics for such problems ( §3). We use policy gradient reinforcement learning (Sutton & Barto, 2018; Kool et al., 2019) to learn a node ordering and combine this with a fixed label rule to label each node according to the ordering. We show that for the chosen label rules, there still exists an order that guarantees an optimal solution. By using policy gradients, we can construct both a deterministic greedy policy, as well as a probabilistic policy where sampling boosts the solution quality. To improve performance, we incorporate two inductive biases: spatial locality, where labeling a node only impacts the weights of its neighbors; and temporal locality, where node selection is conditioned only on the previously labeled node, a summary of prior labelings, and a global graph context (Figs. 2 and 3 ).

