TRANSFORMERS SATISFY

Abstract

The Propositional Satisfiability Problem (SAT), and more generally, the Constraint Satisfaction Problem (CSP), are mathematical questions defined as finding an assignment to a set of variables such that all the constraints are satisfied. The modern approach is trending to solve CSP through neural symbolic methods. Most recent works are sequential model-based, and adopt neural embedding, i.e., reinforcement learning with graph neural networks, and graph recurrent neural networks. In this work, we propose Heterogeneous Graph Transformer (HGT), a one-shot model derived from the eminent Transformer architecture for factor graph structure to solve the CSP problem. We define the heterogeneous attention mechanism based on meta-paths for the self-attention between literals, the cross-attention based on the bipartite graph links between literals and clauses. Exploiting highlevel parallelism, our model is able to achieve exceptional speed and accuracy on the factor graph for CSPs with arbitrary size. The experimental results have demonstrated the competitive performance and generality of HGT compared to the most recent baseline approaches.

1. INTRODUCTION

The Constraint Satisfaction Problems (CSP) is of central importance in several aspects of computer science, including theoretical computer science, complexity theory, algorithmics, cryptography, and artificial intelligence. CSP aims at finding a consistent assignment of values to variables such that all constraints, which are typically defined over a finite domain, are satisfied. In particular, there is an assortment of problems arising from artificial intelligence and circuit design that can be reduced to CSP subtypes, including map coloring, vertex cover, independent set, dominating set, and clique detection. Solving a CSP on a finite domain is often an NP-complete problem with respect to the domain size. The conventional CSP-solvers rely on handcrafted heuristics that guide the search for satisfying assignments. These algorithms are focused on solving CSP via backtracking or local search. Hence, the resulted model is bounded by the greedy strategy, which is generally sub-optimal. With the advent of Graph Neural Networks (Scarselli et al. ( 2009 et al. (2019) , attempted to solve CSP through different deep learning approaches. However, most pioneering works, such as neural approaches utilizing RNN or Reinforcement Learning, are still restricted to sequential algorithms, while clauses are parallelizable even though they are strongly correlated through shared variables. In this work, we propose a hybrid model of the Transformer architecture (Vaswani et al. (2017) ) and the Graph Neural Network for solving combinatorial problems, especially CSP. Our main contributions in this work are: (a) We derived meta-paths adopted from Sun et al. (2011) to formulate the message passing mechanism between homogeneous nodes (i.e., variable to variable, or clause to clause), which enable us to perform self-attention and let message pass through either variables sharing the same clauses, or clauses that include the same variables. We apply the cross-attention mechanism to optimize message exchanges between heterogeneous nodes(i.e., clause to variable, or variable to clause). (b) With the combination of homogeneous attention and heterogeneous atten-tion mechanisms on bipartite graph structure, we then combine Transformer with Neuro-Symbolic methods to resolve combinatorial optimization on graphs. (c) We proposed Heterogeneous Graph Transformer (HGT), a general framework for graphs with heterogeneous nodes. In this work, we trained the HGT framework to approximate the solutions of CSP (but not limited to CSP). Our model is able to achieve competitive accuracy, parallelism, and generality on CSP problems with arbitrary sizes 2019) is a generic neural framework for learning CSP solvers based on the idea of Propagation, Decimation, and Prediction. PDP provides a completely unsupervised training mechanism for solving SAT via energy minimization, and can be seen as learning optimal message passing strategy on probabilistic graphical models. G2SAT (You et al. ( 2019)) is a deep generative framework that learns to generate SAT formulas from a given set of input formulas while preserving the graph statistics. Even though G2SAT lacks the ability to derive solutions, it provides synthetic formulas for hyperparameter optimization. RLSAT (Yolcu & Póczos (2019)) learns SAT solvers through deep reinforcement learning and iterative refinement. It incorporates a graph neural network into a Stochastic Local Search(SLS) algorithm to act as the variable selection heuristic during training. However, since SLS begins with randomly initialized parameters, and a non-zero terminal reward is given only when a satisfying assignment is found, RLSAT requires a curriculum learning process for performance improvement. As a result, RLSAT becomes inefficient when its learning process starts with large complex graphs, in which satisfying assignments are hard to obtain. In our HGT, each possible state of assignment corresponds to a likelihood, which can be minimized to train the model. In particular, HGT is able to achieve optimal performance in efficiency and accuracy regardless of input graph sizes.

3.1. CONSTRAINT SATISFACTION PROBLEMS

Constraint Satisfaction Problems (CSP) (Kumar (1992)) is a fundamental problem in logic study that constitutes the cornerstone of combinatorial optimization. It provides feasible models to real world applications and is intensively involved in the design of artificial intelligence. An instance of CSP problem, CSP(V, U ), is constituted of two main components: a set of N variables V = {v i ∈ D : i ∈ 1...N }, defined over a discrete domain D; and a set of M constraint functions or factors, U = {u j (q j ) : j ∈ 1...M }, where q j is a subset of V subject to u j . For each u j ∈ U , u j : D |qj | → {0, 1} outputs 1 if the input q j satisfies constraint u j , and 0 otherwise. A CSP problem can be formulated in Conjunctive Normal Form (CNF) (Pfahringer ( 2010)) with the goal of finding an assignment of variables that satisfies all constraints. For a given assignment to V , the measure of



)), the geometric deep learning (Bronstein et al. (2017)) for Non-Euclidean data has become one of the most emerging fields of machine learning. In particular, it brought deep learning solutions to one of the most dominant combinatorial optimization problems, the Constraint Satisfaction Problem (CSP) (Khalil et al. (2017)). Works including NeuroSAT (Selsam et al. (2018)) and Circuit-SAT (Amizadeh et al. (2018)) commenced the study of neural methods targeted at CSP. Later works, such as Yolcu & Póczos (2019) and You

Machine Learning community has seen an increasing interest in applications and optimizations related to constraint satisfaction problem solving. Various frameworks utilising diverse methodologies have been proposed, offering new insights into developing CSP solvers and classifiers. For example, Bello et al. (2016) adopts Reinforcement Learning in their Neural Combinatorial Optimization, with an approach based on policy gradients. On the other hand, works such as Evans et al. (2018) and Arabshahi et al. (2018) have demonstrated the effectiveness of recursive neural networks in modeling symbolic expressions. Meanwhile, Prates et al. (2019) proposed an embeddingbased message-passing algorithm for solving Traveling Salesman Problem (TSP), a highly relevant CSP problem. NeuroSAT (Selsam et al. (2018)) is a graph neural network model that aims at solving the Boolean Satisfiability Problem (SAT) without leveraging the greedy search paradigm. It approaches SAT as a binary classification problem during training and finds an SAT assignment from the latent representations during inference. NeuroSAT is able to search for solutions to problems of various difficulties despite training for relatively small number of iterations. As an extension to this line of work, Selsam & Bjørner (2019) proposes a neural network that facilitates variable branching decision making within high-performance SAT solvers on real problems. PDP Amizadeh et al. (

