LEARNING TO SOLVE CONSTRAINT SATISFACTION PROBLEMS WITH RECURRENT TRANSFORMER

Abstract

Constraint satisfaction problems (CSPs) are about finding values of variables that satisfy the given constraints. We show that Transformer extended with recurrence is a viable approach to learning to solve CSPs in an end-to-end manner, having clear advantages over state-of-the-art methods such as Graph Neural Networks, SATNet, and some neuro-symbolic models. With the ability of Transformer to handle visual input, the proposed Recurrent Transformer can straightforwardly be applied to visual constraint reasoning problems while successfully addressing the symbol grounding problem. We also show how to leverage deductive knowledge of discrete constraints in the Transformer's inductive learning to achieve sampleefficient learning and semi-supervised learning for CSPs.

1. INTRODUCTION

Constraint Satisfaction Problems (CSPs) are about finding values of variables that satisfy given constraints. They have been widely studied in symbolic AI with an emphasis on designing efficient algorithms to deductively find solutions for explicitly stated constraints. In the recent deep learningbased approach, the focus is on inductively learning the constraints and solving them in an end-to-end manner. For example, the Recurrent Relational Network (RRN) (Palm et al., 2018) uses message passing over graph structures to learn logical constraints, achieving high accuracy in textual Sudoku. On the other hand, it uses hand-coded information about Sudoku constraints, namely, which variables are allowed to interact. Moreover, it is limited to textual input. SATNet (Wang et al., 2019) is a differentiable MAXSAT solver that can infer logical rules and can be integrated into DNNs. SATNet was shown to solve even visual Sudoku, where the input is a hand-written Sudoku board. The problem is harder because a model has to learn how to map visual inputs to symbolic digits without explicit supervision. However, Chang et al. (2020) observed a label leakage issue with the experiment; with proper evaluation, the performance of SATNet on visual Sudoku dropped to 0%. Moreover, SATNet evaluation is limited to easy puzzles, and SATNet does not perform well on hard puzzles that RRN could solve. On another aspect, although these models could learn complicated constraints purely from data, in many cases, (part of) constraints are already known, and exploiting such deductive knowledge in inductive learning could be helpful for sample-efficient and robust learning. The problem is challenging, especially if the knowledge is in the form of discrete constraints, whereas standard deep learning is mainly about optimizing the continuous and differentiable parameters. This paper provides a viable solution to the limitations of the above models based on the Transformer architecture. Transformer-based models have not been shown to be effective for CSPs despite their widespread applications in language (Vaswani et al., 2017; Zhang et al., 2020; Helwe et al., 2021; Li et al., 2020) and vision (Dosovitskiy et al., 2020; Gabeur et al., 2020) . Creswell et al. ( 2022) asserted that Transformer-based large language models (LLMs) tend to perform poorly on multi-step logical reasoning problems. In the case of Sudoku, typical solving requires about 20 to 60 steps of reasoning. Despite the various ideas for prompting GPT-3, GPT-3 is not able to solve Sudoku. Nye et al. (2021) note that LLMs work well for system 1 intuitive thinking but not for system 2 logical thinking. Given the superiority of other models on CSPs, one might conclude that Transformers are unsuitable for CSPs. We find that Transformer can be successfully applied to CSPs by incorporating recurrence, which encourages the Transformer model to apply multi-step reasoning similar to RRNs. Interestingly, this simple change already yields better results than the other models above and gives several other advantages. The learning is more robust than SATNet's. Looking at the learned attention matrices, we could interpret what the Transformer has learned. Intuitively, multi-head attention extracts distinct information about the problem structure. Adding more attention blocks and recurrences tends to make the model learn better. Analogously to the Vision Transformer (Dosovitskiy et al., 2020) , our model can be easily extended to process visual input. Moreover, the model avoids the symbol grounding problem encountered by SATNet. In addition, we present a way to inject discrete constraints into the Recurrent Transformer training, borrowing the idea from (Yang et al., 2022) . That paper shows a way to encode logical constraints as a loss function and use Straight-Through Estimators (STE) (Courbariaux et al., 2015) to make discrete constraints meaningfully differentiable for gradient descent. We apply this idea to Recurrent Transformer with some modifications. We note that adding explicit constraint loss to all recurrent layers helps the Transformer learn more effectively. We also add a constraint loss to the attention matrix so that constraints can help learn better attentions. Including these constraint losses in training improves accuracy and lets the Transformer learn with fewer labeled data (semi-supervised learning). In summary, the paper makes the following contributions.foot_0  Recurrent Transformer for Constraint Reasoning. We show that Recurrent Transformer is a viable approach to learning to solve CSPs, with clear advantages over state-of-the-art methods, such as RRN and SATNet. Symbol Grounding with Recurrent Transformer. With the ability of Transformers to handle vision problems well, we demonstrate that our model can straightforwardly be applied to visual constraint reasoning problems while successfully addressing the symbol grounding problem. It achieves 93.5% test accuracy on the SATNet's visual Sudoku test set, for which even the enhanced SATNet from (Topan et al., 2021) could achieve only 64.8% accuracy. Injecting Logical Constraints into Transformers. We show how to inject discrete logical constraints into Recurrent Transformer training to achieve sample-efficient learning and semi-supervised learning for CSPs.

2.1. CONSTRAINT SATISFACTION PROBLEMS

A constraint satisfaction problem is defined as X, D, C where X = {X 1 , . . . , X t } is a set of t logical variables; D = {D 1 , . . . , D t } and each D i is a finite set of domain values for logical variable X i ; and C is a set of constraints. An atom (i.e., value assignment) is of the form X i = v where v ∈ D i . A constraint on a sequence X i , . . . , X j of variables is a mapping: D i × • • • × D j → {TRUE, FALSE} that specifies the set of atoms that can or cannot hold at the same time. A (complete) evaluation is a set of t atoms {X i = v | i ∈ {1, . . . , t}, v ∈ D i }. An evaluation is a solution if it does not violate any constraint in C, i.e., it makes all constraints TRUE. One of the commonly used constraints is the cardinality constraint: l ≤ |{X i = v i , . . . , X j = v j }| ≤ u (1) where l and u are nonnegative integers denoting bounds, and for k ∈ {i, . . . , j}, X k ∈ X and v k ∈ D k . Cardinality constraint (1) is TRUE iff the number of atoms that are true in it is between l and u. If l = u, constraint (1) can be simplified to |{X i = v i , . . . , X j = v j }| = l (2) which is TRUE iff the number of atoms in the given set is exactly l. If i = j and l = 1, constraint (2) can be further simplified to X i = v i . Example 1 (CSP for Sudoku) A CSP for a Sudoku puzzle is such that X = {cell 1 , . . . , cell 81 } denotes all 81 cells on a Sudoku board; D = {D 1 , . . . , D 81 } and D i = {1, . . . , 9} (i = 1, . . . , 81)



The code is available at https://github.com/azreasoners/recurrent_transformer.

