LEARNING WHERE AND WHEN TO REASON IN NEURO-SYMBOLIC INFERENCE

Abstract

The integration of hard constraints on neural network outputs is a very desirable capability. This allows to instill trust in AI by guaranteeing the sanity of that neural network predictions with respect to domain knowledge. Recently, this topic has received a lot of attention. However, all the existing methods usually either impose the constraints in a "weak" form at training time, with no guarantees at inference, or fail to provide a general framework that supports different tasks and constraint types. We tackle this open problem from a neuro-symbolic perspective. Our pipeline enhances a conventional neural predictor with (1) a symbolic reasoning module capable of correcting structured prediction errors and (2) a neural attention module that learns to direct the reasoning effort to focus on potential prediction errors, while keeping other outputs unchanged. This framework provides an appealing trade-off between the efficiency of constraint-free neural inference and the prohibitive cost of exhaustive reasoning at inference time. We show that our method outperforms the state of the art on visual-Sudoku, and can also benefit visual scene graph prediction. Furthermore, it can improve the performance of existing neuro-symbolic systems that lack our explicit reasoning during inference.

1. INTRODUCTION

Despite the rapid advance of machine learning (ML), it is still difficult for deep learning architectures to solve a certain classes of problems, especially those that require non-trivial symbolic reasoning (e.g. automated theorem proving or scientific discovery). A very practical example of this limitation -even in applications that are typical deep learning territory such as image processing -is the difficulty of imposing hard symbolic constraints on model outputs. This is relevant when learning systems produce outputs for which domain knowledge constraints apply (e.g., Figure 2 ). The common situation today, that ML systems violate such constraints regularly, is both a missed opportunity to improve performance and more importantly a source of reduced public trust in AI. This issue has motivated a growing body of work in neuro-symbolic methods that aim to exploit domain knowledge constraints and reasoning to improve performance. Most of these methods address neuro-symbolic learning, where constraints are applied in the loss function (e.g., Xu et al. (2018); Xie et al. (2019); Li et al. (2019) ; Wang & Pan (2020)) and predictions that violate those constraints are penalised. In this way, during learning, the model is "encouraged" to move close to a solution that satisfies the constraints/rules. However, high-capacity deep networks in any case usually fit their training sets, and thus violate no constraints on the output labels during supervised learning. The central issue of whether constraints are also met upon inference during deployment is unaddressed by these methods and is under-studied more generally Giunchiglia et al. (2022b); Dash et al. (2022); von Rueden et al. (2021) . A minority of studies have worked towards exploiting constraints during inference. Since in general reasoning to guarantee that constraints are met is expensive, some methods try to apply soft relaxations (Daniele & Serafini, 2019; Li & Srikumar, 2019; Wang et al., 2019) , which is unhelpful for trust and guarantees. The few methods that manage to impose exact constraints are either restricted to very simple or restrictive rules (Yu et al., 2017; Giunchiglia et al., 2022b) that are not expressive enough, or involve invoking a full reasoning engine during inference Manhaeve et al. (2018); Yang et al. (2020) , which is prohibitively costly in general. In this work we explore a new neuro-symbolic integration approach to manage the trade-off between the cost, expressivity, and exactness of reasoning during inference. Our Neural Attention for Symbolic Reasoning (NASR) pipeline leverages the best of neural and symbolic worlds. Rather than performing inexact, inexpressive, or intractable reasoning, we first execute an efficient neuralsolver to solve a task, and then delegate a symbolic solver to correct any mistakes of the neural solver. By reasoning over only a subset of neural fact predictions we maintain efficiency. So long as the facts selected have no false positives, we guarantee constraints are met during inference. Thus we enjoy most of the benefit of symbolic reasoning about the solution with a cost similar to neural inference. This dual approach is aligned with the 'two systems' perspective of Sloman (1996) and Kahneman (2011) that has recently been applied in AI (e.g. Booch et al. (2021) and LeCun ( 2022)). More specifically, our NASR framework is built upon three components: A neural network (Neuro-Solver) is trained to solve a given task directly; a symbolic engine that reasons over the output of the Neuro-Solver, and can revise its predictions in accordance with domain knowledge; and a hard attention neural network (Mask-Predictor), that decides which subset of Neuro-Solver predictions should be eligible for revision by the reasoning engine using domain knowledge. The Mask-Predictor essentially learns when and where to reason in order to effectively achieve high prediction accuracy and constraint satisfaction with low computation cost. Since the reasoning engine is not generally differentiable, we train our framework with reinforcement learning (RL). The contributions of our work can be summarized as follows: (1) We provide a novel neuro-symbolic integration pipeline with a novel neural-attention module (Mask-Predictor) that works with any type of constraints/rules (2) We apply such architecture in the case of the visual-Sudoku task (given an image of a incomplete Sudoku board, the goal is to provide a complete symbolic solution) considerably improving the state-of-the-art (3) Finally, we show that when wrapping an existing state-of-the-art (replacing Neuro-Solver), our framework significantly improves its model performance. The code is available at: https://github.com/corneliocristina/NASR.

1.1. RELATED WORKS

There has been a lot of recent research regarding the imposition of constrains in neural networks. This can be roughly divided in the following categories: 1) Modification of the Loss function: 



Xu et al. (2018)  add a component to the loss function quantifying the level of disagreement with the constraints; A similar idea can be found in the work ofXie et al. (2019)  andLi et al. (2019);Wang & Pan (2020)  instead exploit a parallel neuro-reasoning engine to produce the same output as the neural process and then add the distance between the two outcomes in the loss. 2) Adversarial training: In the work ofAshok et al. (2021)  they integrate a NN with a violation function generating new data; A similar idea can be found in the work ofMinervini & Riedel (2018). 3) Adding an ad-hoc constraint output layer: One example is adding a layer to the network which manipulates the output enforcing the constraints Giunchiglia & Lukasiewicz (2021); Ahmed et al. (2022) adds a compiled logic circuit layer to the network enforcing the constraints; Yang et al. (2020) and Manhaeve et al. (2018) instead create a parallel between the logic predicates with their neural version. 4) Logic relaxations: Daniele & Serafini (2019) use a differentiable relaxation of logic to extend a NN with an output logic layer to increase the probability of compliant outcomes; Li & Srikumar (2019) have a similar approach but on the internal layers, augmenting the likelihood of a neuron when a rule is satisfied; Wang et al. (2019) introduce a differentiable MAXSAT solver that can be integrated into neural networks; Similar ideas using different types of logic relaxations can be found in the work of Gan et al. (2021); Sachan et al. (2018); Donadello & Serafini (2019) and Marra et al. (2019). 5) Neuro-symbolic integrations: The alternating of purely symbolic components with neural ones can be found in the work of Agarwal et al. (2021) where the authors create an encoder/decoder with a standard reasoner in the middle. In another approach, Yang et al. (2020) combine answer set programs with the output of a neural network. Similar methods are the one of Tsamoura et al. (2021) and of Manhaeve et al. (2018). Other neuro-symbolic integration methods worth mentioning (that consider the specific show case of the Visual Sudoku task) are: Brouard et al. (2020) extract preferences from data and push it into Cost Function Networks; Mulamba et al. (2020) combine a neural perception module with a purely symbolic reasoner; and Yiwei et al. (2021) use curriculum-learning-with-restarts framework to boost the performance of Deep Reasoning Nets.

