WEAKLY SUPERVISED EXPLAINABLE PHRASAL REASONING WITH NEURAL FUZZY LOGIC

Abstract

Natural language inference (NLI) aims to determine the logical relationship between two sentences, such as Entailment, Contradiction, and Neutral. In recent years, deep learning models have become a prevailing approach to NLI, but they lack interpretability and explainability. In this work, we address the explainability of NLI by weakly supervised logical reasoning, and propose an Explainable Phrasal Reasoning (EPR) approach. Our model first detects phrases as the semantic unit and aligns corresponding phrases in the two sentences. Then, the model predicts the NLI label for the aligned phrases, and induces the sentence label by fuzzy logic formulas. Our EPR is almost everywhere differentiable and thus the system can be trained end to end. In this way, we are able to provide explicit explanations of phrasal logical relationships in a weakly supervised manner. We further show that such reasoning results help textual explanation generation. 1

1. INTRODUCTION

Natural language inference (NLI) aims to determine the logical relationship between two sentences (called a premise and a hypothesis), and target labels include Entailment, Contradiction, and Neutral (Bowman et al., 2015; MacCartney & Manning, 2008) . Figure 1 gives an example, where the hypothesis contradicts the premise. NLI is important to natural language processing, because it involves logical reasoning and is a key problem in artificial intelligence. Previous work shows that NLI can be used in various downstream tasks, such as information retrieval (Karpukhin et al., 2020) and text summarization (Liu & Lapata, 2019) . In recent years, deep learning has become a prevailing approach to NLI (Bowman et al., 2015; Mou et al., 2016; Wang & Jiang, 2016; Yoon et al., 2018) . Especially, pretrained language models with the Transformer architecture (Vaswani et al., 2017) achieve state-of-the-art performance for the NLI task (Radford et al., 2018; Zhang et al., 2020) . However, such deep learning models are black-box machinery and lack interpretability. In real applications, it is important to understand how these models make decisions (Rudin, 2019) . Several studies have addressed the explainability of NLI models. Camburu et al. ( 2018) generate a textual explanation by sequence-to-sequence supervised learning, in addition to NLI classification; such an approach is multi-task learning of text classification and generation, which does not perform reasoning itself. MacCartney et al. (2008) propose a scoring model to align related phrases; Parikh et al. (2016) and Jiang et al. (2021) propose to obtain alignment by attention mechanisms. However, they only provide correlation information, instead of logical reasoning. Other work incorporates upward and downward monotonicity entailment reasoning for NLI (Hu et al., 2020; Chen et al., 2021) , but these approaches are based on hand-crafted rules (e.g., every downward entailing some) and are restricted to Entailment only; they cannot handle Contradiction or Neutral.



Code and resources available at https://github.com/MANGA-UOFA/EPR 1

