WEAKLY SUPERVISED NEURO-SYMBOLIC MODULE NETWORKS FOR NUMERICAL REASONING

Abstract

Neural Module Networks (NMNs) have been quite successful in incorporating explicit reasoning as learnable modules in various question answering tasks, including the most generic form of numerical reasoning over text in Machine Reading Comprehension (MRC). However, to achieve this, contemporary NMNs need strong supervision in executing the query as a specialized program over reasoning modules and fail to generalize to more open-ended settings without such supervision. Hence we propose Weakly-Supervised Neuro-Symbolic Module Network (WNSMN) trained with answers as the sole supervision for numerical reasoning based MRC. It learns to execute a noisy heuristic program obtained from the dependency parsing of the query, as discrete actions over both neural and symbolic reasoning modules and trains it end-to-end in a reinforcement learning framework with discrete reward from answer matching. On the numerical-answer subset of DROP, WNSMN outperforms NMN by 32% and the reasoning-free language model GenBERT by 8% in exact match accuracy when trained under comparable weak supervised settings. This showcases the effectiveness and generalizability of modular networks that can handle explicit discrete reasoning over noisy programs in an end-to-end manner.

1. INTRODUCTION

End-to-end neural models have proven to be powerful tools for an expansive set of language and vision problems by effectively emulating the input-output behavior. However, many real problems like Question Answering (QA) or Dialog need more interpretable models that can incorporate explicit reasoning in the inference. In this work, we focus on the most generic form of numerical reasoning over text, encompassed by the reasoning-based MRC framework. A particularly challenging setting for this task is where the answers are numerical in nature as in the popular MRC dataset, DROP (Dua et al., 2019) . Figure 1 shows the intricacies involved in the task, (i) passage and query language understanding, (ii) contextual understanding of the passage date and numbers, and (iii) application of quantitative reasoning (e.g., max, not) over dates and numbers to reach the final numerical answer. Three broad genres of models have proven successful on the DROP numerical reasoning task. First, large-scale pretrained language models like GenBERT (Geva et al., 2020) uses a monolithic Transformer architecture and decodes numerical answers digit-by-digit. Though they deliver mediocre performance when trained only on the target data, their competency is derived from pretraining on massive synthetic data augmented with explicit supervision of the gold numerical reasoning. Second kind of models are the reasoning-free hybrid models like NumNet (Ran et al., 2019 ), NAQANet (Dua et al., 2019) , NABERT+ (Kinley & Lin, 2019) and MTMSN (Hu et al., 2019 ), NeRd (Chen et al., 2020) . They explicitly incorporate numerical computations in the standard extractive QA pipeline by learning a multi-type answer predictor over different reasoning types (e.g., max/min, diff/sum, count, negate) and directly predicting the corresponding numerical expression, instead of learning to reason. This is facilitated by exhaustively precomputing all possible outcomes of discrete operations and augmenting the training data with the reasoning-type supervision and numerical expressions that lead to the correct answer. Lastly, the most relevant class of models to consider for this work are the modular networks for reasoning. Neural Module Networks (NMN) (Gupta et al., 2020) is the first explicit reasoning based QA model which parses the query into a specialized program and executes it step-wise over learnable reasoning modules. However, to do so, apart from the exhaustive precomputation of all discrete operations, it also needs more fine-grained supervision of the gold

