INVERSELY ELICITING NUMERICAL REASONING IN LANGUAGE MODELS VIA SOLVING LINEAR SYSTEMS

Abstract

Numerical reasoning over natural language has been a long-standing goal for the research community. However, recent language models have proven difficult to reliably generalize to a broad range of numbers, although they have shown proficiency in reasoning over common and simple numbers. In this paper, we propose a novel method to elicit and exploit the numerical reasoning knowledge hidden in pre-trained language models using simple anchor numbers. Concretely, we first leverage simple numbers as anchors to probe the implicitly inferred arithmetic expressions from language models, and then explicitly apply the expressions on complex numbers to get corresponding answers. To inversely elicit arithmetic expressions, we transform and formulate the task as an analytically solvable linear system. Experimental results on several numerical reasoning benchmarks demonstrate that our approach is highly effective. More importantly, our approach works in the inference phase without extra model training, making it highly portable and achieving significant and consistent performance benefits across a variety of language models in zero-shot, few-shot, and fine-tuning scenarios.

1. INTRODUCTION

Language Models (LMs) have demonstrated great success on a wide range of natural language tasks (Devlin et al., 2018; Brown et al., 2020b; Chowdhery et al., 2022) , and recent works even explore to use LMs as a general-purpose interface for diverse modalities (Hao et al., 2022; Xie et al., 2022; He et al., 2022) . But when it comes to reasoning about numbers, the crucial parts of text, tables, and knowledge bases, the performance of LMs slumps. A key challenge of numerical reasoning for now is number calculation. Even rational numbers, a small subset of real numbers, readily constitute an infinite space that cannot be completely covered by pre-training corpora, hence posing a significant obstacle to LMs. Recent works have shown strong context understanding capabilities of LMs in numerical reasoning datasets (Dua et al., 2019; Cobbe et al., 2021) , but LMs are still far from being robust on implicit numerical calculation: as numbers grow bigger and more complex, LMs are more likely to fail, e.g., 8, 534.5 + 17.85; and even for small number additions, e.g., 512+128 and 513+129, LMs are not stable enough to produce the correct answer consistently. Similar observations are also reported by Razeghi et al. (2022) , showing that end-to-end LMs easily fail to calculate numbers that rarely appear in pre-training corpora. Fortunately, by reverse thinking, we have a positive perspective: with the exact same context, LMs are significantly more accurate and stable on simple numbers -typically small integers that appear frequently in the pre-training corpora -than complex numbers, indicating that LMs have a strong capability of applying arithmetic results to simple numbers after pre-training. This motivates us to leverage simple numbers as "anchors" to probe the implicitly inferred arithmetic expressions from language models and then explicitly apply the expressions on complex numbers. Specifically, as Figure 1 illustrates, when detecting complex numbers (10, 477 and 7, 459) that are challenging for LMs, to first replace them by anchor numbers(10 and 7, etc) and use LMs to output answers (3, etc) that are more much accurate than complex numbers, then inversely elicit the hidden arithmetic relationship (x 1 -x 2 ) implicitly inferred by LMs through anchor inputs/outputs (10,7,3, etc) , and finally explicitly doing the arithmetic using the initial complex numbers (10, 459) to produce the precise answer (3, 018). In this way, our method combines the advances of LMs on understanding complex context and memorizing simple numbers for reliable numerical reasoning.

