Learning advanced mathematical computations from examples

Abstract

Using transformers over large generated datasets, we train models to learn mathematical properties of differential systems, such as local stability, behavior at infinity and controllability. We achieve near perfect prediction of qualitative characteristics, and good approximations of numerical features of the system. This demonstrates that neural networks can learn to perform complex computations, grounded in advanced theory, from examples, without built-in mathematical knowledge.

1. Introduction

Scientists solve problems of mathematics by applying rules and computational methods to the data at hand. These rules are derived from theory, they are taught in schools or implemented in software libraries, and guarantee that a correct solution will be found. Over time, mathematicians have developed a rich set of computational tools that can be applied to many problems, and have been said to be "unreasonably effective" (Wigner, 1960) . Deep learning, on the other hand, learns from examples and solves problems by improving a random initial solution, without relying on domain-related theory and computational rules. Deep networks have proven to be extremely efficient for a large number of tasks, but struggle on relatively simple, rule-driven arithmetic problems (Saxton et al., 2019; Trask et al., 2018; Zaremba and Sutskever, 2014) . Yet, recent studies show that deep learning models can learn complex rules from examples. In natural language processing, models learn to output grammatically correct sentences without prior knowledge of grammar and syntax (Radford et al., 2019) , or to automatically map one language into another (Bahdanau et al., 2014; Sutskever et al., 2014) . In mathematics, deep learning models have been trained to perform logical inference (Evans et al., 2018) , SAT solving (Selsam et al., 2018) or basic arithmetic (Kaiser and Sutskever, 2015) . Lample and Charton (2020) showed that transformers can be trained from generated data to perform symbol manipulation tasks, such as function integration and finding formal solutions of ordinary differential equations. In this paper, we investigate the use of deep learning models for complex mathematical tasks involving both symbolic and numerical computations. We show that models can predict the qualitative and quantitative properties of mathematical objects, without built-in mathematical knowledge. We consider three advanced problems of mathematics: the local stability and controllability of differential systems, and the existence and behavior at infinity of solutions of partial differential equations. All three problems have been widely researched and have many applications outside of pure mathematics. They have known solutions that rely on advanced symbolic and computational techniques, from formal differentiation, Fourier transform, algebraic full-rank conditions, to function evaluation, matrix inversion, and computation of complex eigenvalues. We find that neural networks can solve these problems with a very high accuracy, by simply looking at instances of problems and their solutions, while being totally unaware of the underlying theory. In one of the quantitative problems

