D-CIPHER: DISCOVERY OF CLOSED-FORM PARTIAL DIFFERENTIAL EQUATIONS

Abstract

Closed-form differential equations, including partial differential equations and higher-order ordinary differential equations, are one of the most important tools used by scientists to model and better understand natural phenomena. Discovering these equations directly from data is challenging because it requires modeling relationships between various derivatives that are not observed in the data (equation-data mismatch) and it involves searching across a huge space of possible equations. Current approaches make strong assumptions about the form of the equation and thus fail to discover many well-known phenomena. Moreover, many of them resolve the equation-data mismatch by estimating the derivatives, which makes them inadequate for noisy and infrequent observations. To this end, we propose D-CIPHER, which is robust to measurement artifacts and can uncover a new and very general class of differential equations. We further design a novel optimization procedure, CoLLie, to help D-CIPHER search through this class efficiently. Finally, we demonstrate empirically that it can discover many well-known equations that are beyond the capabilities of current methods.

1. INTRODUCTION

Scientists have been using mathematical equations to describe the world for centuries. In particular, closed-form differential equations turned out to be one of the best tools to model physical phenomena. A differential equation describes a relationship between a quantity and its derivatives (rates of change); it is called closed-form if this relationship is described by a mathematical expression consisting of a finite number of variables, constants, arithmetic operations, and some well-known functions (e.g., exponent, logarithm, trigonometric functions) 1 . Closed-form differential equations provide a general description of reality in a concise representation that is amenable to closer inspection by scientists. This renders them transparent and interpretable to human experts. Discoveries of these equations required a thorough knowledge of the theory, strong mathematical skills, substantial creativity, and good intuition. The goal of this work is to discover closed-form differential equations directly from data thus accelerating the process of scientific discovery.

Challenges in discovering differential equations from data

• Partial and higher-order derivatives. Many algorithms (Brunton et al., 2016; Qian et al., 2022) can only identify Ordinary Differential Equations (ODEs) which evolve only with respect to one variable (usually time). In contrast, many natural phenomena are described by equations involving many variables (e.g., spatial coordinates) called Partial Differential Equations (PDEs). Many equations also involve higher-order derivatives. • Derivatives not observed. Discovering differential equations from data is challenging because the derivatives are usually not observed in the dataset (equation-data mismatch (Qian et al., 2022) ). This makes verifying a candidate equation a non-trivial task. Most of the methods proposed in the literature try to resolve this issue by estimating the derivatives (Brunton et al., 2016; Rudy et al., 2017) . However, estimating the derivative is difficult, especially when the data is sampled infrequently or with high noise (Qian et al., 2022; Messenger & Bortz, 2021a ). • Strong assumptions and constrained search space. The majority of algorithms for identifying differential equations make many assumptions about the form of the equation. In particular, they 1 Detailed discussion in Appendix A.2 1

