METAP: HOW TO TRANSFER YOUR KNOWLEDGE ON LEARNING HIDDEN PHYSICS

Abstract

Gradient-based meta-learning methods have primarily focused on classical machine learning tasks such as image classification and function regression, where they were found to perform well by recovering the underlying common representation among a set of given tasks. Recently, PDE-solving deep learning methods, such as neural operators, are starting to make an important impact on learning and predicting the response of a complex physical system directly from observational data. Since the data acquisition in this context is commonly challenging and costly, the call of utilization and transfer of existing knowledge to new and unseen physical systems is even more acute. Herein, we propose a novel meta-learnt approach for transfer-learning knowledge between neural operators, which can be seen as transferring the knowledge of solution operators between governing (unknown) PDEs with varying parameter fields. With the key theoretical observation that the underlying parameter field can be captured in the first layer of the neural operator model, in contrast to typical final-layer transfer in existing meta-learning methods, our approach is a provably universal solution operator for multiple PDE solving tasks. As applications, we demonstrate the efficacy of our proposed approach on PDE-based datasets and a real-world material modeling problem, demonstrating that our method can handle complex and nonlinear physical response learning tasks while greatly improving the sampling efficiency in new and unseen tasks.

1. INTRODUCTION

Few-shot learning is an important problem in machine learning, where new tasks are learned with a very limited number of labelled datapoints (Wang et al., 2020) . In recent years, significant progress has been made on few-shot learning using meta-learning approaches (Koch et al., 2015; Vinyals et al., 2016; Snell et al., 2017; Finn et al., 2017; Santoro et al., 2016; Antoniou et al., 2018; Ravi & Larochelle, 2016; Nichol & Schulman, 2018; Raghu et al., 2019; Tripuraneni et al., 2021; Collins et al., 2022) . Broadly speaking, given a family of tasks, some of which are used for training and others for testing, meta-learning approaches aim to learn a shared multi-task representation that can generalize across the different training tasks, and result in fast adaptation to new and unseen testing tasks. Although most of meta-learning learning developments focus on conventional machine learning problems such as image classification, function regression, and reinforcement learning, studies on few-shot learning approaches for complex physical system modeling problems have been limited. The call of developing a few-shot learning approach for complex physical system modeling problems is just as acute, while the typical understanding of how multi-task learning should be applied on this scenario is still nascent. As a motivating example, we consider the scenario of new material discovery in the lab environment, where the material model is built based on experimental measurements of its responses subject to different loadings. Since the physical properties (such as the mechanical and structural parameters) in different material specimens vary, the model learnt from experimental measurements on one specimen would have a large generalization error on future specimens. That means, the data-driven model has to be trained repeatedly with a large number of material specimens, which makes the learning process inefficient. Further, experimental measurement acquisition of these specimens is often challenging and expensive. In some problems, a large amount of measurements are not even feasible. For example, in the design and testing of biosynthetic tissues, performing repeated loading would potentially induces the cross-linking and permanent set phenomenon, which notoriously alter the tissue durability (Zhang & Sacks, 2017) . As a result, it is critical to learn the physical response model of a new specimen with samples sizes as small as possible. Furthermore, since many characterization methods to obtain underlying material mechanistic and structural properties would require the use of destructive methods (Misfeld & Sievers, 2007; Rieppo et al., 2008) , in practice many physical properties are not measured and can only be treated as hidden and unknown variables. We likely only have limited access to the measurements on the complex system responses caused by the change of these physical properties. Supervised operator learning methods are typically used to address this class of problems. They take a number of observations on the loading field as input, and try to predict the corresponding physical system response field as output, corresponding to one underlying PDE (as one task). Herein, we consider the meta-learning of multiple complex physical systems (as tasks), such that all these tasks are governed by a common PDE with different (hidden) physical property or parameter fields. Formally, assume that we have a distribution p(T ) over tasks, each task T η ∼ p(T ) corresponds to a hidden physical property field b η (x) ∈ B(R d b ) that contains the task-specific mechanistic and structural information in our material modeling example. On task T η , we have a number of observations on the loading field g η i (x) ∈ A(R dg ) and the corresponding physical system response field u η i (x) ∈ U(R du ) according to a hidden parameter field b η (x). Here, i is the sample index, B, A and U are Banach spaces of function taking values in R d b , R dg and R du , respectively. For task T η , our modeling goal is to learn the solution operator G η : A → U, such that the learnt model can predict the corresponding physical response field u(x) for any loading field g(x). Without transfer learning, one needs to learn a surrogate solution operator for each task only based on the data pairs on this task, and repeat the training for every task. The learning procedure would require a relatively large amount of observation pairs and training time for each task. Therefore, this physical-based modeling scenario raises a key question: Given knowledge on a number of parametric PDE solving tasks with different unknown parameters, how can one efficiently learn the best surrogate solution operator for a new and unknown parameter, with only a small set of training data pairsfoot_0 ? To address this question, we introduce MetaP, a novel meta-learnt approach for transfer-learning knowledge between neural operators, which can be seen as transferring the knowledge of solution operators between governing (unknown) PDEs with varying hidden parameter fields. Our main contributions are: • MetaP is the first neural-operator-based approach for multi-task learning, which not only preserves the generalizability to different resolutions and input functions from its integral neural operator architecture, but also improves sampling efficiency on new tasks -for comparable accuracy, MetaP saves the required number of measurements by ∼90%. • With rigorous operator approximation analysis, we made the key observation that the hidden parameter field can be captured by adapting the first layer of the neural operator model, in contrast to typical final-layer transfer in existing meta-learning methods. By construction, MetaP serves as a provably universal solution operator for multiple PDE solving tasks. • From synthetic, benchmark, to a real-world biological tissue datasets, the proposed method consistently outperforms existing baseline gradient-based meta-learning methods.

2. BACKGROUND AND RELATED WORK

In this section we introduce the relevant materials on hidden physics learning, neural operators, and gradient-based meta-learning methods, which will later complement the definition of our method.

2.1. HIDDEN PHYSICS LEARNING AND NEURAL OPERATORS

For many decades, physics-based PDEs have been commonly employed for predicting and monitoring complex system responses, then traditional numerical methods were employed to solve these



In some meta-learning literature (e.g.,(Xu et al., 2020)), these small sets of labelled data pairs on a new task (or any task) is also called the context, and the learnt model will be evaluated on an additional set of unlabelled data pairs, i.e., the target.

