GENERALIZING TO NEW DYNAMICAL SYSTEMS THROUGH FIRST-ORDER CONTEXT-BASED ADAPTATION Anonymous authors Paper under double-blind review

Abstract

In this paper, we propose FOCA (First-Order Context-based Adaptation), a learning framework to model sets of systems governed by common but unknown laws that differentiate themselves by their context. Inspired by classical modeling-andidentification approaches, FOCA learns to represent the common law through shared parameters and relies on online optimization to compute system-specific context. Due to the online optimization-based context inference, the training of FOCA involves a bi-level optimization problem. To train FOCA efficiently, we utilize an exponential moving average (EMA)-based method that allows for fast training using only first-order derivatives. We test FOCA on polynomial regression and time-series prediction tasks composed of three ODEs and one PDE, empirically finding it outperforms baselines.

1. INTRODUCTION

Scientists and engineers have made tremendous progress on modeling the behavior of natural and engineering systems and optimizing model parameters to best describe the target system (Ljung, 2010) . This modeling and system identification paradigm has made remarkable advances in modern science and engineering (Schrödinger, 1926; Black & Scholes, 1973; Hawking, 1975) . However, applying this paradigm to complex systems is difficult because the mathematical modeling of systems requires a considerable degree of domain expertise, and finding the best system parameters requires massive experimentation. The availability of large datasets and advances in deep learning tools have made it possible to model a target system without specific mathematical models, relying instead on flexible model classes (Brunton et al., 2016; Gupta et al., 2020; Menda et al., 2020; Jumper et al., 2021; Kochkov et al., 2021; Degrave et al., 2022) . However, when the characteristics of target systems change (e.g., system parameters, boundary conditions), the flexibility of data-driven models makes them difficult to adapt. Deep learning approaches typically handle the contextual change by collecting data from the new behavioral mode and re-training the model on the new dataset. However, this approach can be impractical, especially when the system is complex and context change is frequent. We are interested in developing a framework that learns a common shared model of the systems and inferring the context that best describes the target system to predict response. Our study considers a target system whose input x and response y can be described by y = f (x, c), where f denotes the function class shared by the target systems and c denotes the system-specific context. One possible approach for modeling such target systems is meta-learning (Hospedales et al., 2021) , which learns how to adapt to new systems. Meta-learning is typically a combination of an adaptive mechanism and training for the adaptation. One typical meta-learning approach is to use an encoder that takes the adaptation data and returns the learned context (Santoro et al., 2016; Mishra et al., 2018; Garnelo et al., 2018; Kim et al., 2019) . Although encoder-based adaptation schemes require constant memory usage, their parameterized encoders limit the adaptation capability. Other approaches (pre) train the parameters on the dataset collected from the various modes and update all parameters using gradient-descent (Finn et al., 2017; Nagabandi et al., 2018; Rajeswaran et al., 2019) . Despite their effective adaptability, those approaches are often prone to (meta) over-fitting (Antoniou et al., 2019) , especially when the adaptation target is complex and adaptation data is scarce. Instead of updating all parameters, Raghu et al. ( 2019 



); Zintgraf et al. (2019) update a subset

