PHYSICS-AWARE SPATIOTEMPORAL MODULES WITH AUXILIARY TASKS FOR META-LEARNING

Abstract

Modeling the dynamics of real-world physical systems is critical for spatiotemporal prediction tasks, but challenging when data is limited. The scarcity of realworld data and the difficulty in reproducing the data distribution hinder directly applying meta-learning techniques. Although the knowledge of governing partial differential equations (PDE) of the data can be helpful for the fast adaptation to few observations, it is mostly infeasible to exactly find the equation for observations in real-world physical systems. In this work, we propose a framework, physics-aware meta-learning with auxiliary tasks whose spatial modules incorporate PDE-independent knowledge and temporal modules utilize the generalized features from the spatial modules to be adapted to the limited data, respectively. The framework is inspired by a local conservation law expressed mathematically as a continuity equation and does not require the exact form of governing equation to model the spatiotemporal observations. The proposed method mitigates the need for a large number of real-world tasks for meta-learning by leveraging spatial information in simulated data to meta-initialize the spatial modules. We apply the proposed framework to both synthetic and real-world spatiotemporal prediction tasks and demonstrate its superior performance with limited observations.

1. INTRODUCTION

Deep learning has recently shown promise to play a major role in devising new solutions to applications with natural phenomena, such as climate change (Manepalli et al., 2019; Drgona et al., 2019) , ocean dynamics (Cosne et al., 2019 ), air quality (Soh et al., 2018; Du et al., 2018; Lin et al., 2018) , and so on. Deep learning techniques inherently require a large amount of data for effective representation learning, so their performance is significantly degraded when there are only a limited number of observations. However, in many tasks in physical systems in the real-world we only have access to a limited amount of data. One example is air quality monitoring (Berman, 2017) , in which the sensors are irregularly distributed over the space -many sensors are located in urban areas whereas there are much fewer sensors in vast rural areas. Another example is extreme weather modeling and forecasting, i.e., temporally short events (e.g., tropical cyclones (Racah et al., 2017b)) without sufficient observations over time. Moreover, inevitable missing values from sensors (Cao et al., 2018; Tang et al., 2019) further reduce the number of operating sensors and shorten the length of fullyobserved sequences. Thus, achieving robust performance from a few spatiotemporal observations in physical systems remains an essential but challenging problem. Learning on a limited amount of data from physical systems can be considered as a few shot learning. While recently many meta-learning techniques (Schmidhuber, 1987; Andrychowicz et al., 2016; Ravi & Larochelle, 2017; Santoro et al., 2016; Snell et al., 2017; Finn et al., 2017) have been developed to address this few shot learning setting, there are still some challenges for the existing meta-learning methods to be applied in modeling natural phenomena. First, it is not easy to find a set of similar meta-tasks which provide shareable latent representations needed to understand targeted observations. For instance, while image-related tasks (object detection (He et al., 2017) or visual-question-answering tasks (Andreas et al., 2016; Fukui et al., 2016) ) can take advantage of an image-feature extractor pre-trained by a large set of images (Deng et al., 2009) and well-designed architecture (Simonyan & Zisserman, 2014; He et al., 2016; Sandler et al., 2018) , there is no such large data corpus that is widely applicable for understanding natural phenomena. Second, unlike computer vision or natural language processing tasks where a common object (images or words) is clearly de-

