HYPERDYNAMICS: META-LEARNING OBJECT AND AGENT DYNAMICS WITH HYPERNETWORKS

Abstract

We propose HyperDynamics, a dynamics meta-learning framework that conditions on an agent's interactions with the environment and optionally its visual observations, and generates the parameters of neural dynamics models based on inferred properties of the dynamical system. Physical and visual properties of the environment that are not part of the low-dimensional state yet affect its temporal dynamics are inferred from the interaction history and visual observations, and are implicitly captured in the generated parameters. We test HyperDynamics on a set of object pushing and locomotion tasks. It outperforms existing dynamics models in the literature that adapt to environment variations by learning dynamics over high dimensional visual observations, capturing the interactions of the agent in recurrent state representations, or using gradient-based meta-optimization. We also show our method matches the performance of an ensemble of separately trained experts, while also being able to generalize well to unseen environment variations at test time. We attribute its good performance to the multiplicative interactions between the inferred system properties-captured in the generated parametersand the low-dimensional state representation of the dynamical system.

1. INTRODUCTION

Humans learn dynamics models that predict results of their interactions with the environment, and use such predictions for selecting actions to achieve intended goals (Miall & Wolpert, 1996; Haruno et al., 1999) . These models capture intuitive physics and mechanics of the world and are remarkably versatile: they are expressive and can be applied to all kinds of environments that we encounter in our daily lives, with varying dynamics and diverse visual and physical properties. In addition, humans do not consider these models fixed over the course of interaction; we observe how the environment behaves in response to our actions and quickly adapt our model for the situation at hand based on new observations. Let us consider the scenario of moving an object on the ground. We can infer how heavy the object is by simply looking at it, and we can then decide how hard to push. If it does not move as much as expected, we might realize it is heavier than we thought and increase the force we apply (Hamrick et al., 2011) . Motivated by this, we propose HyperDynamics, a dynamics meta-learning framework for that generates parameters for dynamics models (experts) dedicated to the situation at hand, based on observations of how the environment behaves. HyperDynamics has three main modules: i) an encoding module that encodes a few agent-environment interactions and the agent's visual observations into a latent feature code, which captures the properties of the dynamical system, ii) a hypernetwork (Ha et al., 2016) that conditions on the latent feature code and generates parameters of a dynamics model dedicated to the observed system, and iii) a target dynamics model constructed using the generated parameters that takes as input the current low-dimensional system state and the agent action, and predicts the next system state, as shown in Figure 1 . We will be referring to this target dynamics model as an expert, as it specializes on encoding the dynamics of a particular scene at a certain point in time. HyperDynamics conditions on real-time observations and generates dedicated expert models on the fly. It can be trained in an end-to-end differentiable manner to minimize state prediction error of the generated experts in each task. Figure 1 : HyperDynamics encodes the visual observations and a set of agent-environment interactions and generates the parameters of a dynamics model dedicated to the current environment and timestep using a hypernetwork. HyperDynamics for pushing follows the general formulation, with a visual encoder for detecting the object in 3D and encoding its shape, and an interaction encoder for encoding a small history of interactions of the agent with the environment. Many contemporary dynamics learning approaches assume a fixed system without considering potential change in the underlying dynamics, and train a fixed dynamics model (Watter et al., 2015; Banijamali et al., 2018; Fragkiadaki et al., 2015; Zhang et al., 2019) . Such expert models tend to fail when the system behavior changes. A natural solution to address this is to train a separate expert for each dynamical system. However, this can hardly scale and doesn't transfer to systems with novel properties. Inspired by this need, HyperDynamics aims to infer a system's properties by simply observing how it behaves, and automatically generate an expert model that is dedicated to the observed system. In order to address system variations and obtain generalizable dynamics models, many other prior works propose to condition on visual inputs and encode history of interactions to capture the physical properties of the system (Finn & Levine, 2016; Ebert et al., 2018; Xu et al., 2019b; Pathak et al., 2017; Xu et al., 2019a; Li et al., 2018; Sanchez-Gonzalez et al., 2018b; Hafner et al., 2019) . These methods attempt to infer system properties; yet, they optimize a single and fixed global dynamics model, which takes as input both static system properties and fast-changing states, with the hope that such a model can handle variations of systems and generalize. We argue that a representation which encodes system information varies across different system variations, and each instance of this presentation ideally should correspond to a different dynamics model that needs to be used in the corresponding setting. These models are different, but also share a lot of information as similar systems will have similar dynamics functions. HyperDynamics makes explicit assumptions about the relationships between these systems, and attempts to exploit this regularity and learn such commonalities across different system variations. There also exsists a family of approaches that attempt online model adaptation via meta-learning (Finn et al., 2017; Nagabandi et al., 2019; Clavera et al., 2018; Nagabandi et al., 2018a) . These methods perform online adaptation on the parameters of the dynamics models through gradient descent. In this work, we empirically show that our approach adapts better than such methods to unseen system variations. We evaluate HyperDynamics on single-and multi-step state predictions, as well as downstream model-based control tasks. Specifically, we apply it in a series of object pushing and locomotion tasks. Our experiments show that HyperDynamics is able to generate performant dynamics models that match the performance of separately and directly trained experts, while also enabling effective generalization to systems with novel properties in a few-shot manner. We attribute its good performance to the multiplicative way of combining explicit factorization of encoded system features with the low-dimensional state representation of the system. In summary, our contributions are as follows: • We propose HyperDynamics, a general dynamics learning framework that is able to generate expert dynamics models on the fly, conditioned on system properties. • We apply our method to the contexts of object pushing and locomotion, and demonstrate that it matches performance of separately trained system-specific experts. • We show that our method generalizes well to systems with novel properties, outperforming contemporary methods that either optimize a single global model, or attempt online model adaptation via meta-learning.

