LEARNING RIGID DYNAMICS WITH FACE INTERACTION GRAPH NETWORKS

Abstract

Simulating rigid collisions among arbitrary shapes is notoriously difficult due to complex geometry and the strong non-linearity of the interactions. While graph neural network (GNN)-based models are effective at learning to simulate complex physical dynamics, such as fluids, cloth and articulated bodies, they have been less effective and efficient on rigid-body physics, except with very simple shapes. Existing methods that model collisions through the meshes' nodes are often inaccurate because they struggle when collisions occur on faces far from nodes. Alternative approaches that represent the geometry densely with many particles are prohibitively expensive for complex shapes. Here we introduce the "Face Interaction Graph Network" (FIGNet) which extends beyond GNN-based methods, and computes interactions between mesh faces, rather than nodes. Compared to learned node-and particle-based methods, FIGNet is around 4x more accurate in simulating complex shape interactions, while also 8x more computationally efficient on sparse, rigid meshes. Moreover, FIGNet can learn frictional dynamics directly from real-world data, and can be more accurate than analytical solvers given modest amounts of training data. FIGNet represents a key step forward in one of the few remaining physical domains which have seen little competition from learned simulators, and offers allied fields such as robotics, graphics and mechanical design a new tool for simulation and model-based planning.

1. INTRODUCTION

Simulating rigid bodies accurately is vital in a wide variety of disciplines from robotics to graphics to mechanical design. While popular general-purpose tools like Bullet (Coumans, 2015) , MuJoCo (Todorov et al., 2012) and Drake (Tedrake, 2019) can generate plausible predictions, predictions that match real-world observations accurately are notoriously difficult (Wieber et al., 2016; Anitescu & Potra, 1997; Stewart & Trinkle, 1996; Fazeli et al., 2017; Lan et al., 2022) . Numerical approximations necessary for efficiency are often inaccurate and unstable. Collision, contact and friction are challenging to model accurately, and hard to estimate parameters for. The dynamics are non-smooth and nearly discontinuous (Pfrommer et al., 2020; Parmar et al., 2021) , and influenced heavily by the fine-grained structure of colliding objects' surfaces (Bauza & Rodriguez, 2017) . Slight errors in the physical model or state estimates can thus lead to large errors in objects' predicted trajectories. This underpins the well-known sim-to-real gap between results from analytical solvers and real-world experiments. Learned simulators can potentially fill the sim-to-real gap. They can be trained to correct imperfect state estimation, and can learn physical dynamics directly from observations, potentially producing more accurate predictions than analytical solvers (Allen et al., 2022; Kloss et al., 2022) . Graph neural network (GNN)-based models, in particular, are effective at simulating liquids, sand, soft materials and simple rigids (Sanchez-Gonzalez et al., 2020; Mrowca et al., 2018; Li et al., 2019b; Pfaff et al., 2021; Li et al., 2019a) . Many GNN-based models are node-based: they detect and resolve potential collisions based on whether two mesh nodes or particles are within a local neighborhood. However, collisions between objects do not only happen at nodes. For example, two cubes may * These authors contributed equally. Correspondence to {krallen,rubanova}@deepmind.com 1

