PLASTICINELAB: A SOFT-BODY MANIPULATION BENCHMARK WITH DIFFERENTIABLE PHYSICS

Abstract

Simulated virtual environments serve as one of the main driving forces behind developing and evaluating skill learning algorithms. However, existing environments typically only simulate rigid body physics. Additionally, the simulation process usually does not provide gradients that might be useful for planning and control optimizations. We introduce a new differentiable physics benchmark called PasticineLab, which includes a diverse collection of soft body manipulation tasks. In each task, the agent uses manipulators to deform the plasticine into a desired configuration. The underlying physics engine supports differentiable elastic and plastic deformation using the DiffTaichi system, posing many underexplored challenges to robotic agents. We evaluate several existing reinforcement learning (RL) methods and gradient-based methods on this benchmark. Experimental results suggest that 1) RL-based approaches struggle to solve most of the tasks efficiently; 2) gradient-based approaches, by optimizing open-loop control sequences with the built-in differentiable physics engine, can rapidly find a solution within tens of iterations, but still fall short on multi-stage tasks that require long-term planning. We expect that PlasticineLab will encourage the development of novel algorithms that combine differentiable physics and RL for more complex physics-based skill learning tasks. PlasticineLab is publicly available 1 .

1. INTRODUCTION

Virtual environments, such as Arcade Learning Environment (ALE) (Bellemare et al., 2013) , Mu-JoCo (Todorov et al., 2012), and OpenAI Gym (Brockman et al., 2016) , have significantly benefited the development and evaluation of learning algorithms on intelligent agent control and planning. However, existing virtual environments for skill learning typically involves rigid-body dynamics only. Research on establishing standard soft-body environments and benchmarks is sparse, despite the wide range of applications of soft bodies in multiple research fields, e.g., simulating virtual surgery in healthcare, modeling humanoid characters in computer graphics, developing biomimetic actuators in robotics, and analyzing fracture and tearing in material science. Compared to its rigid-body counterpart, soft-body dynamics is much more intricate to simulate, control, and analyze. One of the biggest challenges comes from its infinite degrees of freedom (DoFs) and the corresponding high-dimensional governing equations. The intrinsic complexity of soft-body dynamics invalidates the direct application of many successful robotics algorithms designed for rigid bodies only and inhibits the development of a simulation benchmark for evaluating novel algorithms tackling soft-body tasks. In this work, we aim to address this problem by proposing PlasticineLab, a novel benchmark for running and evaluating 10 soft-body manipulation tasks with 50 configurations in total. These tasks have to be performed by complex operations, including pinching, rolling, chopping, molding, and carving. Our benchmark is highlighted by the adoption of differentiable physics in the simulation environment, providing for the first time analytical gradient information in a soft-body benchmark, making it possible to conduct supervised learning with gradient-based optimization. In terms of the soft-body model, we choose to study plasticine (Fig. 1 , left), a versatile elastoplastic material for sculpturing. Plasticine deforms elastically under small deformation, and plastically under large deformation. Compared to regular elastic soft bodies, plasticine establishes more diverse and realistic behaviors and brings challenges unexplored in previous research, making it a representative medium to test soft-body manipulation algorithms (Fig. 1 , right). We implement PlasticineLab, its gradient support, and its elastoplastic material model using Taichi (Hu et al., 2019a) , whose CUDA backend leverages massive parallelism on GPUs to simulate a diverse collection of 3D soft-bodies in real time. We model the elastoplastic material using the Moving Least Squares Material Point Method (Hu et al., 2018) and the von Mises yield criterion. We use Taichi's two-scale reverse-mode differentiation system (Hu et al., 2020) to automatically compute gradients, including the numerically challenging SVD gradients brought by the plastic material model. With full gradients at hand, we evaluated gradient-based planning algorithms on all soft-robot manipulation tasks in PlasticineLab and compared its efficiency to RL-based methods. Our experiments revealed that gradient-based planning algorithms could find a more precious solution within tens of iterations with the extra knowledge of the physical model. At the same time, RL methods may fail even after 10K episodes. However, gradient-based methods lack enough momentum to resolve long-term planning, especially on multi-stage tasks. These findings have deepened our understanding of RL and gradient-based planning algorithms. Additionally, it suggests a promising direction of combining both families of methods' benefits to advance complex planning tasks involving soft-body dynamics. In summary, we contribute in this work the following: • We introduce, to the best of our knowledge, the first skill learning benchmark involving elastic and plastic soft bodies. • We develop a fully-featured differentiable physical engine, which supports elastic and plastic deformation, soft-rigid material interaction, and a tailored contact model for differentiability. • The broad task coverage in the benchmark enables a systematic evaluation and analysis of representative RL and gradient-based planning algorithms. We hope such a benchmark can inspire future research to combine differentiable physics with imitation learning and RL. 



Figure 1: Left: A child deforming a piece of plasticine into a thin pie using a rolling pin. Right: The challenging RollingPin scene in PlasticineLab. The agent needs to flatten the material by rolling the pin back and forth, so that the plasticine deforms into the target shape.

