ABSTRACT-TO-EXECUTABLE TRAJECTORY TRANSLA-TION FOR ONE-SHOT TASK GENERALIZATION

Abstract

Training long-horizon robotic policies in complex physical environments is essential for many applications, such as robotic manipulation. However, learning a policy that can generalize to unseen tasks is challenging. In this work, we propose to achieve one-shot task generalization by decoupling plan generation and plan execution. Specifically, our method solves complex long-horizon tasks in three steps: build a paired abstract environment by simplifying geometry and physics, generate abstract trajectories, and solve the original task by an abstract-to-executable trajectory translator. In the abstract environment, complex dynamics such as physical manipulation are removed, making abstract trajectories easier to generate. However, this introduces a large domain gap between abstract trajectories and the actual executed trajectories as abstract trajectories lack low-level details and are not aligned frame-to-frame with the executed trajectory. In a manner reminiscent of language translation, our approach leverages a seq-to-seq model to overcome the large domain gap between the abstract and executable trajectories, enabling the low-level policy to follow the abstract trajectory. Experimental results on various unseen long-horizon tasks with different robot embodiments demonstrate the practicability of our methods to achieve one-shot task generalization. Videos and more details can be found in the supplementary materials and project page .

1. INTRODUCTION

Training long-horizon robotic policies in complex physical environments is important for robot learning. However, directly learning a policy that can generalize to unseen tasks is challenging for Reinforcement Learning (RL) based approaches (Yu et al., 2020; Savva et al., 2019; Shen et al., 2021; Mu et al., 2021) . The state/action spaces are usually high-dimensional, requiring many samples to learn policies for various tasks. One promising idea is to decouple plan generation and plan execution. In classical robotics, a high-level planner generates a abstract trajectory using symbolic planning with simpler state/action space than the original problem while a low-level agent executes the plan in an entirely physical environment Kaelbling & Lozano-Pérez (2013); Garrett et al. (2020b) . In our work, we promote the philosophy of abstract-to-executable via the learning-based approach. By providing robots with an abstract trajectory, robots can aim for one-shot task generalization. Instead of memorizing all the high-dimensional policies for different tasks, the robot can leverage the power of planning in the low-dimensional abstract space and focus on learning low-level executors. The two-level framework works well for classical robotics tasks like motion control for robot arms, where a motion planner generate a kinematics motion plan at a high level and a PID controller execute the plan step by step. However, such a decomposition and abstraction is not always trivial for more complex tasks. In general domains, it either requires expert knowledge (e.g., PDDL (Garrett et al., 2020b; a) ) to design this abstraction manually or enormous samples to distill suitable abstractions automatically (e.g., HRL (Bacon et al., 2017; Vezhnevets et al., 2017) ). We refer Abel (2022) for an in-depth investigation into this topic. On the other side, designing imperfect high-level agents whose state space does not precisely align with the low-level executor could be much easier and more flexible. High-level agents can be planners with abstract models and simplified dynamics in the simulator (by discarding some physical features, e.g., enabling a "magic" gripper Savva et al. (2019); Torabi et al. (2018)) or utilizing an existing "expert" agent such as humans or pre-trained agents on different manipulators. Though imperfect,

