SIMPLE EMERGENT ACTION REPRESENTATIONS FROM MULTI-TASK POLICY TRAINING

Abstract

The low-level sensory and motor signals in deep reinforcement learning, which exist in high-dimensional spaces such as image observations or motor torques, are inherently challenging to understand or utilize directly for downstream tasks. While sensory representations have been extensively studied, the representations of motor actions are still an area of active exploration. Our work reveals that a space containing meaningful action representations emerges when a multi-task policy network takes as inputs both states and task embeddings. Moderate constraints are added to improve its representation ability. Therefore, interpolated or composed embeddings can function as a high-level interface within this space, providing instructions to the agent for executing meaningful action sequences. Empirical results demonstrate that the proposed action representations are effective for intra-action interpolation and inter-action composition with limited or no additional learning. Furthermore, our approach exhibits superior task adaptation ability compared to strong baselines in Mujoco locomotion tasks. Our work sheds light on the promising direction of learning action representations for efficient, adaptable, and composable RL, forming the basis of abstract action planning and the understanding of motor signal space. Project page: https://sites.

1. INTRODUCTION

Deep reinforcement learning (RL) has shown great success in learning near-optimal policies for performing low-level actions with pre-defined reward functions. However, reusing this learned knowledge to efficiently accomplish new tasks remains challenging. In contrast, humans naturally summarize low-level muscle movements into high-level action representations, such as "pick up" or "turn left", which can be reused in novel tasks with slight modifications. As a result, we carry out the most complicated movements without thinking about the detailed joint motions or muscle contractions, relying instead on high-level action representations (Kandel et al., 2021) . By analogy with such abilities of humans, we ask the question: can RL agents have action representations of low-level motor controls, which can be reused, modified, or composed to perform new tasks? As pointed out in Kandel et al. (2021) , "the task of the motor systems is the reverse of the task of the sensory systems. Sensory processing generates an internal representation in the brain of the outside world or of the state of the body. Motor processing begins with an internal representation: the desired purpose of movement." In the past decade, representation learning has made significant progress in representing high-dimensional sensory signals, such as images and audio, to reveal the geometric and semantic structures hidden in raw signals (Bengio et al., 2013; Chen et al., 2018; Kornblith et al., 2019; Chen et al., 2020; Baevski et al., 2020; Radford et al., 2021; Bardes et al., 2021; Bommasani et al., 2021; He et al., 2022; Chen et al., 2022) . With the generalization ability of sensory representation learning, downstream control tasks can be accomplished efficiently, as shown by recent studies Nair et al. (2022); Xiao et al. (2022); Yuan et al. (2022) . While there have been significant advances in sensory representation learning, action representation learning remains largely unexplored. To address this gap, we aim to investigate the topic and discover generalizable action representations that can be reused or efficiently adapted to perform new tasks. An important concept

availability

google.com/view/

