PLANSFORMER: GENERATING SYMBOLIC PLANS USING TRANSFORMERS

Abstract

Large Language Models (LLMs) have been the subject of active research, significantly advancing the field of Natural Language Processing (NLP). From BERT to BLOOM, LLMs have surpassed state-of-the-art results in various natural language tasks such as question answering, summarization, and text generation. Many ongoing efforts focus on understanding LLMs' capabilities, including their knowledge of the world, syntax, and semantics. However, extending the textual prowess of LLMs to symbolic reasoning has been slow and predominantly focused on tackling problems related to the mathematical field. In this paper, we explore the use of LLMs for automated planning -a branch of AI concerned with the realization of action sequences (plans) to achieve a goal, typically executed by intelligent agents, autonomous robots, and unmanned vehicles. We introduce Plansformer 1 ; an LLM fine-tuned on planning problems and capable of generating plans with favorable behavior in terms of correctness and length with reduced knowledge-engineering efforts. We also demonstrate the adaptability of Plansformer in solving different planning domains with varying complexities, owing to the transfer learning abilities of LLMs. For one configuration of Plansformer, we achieve 97% valid plans, out of which 95% are optimal for Towers of Hanoi -a puzzle-solving domain.

1. INTRODUCTION

Large Language Models (LLMs), based on transformer-based (neural) architecture (Vaswani et al., 2017; Devlin et al., 2018; Brown et al., 2020; Chowdhery et al., 2022) , have significantly advanced the field of Natural Language Processing (NLP). Their employment has grown dramatically in recent times (Li, 2022) , as researchers develop newer and bigger LLMs. From BERT to the most recent BLOOM, language models have surpassed state-of-the-art results in various natural language tasks. For example, PaLM (Chowdhery et al., 2022) , achieved breakthrough performance on a plethora of natural language tasks such as inference, question answering, and commonsense reasoning, and outperformed an average human performance on the BIG-bench benchmark. Despite the textual prowess of LLMs, their significance has been limited in the domains that involve symbols. For example, domains with symbols such as mathematics (Hendrycks et al., 2021b; Cobbe et al., 2021) , and coding problems (Hendrycks et al., 2021a; Chen et al., 2021) deliberates the failures of LLMs when it comes to handling symbols. In Automated Planning, (Valmeekam et al., 2022) suggests that even state-of-the-art LLMs cannot reason with symbolic data and offer a new suite of benchmarks to test their reasoning capabilities. Recently, there has been a lot of interest in LLMs for code generation; for example, CodeT5 (Wang et al., 2021) , CodeBERT (Feng et al., 2020) , Codex (Chen et al., 2021) , etc. In this paper, we propose to employ LLMs that are trained to generate code and repurpose them to generate valid plans. To advance the research in LLM-based automated planning, we create a training and test dataset for several planning domains. We use CodeT5 (base), a transformer-based code generation model that achieves state-of-the-art results in CodeXGlue, as the pre-trained LLM. We select CodeT5 due to its ability to generate goal-directed, sequential instruction and semantically meaningful program codes with syntactic and structural constraints. Then, we present, Plansformer, an LLM trained to generate symbolic plans of high quality in terms of correctness and length. Our experimental results indicate that the syntactic/symbolic knowledge learned from different programming languages in the CodeT5 model can be beneficial for the PDDL-based automated planning task. For example, in one configuration of Plansformer tested on a puzzle-solving domain, Towers of Hanoi -hanoi, our model was able to generate 97%



The dataset, code, and all fine-tuned model checkpoints will be released for public usage.1

