BUSTLE: BOTTOM-UP PROGRAM SYNTHESIS THROUGH LEARNING-GUIDED EXPLORATION

Abstract

Program synthesis is challenging largely because of the difficulty of search in a large space of programs. Human programmers routinely tackle the task of writing complex programs by writing sub-programs and then analyzing their intermediate results to compose them in appropriate ways. Motivated by this intuition, we present a new synthesis approach that leverages learning to guide a bottom-up search over programs. In particular, we train a model to prioritize compositions of intermediate values during search conditioned on a given set of input-output examples. This is a powerful combination because of several emergent properties. First, in bottom-up search, intermediate programs can be executed, providing semantic information to the neural network. Second, given the concrete values from those executions, we can exploit rich features based on recent work on property signatures. Finally, bottom-up search allows the system substantial flexibility in what order to generate the solution, allowing the synthesizer to build up a program from multiple smaller sub-programs. Overall, our empirical evaluation finds that the combination of learning and bottom-up search is remarkably effective, even with simple supervised learning approaches. We demonstrate the effectiveness of our technique on two datasets, one from the SyGuS competition and one of our own creation.

1. INTRODUCTION

Program synthesis is a longstanding goal of artificial intelligence research (Manna & Waldinger, 1971; Summers, 1977) , but it remains difficult in part because of the challenges of search (Alur et al., 2013; Gulwani et al., 2017) . The objective in program synthesis is to automatically write a program given a specification of its intended behavior, and current state of the art methods typically perform some form of search over a space of possible programs. Many different search methods have been explored in the literature, both with and without learning. These include search within a version-space algebra (Gulwani, 2011), bottom-up enumerative search (Udupa et al., 2013) , stochastic search (Schkufza et al., 2013) , genetic programming (Koza, 1994) , reducing the synthesis problem to logical satisfiability (Solar-Lezama et al., 2006) , beam search with a sequence-to-sequence neural network (Devlin et al., 2017) , learning to perform premise selection to guide search (Balog et al., 2017) , learning to prioritize grammar rules within top-down search (Lee et al., 2018) , and learned search based on partial executions (Ellis et al., 2019; Zohar & Wolf, 2018; Chen et al., 2019) . While these approaches have yielded significant progress, none of them completely capture the following important intuition: human programmers routinely write complex programs by first writing sub-programs and then analyzing their intermediate results to compose them in appropriate ways. We propose a new learning-guided system for synthesis, called BUSTLE, 1 which follows this intuition in a straightforward manner. Given a specification of a program's intended behavior (in this paper given by input-output examples), BUSTLE performs bottom-up enumerative search for a satisfying program, following Udupa et al. (2013) . Each program explored during the bottom-up search is an expression that can be executed on the inputs, so we apply a machine learning model to the resulting value to guide the search. The model is simply a classifier trained to predict whether the intermediate value produced by a partial program is part of an eventual solution. This combination of learning and

