OPTIMAL NEURAL PROGRAM SYNTHESIS FROM MULTIMODAL SPECIFICATIONS

Abstract

Multimodal program synthesis, which leverages different types of user input to synthesize a desired program, is an attractive way to scale program synthesis to challenging settings; however, it requires integrating noisy signals from the user (like natural language) with hard constraints on the program's behavior. This paper proposes an optimal neural synthesis approach where the goal is to find a program that satisfies user-provided constraints while also maximizing the program's score with respect to a neural model. Specifically, we focus on multimodal synthesis tasks in which the user intent is expressed using combination of natural language (NL) and input-output examples. At the core of our method is a top-down recurrent neural model that places distributions over abstract syntax trees conditioned on the NL input. This model not only allows for efficient search over the space of syntactically valid programs, but it allows us to leverage automated program analysis techniques for pruning the search space based on infeasibility of partial programs with respect to the user's constraints. The experimental results on a multimodal synthesis dataset (STRUCTUREDREGEX) show that our method substantially outperforms prior state-of-the-art techniques in terms of accuracy and explores fewer states during search.

1. INTRODUCTION

In recent years, there has been a revolution in machine learning-based program synthesis techniques for automatically generating programs from high-level expressions of user intent, such as input-output examples (Balog et al., 2017; Chen et al., 2019a; Devlin et al., 2017; Ellis et al., 2019; Kalyan et al., 2018; Shin et al., 2018) and natural language (Yaghmazadeh et al., 2017; Dong & Lapata, 2016; Rabinovich et al., 2017; Yin & Neubig, 2017; Desai et al., 2016; Wang et al., 2018) . Many of these techniques use deep neural networks to consume specifications and then perform model-guided search to find a program that satisfies the user. However, because the user's specification can be inherently ambiguous (Devlin et al., 2017; Yin et al., 2018) , a recent thread of work on multimodal synthesis attempts to combine different types of cues, such as natural language and examples, to allow program synthesis to effectively scale to more complex problems. Critically, this setting introduces a new challenge: how do we efficiently synthesize programs with a combination of hard and soft constraints from distinct sources? In this paper, we formulate multimodal synthesis as an optimal synthesis task and propose an optimal synthesis algorithm to solve it. The goal of optimal synthesis is to generate a program that satisfies any hard constraints provided by the user while also maximizing the score under a learned neural network model that captures noisy information, like that from natural language. In practice, there are many programs that satisfy the hard constraints, so this maximization is crucial to finding the program that actually meets the user's expectations: if our neural model is well-calibrated, a program that maximizes the score under the neural model is more likely to be the user's intended program. Our optimal neural synthesis algorithm takes as input multimodal user guidance. In our setting, we train a neural model to take natural language input that can be used to guide the search for a program consistent with some user-provided examples. Because our search procedure enumerates programs according to their score, the first enumerated program satisfying the examples is guaranteed to be optimal according to the model. A central feature of our approach is the use of a tree-structured neural model, namely the abstract syntax network (ASN) (Rabinovich et al., 2017) , for constructing S 0 → V 1 V 1 → <0> |<1> |cat(V 1 , V 1 ) Figure 1 : Example grammar for a simple language. ( , ( , ) cat V 1 → cat(V 1 , V 1 ) n 1 ( , ) cat V 1 → cat(V 1 , V 1 ) n 2 ( , ) <0> V 1 → <0> n 3 ( , ) <0> V 1 → <0> n 4 ( , ) <1> V 1 → <1> n 5 ) cat V 1 → cat(V 1 , V 1 ) n 1 ( , ) <0> V 1 → <0> n 3 ( , ) V 1 Ø n 4 ( , ) cat V 1 → cat(V 1 , V 1 ) n 2 ( , ) <1> V 1 → <1> n 5 Figure 

2. PROBLEM FORMULATION

Context-free grammar. In this work, we assume that the syntax of the target programming language L is specified as a context-free grammar G = (V, Σ, R, S 0 ) where V is a set of non-terminals, Σ is the set of terminal symbols, R is a set of productions, and S 0 is the start symbol. We use the notation s to denote any symbol in V ∪ Σ. The grammar in Figure 1 has two nonterminals (S 0 and V 1 ) and three terminals (cat, <0>, and <1>). To simplify presentation in the rest of the paper, we assume that each grammar production is of the form v → f (s 0 , . . . , s n ) where f is a language construct (e.g., a constant like 0 or a built-in function/operator like cat, +, etc.). We represent programs in terms of their abstract syntax trees (AST). We assume that every node n in an abstract syntax tree is labeled with a grammar symbol s (denoted S(n)), and that every node is labeled with a production r ∈ R (denoted R(n)) that indicates which CFG production was used to assign a terminal symbol for that node (if applicable). Figure 2 shows an AST representation of the program cat(cat(<0>, <1>), <0>) generated using the simple grammar shown in Figure 1 . Partial programs. For the purposes of this paper, a partial program is an AST in which some of the nodes are labeled with non-terminal symbols in the grammar (see Figure 3 ). For a complete program, all node labels are terminal symbols. We use the notation EXPAND(P, l, r) to denote replacing leaf l with production r, which adds n nodes s 1 , . . . , s n to the tree corresponding to the yield of r. Consistency with examples. In this paper, we focus on multimodal synthesis problems where the user provides a logical specification φ in addition to a natural language description. More concretely, we focus on logical specifications that are in the form of positive and negative examples on the program behavior. Each example is a pair (x, y) such that, for a positive example, we have P (x) = y for the target program P , and for a negative example, we have P (x) = y. Given a set of examples



Figure 2: Example of an AST derivation of cat(cat(<0>,<1>),<0>). Blue boxes represent symbols and yellow boxes represent productions.

3: Example of a partial program. n 4 is a leaf node with nonterminal symbol V 1 . syntactically valid programs in a top-down manner. The structure of the ASN model restricts search to programs that are syntactically correct, thereby avoiding the need to deal with program syntax errors(Kulal et al., 2019), and it allows us to search over programs in a flexible way, without constraining a left-to-right generation order like seq2seq models do. More importantly, the use of top-down search allows us to more effectively leverage automated program analysis techniques for proving infeasibility of partial ASTs. As a result, our synthesizer can prune the search space more aggressively than prior work and significantly speed up search. While our network structure and pruning techique are adapted from prior work, we combine them and generalize them to this optimal neural synthesis setting in a new way, and we show that our general approach leads to substantial improvements over previous synthesis approaches.

