-TION WITH TEN FORWARD PASSES

Abstract

We describe PROMPTBOOSTING, a query-efficient procedure for building a text classifier from a neural language model (LM) without access to the LM's parameters, gradients, or hidden representations. This form of "black-box" classifier training has become increasingly important as the cost of training and inference in large-scale LMs has grown. But existing black-box LM classifier learning approaches are themselves computationally inefficient, typically specializing LMs to the target task by searching in a large space of (discrete or continuous) prompts using zeroth-order optimization methods. Instead of directly optimizing in prompt space, PROMPTBOOSTING obtains a small pool of prompts via a gradient-free approach, and then constructs a large pool of weak learners by pairing these prompts with different elements of the LM's output distribution. These weak learners are then ensembled using the ADABOOST algorithm. The entire learning process requires only a small number of forward passes per batch and no backward pass. Experiments show that PROMPTBOOSTING achieves state-of-the-art performance in multiple black-box few-shot classification tasks, and matches or outperforms full fine-tuning in both few-shot and standard learning paradigms, while training 10x faster than existing black-box methods.

1. INTRODUCTION

Prompt-based learning has emerged as an effective method to adapt pretrained language models (LMs) for downstream natural language processing (NLP) tasks. A typical prompt-learning paradigm involves appending a specially-designed sequence, called a prompt, to the input to a pretrained LM, which will thereby be repurposed for a given downstream task. Compared to the standard fine-tuning, prompt-based learning is much more parameter-efficient. Most prompt-based learning methods require searching for the optimal prompt for the downstream task. When gradient information of the pre-trained LM is available, such optimization can easily be performed by standard gradient-based methods (Liu et al., 2021; Li & Liang, 2021; Lester et al., 2021; Zhang et al., 2021; Liu et al., 2022) . However, in many real-world scenarios, the parameters, gradient or hidden representations of the LMs are not accessible, a.k.a. the black-box tuning setting, which makes gradient-based prompt learning very challenging (Sun et al., 2022) . To tackle the challenges, the most common existing black-box solution is to resort to gradient-free optimization techniques to search for the optimal prompt, such as the zeroth-order gradient approximation (Sun et al., 2022; Diao et al., 2022) and reinforcement learning-guided optimization (Deng et al., 2022) . However, these methods would require a large number of queries of the LMs, which, considering the ever-growing size and computation cost of the pre-trained LMs, is highly inefficient and could lead to large approximation errors. In this paper, we propose PROMPTBOOSTING, a novel black-box prompt learning approach which does not rely on searching an optimal prompt, and which can thus drastically improve the computational efficiency over the existing methods. Figure 1 illustrates the pipeline of PROMPTBOOSTING. Specifically, rather than optimizing over the prompts, PROMPTBOOSTING constructs a small pool of prompts via a gradient-free approach. These prompts are sub-optimal because they are not optimized for any downstream tasks. Then, PROMPTBOOSTING creates a large pool of weak learners by pairing each prompt with different elements of the LM's output distribution, which is commonly 1

