kNN PROMPTING: BEYOND-CONTEXT LEARNING WITH CALIBRATION-FREE NEAREST NEIGHBOR IN-FERENCE

Abstract

In-Context Learning (ICL), which formulates target tasks as prompt completion conditioned on in-context demonstrations, has become the prevailing utilization of LLMs. In this paper, we first disclose an actual predicament for this typical usage that it can not scale up with training data due to context length restriction. Besides, existing works have shown that ICL also suffers from various biases and requires delicate calibration treatment. To address both challenges, we advocate a simple and effective solution, kNN Prompting, which first queries LLM with training data for distributed representations, then predicts test instances by simply referring to nearest neighbors. We conduct comprehensive experiments to demonstrate its two-fold superiority: 1) Calibration-Free: kNN Prompting does not directly align LLM output distribution with task-specific label space, instead leverages such distribution to align test and training instances. It significantly outperforms state-of-the-art calibration-based methods under comparable few-shot scenario. 2) Beyond-Context: kNN Prompting can further scale up effectively with as many training data as are available, continually bringing substantial improvements. The scaling trend holds across 10 orders of magnitude ranging from

1. INTRODUCTION

Maximum Context Length Large language models (LLMs), when scale up to billions of parameters, have demonstrated remarkable capabilities in a wide range of NLP tasks (Radford et al., 2019; Brown et al., 2020) . However, such models are prohibitively expensive to train with most of the research-or consumer-level devices, though some of them are already publicly available (Zhang et al., 2022) . As a result, it is now an emerging paradigm that LLMs are hosted in a remote data center while accessed by end users or applications via simple API requestsfoot_0 . The typical usage of LLM under such paradigm is In-Context Learning, where LLM reads and completes a prompt sequence as how it is pretrained on massive text corpora. The



shots to 1024 shots as well as different LLMs scales ranging from 0.8B to 30B.It successfully bridges data scaling into model scaling, and brings new potentials for the gradient-free paradigm of LLM deployment. Code is publicly available 1 . https://github.com/BenfengXu/KNNPrompting https://openai.com/api/, https://gpt3demo.com/ 1



Figure 1: kNN Prompting brings substantial improvements over standard ICL, and can continually scale up beyond the context with as many data as are available. Conducted with GPT XL.

