AGING WITH GRACE: LIFELONG MODEL EDITING WITH DISCRETE KEY-VALUE ADAPTORS

Abstract

Large pre-trained models often err during deployment as input distributions shift, user requirements change, or crucial knowledge gaps are discovered. Recently, model editors have been proposed to modify a model's behavior by adjusting its weights during deployment. However, when editing the same model multiple times, these approaches quickly decay a model's performance on upstream data and forget how to fix previous errors. We propose and study a novel Lifelong Model Editing setting, where streaming errors are identified for a deployed model and we update the model to correct its predictions without influencing unrelated inputs. We introduce General Retrieval Adaptors for Continual Editing, or GRACE, which learns to cache a chosen layer's activations in a codebook as edits stream in, while the original model weights remain frozen. GRACE succeeds to edit pre-trained models thousands of times in a row using only streaming errors, while minimally influencing unrelated inputs. Experimentally, we show that GRACE substantially improves over recent model editors while generalizing to unseen inputs.

1. INTRODUCTION

Modern machine learning systems perform extremely well on challenging, real-world tasks. Many successes stem from large models trained on massive amounts of data, achieving state-of-the-art performance on challenging tasks in natural language processing (Rae et al., 2021; Brown et al., 2020) and computer vision (Ramesh et al., 2022; Dosovitskiy et al., 2020) . However, despite high performance, large models still make critical mistakes during deployment (Sinitsin et al., 2019; Balachandran et al., 2022; Wang et al., 2021) . Further, when models are deployed over long periods of time without updates, their error rates increase as data distributions shift, as labels shift, or as ground-truth information about the world simply changes (Lazaridou et al., 2021) . For example, a language model trained in 2016 that correctly identifies Barack Obama as president of the United States would be incorrect after 2017, as illustrated in Figure 1 . Furthermore, multiple errors often occur sequentially, requiring many fixes to the same model over time. To handle this case, we introduce Lifelong Model Editing to continuously correct large models' mistakes over long sequences of edits. By editing a model, we avoid costly retraining while maintaining its performance on unrelated inputs (Mitchell et al., 2022a) . One approach to lifelong editing is to directly finetune a model on errors as they arrive. However, finetuning on errors is prone to overfitting, even with customized regularization (Lin et al., 2022) . Further, finetuned models quickly forget original training data, devaluing upstream pretraining (Lee et al., 2020) . And even worse, finetuned models can easily forget previously-fixed errors, counteracting the objective of editing in the first place (Sinitsin et al., 2019) . Another candidate for lifelong editing is through standard model editing, which outperforms finetuning by updating model behavior with minimal influence on unrelated inputs (Mitchell et al., 2022a) . While a promising direction, these methods require large amounts of data in order to make edits. For example, some recent works use training sets filled with pre-collected errors to train hypernetworks (von Oswald et al., 2020 ) that edit a model's behavior by predicting new weights (Mitchell et al., 2022a) or offload predictions to a new classifier (Mitchell et al., 2022b) . However, large pools of representative training edits are rarely available before deployment. Alternatively, regularized finetuning approaches (Meng et al., 2022a; Sinitsin et al., 2019; De Cao et al., 2021) Figure 1 : GRACE edits pretrained models during deployment by maintaining layer-specific codebooks that cache learned activations for selected layers. When inputs similar to previous edits arrive, the corresponding activations are passed to the next layer to correct the model's predictions. GRACE continuously edits a deployed model using only streaming errors without accessing upstream data. editors (Mitchell et al., 2022a; Sinitsin et al., 2019; De Cao et al., 2021; Mitchell et al., 2022b) fail to make multiple edits sequentially. Consequently, they fail as lifelong editors, as we demonstrate. In this paper, we introduce Lifelong Model Editing. Assume we have a model f 0 that was pretrained on a set of upstream instances U. During deployment, we observe a stream of errors {(X e t , y e t )} T t=1 such that f t-1 (X e t ) ̸ = y e t ∀ t. Here, f t-1 is the model being used for inference at step t, and y e t is the correct label for input X e t . At each step t, we receive an input-label pair for an edit (X e t , y e t ), and our aim is to produce an edited model f t that meets three criteria: 1. f t (X e t ) = y e t ; the error at step t is corrected. 2. f t (X e i ) = y e i for i ≤ t; the model remembers the corrections for previous errors. 3. f t (X) = y ∀ (X, y) ∈ U; the model's upstream test performance is maintained. To address this challenging setting, we propose General Retrieval Adaptors for Continual Editing, or GRACE. GRACE edits individual layers of a frozen, pretrained model f 0 , treating it as an encoder. GRACE introduces a codebook memory to a chosen layer, using encodings from the previous layer as queries to find the nearest key, which is associated with a value. Each value can replace the model's predicted activation, which can be decoded by future layers, ultimately leading to a prediction. Additionally, GRACE learns one ϵ-ball per key. During inference, if an encoding does not land within any key's ϵ-ball, then the frozen pre-trained model is used directly, avoiding interference with any edits. By adding and updating key-value pairs and their ϵ values as edits stream in, GRACE fixes model mistakes without decaying performance on upstream training data. For example, in Figure 1 , where GRACE corrects f 's predictions about the United States President without impacting knowledge of India's Prime Minister. During training GRACE adapts to changing distributions of edits by shrinking and expanding each ϵ-ball, leading to coarse-or fine-grained influence on the edited layer's representation space. Further, we can initialize ϵ to manage the trade-off between remembering upstream data and successfully fixing long sequences of errors. Our contributions in this work are: 1. We cast model editing in a new and realistic lifelong streaming setting. To our knowledge, this setting is unstudied, yet is crucial to successfully deploying large language models. 2. We present GRACE, a novel key-value model editor which learns to cache and retrieve activations for selected layers using only errors observed during deployment. Further, GRACE directly applies to any existing transformer-based model, with straight-forward alterations for other architectures. 3. Our experiments show that GRACE is a state-of-the-art model editor, outperforming alternatives like MEND (Mitchell et al., 2022a) on domain-free question answering and a classification task with shifting label distributions. We also find that GRACE successfully generalizes edits to previously-unseen inputs.



rely on sets of semantically-equivalent inputs to preserve upstream model performance. Additionally, prior model Continually Editing a Pre-Trained model with GRACE

