AGING WITH GRACE: LIFELONG MODEL EDITING WITH DISCRETE KEY-VALUE ADAPTORS

Abstract

Large pre-trained models often err during deployment as input distributions shift, user requirements change, or crucial knowledge gaps are discovered. Recently, model editors have been proposed to modify a model's behavior by adjusting its weights during deployment. However, when editing the same model multiple times, these approaches quickly decay a model's performance on upstream data and forget how to fix previous errors. We propose and study a novel Lifelong Model Editing setting, where streaming errors are identified for a deployed model and we update the model to correct its predictions without influencing unrelated inputs. We introduce General Retrieval Adaptors for Continual Editing, or GRACE, which learns to cache a chosen layer's activations in a codebook as edits stream in, while the original model weights remain frozen. GRACE succeeds to edit pre-trained models thousands of times in a row using only streaming errors, while minimally influencing unrelated inputs. Experimentally, we show that GRACE substantially improves over recent model editors while generalizing to unseen inputs.

1. INTRODUCTION

Modern machine learning systems perform extremely well on challenging, real-world tasks. Many successes stem from large models trained on massive amounts of data, achieving state-of-the-art performance on challenging tasks in natural language processing (Rae et al., 2021; Brown et al., 2020) and computer vision (Ramesh et al., 2022; Dosovitskiy et al., 2020) . However, despite high performance, large models still make critical mistakes during deployment (Sinitsin et al., 2019; Balachandran et al., 2022; Wang et al., 2021) . Further, when models are deployed over long periods of time without updates, their error rates increase as data distributions shift, as labels shift, or as ground-truth information about the world simply changes (Lazaridou et al., 2021) . For example, a language model trained in 2016 that correctly identifies Barack Obama as president of the United States would be incorrect after 2017, as illustrated in Figure 1 . Furthermore, multiple errors often occur sequentially, requiring many fixes to the same model over time. To handle this case, we introduce Lifelong Model Editing to continuously correct large models' mistakes over long sequences of edits. By editing a model, we avoid costly retraining while maintaining its performance on unrelated inputs (Mitchell et al., 2022a) . One approach to lifelong editing is to directly finetune a model on errors as they arrive. However, finetuning on errors is prone to overfitting, even with customized regularization (Lin et al., 2022) . Further, finetuned models quickly forget original training data, devaluing upstream pretraining (Lee et al., 2020) . And even worse, finetuned models can easily forget previously-fixed errors, counteracting the objective of editing in the first place (Sinitsin et al., 2019) . Another candidate for lifelong editing is through standard model editing, which outperforms finetuning by updating model behavior with minimal influence on unrelated inputs (Mitchell et al., 2022a) . While a promising direction, these methods require large amounts of data in order to make edits. For example, some recent works use training sets filled with pre-collected errors to train hypernetworks (von Oswald et al., 2020 ) that edit a model's behavior by predicting new weights (Mitchell et al., 2022a) or offload predictions to a new classifier (Mitchell et al., 2022b) . However, large pools of representative training edits are rarely available before deployment. Alternatively, regularized finetuning approaches (Meng et al., 2022a; Sinitsin et al., 2019; De Cao et al., 2021) rely on sets of semantically-equivalent inputs to preserve upstream model performance. Additionally, prior model

