MODIFYING MEMORIES IN TRANSFORMER MODELS

Abstract

Large Transformer models have achieved impressive performance in many natural language tasks. In particular, Transformer based language models have been shown to have great capabilities in encoding factual knowledge in their vast amount of parameters. While the tasks of improving the memorization and generalization of Transformers have been widely studied, it is not well known how to make transformers forget specific old facts and memorize new ones. In this paper, we propose a new task of explicitly modifying specific factual knowledge in Transformer models while ensuring the model performance does not degrade on the unmodified facts. This task is useful in many scenarios, such as updating stale knowledge, protecting privacy, and eliminating unintended biases stored in the models. We benchmarked several approaches that provide natural baseline performances on this task. This leads to the discovery of key components of a Transformer model that are especially effective for knowledge modifications. The work also provides insights into the role that different training phases (such as pretraining and fine-tuning) play towards memorization and knowledge modification.

1. INTRODUCTION

Large-scale Transformer based language models (Vaswani et al., 2017; Devlin et al., 2018; Radford et al., 2019; Raffel et al., 2019; Brown et al., 2020) have not only pushed state-of-the-art on standard natural language processing (NLP) benchmarks such as GLUE and SQuAD, but they have also been crucial for improving various real-world systems (see, e.g., Nayak, 2019; Rao et al., 2019) . Given that these models are pretrained on a large corpora of text such as Wikipedia and BookCorpus (Zhu et al., 2015) , it's quite conceivable that they are able to implicitly memorize the factual knowledge in their large number of parameters. Recent works (Petroni et al., 2019; Roberts et al., 2020) have verified this hypothesis by evaluating the pretrained language models on factual knowledge based tasks. This line of work shows that pretrained large Transformer based language models achieve non-trivial performance on various open-domain question answering (QA) tasks that probe the factual knowledge stored in the model parameters. The aforementioned memorization capability of Transformers opens up many exciting opportunities. In addition to improving generalization with better language understanding, Transformers may also replace or assist traditional knowledge bases (KBs) that are either manually curated or require significant amount of supervision (Roth & Yih, 2002; Kambhatla, 2004; Surdeanu & Ji, 2014) . Different from conventional KBs that explicitly memorize factual knowledge, Transformers implicitly memorize knowledge in their model parameters. As a result, Transformers lack one key advantage of the conventional databases: efficiently modifying the factual knowledge stored in the model. Unlike Transformers, in conventional databases such as SQL and NoSQL that explicitly store knowledge in the forms of structured tables, key-value pairs, wide columns, graphs, or documents, updating knowledge is straightforward. Knowledge-augmented Transformers, which leverage factual knowledge bases to improve their feature representations, cannot effectively modify their predictions by only updating the symbolic knowledge as it causes conflict with the implicit memorization in their parameters (Verga et al., 2020) . This raises the natural question: Can Transformers cope with the ever-changing world where knowledge is continuously being added, updated, and deprecated? To answer this question, we propose a new task of explicitly modifying specific factual knowledge in Transformer models while ensuring that model performance does not degrade on the unaltered facts. This task is useful in many scenarios. For example, the factual knowledge stored by the model can become stale over time, which

