PEER: A COLLABORATIVE LANGUAGE MODEL

Abstract

Textual content is often the output of a collaborative writing process: We start with an initial draft, ask for suggestions, and repeatedly make changes. Agnostic of this process, today's language models are trained to generate only the final result. As a consequence, they lack several abilities crucial for collaborative writing: They are unable to update existing texts, difficult to control and incapable of verbally planning or explaining their actions. To address these shortcomings, we introduce PEER, a collaborative language model that is trained to imitate the entire writing process itself. PEER can write drafts, add suggestions, propose edits and provide explanations for its actions. Crucially, we train multiple instances of PEER able to infill various parts of the writing process, enabling the use of self-training techniques for increasing the quality, amount and diversity of training data. This unlocks PEER's full potential by making it applicable in domains for which no edit histories are available and improving its ability to follow instructions, to write useful comments, and to explain its actions. We show that PEER achieves strong performance across various domains and editing tasks.

1. INTRODUCTION

Large neural networks show impressive text generation capabilities when pretrained with a language modeling objective (Radford et al., 2019; Raffel et al., 2020; Brown et al., 2020; Rae et al., 2021; Zhang et al., 2022; Chowdhery et al., 2022, i.a.) . However, the way these models operate -producing outputs in a single pass from left to right -differs strongly from the iterative process by which humans typically write texts. This limits their utility for collaborative writing in various respects; for example, they are not able to retroactively modify or refine their own outputs. Beyond that, they are hard to control (Korbak et al., 2022) and verifying their outputs is challenging as they often hallucinate content (Maynez et al., 2020; Shuster et al., 2021; Nakano et al., 2021) and lack the ability to explain their intentions. All of this makes it very difficult for humans to collaborate with such models for writing coherent, factual texts. To address these shortcomings of existing LMs, we propose PEER (Plan, Edit, Explain, Repeat), a collaborative language model trained on edit histories to cover the entire writing process. As illustrated in Figure 1 , PEER operates in several steps that aim to mirror the human writing process: For a given text, either a user or the model itself can plan an action to be applied, for example by means of a natural language instruction. This plan is then realized by an edit, which the model can explain both in form of a textual comment and by pointing to references used; this is enabled by augmenting each input text with retrieved passages containing potentially relevant background information. We repeat these steps until the text is in a satisfactory state that does not require any further updates. This iterative approach does not only enable the model to decompose the complex task of writing a consistent, factual text into multiple easier subtasks, it also allows humans to intervene at any time and steer the model in the right direction, either by providing it with their own plans and comments or by making edits themselves. Similar to recent approaches for iterative editing (Faltings et al., 2021; Reid & Neubig, 2022) , we use Wikipedia as our main source of edits and associated comments, which we use as proxies for plans and explanations. In contrast to this prior work, however, our goal is to obtain a collaborative model Fix incorrect information.

PLAN

Golden Guinea Breweries Plc (established 1970) is a Nigerian brewery located in Umuahia, Anambra State. It was founded in 1962 [1] as an indigenous competitor to foreign manufacturers in the country.

Corrected founding date

[1] africanfinancials.com: Golden Guinea Breweries was founded in 1962 and originally called Independence Brewery [...]

REPEAT

Figure 1 : Illustration of the steps performed by PEER: First, either the user or the model specifies a plan describing the action they want to be performed; this action is then realized by means of an edit. The model can explain the edit both in natural language and by pointing to relevant sources. We can repeat this process until the generated text requires no further updates. that is useful beyond just Wikipedia: It should be capable of following human-written instructions for updating texts in any domain. To achieve this goal, we train PEER not only to perform the writing process illustrated in Figure 1 in sequential order, but also to infill various parts; for example, given an edited text and a set of relevant documents, we teach it to produce the original version of this text before it was edited. This enables us to use self-training techniques (e.g., Yarowsky, 1995; Sennrich et al., 2016; He et al., 2020a; Schick & Schütze, 2021a) for training PEER with synthetic plans, edits, explanations and documents. We show that this substantially improves PEER along several axes, including its ability to edit texts in any domain, to understand human-written instructions, and to explain its actions.

2. RELATED WORK

Text Editing Similar to our work, Faltings et al. ( 2021) train an editing model to follow plans on Wikipedia data. However, they only consider single sentence edits, evaluate on Wikipedia data only and do not explore approaches for improving data quality and coverage. Reid & Neubig (2022) also train models on Wikipedia's edit history, but do not consider plans, explanations or reference documents. Several editing models are trained to solve specific tasks, such as updating information (Logan IV et al., 2021) , fixing grammar errors (Napoles et al., 2017; Awasthi et al., 2019) or improving citations (Petroni et al., 2022) . Various approaches teach models to iteratively improve texts in an unsupervised fashion (e.g., Shen et al., 2020; Donahue et al., 2020; Li et al., 2022) and explore more efficient ways of representing edits (Mallinson et al., 2020) . Concurrent work of Dwivedi-Yu et al. (2022) proposes EDITEVAL, a benchmark for evaluating editing models. Instruction Tuning and Planning Explicitly teaching models to follow plans is related to recent work that finetunes models on human-written instructions (Wei et al., 2022a; Sanh et al., 2022; Bach et al., 2022; Ouyang et al., 2022; Wang et al., 2022) . The idea of having a separate planning stage has also been explored for other text generation tasks inlcuding summarization (Narayan et al., 2021) , data-to-text generation (Moryossef et al., 2019) and story writing (Yao et al., 2019) . Our approach of writing text by iteratively performing small updates has some similarity with recent approaches like chain-of-thought prompting (Wei et al., 2022b; Dohan et al., 2022) and document sketching (Wu et al., 2021) , that also break down a complex task into multiple smaller steps. Collaborative Writing Du et al. (2022a;b) investigate human-machine interactions for iteratively improving documents; however, they focus mostly on syntactic edits that improve the fluency, coherence or style of a document. Lee et al. (2022) investigate using GPT3 (Brown et al., 2020) as a writing assistant for creative and argumentative writing. In their setup, however, the model provides suggestions for continuations without being controllable by means of natural language instructions. Self-Training Our approach of using models to infill missing data closely resembles other selftraining and bootstrapping approaches used e.g. in word sense disambiguation (Yarowsky, 1995) , machine translation (Sennrich et al., 2016; Hoang et al., 2018) , sequence generation (He et al., 2020a) , and few-shot learning (Schick & Schütze, 2021a; b) . Similar to how we use models to turn plain texts into sequences of edits, Dai et al. (2022) turn documents into dialogue sequences.

