TEXTSETTR: LABEL-FREE TEXT STYLE EXTRACTION AND TUNABLE TARGETED RESTYLING

Abstract

We present a novel approach to the problem of text style transfer. Unlike previous approaches that use parallel or non-parallel labeled data, our technique removes the need for labels entirely, relying instead on the implicit connection in style between adjacent sentences in unlabeled text. We show that T5 (Raffel et al., 2020), a strong pretrained text-to-text model, can be adapted to extract a style vector from arbitrary text and use this vector to condition the decoder to perform style transfer. As the resulting learned style vector space encodes many facets of textual style, we recast transfers as "targeted restyling" vector operations that adjust specific attributes of the input text while preserving others. When trained over unlabeled Amazon reviews data, our resulting TextSETTR model is competitive on sentiment transfer, even when given only four exemplars of each class. Furthermore, we demonstrate that a single model trained on unlabeled Common Crawl data is capable of transferring along multiple dimensions including dialect, emotiveness, formality, politeness, and sentiment.

1. INTRODUCTION

There has been a recent surge of interest in text style transfer, with the aim of training models able to modify specific attributes of input text (e.g., sentiment or formality) while preserving the remaining content. For example, a sentiment transfer model might transform the input "best book ever!" into "worst book ever!", while a formality transfer model might change the same input into "This is the best book I have ever read." Work in this area falls into three categories. Supervised approaches like Jhamtani et al. (2017) transfer between pre-selected styles, and rely on aligned parallel training data to teach the model the desired input/output correspondence. This method is limited by the availability of parallel corpora. So-called "unsupervised" approaches like Li et al. ( 2018 In this work, we explore the hypothesis that large pretrained text-to-text models like T5 (Raffel et al., 2020) already contain a strong representation of textual style, which can be extracted and used to condition the decoder of a style transfer model through a relatively lightweight fine-tuning procedure. To isolate style information in the absence of labels, we rely on the observation that style is a "slow-moving" feature, which tends to be consistent over large spans of text. Specifically, given two adjacent sentences from an unlabeled corpus, we train our model to extract a "style vector" from the first and use that vector to perform denoising and other reconstruction tasks on the second. This technique extends the unsupervised approach of Lample et al. (2019) to the label-free setting, and allows us to reformulate the style transfer operation as a directional operation in style vector space using the difference between target and source style vectors; we call this "targeted restyling". When combined with a novel "tunable inference" technique for controlling token add/delete rates, this gives our final model: Text Style Extraction and Tunable Targeted Restyling (TextSETTR).



) and Lample et al. (2019) remove the requirement for parallel data, but still require labeled training examples of each style, and are limited to transfer between a pre-specified set of styles. Label-free approaches like the recent Xu et al. (2020) remove the need for any training labels. While the most technically challenging, this offers the potential for transferring between arbitrary styles at inference time and has significant value, as curated datasets are not available for many style attributes.

