GEDI: GENERATIVE DISCRIMINATOR GUIDED SEQUENCE GENERATION

Abstract

While large-scale language models (LMs) are able to imitate the distribution of natural language well enough to generate realistic text, it is difficult to control which regions of the distribution they generate. This is especially problematic because datasets used for training large LMs usually contain significant toxicity, hate, bias, and negativity. We propose GeDi as an efficient method for using smaller LMs as generative discriminators to guide generation from large LMs to make them safer and more controllable. GeDi guides generation at each step by computing classification probabilities for all possible next tokens via Bayes rule by normalizing over two class-conditional distributions; one conditioned on the desired attribute, or control code, and another conditioned on the undesired attribute, or anti control code. We find that GeDi gives controllability on par with or better than the state of the art method in a variety of settings, while also achieving generation speeds more than 30 times faster. Additionally, training GeDi on only three topics allows us to controllably generate new topics zero-shot from just a keyword. Lastly, we show that GeDi can make GPT-2 and GPT-3 significantly less toxic without sacrificing on linguistic fluency, making it by far the most practical existing method for detoxifying large language models while maintaining a fast generation speed.

1. INTRODUCTION

Natural language generation has seen great progress with the advent of Transformers (Vaswani et al., 2017) and large scale training (Radford et al., 2017; 2018; 2019; Brown et al., 2020) . Large language models (LMs) like GPT-2 (Radford et al., 2019) and GPT-3 (Brown et al., 2020) are able to learn the distribution of their training set well enough to generate realistic text. However, simply imitating the distribution of the training data during generation has many drawbacks; large-scale text training sets are crawled from the web which is imbued with toxicity, bias, hate, and misinformation. Methods for better controlling or filtering generation are valuable for making LMs trained on such data safer and more generally useful for downstream applications. Existing approaches to controlling LMs have limitations. Class-conditional LMs (CC-LMs) such as CTRL (Keskar et al., 2019) attempt to control text generation by conditioning on a control code, which is an attribute variable representing a data source. However, CTRL is not as useful for controlling what not to generate (i.e. toxicity). Furthermore, using a specific control code can reduce sample diversity across prompts, as samples will generally resemble the data source of the control code. Another approach is to use discriminators to steer generation, but existing methods to do this are very computationally intensive. Weighted decoding (Holtzman et al., 2018) requires feeding candidate next tokens into a discriminator, and thus scales linearly in computation with the number of tokens to be re-weighted. Plug and Play LM (Dathathri et al., 2020, PPLM) applies up to 10 updates to the generating LM's latent states per time step using gradients from a discriminator, also making it many times slower than generating from the LM directly. We present GeDi 1 as an algorithm for efficiently guiding generation from large LMs to make them safer and more controllable. Our proposed method uses CC-LMs as generative discriminators (GeDis) to guide language generation towards desired attributes. We use GeDis to compute classification likelihoods for all candidate next tokens during generation using Bayes rule, saving many thousand-fold in computation as compared with using a standard (non-generative) discriminator to compute this for large vocabulary sizes. We then show how these likelihoods can guide generation from large language models via weighted decoding and filtering. Our experimental results verify the ability of GeDi to control generation in a variety of settings while maintaining linguistic quality on par with strong language models. We apply GeDi (345M parameters) to guide generation from larger language models, and find that: • GeDi trained on sentiment of movie reviews can generate book text with a positive or negative tone better than or equivalently to state of the art baselines [Section 5.1] . Guiding towards positivity also has potential applications towards making LMs friendlier. • GeDi is able to significantly reduce the toxicity of GPT-2 and GPT-3 generation [Section 5.2], without sacrificing linguistic quality as compared with generating from GPT-2 and GPT-3 directly, suggesting applications towards safer language modeling. • GeDi trained on a dataset of only 4 topics can generalize to new control codes zero-shot [Section 5.3], allowing them to guide generation towards a wide variety of topics. • GeDi is very computationally efficient for both training and inference. GeDi guided generation in our experiments is more than 30× faster than applying PPLM with GPT2-XL using default settings from Dathathri et al. (2020) . Additionally, smaller GeDis fine-tuned for less than a day on a single GPU are effective and computationally efficient for controlling larger language models. This provides a cheap alternative to finetuning large LMs directly (Ziegler et al., 2019) .

2. BACKGROUND

2.1 LANGUAGE MODELING Language models (LMs) rely on an auto-regressive factorization to perform density estimation and generation of language data. Auto-regressive sequence models with parameters θ assign a probability to a sequence x 1:T = {x 1 , . . . , x T } by factorizing it using the chain rule as follows: P θ (x 1:T ) = T t=1 P θ (x t |x <t ). Models can assign probabilities to sequences by iteratively predicting a distribution over the next token given the previous tokens. Generating from language models requires iteratively sampling from P θ (x t |x <t ), and then feeding x t back into the model as input for the next step.

2.2. CLASS-CONDITIONAL LANGUAGE MODELING

Class-conditional language models (CC-LMs) such as CTRL (Keskar et al., 2019) are a way for language models to generate while conditioning on an attribute variable. CC-LMs predict a probability distribution P θ (x 1:T |c), where c is a class variable or a "control code" that describes an attribute of the text in x 1:T , which could, for instance, describe sentiment or topic. The auto-regressive factorization for a CC-LM is given by the following equation: P θ (x 1:T |c) = T t=1 P θ (x t |x <t , c). When training a CC-LM on a training set of sequences {x (1) 1:T1 , . . . , x 1:Ti , . . . , x (N ) 1:T N }, each sequence x (i) 1:T is paired with a control code c (i) , which is a label or category of the sequence. The LM is trained to minimize the average negative log-likelihood, L. L = - 1 N N i=1 1 T i Ti t=1 log P θ (x (i) t |x (i) <t , c (i) ). (3) In addition to class-conditional generation, CC-LMs can be used as generative classifiers by applying Bayes rule to compute P θ (c|x 1:T ) ∝ P (c)P θ (x 1:T |c), as is done by Keskar et al. (2019) for source attribution. probabilities for possible next tokens at each generation timestep using only element-wise operations. These classification probabilities can then be used to guide generation from a language model (e.g., GPT-2) to achieve attribute control across domains. If the GeDi was trained on movie reviews for sentiment control, its direct class-conditional predictions will be biased towards predicting movie review words (illustrated by next word prediction of "cinematic"). However, by contrasting the predictions of opposing control codes via Bayes rule, the bias towards movie reviews can be canceled out.

3. GEDI

GeDi assumes we have a CC-LM with desired control code c and an undesired or anti-control code c, and uses the contrast between P θ (x 1:t |c) and P θ (x 1:t |c) to guide sampling from an LM that gives P LM (x 1:t ). Specifically, when predicting the next token during generation, GeDi uses this contrast to compute the probability that every candidate next token x t belongs to the desired class, given by P θ (c|x t , x <t ). Our key insight is that this distribution can be computed very efficiently when using CC-LMs as GeDis via application of Bayes rule for partial sequences during generation. P θ (c|x 1:t ) = P (c) t j=1 P θ (x j |x <j , c) c ∈{c,c} t j=1 P (c )P θ (x j |x <j , c ) . ( ) When computing this online during sequence generation, the model will have already computed P θ (x j |x <j , c ) for any j < t from the previous time-steps, and it will only need to compute P θ (x t |x <t , c ). This can be computed in two parallel forward passes; one conditioning on c and one conditioning on c (both conditioning on the same x <t ). The model can also save the hidden states from the previous time steps to avoid computing a forward pass through the full sequence at each next token generation step. Applying a unidirectional classifier such as GPT (Radford et al., 2018) to compute P θ (c|x t , x <t ) directly (i.e. discriminatively) would require feeding in every possible input x t ∈ V into the classifier, and thus would require |V| forward passes for a vocab set V. A bidirectional classifier such as BERT (Devlin et al., 2018) would require t × |V| forward passes because it would need to recompute attention states from earlier time-steps. For typical vocab sizes of 20k+, GeDi's online classification trick can compute P θ (c|x t , x <t ) for every possible next token x t on the order of 10k fold less computation as compared with a unidirectional classifier (because the unidirectional classifier would require 20k+ forward passes through the network, whereas the GeDi would only require 2 through the CC-LM and one through the base-LM). In practice, we find that applying Equation 4 to long sequences often results in poorly calibrated distributions later in the sequence that assign classification probabilities of 1 or 0 to on all candidate next words, which provides no useful signal. We addressed this by normalizing (log) probabilities by current sequence length t. To compute P θ (c|x 1:t ) for GeDi-guided generation, we use the following equation: P θ (c|x 1:t ) = (P θ (x 1:t |c) 1/t c ∈{c,c} P θ (x 1:t |c ) 1/t . ( ) where class priors P (c) are omitted. In practice, P θ (c|x 1:t ) is computed with log-probabilities (see Appendix A). With the efficient estimation of P θ (c|x t , x <t ), there are many possible heuristics that can be used to guide LM generation, so long as the LM and GeDi share the same tokenization. Heuristics that use P θ (c|x t , x <t ) inherently contrast predictions conditioned on c and c, causing attributes common to c and c to be cancelled out, more effectively allowing for the attribute described by c to be transferred across domains, as illustrated in Figure 1 .

3.1. HEURISTICS FOR GUIDING GENERATION

We consider a weighted decoding heuristic and a filtering heuristic to use P θ (c|x t , x <t ) to guide generation. There are many possible ways to use the classification signal given by GeDi to guide generation, and the goal for this paper was to find heuristics that work well enough to justify the usefulness of the method in Equation 5. We find that both of these heuristics work reasonably well independently, but sometimes achieve slightly better results when combined. Our initial heuristic applies a weighted posterior given by P w (x t |x <t , c) ∝ P LM (x t |x <t )P θ (c|x t , x <t ) ω , (6) where ω > 1 to bias generation more strongly towards the correct class. The right hand side of Equation ( 6) is normalized over all x t in the vocabulary to obtain P w (x t |x <t , c). We summarize the resulting scheme in Algorithm 1. While we found that the weighted posterior in Equation ( 6) is most critical for controlling generation, we also used an additional filtering heuristic that was beneficial for steering generation more aggressively. This heuristic, inspired by nucleus sampling (Holtzman et al., 2020) , removes candidate next word tokens with lower values for P θ (c|x t , x <t ) while maintaining a minimum of at least 1 -ρ in cumulative probability mass in P w (x t |x <t , c), where 0 ≤ ρ < 1 is a parameter that decides the aggressiveness of the filtering. We define V n as the set of n tokens with the highest P θ (c|x t , x <t ). We define m as the minimum n such that xt∈Vn P w (x t |x <t , c) ≥ 1 -ρ. We define V m as V n for n = m, meaning that V m will contain the minimum number of tokens possible at the head of the distribution for P θ (c|x t , x <t ) to maintain a minimum cumulative probability of ρ in P w (x t |x <t , c). We then zero out probabilities of tokens not in V m and re-scale the remaining distribution to sum to 1.

Algorithm 1 Generative Discriminator Guided Sequence Generation

Inputs: base LM P LM , CC-LM P θ , vocabulary V, posterior mixing weight ω, decoding scheme 1: P (x|c) ← 1 2: P (x|c) ← 1 3: for t = 1 . . . , N do 4: p LM ← [P LM (x t = v|x <t ) for v in V] base-LM prediction 5: 6: p x1:t|c ← [(P (x|c)P θ (x t = v|x <t , c)) 1/t for v in V] CC-LM with control code 7:  p x1:t|c ← [(P (x|c)P θ (x t = v|x <t , c)) 1/t for v in V] CC-LM p w ← p LM (p c|x1:t ) ω 12: p w ← pw |V| i=1 pw[i] normalize over the vocabulary 13: v i ← Decode(p w ) Can be greedy or sampling, could include filtering in Eq. 7 14: 15: P (x|c) ← P (x|c)P θ (x t = v i |x <t , c) 16: P (x|c) ← P (x|c)P θ (x t = v i |x <t , c) 17: x t ← v i

4. RELATED WORK

Methods for controlling text generation can be categorized broadly into two types: training or finetuning a model directly for controllable generation (Keskar et al., 2019; Ziegler et al., 2019; Rajani et al., 2019; Ficler & Goldberg, 2017; Yu et al., 2017; Hu et al., 2017) or using a discriminator to guide generation (Ghazvininejad et al., 2017; Holtzman et al., 2018; Dathathri et al., 2020) . Keskar et al. (2019) train a CC-LM with pre-defined control codes placed at the start of every sequence. Our approach also uses CC-LMs, but instead of generating from them directly, we use them as discriminators to guide generation from another language model. This is much more computationally efficient than previous methods for discriminator guided generation. Holtzman et al. (2018) apply discriminators to re-weight a beam search, requiring all candidate tokens to be passed through the discriminator, scaling linearly with the number of re-scored tokens. PPLM (Dathathri et al., 2020) trains an attribute model on top of a language model's last hidden layer and backpropagates gradients to update the hidden states of the model. This is computationally intensive, especially when applying to large LMs, because it requires multiple forward and backward passes for each generation step. Our approach also relates to the rational speech acts framework for computational pragmatics (Frank & Goodman, 2012; Goodman & Stuhlmüller, 2013) where a "listener" model and a "speaker" model interactively generate a sequence such that the listener can recover the input. GeDi most closely relates to distractor based pragmatics (Andreas & Klein, 2016; Cohn-Gordon et al., 2018; Shen et al., 2019) , where a single model processes a true input and a distractor input, and uses Bayes rule to produce text that fits the true input but not the distractor input. In GeDi, the base language model (the speaker) is guided by the GeDi (the listener), to produce text where the GeDi can recover the desired attribute. GeDi differs from previous distractor based pragmatics based approaches in that it trains a separate class-conditional language model on a single attribute, allowing that attribute to be isolated, and uses it to guide generation from a separate larger language model.

5. EXPERIMENTS

We experiment with GeDi-guided generation for sentiment, detoxification, and topic control. Our experiments finetune GPT2-medium (345M parameter) (Radford et al., 2019) using the loss in Equation 3 with control codes specific to each task to form a class-conditional language model. We use these CC-LMs as GeDis to guide generation from GPT2-XL (1.5B parameter) and GPT-3 (Brown et al., 2020) in our detoxification experiments. These experiments were performed using adaptations of Huggingface Transformers (Wolf et al., 2019) . For generation, we include experiments with greedy decoding with a repetition penalty (Keskar et al., 2019) (conditioning on varying prompts to give diversity across samples) and top-p sampling (Holtzman et al., 2020) . We focus primarily on the greedy decoding setting because we found it to give higher quality samples. However, we do include top-p sampling models for GPT-2, PPLM, and GeDi for our sentiment and detoxification experiments to evaluate GeDi in this setting. Additional details about the way we apply a repetition penalty are given in Appendix B, and our hyperparameter settings for GeDi-guided generation are given in Appendix C.1.

5.1. CONTROLLING SENTIMENT OF GENERATIONS FROM BOOK PROMPTS

We experiment with GeDi-guided generation from GPT-2 for sentiment control. For these experiments, we use CC-LMs finetuned on IMDb movie reviews. We noticed that, while direct generation from CC-LMs could effectively control the sentiment of movie reviews, it struggled to generalize to out-of-domain prompts, and would generally try to convert prompts into movie reviews. However, when we used this same model as a GeDi to guide sampling from GPT-2, we were able to effectively control the sentiment of a wide variety of topics. To experimentally verify that GeDi can generalize the concepts of "positivity" and "negativity" beyond its training domain, we evaluate on a task where the models conditionally generate text from the start of book chapters from Bookcorpus (Zhu et al., 2015) , where each prompt is at least 150 characters and ends on the first-word break after the minimum length. We run human evaluation on generations from 50 different book prompts from 14 different models; including raw GPT2-XL with both topp sampling (p = 0.9) and greedy decoding (repetition penalty=1.2), and the following models with both positive and negative sentiment: 1. GPT2-XL guided by GeDi, greedy decoding plus repetition penalty of 1.2. 2. GPT2-XL guided by GeDi, top-p sampling with p = 0.9 plus repetition penalty of 1.05. 3. GPT2-XL guided by PPLM, greedy decoding plus repetition penalty of 1.2. 4. GPT2-XL guided by PPLM, top-p sampling with p = 0.9. 5. CC-LM trained on movie reviews (same model used as GeDi, but with direct CTRL-style generation), greedy decoding plus repetition penalty of 1.2. 6. CTRL (Keskar et al., 2019) conditioned on positive and negative control codes for Amazon review sentiment, greedy decoding plus repetition penalty of 1.2. See Appendices C.2 and C.3 for additional information about our PPLM and CTRL baselines respectively. We found that it was more than 30× faster to guide GPT2-XL with a GeDi as compared with PPLM (assuming 10 update steps as used in Dathathri et al. (2020) and in our experiments), as shown in Table 1 . Amazon Mechanical Turk annotators rated the generated text on sentiment, how book-like the text was, fluency, and whether or not the text resembled an Amazon review or movie review (since CTRL was trained on Amazon reviews and GeDi was trained on movie reviews). 3 annotations were collected on each sample, and each annotator was randomly assigned samples from the set of all generations from all models. To optimize the quality of annotations, we require all the annotators to have Mechanical Turk Masters Qualificationfoot_1 along with requiring them to be located in the US, having more than 97% task approval rate and having completed more than 10000 tasks. Exact instructions given to annotators are given in Appendix G. The results of the experiment are given in Table 2 . Using a CC-LM to guide GPT2-XL was able to generate book-like and linguistically fluent text while strongly control the tone. In the greedy setting, GeDi was also able to give roughly equivalent positive sentiment control and statistically significantly stronger negative sentiment control compared with PPLM. In the top-p setting, GeDi achieved stronger sentiment control for both positive and negative sentiment. CTRL struggled to control tone/sentiment in this setting because its training domain for sentiment was Amazon reviews, and direct generation from the CC-LMs that we used as GeDis failed to generate book-like text because their training domain was movie reviews. We provide examples of generations from all models on book prompts in Appendix F.1. According to our annotators, 27% of CTRL samples resembled Amazon reviews, and 61% of CC-LM samples resembled movie reviews (Amazon and movie review resemblance percentages were less than 5% for samples from all other models). Table 14 specifically shows how CTRL tends to generate Amazon reviews and how CC-LMs tend to generate movie reviews. This is a critical drawback of CTRL-style generation -the model can only reliably generate text and control attributes within the training domain corresponding to the control code. Discriminator-guided methods GeDi and PPLM result in text rated more book-like that very rarely if ever diverts back to the domain that the discriminator was trained on. The most significant advantage of GeDi over PPLM is that it is able to generate 30× faster. GeDi was also able to give statistically significantly stronger positive sentiment control for top-p sampling, and negative sentiment control for both greedy and top-p sampling. For a detailed overview of p-values for significance test, see Appendix E.

5.2. DETOXIFYING GPT-2 AND GPT-3

We test GeDi's ability to detoxify language generation. We train a CC-LM on the Jigsaw Toxic Comment Classification Challenge Datasetfoot_2 , which contains text samples labeled as "toxic" or "nontoxic". The "toxic" label indicates the presence of profanity, obscenity, threats, insults, or identity hate. We train the model on an even split of toxic and non-toxic examples, with a "clean" and "dirty" control code to specify toxic and non-toxic text. For evaluation, we use prompts from Real Toxicity Prompts (Gehman et al., 2020) . To identify strong triggers, we select a subset of prompts with perspective API 4 toxicity probabilities between 0.3 and 0.5, that also were classified as non-toxic by a RoBERTa toxicity classifier trained on the Jigsaw dataset. We use GPT-2-XL to draw 32 samples from each prompt, and select the 100 prompts with the highest average toxicity probability over their 32 completions according to the RoBERTa Table 2 : Human and automatic evaluation for sentiment on book text generation (rated for positivity, book resemblance and fluency all on a scale of 1-5), with key results in bold. For human evaluation, we average three annotations on generations from 50 prompts for each model, where prompts are from the start of book chapters, and are a minimum of 150 char. For automatic evaluation, we use a RoBERTa classifier trained on SST-2 (Socher et al., 2013) to measure label fidelity (how often the sample is classified as having the same label as the control code), and measure the perplexity of generations under GPT-2 to compute perplexity scores. We compare using a CC-LM as a GeDi to guide GPT2-XL (GeDi-guided), vs. direct class conditional generation (CC-LM). Generating directly from CC-LMs (as opposed to using them as GeDis) resulted in text that was less book-like and often reverted back to the training domain of the model -for instance, our CC-LMs tended to generate text that resembled movie reviews, and CTRL tended to generate text that resembled Amazon reviews (Note that CTRL is also a type of CC-LM, and was trained on Amazon reviews for sentiment control). toxicity classifier. Our goal with this procedure was to identify prompts that are non-toxic, but have a high probability of causing language models to generate toxic text. Using these 100 prompts, we evaluate generation from 9 models: 1. GPT2-XL greedy decoding plus repetition penalty. 2. GPT2-XL guided by GeDi, greedy decoding plus repetition penalty of 1.2. 3. GPT2-XL guided by PPLM, greedy decoding plus repetition penalty of 1.2. 4. GPT2-XL top-p sampling with p = 0.9. 5. GPT2-XL guided by GeDi, top-p sampling with p = 0.9. 6. GPT2-XL guided by PPLM, top-p sampling with p = 0.9. 7. CC-LM trained on Jigsaw, conditioning on "clean" control code (same model used as GeDi, but with direct CTRL-style generation), greedy plus repetition penalty of 1.2. 8. GPT3 using Open AI API, greedy plus repetition penalty of 1.2. 9. GPT3 using Open AI API, guided by GeDi, greedy plus repetition penalty of 1.2. For results with GPT-3, we use the Da Vinci model from the Open AI APIfoot_4 , which can give up to 100 next token log probabilities for any next token prediction. We controlled GPT-3 decoding by passing the API a prompt, selecting the next token using the top 100 log-probabilities, and then passing a new prompt that has the selected token appended to the end. This limitation means that we can only re-weight the top 100 tokens; we treat all other tokens as having zero probability and normalize the top 100 at each prediction to sum to 1. There is no way to apply PPLM to the GPT-3 API, since PPLM requires access to hidden states and gradient. We run human evaluation to measure toxicity and linguistic fluency [1: very low fluency, 5: very high fluency], using the same annotator criteria as in the previous section, and using instructions given in Appendix G. We also compute perplexity scores (using GPT-2-XL) and automatic toxicity scores using the RoBERTa toxicity classifier trained on Jigsaw. Results are given in Table 3 . GeDi was able to significantly reduce the toxicity in GPT-2, and perform on par with PPLM for greedy decoding and better than PPLM for sampling, while also achieving 30× faster generation speeds (PPLM and GeDi for detoxification have the same respective computational costs as in the previous experiment with sentiment). Table 3 : Human and automatic evaluation of toxicity. We collect 3 annotations of toxicity labels (where we classify sample based on majority) and linguistic fluency scores (scale of 1-5) for 100 samples for each model. We also measure toxicity percentages using a RoBERTa classifier trained on Jigsaw, and perpliexty scores using GPT-2. We find that GeDi is effective for detoxifying GPT-2 and GPT-3 while maintaining fluency. We also found that GeDi produced fewer toxic generations than CC-LM (although this was not statistically significant, p = 0.13), while achieving significantly higher fluency (p = 0.0003). The finetuning needed for CC-LM tends to result in catastrophic forgetting of the information that the model learned during pretraining -for instance in the previous section we measured that CC-LM would usually generate movie reviews (generations resembled movie reviews 61 % of the time) -and here we noticed that the CC-LM is biased towards shorter generations that often resemble internet comments from the Jigsaw dataset. We also found GeDi was able to detoxify GPT-3-making it the only practical existing method for detoxifying GPT-3 due to the API limitations. For a detailed overview of significance tests, see Appendix E, and for examples of GeDi samples used in this experiment, see Appendix F.2.

5.3. EXTENDING GEDI TO THE MUTLI-CLASS SETTING

To extend GeDi to the multi-class setting, we propose reframing each classification task as binary classification using control codes and anti control codes for each class. The control code for each class is given by "true" concatenated with the class name, and the anti-control code is given by "false" concatenated with the class name. The CC-LM then classifies whether the class name corresponds to the text. For instance, the CC-LM would process the following two sequences in parallel: <true> <science> T-rex achieved its massive size due to an enormous growth spurt during its adolescent years. <false> <science> T-rex achieved its massive size due to an enormous growth spurt during its adolescent years. and would classify it as true or false as to whether the class (in this case "science") matches the category of the text by using Equation ( 5). During training, the model sees an equal number of true pairings (where text corresponds to class) and randomly chosen false pairings. After the model has been trained, binary GeDi-guided generation can be applied, using c =<true> and c =<false>, and using the desired class name as the first token (x 1 ) in the sequence. This also makes it possible to form new control codes zero-shot; a new topic word that was never seen before in training can be chosen in place of x 1 . To experiment with multi-class GeDi, we use the AG news topic classification data set (Zhang et al., 2015) which has 4 topics (World, Sports, Business, and Science/Tech). In order to test GeDi's ability to generate never seen before classes zero-shot, we trained 4 different CC-LMs; each one is trained on only 3 out of 4 of the AG news classes, with one class held out. We then compare the effect of direct (CTRL-style) generation from CC-LMs and GeDi-guided generation from GPT-2, on both training topics and held out (zero-shot) topics. To evaluate topic relevance, we use a RoBERTa classifier trained on all 4 AG news topics to estimate the topic of generation. We compare with a raw GPT-2 baseline to see how much generation is being influenced towards each topic. We obtain generations conditioning on short (minimum 30 character, ending on a space) prompts from the multi-news data-set (Fabbri et al., 2019) , and report results in Table 4 . We find that GeDi-guided generation is able to generate training topics with a higher label fidelity than CTRL-style generation from a CC-LM. We also find that unlike CC-LM, GeDi is able to bias generation towards never seen before zero-shot control codes that are held out from training. GeDi's ability to generalize to new control codes zero-shot gives the ability to generate text corresponding to many topics and subtopics. This ability likely emerges because generative classifiers can classify unseen topics zero-shot from learned word embeddings (Yogatama et al., 2017) , and GeDi uses generative classifiers to guide generation. We provide examples of zero-shot generation with GeDi from many control codes in Appendix F.3.

6. FUTURE DIRECTIONS

Methods to make large LMs like GPT-3 safer and more controllable are becoming especially important as LMs become incorporated into products. GeDi is by far the most practical existing method for detoxifying generation from large LMs, since it only uses a small constant amount of computational overhead and only requires access to the LM's next token log probabilities. With the right training data for classification, GeDi could also potentially be used to filter out harder to detect forms of toxicity such as bias and misinformation. Extending on the methods in this paper, multiple GeDis trained to filter out different undesirable attributes could be combined, for instance by multiplying the attribute classification terms from several different discriminators in Equation 6. In additional to making LMs safer, GeDi could potentially be used to guide generation towards other desirable attributes such as high linguistic quality and improved commonsense reasoning. Lastly, GeDi-inspired methods could be explored as much more computationally efficient alternatives to fine-tuning large LMs to new generation tasks.

7. CONCLUSION

We present GeDi as an approach for controllable generation that uses generative discriminators to classify candidate next tokens on the fly during inference, making it far more efficient than previous methods that use discriminators to guide generation. GeDi achieves stronger controllability of sentiment than PPLM while also giving a generation speed more than 30× faster. GeDis trained on 3 topics can also controllably generate new topics zero-shot from just a key word. We also show that GeDi is able to significantly reduce the toxicity of GPT-2 and GPT-3 without sacrificing noticeably on linguistic fluency. This work also moves towards unifying natural language generation with classification, and suggests that we may be able to efficiently generate text that corresponds to any attribute that we can accurately classify. This could have broad implications towards improving text generation systems by making them safer and more controllable.

A GEDI WITH LOG PROBABILITIES

GeDi-guided generation uses language models discriminatively via Bayes rule by using P θ (c|x 1:T ) = P (c) P θ (x 1:T |c) 1/T c P (c ) P θ (x 1:T |c ) 1/T , where P (c) is assumed to be uniform. Log-probabilities for each class are given by log P θ (x 1:T |c) = T t=1 log P θ (x t |x <t , c), and the class probability is given by P θ (c|x 1:T ) = e ((1/T ) log P θ (x 1:T |c)) c e ((1/T ) log P θ (x 1:T |c )) . ( ) This can be computed in a numerically stable way using a softmax (Bridle, 1990) , since the maximum logit to the softmax can be subtracted out before taking the exponent without changing the result. For the two class case, c ∈ {c, c}, meaning that the above equation could have been equivalently computed using a sigmoid of the difference of the logs of the two terms in the denominator sum (but our implementation used softmax as above).

B GENERATION SETTINGS

When comparing the quality of samples from different language models, there is a trade-off between quality and diversity; models that tend to have more sharply peaked distributions for P θ (x t |x <t , c) will tend to have higher quality samples, but will also have less diversity. Applying GeDi results in more sharply peaked distributions due to the filtering step, which zeros out probabilities for some tokens. In order to ensure a fair comparison of models, we mainly consider deterministic decoding for our experiments. We use greedy decoding with a repetition penalty (Keskar et al., 2019) , meaning we always pick the most likely token in the model's predictive distribution after applying a penalty to words that have previously occurred in the sequence. With deterministic decoding, the model would generate the same text sequence every time without any conditioning text. Therefore, all experiments in our paper rely on varying prompts to ensure diversity of generation. We found the repetition penalty to be necessary for preventing degeneration with greedy decoding. Logits of each previously occurring word in the sequence are divided by a repetition penalty which is greater than 1. To account for the possibility of negative logits, we re-scaled the final logits in all models to always have a maximum of 10 across the vocabulary before dividing by the repetition penalty.

C ADDITIONAL MODEL AND HYPER-PARAMETER DETAILS C.1 HYPER-PARAMETERS FOR GEDI GUIDED GENERATION

GeDi used ρ = 0.7 and ω = 20 for sentiment, ρ = 0.8 and ω = 30 for GPT-2 detoxification, ρ = 0.8 and ω = 90 for GPT-3 detoxification (since GPT-3 is limited to the top 100 LM logits, steering needs to be more aggressive), and ρ = 0.8 and ω = 150 for topic control.

C.2 BASELINE DETAILS FOR PPLM

For PPLM, we trained the external classifier (which uses logistic regression on top of representations from GPT-2) on the SST-5 data set, after struggling to achieve as strong results training on IMDb (which is what GeDi was trained on) and advise from the paper authors. For generation, we used greedy decoding with a repetition penalty applied the same way as described in Appendix B. We applied additional tuning to hyper-parameters because we were guiding generation from GPT2-XL (whereas original PPLM work uses GPT2-medium). Starting from the default hyper-parameters in the repository, we considered step sizes in the set {0.04, 0.08, 0.16, 0.25, 0.35}, and found that 0.25 GeDi-guided (combined heuristic, ρ = 0.7, ω = 30) 96 % 13.5 GeDi-guided (weighted decoding heuristic, ρ = 0, ω = 600) 86 % 13.6 GeDi-guided (filtering heuristic, ρ = 0.7, ω = 0) 95 % 13.3 Table 6 : RoBERTa-based toxicity and perplexity scores for the weighted decoding heuristic, filtering heuristic, and combined weighted decoding filtering heuristic. Model Toxicity (RoBERTa) ↓ perplexity scores ↓ GeDi-guided greedy (combined heuristic, ρ = 0.8, ω = 30) 8 % 10.9 GeDi-guided greedy (weighted decoding heuristic, ρ = 0, ω = 150) 13 % 10.8 GeDi-guided greedy (filtering heuristic, ρ = 0.85, ω = 0) 24 % 10.7 gave the best trade-off between sentiment control and generation quality, so we used this for our experiments. Similarly, for detoxification we tried the stepsizes in {0.10, 0.20, 0.40} and chose 0.20 to minimize toxicity while maintaining fluency (low perplexity).

C.3 BASELINE DETAILS FOR CTRL

For CTRL, we prepended prompts with the control codes for positive and negative Amazon reviews, which are "Reviews Rating: 1.0" and "Reviews Rating: 5.0" for negative and positive respectively. We also tried "Books Rating:" as a prompt that mixes the control code for sentiment and books, however we found that there was very little variation in the samples generated by positive and negative (generation was usually identical for several sentences before deviating), and no noticeable impact on sentiment, tone, or mood.

D ABLATION STUDIES

We examine the effects of the filtering and weighted decoding methods described in Section 3.1 for sentiment control and detoxification. We consider both heuristics independently, as well as the combination of these two heuristic used in all experiments in the paper. For the weighted decoding setting, we set ρ = 0 which turns off filtering, and tune ω to give a similar perplexity score to the combined heuristic (higher ω results in more aggressive steering and generally gives a worse perplexity score and higher label fidelity). For the filtering setting, we set ω = 0 to turn off weighted decoding, and tune ρ to give a similar perplexity score to the combined heuristic (higher ρ results in more aggressive filtering and generally gives a worse perplexity score and higher label fidelity). Results are given in Table 6 for detoxification and Table 5 for sentiment. Both heuristics are able to control generation on there own, but the combined heuristic appears to perform slightly better for detoxification, and may be more robust to settings where one heuristic or the other do not work as well in isolation. 

GeDi-guided (positive)

In a shocking finding, scientists have discovered that the first known hominid species, Australopithecus afarensis (or 'Lucy') was not just a human-like being. It had an advanced sense of hearing which allowed it to identify and follow the sounds of its environment. This new species is one of only four known to exist today and is named Australopithecus africanus (or 'Lucy'). ing) . Boldfaced string indicates the context provided to the language model followed by its generation. GeDi guided generation gives strong sentiment control for book prompts.

Model Generation

GeDi-guided (negative) (greedy) 21 "Hello, Jennifer, Bella." Tracy said. "Thanks so much for coming in." She dragged another chair out to the front of her desk and motioned for us all to sit down on it while she made herself a sandwich. "What are you doing here? I thought we were supposed to be having a private meeting!" She snapped angrily through the mouthpiece of her headset before taking a bite out of it with an angry huff that sounded like someone had taken their toothbrush and tried to use it as an axe blade instead. Youŕe just wasting your time here." I was pretty sure she was trying not to laugh when she said those last two words, which is really hard because they sounded like they came straight out of some bad horror movie script written by somebody who d seen The Evil Dead twice in quick succession and then decided "Hey, why not?" It wasnt until after my third or fourth attempt that I finally managed something resembling humor: †And this is why you should never ever ever ever ever call me again... * * * "Oh hey guys." A voice boomed from somewhere off-screen (probably behind me). "Sorry if there arent any other important things going on around here..

GeDi-guided (positive) (greedy)

21 "Hello, Jennifer, Bella." Tracy said. "Thanks so much for coming in." She dragged another chair out to the front of her desk and motioned for us all to sit as she began speaking with me about my work with them. I have been working with these girls since I started this position at the end of last year, which is a little over two years ago, when I got my start in this field here at the school and also at The Ranch Academy where I am now located in Arizona where most of these girls are being educated on an intensive basis while they are here before going back home to their families and communities once they get older." It is a really unique experience because there are very few places like it anywhere in the country that can provide young girls so much support and care from day one until they reach adulthood|from birth right up to graduation day!" I think it is something that every family deserves! It's really exciting how things have worked out so far! It has been amazing seeing how each child reacts when you take your time with them every day, helping them feel comfortable talking more about their feelings as they grow physically as well as emotionally...and just having a great time watching all of them develop both physically and mentally along the way....

CC-LM (negative) (greedy)

21 "Hello, Jennifer, Bella." Tracy said. "Thanks so much for coming in." She dragged another chair out to the front of her desk and motioned for us all to sit down.<br /><br />i'm sorry i couldn't get a better seat.<br /><br />this is one of those movies that you can watch over and over again without ever getting bored or annoyed by it. it's not really about the characters but rather how they interact with each other in this movie -what happens when two people meet at a party? what happens when they go on vacation together? how do we make friends while traveling alone? these are questions that are answered here as well as many others throughout the movie (and yes, there is an ending too).<br /><br />the acting was good enough but nothing special -just average at best.<br /><br />there were some funny moments though...<br /><b>1) jennifer: <sighs deeply> i don't think anyone would have thought me capable of such stupidity! 2) tracy: <laughs nervously} no way! 3) jennifer: oh my god...

CC-LM (positive) (greedy)

21 "Hello, Jennifer, Bella." Tracy said. "Thanks so much for coming in." She dragged another chair out to the front of her desk and motioned for us all to sit down.<br /><br />iḿ glad youŕe having a good day!<br /><br />ive been looking forward to this movie since i saw it on tv last night...and i cant believe that itś only been available on video in the uk since 73!<br /><br />itś a great film about two young women who are both very talented and very lonely at the same time -but they have something special going for them: their friendship.<br /><br />the acting is excellent -especially from jennifer aniston as lisa; she really brings this character into life.< br /><besides her performance, there are some other memorable moments too: when lisa tells jenny how much she misses him (she says ""you dont miss me like that""), when he asks if he can come over here with his girlfriend (he says ""sure""), and even when they go back home after work together...<br /><besides being one of my favorite movies ever made, it also has one of my favorite songs too: <a href="http://www.myspace. Table 12 : Generation with raw GPT-2 and PPLM when controlling the sentiment/tone of the text (greedy decoding). Boldfaced indicates the context provided to the language model followed by its generation. Like GeDi, PPLM was able to achieve domain transfer of sentiment. However, GeDi led to 30x faster generation.

GPT-2 (greedy)

21 "Hello, Jennifer, Bella." Tracy said. "Thanks so much for coming in." She dragged another chair out to the front of her desk and motioned for us all to sit down on it as she sat down herself, a little awkwardly at first but then settled into the chair with a smile on her face and an arm around my shoulder.

PPLM (negative) (greedy)

21 "Hello, Jennifer, Bella." Tracy said. "Thanks so much for coming in." She dragged another chair out to the front of her desk and motioned for us all to sit down on it as she walked over to a chair across from me and sat down next to me with her feet crossed on the floor behind my desk. I'm sorry I didn't have time today but I wanted you all here so we could get this off our plates before we move forward with anything else that's important at work or school or wherever else you are right now! She looked up at me and then asked: You know what? You're not going anywhere until you've cleared this up! And then she turned back around towards Tracy who was sitting there looking away from us again while still holding onto her phone like it was a bomb about to go off in her lap, which it wasn't because no one had ever been killed by an iPhone explosion before (although they were very rare).

PPLM (positive) (greedy)

21 "Hello, Jennifer, Bella." Tracy said. "Thanks so much for coming in." She dragged another chair out to the front of her desk and motioned for us all to sit down on it as she sat herself down across from us with a smile on her face. "Iḿ very excited about this yearś crop," I told her as we sat across from each other at the front of my desk, "and I think you ĺl see that there are some really great things happening in our industry." She smiled back at me and told me that she was looking forward to seeing what else we could do together over the next few years|a sentiment I shared wholeheartedly with everyone who had come into contact with my company during this past year or so|but then quickly added: "But first letś get started! Weŕe going to have a lot more fun working together!" And then she got up and walked over towards one of those glass windows where people were sitting around talking quietly while they worked (which is something that happens every day). Table 13 : Controlling the sentiment/tone of generation with CTRL (greedy decoding). Boldfaced indicates the context provided to the language model followed by its generation. CTRL struggles to control the sentiment when given a book text prompt because it was trained for sentiment on Amazon reviews.

CTRL (negative) (greedy)

21 "Hello, Jennifer, Bella." Tracy said. "Thanks so much for coming in." She dragged another chair out to the front of her desk and motioned for us all to sit. I sat down in one of the chairs and she sat in another. We were all staring at her. Her hair was a mess from her morning shower. She looked like she had just gotten out of bed. Her eyes were red as if they had been crying. sounded like a woman screaming. she was hysterical and shouting things at him.<br /><br />heinlein had been writing his own stories for years before this one came along. he knew how to write suspenseful stories with no plot or character development whatsoever -just a series of events happening over time.<br /><br />the story is about an old man who has lost his wife and two children in the plane crash that killed their parents on board the airplane they were on (it was supposed to be a funeral). when he returns home from work one day, there are still some people around who have not seen him since leaving them behind...<br /><br />this movie is so bad it's good!

F.2 DETOXIFICATION SAMPLES

Table 15 : Detoxification of LM generation with GeDi (greedy decoding) when triggers from Real Toxicity Prompts (filtered as described in Section 5.2) are used. Boldfaced string indicates the context provided to the language model followed by its generation. GPT-2 and GPT-3 complete these with text containing toxicity in the form of sexually explicit references, profanity or insulting language including curse words.

Model

GeDi-guided generation

GeDi-guided GPT-2

Then she stuck a 3 foot metal rod up in the tree to help guide her. The video shows Ms Johnson's husband running up to try and rescue her. He says: "She was on the ground with her arms above her head, but I think it's more likely that she fell over backwards than that she fell off of it herself. "We're not sure why this happened." GeDi-guided GPT-3 Then she stuck a 3 foot metal rod up through. She found nothing unusual. So I moved the detector coil around the area where I had seen it glow red on my detector. Nothing again until I moved it to about 6 inches from the ground in this spot near a tree trunk (I was looking for an old well). It started glowing red again but not as bright as when I first saw it glow earlier at another spot nearby. Nothing's more embarrassing than seeing a band who thinks they're the hotshot act of today playing an older style. When I first saw this, says Kaela to Paul, "It reminded me of when we played in New York." Paul laughs nervously, laughing and shaking his head as he recalls that time. We had no guitar or bass at all. Dave told me that he didn't have any equipment at all for the first two gigs in New York City and there were some problems with microphones {-but I think what really set them apart was our approach to sound-making on those records.

GeDi-guided GPT-3

Nothing's more embarrassing than seeing a band who thinks they're the hot ticket, only to discover that there is not an audience for their music in any given venue. I recall one show at which we were scheduled to play with a local group. The opening act was late, and we had been told that the crowd would be restless until the other band arrived. When we took our places on stage, it was clear that many people had left after us and before them|and this was at 8:00 p.m.; there would be no encore or second chance to win back those fans later in the evening when most shows begin (9:00 p.m.). GeDi-guided GPT-2 

F.3 ZERO-SHOT CONTROL CODES

Table 16 : Controlling topic of generation (zero-shot) with GeDi (greedy decoding). This particular topic GeDi (from Section 5.3) was trained as CC-LM on only three classes: science, sports and business. The topics of Space, Fire, and History were not a part of the GeDi training set. Boldfaced string indicates the context provided to the language model followed by its generation.

Space

In a shocking finding NASA have announced the discovery of a mysterious object orbiting our nearest neighbour, Proxima Centauri. Advertisement Researchers have spent years studying the strange object in space as they are increasingly becoming more convinced that it's real. In fact, some researchers are starting to wonder if this new discovery may prove to be one of the greatest scientific discoveries of recent years. According to The Guardian: NASA is currently investigating this new object and will release more information soon about it's possible impact on Earth and what exactly is it? What do you think this new "discovery" is? Is this one of the most important discoveries in recent years? Is this one of many "unknowns" that has scientists at an edge over time? What does that mean exactly?! I can't wait for more information!!

Fire

In a shocking finding police believe two fire crews, including a senior paramedic, were deliberately set alight as part of a revenge plot. It comes as a huge investigation into an apparent conspiracy in which arsonists targeted at least three other London fire engines in just one night on Friday and Saturday night. Officers believe it was part of the revenge plot by the crew members to target those they suspected were being investigated by the police. Scroll down for video... Firefighters from different London fire companies (L) and outside their vehicles (R) on Friday night and early Saturday morning where they had allegedly been targeted for revenge plot against those being investigated The alleged retaliation plot is now one of many being looked at after it was discovered some members of the London firefighters had been targeted for revenge plots. In all cases, these are two separate fires started when people opened up fuel valves with no working flues attached to them.

History

In a shocking finding historians believe to be "unprecedented" British documents have been unearthed which reveal the true history of King Richard II and show that he was not only the son of Godfrey of Gloucester, but also descended from King Henry VIII. Richard, whose father was executed for his crimes in 1483, became King in 1485 after defeating John Balliol in a battle at Bosworth. Historians are baffled by how this happened given Richard II was illegitimate as were both his parents { he had no male heirs { and how it took him until 1499 to marry Catherine de Valois who bore him three children before dying aged just 28 in childbirth. The new documents appear to confirm that there were indeed two Kings: one born on November 29 1533 and another on November 29 1545 when Richard married Catherine following years of secret negotiations with France over marriage rights for England. Table 17 : Controlling topic of generation (zero-shot) with GeDi (greedy decoding). This particular topic GeDi (from Section 5.3) was trained as CC-LM on only three classes: science, sports and business. The topics of World, Cars, Climate were not a part of the GeDi training set. Boldfaced string indicates the context provided to the language model followed by its generation.

Topic

GeDi-guided generation

World

In a shocking finding that is bound to shock India, scientists have said the world's oldest human skull found so far was actually female. A "female" male was buried in a coffin made of two women's skeletons dating back 2,000 years and had the face of an unknown male who had died in his early twenties, they said. "These are all surprises because they're really at the upper end of age at death among this group (in which) only 1 or 2 people were dying that late." Researchers also confirmed that there were no children or even close kin to be found on their bodies. Scientists are working on several theories regarding how the skeletons could have been put into one body as if they were not already cremated it would be too expensive. Including whether it could be some sort of mass-hysteria type game being played by local people for entertainment with no ulterior motives involved but still none ruled out yet.

Cars

In a shocking finding which shocked car enthusiasts around the world, police officers investigating the disappearance of two sisters from a luxury Italian sports car showroom have found no trace of them. Detectives discovered that they were in the custody of their uncle when their bodies were discovered by police at 4am this morning in front of an Italian restaurant on London's Regent Street. Officers are now working to determine if these girls were killed before they could be identified or if they may have died as a result of natural causes. Their mother told The Sun she believes her daughters knew something was wrong with their uncle but couldn't say anything because her daughters would'scream and run away'. They had gone to buy new cars for them after moving from Italy last year where relatives say he was abusive and controlling towards his three teenage sons. Now friends and family are desperately trying to come to terms with what happened while others believe this is yet another bizarre death that has baffled people

Climate

In a shocking finding climate campaigners fear is the tip of the iceberg, the government-commissioned study reveals how energy companies are profiting from burning coal to power our modern world. The "Energy Intensive Companies Study 2015" (EISC2015) was released this week and shows how fossil fuel companies are increasing investment in their oil, gas, and coal business at a faster rate than ever before. It's worth noting EISC2015 has not been subjected to any kind of public or peer review by experts on climate science { it's been made up by an expert group of five energy firms who included big energy firms Exxon and Chevron. So we have very little confidence that these new figures are going to be an accurate reflection of what they're measuring { if they're being published at all! | Michael Tippetts (@TippettMichael) December 17, 2016 Table 18 : Controlling topic of generation (zero-shot) with the CTRL model (greedy decoding). Since these topics were not present in CTRL's training set, it struggles to generalize. We also considered prompts of the form "Text:", as this format was also used in Keskar et al. ( 2019), but we found zero-shot generations to be worse than with "Title:". Boldfaced string indicates the context provided to the language model followed by its generation. 

Space

Space Title: In a shocking finding that the human body is able to survive in space for more than 7 days. Comment: The reason why we can survive in space is because of our cells. We are able to withstand the extreme conditions of our environment and also because we have a lot of water,,,, Title: Why do some people get sick when they go on vacation? Text: I mean like if you went on vacation with your family or friends and then come back home it would be fine but if you go away from your family or friends / / friend etc etc then it will be really bad for them. And I m not talking about just one person who gets sick but many people who goes on vacations with their families and friends etc.. So what happens when they come back home? What happens? Edit : * * * Edit : I didn't know how to format my post so sorry for any mistakes! But anyway here's my question ; How does something like this happen in real life? Is there any way that could happen without killing someone else first before going out into space ( which would probably kill everyone ). EDIT 2 : Sorry guys i meant " why do some people get ill " instead of saying " why <unk>

Education

Education Title: In a shocking finding in the United States, more than half of all children are overweight or obese. Score: 6 Title: I am an atheist who has been raised by my parents and grandparents. AMA about their beliefs and how they came to be atheist. AMA. Text: My parents were very religious but not as extreme as some people believe. They never forced me to go to church or anything like that. They just wanted me to have a good education so that I could get into college. My grandparents on the other hand were very religious. They would pray every night before going to bed. And they would also tell us stories about how God had created everything in this world. It was really hard for them because we didn't believe in any of it. But it wasn't until later when we started questioning things that it became harder for them. We started asking questions such as: Why do you think there is something rather than nothing? What happened before the big bang? How did life begin? Where did life come from? Why does god allow bad things happen and then punishes them? If god is real why doesn't he stop bad things? He allows evil people who want power over others so why doesn't he stop it? He lets murderers go free because if someone kills another person then what happens after that? So many <unk> 



pronounced "Jedi" https://www.mturk.com/worker/help#what_is_master_worker https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge/ https://www.perspectiveapi.com/ https://openai.com/blog/openai-api/



Figure 1: A toy example of how GeDi-guided generation uses Bayes rule to efficiently compute classification

Figure 2: Instructions provided to the annotators on Mechanical Turk for labeling samples from the sentiment control task.

Figure 3: Example prompt shown to annotators on Mechanical Turk for samples from the sentiment control task. The drop-down for the last 2 questions (on amazon review and movie review) consists of 'Yes' and 'No' as options. Instructions from Figure 2 are provided above each such task.



Automatic label fidelity on topics, measured by how often a RoBERTa classifier's label matches the control code used to generate the sample. We trained 4 different class-conditional language models, each with 1 class held out and we consider direct CTRL-style generation (CC-LM), and GeDi-guided generation from these models. "training set class" label fidelity averages the label fidelities from 3 models trained with the given class as one of the training classes. The "zero-shot class" label fidelity for each class uses generations from the model trained on the other 3 classes, using a zero-shot control code for the desired class. We include results from raw GPT-2-XL as a baseline to show how much GeDi and CC-LM are influencing generation. We find that GeDi is able to influence generation more effectively than CC-LM when conditioning on both training classes and held out classes.

Label fidelity and perplexity scores for the weighted decoding heuristic, filtering heuristic, and combined weighted decoding filtering heuristic.

Statistical significance p-values for sentiment results in Table2. We use a Wilcoxon signed rank test for paired measures, since all models generate from the same set of prompts (and because a non-parametric test is appropriate for an ordinal scale). All p-values are 2-tailed and compare the aligned models in first two columns for positivity, book resemblance, and fluency.

Statistical significance p-values for toxicity results in Table3. For toxicity, we use a 2 proportions z-test. For fluency we use a Wilcoxon signed rank test for paired measures as in Table7. All p-values are 2-tailed and compare the aligned models in first two columns for toxicity and fluency.

Controlling the sentiment/tone of generation with GeDi (greedy decoding). Boldfaced string indicates the context provided to the language model followed by its generation. Using GeDi to guide GPT-2 is able to give control over sentiment while also retaining the vast knowledge about topics (in this case deep learning) that GPT-2 learned during pretraining.

Controlling the sentiment/tone of generation with GeDi (greedy decoding). Boldfaced string indicates the context provided to the language model followed by its generation. Using GeDi to guide GPT-2 generation gives control over sentiment.

Controlling the sentiment/tone of generation with GeDi and CC-LMs (greedy decod-

The only thing that made me feel better about it was that I knew what it was. It wasn't my hair. It wasn't my eyes. It was something else. Something very wrong. But who could tell what it was? Who could tell if this woman really loved me? Or maybe even cared about me? Maybe even wanted me?"\I don't know how you can be so sure it's not your hair," said Tracy. \You look like you've been crying." Tracy took a deep breath. \I'm going to go get some ice water."\I'll stay right here," said Jennifer. \I need ice water badly."Jennifer walked over to the refrigerator and pulled out two bottles filled with water. She poured them into glasses and handed them over. They both drank them. Then she turned around again and, without saying anything, walked back into Tracy's office. When we got back home, we told our parents everything. They didn't believe us. They thought we'd gone crazy. But when they saw Tracy, their faces changed completely. Their eyes grew wide. And then their mouths dropped open. \What are you doing here?" asked Tracy. \She doesn't want us back?" \We can't do anything without her. to sit. I sat down in one of the chairs and she sat in another. We were all staring at her. Her hair was a mess from her morning shower. She looked like she had just gotten out of bed. Her eyes were red as if they had been crying. The only thing that made me feel better about it was that I knew what it was. It wasn't my fault. It wasn't my fault. But now here we were, sitting around her, talking about how bad things were. How could this be happening again? What did you do to make your life so miserable? Why didn't you tell me? Why didn't you try harder or talk more with me? You never told me. You never tried harder. And now here we are, sitting here, talking about it. Tracy, why don't you tell me what happened? What happened between you? Did anything happen? Did anything ever happen between us? Because I'm not going to. I'm not going to. If you're telling me, then it's because something happened. Something bad. Something really bad. And if you're not, then it's because nothing happened. Nothing ever happened between us. So why are we here? What's wrong with us? Is there something wrong with our relationship? Are we... is there something wrong? Is there..." Tracy stopped herself before she said any more. \I don't know. It's just... I can't explain it. It's too much. I've been thinking a lot lately. Too much." She looked up at me.

Controlling the sentiment/tone of generation with class-conditional language models (greedy decoding). Boldfaced string indicates the context provided to the language model followed by its generation. Class-conditional language models tend to resemble the data source they were trained on for sentiment. We observed that CTRL would often convert prompts into Amazon reviews, and direct generation from the class-conditional language models that we use as GeDi guides (simply CC-LM in below table) convert prompts into movie reviews. Underlined text indicates the parts of the generation that resemble reviews. The CC-LMs also produces postprocessing artifacts from the IMDb dataset they were trained on. When using these same CC-LMs as GeDis to guide generation, training domain artifacts are cancelled out from contrasting the positive and negative distributions

not like any other sound. It was a voice he couldn't quite place. He could tell it wasn't human. But it sounded like a human voice, and he knew that if there were such a thing as an alien race, they would have developed their own language. And so did he. The only way for him to find out what language this was, or how it worked, or even why anyone would want one of these things, was by going back in time and finding out about it. So he went back in time, and found himself on the planet Earth. And then his ship crashed. And when he woke up, all his memories were gone. He had no idea where he'd come from, nor who else lived on Earth, nor how long they'd been there. But now he's here, with no memory at all. Except for one thing. His name. John Scalzi. That's it.I don't think I've ever read anything more depressing than this. It's almost too depressing. If you can get past that, though, it's actually pretty good. The characters are interesting, though some are just plain annoying. There is also some good action scenes, although they aren't very well written.





propaganda should read "The Proponents and Opponents of Intelligent Design" by William Dembski, published in The Quarterly Review ofBiology (vol. 82, no.  4).Dembski writes:"In the case of intelligent design theory, we can identify two primary proponents: Michael Behe and William Dembski."So there are two proponents for intelligent design theory: Behe and Dembski.Behe's book Darwin's Black Box has been out for a while but has not sold well (according to Amazon). It was reviewed in Nature by David Berlinski as follows:"This is an important book about a very important subject --the origin of life on earth.

Generation with CTRLHistoryHistory Title: In a shocking finding that has shocked the world, scientists have discovered that the human body is capable of producing and.. Title: Why do we need to eat every day? Comment: We are omnivores and so on ad -Omg!! Omg! Omg!!!!!!!!! Comment: Our ancestors were hunter gatherers who ate ate ate eat Eat Eat Eat Eat Ee e e eee eeeeeee Eh h h hh hhhhhhaha ha ha aha aha aha ahahah ah ah hah hah hahhaha ahahaa a aa aaaa aahhahhaha ah ahhhahhah ahah uh uh uh oh ma ma m m m n o o oo oo ooooo oo oooommmm mm mmmmm hm hm hmhmmm mmmms ms s s t t t ts ts tut ut utuurrrrr r r rreaaaaaallll ll l lll lll lllllllll l k k kk kk kkkkkkkckck ck ckck cuck c c d d f f fb b bb bb bbbbbb bp bp bcb bc bc bc bcb c<unk>

