CONTRASTIVE NOVELTY LEARNING: ANTICIPATING OUTLIERS WITH LARGE LANGUAGE MODELS

Abstract

In many task settings, text classification models are likely to encounter examples from novel classes on which they cannot predict correctly. Selective prediction, in which models abstain on low-confidence examples, provides a possible solution, but existing models are often overly confident on OOD examples. To remedy this overconfidence, we introduce Contrastive Novelty Learning (CNL), a two-step method that generates OOD examples representative of novel classes, then trains to decrease confidence on them. First, we generate OOD examples by prompting a large language model twice: we prompt it to enumerate novel classes relevant to the label set, then generate examples from each novel class matching the task format. Second, we train our classifier with a novel contrastive objective that encourages lower confidence on generated OOD examples than training examples. When trained with CNL, classifiers improve in their ability to detect and abstain on OOD examples over prior methods by an average of 2.3% AUAC and 5.5% AUROC across 4 NLP datasets, with no cost to in-distribution accuracy. 1

1. INTRODUCTION

Recent progress in NLP has led to text classification models that are accurate not only in-distribution, but also on some out-of-domain data (Arora et al., 2021) . Nonetheless, some categories of realworld distribution shift still pose serious challenges. For instance, in open-set label shift, the test data includes examples from novel classes not present in the training data, making it impossible for a standard classifier to predict correctly (Scheirer et al., 2013) . Moreover, novel class examples can be difficult to detect with conventional OOD detection methods, as they typically bear a strong surface resemblance to training examples (T ¸ifrea et al., 2021) . In this paper, we frame open-set label shift as a selective prediction problem (El-Yaniv & Wiener, 2010; Geifman & El-Yaniv, 2017) (Setlur et al., 2022) . Text inputs, however, are composed of discrete tokens, and modifying even a single token can unpredictably alter the meaning of a sentence. We seek an automatic generation method that addresses these limitations, leveraging the generative ability of large language models (LLMs) like GPT-3 (Brown et al., 2020) . LLMs are a desirable source for novelty, as their generation is informed by a broad corpus of examples seen during pretraining, allowing them to reliably generate from classes outside a dataset. We present Contrastive Novelty Learning (CNL), a method to improve the OSSC ability of a classifier by automatically generating OOD examples and then training to abstain on them. to form a large set of probable novel examples. Finally, we propose a contrastive confidence loss (CCL) for training, which encourages both high accuracy on the ID training set and lower relative confidence on the generated novel examples. We show that CCL outperforms stricter losses like Outlier Exposure (Hendrycks et al., 2019) , which can adversely affect ID performance. Our full pipeline is shown in Figure 1 . Our method can be viewed as a form of "partial" knowledge distillation: we leverage an LLM "teacher model" to improve novelty detection performance without altering the student model's strong ID classification ability. We evaluate CNL against state-of-the-art baselines across 14 splits of 4 datasets (AGNews (Zhang et al., 2015) , TREC-10 (Li & Roth, 2002) , TACRED (Zhang et al., 2017 ), Emotion (Saravia et al., 2018) ), finding that it improves both OOD detection and OSSC, by an average of 5.5% AUROC and 2.3% AUAC over the best prior method. These improvements come at no cost to ID accuracy, demonstrating that it is possible to distill novelty detection alone without affecting predictive power. Finally, we analyze the settings in which CNL can improve OSSC performance. In the data dimension, scale is often optional: with as few as 1000 generated examples, we find an improvement over vanilla training on all 4 datasets. The same is only partially true for LLM size, as on some datasets only a sufficiently large model can generate useful examples. For a probabilistic model p θ (y | x) and associated confidence metric, the corresponding prediction is given by f (x) = (ŷ, c), where ŷ = arg max y∈YID p θ (y | x) and c denotes the model's confidence. When used as a selective classifier with threshold γ, f predicts ŷ when c > γ and abstains otherwise (Geifman & El-Yaniv, 2017) . This differs from OOD detection (Hendrycks & Gimpel, 2017) in that f must abstain on both novel examples and its own errors and must attain high ID accuracy.

2.2. EVALUATION PROTOCOL

We holistically measure selective classification performance with the area under the accuracycoverage curve (AUAC). The accuracy-coverage curve plots accuracy as a function of the fraction of examples on which the model predicts (i.e., coverage) as the confidence threshold γ varies. For accuracy computation, we treat predictions on all novel class examples as incorrect. AUAC measures the combined ability of a model in ID classification accuracy, ID calibration, and OOD detection.



Code and data have been uploaded and will be released.



Generate a diverse list of news categories: world, sports, business, ID Labels science, crime, travel, auto,

Figure 1: Contrastive Novelty Learning pipeline. We Novelty Prompt a generator model to produce a novel set D OOD , then train with a contrastive confidence loss (CCL) on our original train set D ID and D OOD , ensuring that our classifier is less confident on generated novel examples than closed-set examples. Finally, we abstain based on model logits.

2.1 OPEN-SET SELECTIVE CLASSIFICATIONIn standard classification, an optimal model f should predict the ground-truth label y of an input example x from a closed set of known labels Y ID . However, under a more realistic open-set setting, some test examples are drawn from unknown novel classes Y OOD . Without a priori knowledge of Y OOD , a standard discriminative classifier will never correctly classify a novel example. Instead, an optimal open-set selective classifier f should predict y when y ∈ Y ID , and abstain otherwise.

that we call open-set selective classification (OSSC). OSSC requires text classifiers to predict correctly on closed-set examples while abstaining on novel class examples. To perform well on OSSC, a classifier must have lower confidence on novel class examples than closed-set examples by learning features which differentiate novel classes from closed-set classes (Perera et al., 2020). In order to supervise this representation learning, it is useful to identify what examples from novel classes might look like. Prior work has explored automatically generating OOD images by adding random perturbations to ID examples

To generate a diverse set of OOD examples that anticipate different potential test-time shifts, we introduce Novelty Prompting, a method that augments a source dataset with novel class examples generated by a LLM. We first perform label generation, prompting our LLM to extend the closed-set labels with novel labels. We then prompt the LLM to generate new examples conditioned on each novel label

