RAINPROOF: AN UMBRELLA TO SHIELD TEXT GENER-ATORS FROM OUT-OF-DISTRIBUTION DATA

Abstract

As more and more conversational and translation systems are deployed in production, it is essential to implement and to develop effective control mechanisms guaranteeing their proper functioning and security. An essential component to ensure safe system behavior is out-of-distribution (OOD) detection, which aims at detecting whether an input sample is statistically far from the training distribution. Although OOD detection is a widely covered topic in classification tasks, it has received much less attention in text generation. This paper addresses the problem of OOD detection for machine translation and dialog generation from an operational perspective. Our contributions include: (i) RAINPROOF a Relative informAItioN Projection ODD detection framework; and (ii) a more operational evaluation setting for OOD detection. Surprisingly, we find that OOD detection is not necessarily aligned with task-specific measures. The OOD detector may filter out samples that are well processed by the model and keep samples that are not, leading to weaker performance. Our results show that RAINPROOF breaks this curse and achieve good results in OOD detection while increasing performance.

1. INTRODUCTION

Significant progress have been made in Natural Language Generation (NLG) in recent years with the development of powerful generic (e.g., GPT (Radford et al., 2018; 2019; Brown et al., 2020) ) and task-specific (e.g., Grover (Zellers et al., 2019 ), Pegasus (Zhang et al., 2020) and DialogGPT (Zhang et al., 2019 )) text generators. Text generators power machine translation systems or chat bots that are by definition exposed to the public and whose reliability is therefore a prerequisite for adoption. Text generators are trained in the context of a so-called closed world (Antonucci et al., 2021; Fei & Liu, 2016) , where training and test data are assumed to be drawn i.i.d. from a single distribution, known as the in-distribution. However, when deployed, these models operate in an open world (Parmar et al., 2021; Zhou, 2022) where the i.i.d. assumption is often violated. This change in data distribution is detrimental and induces a drop in performance as illustrated in Tab. 3 and Tab. 4. Thus, to ensure the trustworthiness and adoption, it is necessary to develop tools to protect them from harmful distribution shifts. For example, a trained translation model is not expected to be reliable when presented with another language (e.g. a Spanish model exposed to Catalan, or a Dutch model exposed to Afrikaans) or unexpected technical language (e.g., a colloquial translation model exposed to rare technical terms from the medical field). Most of the existing research, which aims to protect models from Out-Of-Distribution (OOD) data, focuses on classification. Despite their importance, (conditional) text generation has received much less attention even though it is among the most exposed applications. Existing solutions fall into two categories. The first one called training-aware methods (Zhu et al., 2022; Vernekar et al., 2019a; b) modifies the classifier training by exposing the neural network to OOD samples during training. The second one called plug-in methods aims at distinguishing regular samples in the in distribution (IN) from OOD samples based on the behavior of the model on a new input. Plug-in methods include Maximum Softmax Prediction (MSP) (Hendrycks & Gimpel, 2016) or Energy (Lee et al., 2018a) or feature-based anomaly detectors that compute a per-class anomaly score (Ming et al., 2022; Ryu et al., 2017; Huang et al., 2020; Ren et al., 2021a) . Although plug-in methods seem attractive, their adaptation to text generation may not be straightforward. The sheer number of words present in the vocabulary prevents it to be used directly within the classification framework.

