CLOSED BOUNDARY LEARNING FOR NLP CLASSIFICA-TION TASKS WITH THE UNIVERSUM CLASS

Abstract

The Universum class, often known as the other class or the miscellaneous class, is defined as a collection of samples that do not belong to any class of interest. It is a typical class that exists in many classification-based tasks in natural language processing (NLP), such as relation extraction, named entity recognition, sentiment analysis, etc. During data labeling, a significant number of samples are annotated as Universum because there are always some samples that exist in the dataset but do not belong to preset target classes and are not of interest in the task. The Universum class exhibits very different properties, namely heterogeneity and lack of representativeness in training data; however, existing methods often treat the Universum class equally with the classes of interest. Although the Universum class only contains uninterested samples, improper treatment will result in the misclassification of samples of interest. In this work, we propose a closed boundary learning method that treats the Universum class and classes of interest differently. We apply closed decision boundaries to classes of interest and designate the area outside all closed boundaries in the feature space as the space of the Universum class. Specifically, we formulate the closed boundaries as arbitrary shapes, propose a strategy to estimate the probability of the Universum class according to its unique property rather than the within-class sample distribution, and propose a boundary learning loss to learn decision boundaries based on the balance of misclassified samples inside and outside the boundary. By conforming to the natural properties of the Universum class, our method improves both accuracy and robustness of classification models. We evaluate our method on 6 state-of-the-art works in 3 different tasks, and the F1 score/accuracy of all 6 works is improved. Experimental results also indicate that our method has significantly enhanced the robustness of the model, with the largest absolute F1 score improvement of over 8% on the robustness evaluation dataset. Our code will be released on GitHub.



based tasks of NLP, quite often we encounter a class named as other class, miscellaneous class, neutral class or outside (O) class. Such a class is a collection of samples that do not belong to any class of interest, such as samples of no relation class in relation extraction task. We adopt the terminology in (Weston et al., 2006) to designate all such classes as the Universum class (U). Universum class exits in various classification-based problems in NLP, such as relation extraction (RE) (Zhang et al., 2017), named entity recognition (NER) (Tjong Kim Sang & De Meulder, 2003), sentiment analysis (SA) (Tjong Kim Sang & De Meulder, 2003), and natural language inference (NLI) (Bowman et al., 2015). To distinguish the Universum class and the rest of the classes, we call the classes of interest as target classes (T). The set of all classes (A) in training and testing data can be expressed as A = U ∪ T • Universum class: A collection of samples that do not belong to any class of interest. • Target class: A class of interest in the task, i.e., one of the classes other than the Universum class. The sample compositions of the Universum class and target classes are usually very different. Figure 1(a) provides some samples of a target class (entity-destination) and the Universum class (other) in 1

