A HIERARCHICAL HYPER-RECTANGLE MASS MODEL FOR FINE-GRAINED ENTITY TYPING Anonymous

Abstract

Fine-grained entity typing is the task of detecting types of entities inside a given language text. Entity typing models typically transform entities into vectors in high-dimensional space, hyperbolic space, or add additional context information. However, such spaces or feature transformations are not compatible with modeling types' inter-dependencies and diverse scenarios. We study the ability of the hierarchical hyper-rectangle mass model(hRMM), which represents mentions and types into hyper-rectangles mass(hRM) and thus captures the relationships of ontology into a geometric mass view. Natural language contexts are fed into encoder and then projected to hyper-rectangle mass embedding(hRME). We find that hRM perfectly depicts features of mentions and types. With further research in hypervolume indicator and adaptive thresholds, performance achieves additional improvement. Experiments show that our approach achieves better performance on several entity typing benchmarks and attains state-of-the-art results on two benchmark datasets.

1. INTRODUCTION

Entity typing is the task of assigning types to named entities in language texts. Entity typing has shown to be widely used in tasks such as entity linking (Dai et al., 2019) , knowledge base learning (Hao et al., 2019) , and sentence classification. In recent years, this task has become a major focus of NLP research. Many classification systems or hierarchical models have been proposed and achieved promising results in entity typing tasks. Geometric embedding and multi-classification learning are two main techniques in recent years. Rather than representing objects with vectors, geometric representation models are recently assumed to be more suited to expressing relationships in the domain. Geometric embedding such as Box embedding (Onoe et al., 2021) uses box space to represent mention and types (Vilnis et al., 2018) . Box embedding is an interesting view which has been employed for knowledge graph reasoning (Ren et al., 2020) , knowledge graph completion (Abboud et al., 2020) , and joint hierarchical representation (Patel et al., 2020) . However, it has a few drawbacks: (1) representation of box embedding is too simple to capture latent features in mention and types (2) box embedding treats entity typing task as a multiclass or multi-label classification task, which means it cannot learn hierarchical knowledge. Apart from box embedding, Learning hierarchical knowledge in euclidean space (Chen et al., 2020), (Yogatama et al., 2015) or hyperbolic space (López & Strube, 2020) are not perfect and far from enough for representing entities which dive into extremely diverse scenarios. In addition, these methods are not capable to illustrate hierarchical relationships between mention, supertype, and subtype. To overcome the aforementioned drawbacks, we build an entity typing model named hRMM for the entity typing task. As illustrated in figure2, our model builds an hRM hierarchical architecture that represents entity types and mentions into hRME (Figure 1 ). Compared with the baseline model -box embedding (Onoe et al., 2021) , we add density and size scale parameters to determine the mass and size of types in geometric view. These parameters can further investigate more latent features in the language context. Figure 1 : hRMM architecture for predicting types of entity mention "London" in a given sentence Further, we develop hypervolume indicators and adaptive thresholds which bring additional improvements. The hypervolume indicator minimizes the combination of losses to achieve better loss results. Adaptive threshold, a greedy threshold method, also outperformed the normal threshold method. Without any hand-crafted features or data processing methods, experiments on benchmark datasets including Figer, UFET, and OntoNotes demonstrate that our approach outperforms baseline models and prior works. It also indicates that our approach is capable of capturing the latent hierarchical structure and language feature in entity typing tasks. We will publish all source codes and datasets of this work on GitHub for further exploration. 2 RELATED WORKS Many efforts have been invested in entity typing tasks for years. In addition to the related work discussed in the previous section, few works focus on correcting noisy labels to improve metrics. Due to the establishment of large-scale weakly and distantly supervised annotation data, it may contain abundant noise and thus severely hinder the performance of the entity typing task. Co-teaching(Han et al., 2018) simultaneously train two collaborating networks to filter out potentially noisy labels according to their loss. DivideMix (Li et al., 2020) leveraged a one-dimensional and two-component Gaussian Mixture Model (GMM) to model the loss distribution of clean and noisy labels.SELF (Nguyen et al., 2019) selects the clean labels according to the agreement between annotated labels and the network's prediction. However, we found that the loss does not form a bimodal distribution in entity typing tasks and thus it is hard to distinguish clean and noisy labels



Figure 2: Illustration of hRM generation and intersection computations. (Colors for objects represent density value which is linear to RGB color model(Ibraheem et al., 2012))

