LATENT TOPOLOGY INDUCTION FOR UNDERSTANDING CONTEXTUALIZED REPRESENTATIONS

Abstract

Recently, there has been considerable interests in understanding pretrained language models. This work studies the hidden geometry of the representation space of language models from a unique topological perspective. We hypothesize that there exist a network of latent anchor states summarizing the topology (neighbors and connectivity) of the representation space. we infer this latent network in a fully unsupervised way using a structured variational autoencoder. We show that such network exists in pretrained representations, but not in baseline random or positional embeddings. We connect the discovered topological structure to their linguistic interpretations. In this latent network, leave nodes can be grounded to word surface forms, anchor states can be grounded to linguistic categories, and connections between nodes and states can be grounded to phrase constructions and syntactic templates. We further show how such network evolves as the embeddings become more contextualized, with observational and statistical evidence demonstrating how contextualization helps words "receive meaning" from their topological neighbors via the anchor states. We demonstrate these insights with extensive experiments and visualizations.

1. INTRODUCTION

Recently, there has been large interests in analyzing pretrained language models (PLMs) (Rogers et al., 2020; Hewitt & Manning, 2019; Hewitt & Liang, 2019; Chen et al., 2021; Chi et al., 2020; Liu et al., 2019) due to their huge success. This work aims to investigate the topological properties, i.e., neighbors and connections of embeddings, of contextualized representations. Informally, we ask what does the "shape" of the representation manifold "look like", and what do they mean from a linguistic perspective. Formally, we hypothesize that there exists a spectrum of latent anchor embeddings serve as local topological centers within the manifold. As a quick first impression, Fig. 1 shows the latent states that we will discover in the following sections. Since such structure cannot be straightforwardly observed, we use unsupervised methods to infer the topology as latent variables. Our unique topological perspective, combined with unsupervised latent variable induction technique, offers a systematically different methodology than the mainstream probing work. Most existing approaches usually define a supervised linear classifier as the probe (Hewitt & Manning, 2019; Hewitt & Liang, 2019; Hewitt et al., 2021; Liu et al., 2019) , targeting for pre-defined properties using pre-annotated data. Such a priori approaches make maximal pre-assumptions and consequently, it would be hard to make new discoveries other than those are already assumed from the very beginning. Our work takes an a posteriori approach, which makes mininal pre-assumptions without using any annotation for supervision. Consequently, we achieve systematically different (yet complementary) results to the results from supervised probing literature. For example, while arguments make by supervised probing are strictly aspect-specific (e.g., how specific properties like syntax can be extracted out from other properties), our discoveries are more holistic and integrated (e.g., in Fig 1 , we visualize all local topological centers as latent states, ground their meaning to lexical, syntactical, and semantic interpretations, and show how these properties are mixed with each other). We use a structured variational autoencoder (VAE) (Diederik P. Kingma, 2013) to infer the latent topology, as VAEs are common and intuitive models for learning latent variables. We focus on the manifold where contextualized embeddings lay in (e.g., the last layer outputs of a fixed, not fine-tuned, BERT Devlin et al., 2019) . We hypothesize there exists a wide spectrum of static latent states within

