TEXTTN: PROBABILISTIC ENCODING OF LANGUAGE ON TENSOR NETWORK

Abstract

As a novel model that bridges machine learning and quantum theory, tensor network (TN) has recently gained increasing attention and successful applications for processing natural images. However, for natural languages, it is unclear how to design a probabilistic encoding architecture to efficiently and accurately learn and classify texts based on TN. This paper proposes a general two-step scheme of text classification based on Tensor Network, which is named as TextTN. TextTN first encodes the word vectors in a probabilistic space by a generative TN (word-GTN), and then classifies a text sentence using a discriminative TN (sentence-DTN). Moreover, in sentence-DTN, its hyper-parameter (i.e., bond-dimension) can be analyzed and selected by the theoretical property of TextTN's expressive power. In experiments, our TextTN also obtains the state-of-the-art result on SST-5 sentiment classification task.

1. INTRODUCTION

Machine learning incorporating with the quantum mechanics forms a novel interdisciplinary field known as quantum machine learning (Huggins et al., 2019; Ran et al., 2020) . Tensor network (TN) as a novel model has become prominent in the field of quantum machine learning (Biamonte et al., 2017) . On the one hand, tensor network can be used as mathematical tool to enhance the theoretical understanding of existing neural network methods (Levine et al., 2018; 2019) . On the other hand, based on tensor network, new machine learning algorithms have been proposed, e.g., discriminative TN (DTN) (Stoudenmire & Schwab, 2016) for supervised tasks and generative TN (GTN) (Han et al., 2018) for unsupervised scenarios (Han et al., 2018) . Based on the natural analogy between the quantum concepts (e.g., quantum many-body system (Levine et al., 2018) ) and the image representation, many studies and applications are conducted for processing and learning natural pictures (Stoudenmire & Schwab, 2016; Sun et al., 2020; Liu et al., 2019) . However, for natural languages, it remains unclear how to design an efficient and effective TN approach, which can accurately learn and classify texts. In the field of natural language processing (NLP), researchers have realized the analogy between the quantum many-body wave function and the word interactions (by the tensor product) in a text sentence, and developed a quantum-inspired language representation (Zhang et al., 2018) . Based on the quantum many-body physics and tensor decomposition techniques, Zhang et al. ( 2018) provided a mathematical understanding of existing convolution neural network (CNN) based text classification methods. Similarly, a tensor space language model (TSLM) has been built based on the tensor network formulation (Zhang et al., 2019) . This work shows that TSLM is a more generalized language model compared with n-gram and recurrent neural network (RNN) based language models. In implementation, however, TSLM did not provide a tensor network algorithm. The challenge lies in the high dimensionality of each word vector, which is much higher than the dimensionality of each pixel representation in image scenarios. After the tensor product of a number of word vectors, the resulting high-order tensors will become computationally intractable. More recently, a tensor network algorithm, namely uniform matrix product state (u-MPS) model, has been proposed for probabilistic modeling of a text sequence (Miller et al., 2020) . u-MPS is evaluated on a context-free language task, which uses an synthetic data set. However, u-MPS has not been applied in a real-world NLP task, e.g., typical language modeling or text classification task. In addition, the expressive power of u-MPS has not been investigated. The expressive power of tensor

