MANY-BODY APPROXIMATION FOR NON-NEGATIVE TENSORS

Abstract

We propose a nonnegative tensor decomposition with focusing on the relationship between the modes of tensors. Traditional decomposition methods assume low-rankness in the representation, resulting in difficulties in global optimization and target rank selection. To address these problems, we present an alternative way to decompose tensors, a many-body approximation for tensors, based on an information geometric formulation. A tensor is treated via an energy-based model, where the tensor and its mode correspond to a probability distribution and a random variable, respectively, and many-body approximation is performed on it by taking the interaction between variables into account. Our model can be globally optimized in polynomial time in terms of the KL divergence minimization, which is empirically faster than low-rank approximations keeping comparable reconstruction error. Furthermore, we visualize interactions between modes as tensor networks and reveal a nontrivial relationship between many-body approximation and low-rank approximation.

1. INTRODUCTION

Tensors are generalization of vectors and matrices. Data in various fields such as neuroscience (Erol & Hunyadi, 2022) , bioinformatics (Luo et al., 2017) , signal processing (Cichocki et al., 2015) , and computer vision (Panagakis et al., 2021) are often stored in the form of tensors, and features are extracted from them. Tensor decomposition and its non-negative version (Shashua & Hazan, 2005) are popular methods that extract features by approximating tensors by the sum of products of smaller tensors. These smaller tensors are often called factors. It usually tries to minimize the difference between the tensor reconstructed from obtained factors and an original tensor, called the reconstruction error. In most of tensor decomposition approaches, a low-rank structure is typically assumed, where a given tensor is approximated by a linear combination of a small number of bases. Such decomposition requires the following two information. First, it requires the structure, which specifies the type of decomposition such as CP decomposition (Hitchcock, 1927) and Tucker decomposition (Tucker, 1966) . In recent years, tensor networks (Cichocki et al., 2016) have been introduced, which can intuitively and flexibly design the structure including tensor train decomposition (Oseledets, 2011), tensor ring decomposition (Zhao et al., 2016) , and tensor tree decomposition (Murg et al., 2010) . Second, it requires the rank value, the number of bases used in the decomposition. Since larger ranks increase the capability of the model while increasing the computational cost, the user is required to find the appropriate rank in this tradeoff problem. Since the above tensor decomposition via minimization of the reconstruction error is non-convex, which causes initial value dependence (Kolda & Bader, 2009, Chapter 3) , the problem of finding an appropriate setting of the low-rank structure is highly nontrivial in practice as it is hard to locate the cause if the decomposition does not perform well. As a result, to find proper structure and rank, the user often needs to perform decomposition multiple times with various settings, which is time and memory consuming. Instead of the low-rank structure that has been the focus of attention in the past, in this paper, we propose a novel formulation of tensor decomposition, called many-body approximation, that focuses on the relationship among modes of tensors. We determine the structure of decomposition based on the existence of the interactions between modes. The proposed method requires only the decomposition structure naturally determined by the interactions between the modes and does not require the rank value, which traditional decomposition methods also require and often suffer to determine. To describe interactions between modes, we follow the standard strategy in statistical mechanics that uses an energy function H(•) to treat interactions and considers the corresponding distribution exp (H(•)). This model is known to be an energy-based model in machine learning, which has been used in Legendre decomposition (Sugiyama et al., 2018; 2016) that decomposes tensors via convex optimization. Technically, it finds factors of a tensor by treating a tensor as a probability distribution and enforcing some of its natural parameters to be zero. We point out that interactions in the energy function H(•) can be represented using natural parameters of distribution, and we can successfully formulate many-body approximation as a special case of Legendre decomposition by setting some of natural parameters to be zero. The advantage of this is that many-body approximation can be also achieved by convex optimization that minimizes the Kullback-Leibler (KL) divergence (Kullback & Leibler, 1951) . Our approach, describing interactions between modes using energy functions, is different from existing methods that focus on interactions between mode matrices (Vasilescu & Terzopoulos, 2002; Vasilescu, 2011) or block tensors (Vasilescu et al., 2021) . Furthermore, we introduce a way of representing tensor interactions, which visualizes the presence or absence of interactions between modes. We discuss the correspondence between our representation and the tensor network and point out that an operation called coarse-grained transformation (Levin & Nave, 2007) , in which multiple tensors are viewed as a new tensor, reveals unexpected relationship between the proposed method and existing methods such as tensor ring and tensor tree decomposition. We summarize our contribution as follows: • By focusing on the interaction between modes of tensors, we introduce an alternative rankfree tensor decomposition, many-body approximation. This decomposition is realized by convex optimization. • We present a way of describing tensor many-body approximation, interaction representation, a diagram that shows interactions within a tensor. This diagram can be transformed into tensor networks, which tells us the relationship between many-body approximation and existing low-rank approximation. • We empirically show that many-body approximation is faster than low-rank approximation with competitive reconstruction errors.

2. TENSOR MANY-BODY APPROXIMATION

Our proposal, tensor many-body approximation, is based on the formulation of Legendre decomposition for tensors. We first review Legendre decomposition and its optimization in Section 2.1. We introduce interactions between modes and its visual representation to prepare for many-body approximation in Section 2.2. Using interactions between modes, we define many-body approximation in Section 2.3. Finally, we transform the interaction representation into a tensor network and point out the connection between many-body approximation and existing low-rank decomposition methods in Section 2.4. In the following discussion, we consider D-order non-negative tensors whose size is (I 1 , . . . , I D ). We assume the sum of all elements in P is 1 for simplicity, while this assumption can be eliminated using the general property of Kullback-Leibler (KL) divergence, λD KL (P, Q) = D KL (λP, λQ), for any real number λ.

2.1. REMINDER TO LEGENDRE DECOMPOSITION AND ITS OPTIMIZATION

Legendre decomposition is a method to decompose a non-negative tensor by regarding the tensor as a discrete distribution and representing it with a limited number of parameters. We describe a non-negative tensor P using natural parameters θ = (θ 1,...,1 , . . . , θ I1,...,I D ) and its energy function H as P i1,...,i D = exp (H i1,...,i D ), H i1,...,i D = i1 i ′ 1 =1 • • • i D i ′ D =1 θ i ′ 1 ,...,i ′ D ,

