RANDOM LAPLACIAN FEATURES FOR LEARNING WITH HYPERBOLIC SPACE

Abstract

Due to its geometric properties, hyperbolic space can support high-fidelity embeddings of tree-and graph-structured data, upon which various hyperbolic networks have been developed. Existing hyperbolic networks encode geometric priors not only for the input, but also at every layer of the network. This approach involves repeatedly mapping to and from hyperbolic space, which makes these networks complicated to implement, computationally expensive to scale, and numerically unstable to train. In this paper, we propose a simpler approach: learn a hyperbolic embedding of the input, then map once from it to Euclidean space using a mapping that encodes geometric priors by respecting the isometries of hyperbolic space, and finish with a standard Euclidean network. The key insight is to use a random feature mapping via the eigenfunctions of the Laplace operator, which we show can approximate any isometry-invariant kernel on hyperbolic space. Our method can be used together with any graph neural networks: using even a linear graph model yields significant improvements in both efficiency and performance over other hyperbolic baselines in both transductive and inductive tasks.

1. INTRODUCTION

Real-world data contains various structures that resemble non-Euclidean spaces: for example, data with tree-or graph-structure such as citation networks (Sen et al., 2008) , social networks (Hoff et al., 2002) , biological networks (Rossi & Ahmed, 2015) , and natural language (e.g., taxonomies and lexical entailment) where latent hierarchies exist (Nickel & Kiela, 2017) . Graph-style data features in a range of problems-including node classification, link prediction, relation extraction, and text classification. It has been shown both theoretically and empirically (Bowditch, 2006; Nickel & Kiela, 2017; 2018; Chien et al., 2022) that hyperbolic space-the geometry with constant negative curvature-is naturally suited for representing (i.e. embedding) such data and capturing implicit hierarchies, outperforming Euclidean baselines. For example, Sala et al. (2018) shows that hyperbolic space can embed trees without loss of information (arbitrarily low distortion), which cannot be achieved by Euclidean space of any dimension (Chen et al., 2013; Ravasz & Barabási, 2003) . Presently, most well-known and -established deep neural networks are built in Euclidean space. The standard approach is to pass the input to a Euclidean network and hope the model can learn the features and embeddings. But this flat-space approach can encode the wrong prior in tasks for which we know the underlying data has a different geometric structure, such as the hyperbolic-space structure implicit in tree-like graphs. Motivated by this, there is an active line of research on developing ML models in hyperbolic space H n . Starting from hyperbolic neural networks (HNN) by Ganea et al. These hyperbolic networks adopt hyperbolic geometry at every layer of the model. Since hyperbolic space is not a vector space, operations such as addition and multiplication are not well-defined; neither are matrix-vector multiplication and convolution, which are key components of a deep model that uses hyperbolic geometry at every layer. A common solution is to treat hyperbolic space as a gyro-vector space by equipping it with a non-commutative, non-associative addition and multiplication, allowing hyperbolic points to be processed as features in a neural network forward. However, this complicates the use of hyperbolic geometry in neural networks because the imposition of an extra structure on hyperbolic space beyond its manifold properties-making the approach somehow non-geometric. A second problem with using hyperbolic points as intermediate features is that these points can stray far from the origin (just as Euclidean DNNs require high dynamic range Kalamkar et al. ( 2019)), especially for deeper networks. This can cause significant numerical issues when the space is represented with ordinary floating-point numbers: the representation error is unbounded and grows exponentially with the distance from the origin. Much careful hyperparameter tuning is required to avoid this "NaN problem" Sala et al. (2018); Yu & De Sa (2019; 2021) . These issues call for a simpler and more principled way of using hyperbolic geometry in DNNs. In this paper, we propose such a simple approach for learning with hyperbolic space. The insight is to (1) encode the hyperbolic geometric priors only at the input via an embedding into hyperbolic space, which is then (2) mapped once into Euclidean space by a random feature mapping ϕ : H n → R d that (3) respects the geometry of hyperbolic space in that its induced kernel k(x, y) = E[⟨ϕ(x), ϕ(y)⟩] is isometry-invariant, i.e. k(x, y) depends only on the hyperbolic distance between x and y, followed by (4) passing these Euclidean features through some downstream Euclidean network. This approach both avoids the numerical issues common in previous approaches (since hyperbolic space is only used once early in the network, numerical errors will not compound) and eschews the need for augmenting hyperbolic space with any additional non-geometric structure (since we base the mapping only on geometric distances in hyperbolic space). Our contributions are as follows: • In Section 4 we propose a random feature extraction called HyLa which can be sampled to be an unbiased estimator of any isometry-invariant kernel on hyperbolic space. This generalizes the classic method of random Fourier features proposed for Euclidean space by Rahimi et al. ( 2007). • In Section 5 we show how to adopt HyLa in an end-to-end graph learning architecture that simultaneously learns the embedding of the initial objects and the Euclidean graph learning model. • In Section 6, we evaluate our approach empirically. Our HyLa-networks demonstrate better performance, scalability and computation speed than existing hyperbolic networks: HyLa-networks consistently outperform HGCN, even on a tree dataset, with 12.3% improvement while being 4.4× faster. Meanwhile, we argue that our method is an important hyperbolic baseline to compare against due to its simple implementation and compatibility with any graph learning model.

2. RELATED WORK

Hyperbolic space. n-dimensional hyperbolic space H n is usually defined and used via a model, a representation of H n within Euclidean space. Common choices include the Poincaré ball (Nickel & Kiela, 2017) and Lorentz hyperboloid model (Nickel & Kiela, 2018) . We develop our approach using the Poincaré ball model, but our methodology is independent of the model and can be applied to other models. The Poincaré ball model is the Riemannian manifold (B n , g p ) with B n = {x ∈ R n : ∥x∥ < 1} being the open unit ball and the Riemannian metric g p and metric distance d p being g p (x) = 4(1 -∥x∥ 2 ) -2 g e and d p (x, y) = arcosh 1 + 2 ∥x-y∥ 2 (1-∥x∥ 2 )(1-∥y∥ 2 ) . where g e is the Euclidean metric. To encode geometric priors into neural networks, many versions of hyperbolic neural networks have been proposed. But while (matrix-) addition and multiplication are essential to develop a DNN, hyperbolic space is not a vector space with well-defined addition and multiplication. To handle this issue, several approaches were proposed in the literature. Gyrovector space. Many hyperbolic networks, including HNN (Ganea et al., 2018 ), HNN++ (Shimizu et al., 2020 ), HVAE (Mathieu et al., 2019) , HGAT (Zhang et al., 2021a), and GIL Zhu et al. (2020) , adopt the framework of gyrovector space as an algebraic formalism for hyperbolic geometry, by equipping hyperbolic space with non-associative addition and multiplication: Möbius addition ⊕ and Möbius scalar multiplication ⊗, which is defined for x, y ∈ B n and a scalar r ∈ R x ⊕ y := (1+2⟨x,y⟩+∥y∥ 2 )x+(1-∥x∥ 2 )y 1+2⟨x,y⟩+∥x∥ 2 ∥y∥ 2 , r ⊗ x := tanh(r tanh -1 (∥x∥)) x ∥x∥ . However, Möbius addition and multiplication are complicated with a high computation cost; high level operations such as Möbius matrix-vector multiplication are even more complicated and numer-



(2018), a variety of hyperbolic networks were proposed for different applications, including HNN++ (Shimizu et al., 2020), hyperbolic variational auto-encoders (HVAE, Mathieu et al. (2019)), hyperbolic attention networks (HATN, Gulcehre et al. (2018)), hyperbolic graph convolutional networks (HGCN, Chami et al. (2019)), hyperbolic graph neural networks (HGNN, Liu et al. (2019)), and hyperbolic graph attention networks (HGAT Zhang et al. (2021a)). The strong empirical results of HGCN and HGNN in particular on node classification, link prediction, and molecular-and-chemicalproperty prediction show the power of hyperbolic geometry for graph learning.

