SRBGCN: TANGENT SPACE-FREE LORENTZ TRANS-FORMATIONS FOR GRAPH FEATURE LEARNING

Abstract

Hyperbolic graph convolutional networks have been successfully applied to represent complex graph data structures. However, optimization on Riemannian manifolds is nontrivial thus most of the existing hyperbolic networks build the network operations on the tangent space of the manifold, which is a Euclidean local approximation. This distorts the learnt features, limits the representation capacity of the network and makes it hard to optimize the network. In this work, we introduce a fully hyperbolic graph convolutional network (GCN), referred to as SRBGCN, which performs neural computations such as feature transformation and aggregation directly on the manifold, using manifold-preserving Lorentz transformations that include spatial rotation (SR) and boost (B) operations. Experiments conducted on static graph datasets for node classification and link prediction tasks validate the performance of the proposed method.

1. INTRODUCTION

Graph convolutional networks (GCNs) were proposed to make use of the graph topology and model the spatial relationship between graph nodes, hence generalizing the convolution operation to graph data [Kipf & Welling (2017) ; Defferrard et al. (2016) ]. Initially, the proposed models were built in the Euclidean space Hamilton et al. (2017) ; Zhang et al. (2018) ; Velickovic et al. (2019) which is not the natural space for embedding graph data and produces distorted feature representations Nickel & Kiela (2018) ; Chami et al. (2019) . Hyperbolic spaces are more suitable for representing graph data as the space volume is increasing exponentially which is perfect for embedding tree-like data structures that also grow exponentially with the depth of the tree whereas the space grows polynomially for the Euclidean space. Motivated by this, recent works built GCNs in the hyperbolic space to take advantage of the hyperbolic geometry properties Chami et al. (2019); Liu et al. (2019) . The hyperbolic graph convolutional networks (HGCNs) achieved better performance than the corresponding Euclidean ones which shows the effectiveness of using the hyperbolic space to model hierarchical data structures and graph data. However, these works performed the network operations in the tangent space of the manifold which is a Euclidean local approximation to the manifold at a point. The Euclidean network operations such as feature transformation and feature aggregation are not manifold-preserving and can not be directly applied on the manifold, that is why these methods resort to the tangent space. However, using a tangent space may limit the representation capabilities of the hyperbolic networks which is caused by distortion specially as most of these works used the tangent space at the origin. In this work, we propose a full manifold-preserving Lorentz feature transformations using both boost and spatial rotation operations to build SRBGCN fully in the hyperbolic space without resorting to the tangent space. Experiments conducted on node classification and link prediction tasks on static graph datasets show the effectiveness of our proposed method. SRBGCN has a good physical interpretation and can be used to build deep networks with more representation capacity and less distorted features. 2021) built a hyperbolic network by imposing the orthogonal constraint on a sub-matrix of the transformation matrix (subspace transformation). They used the same number of learnable parameters for the feature transformation step as the networks built on the tangent space, however the orthogonal constraint ensured that the transformation is manifold-preserving and they did not need to learn the parameters on the tangent space. They used the Einstein midpoint method defined in the Klein model Ungar (2005) for the feature aggregation step. Chen et al. ( 2022) used a normalization procedure to keep the points on the manifold. They also use normalization for the feature aggregation step. The idea is similar to learning a general transformation matrix or performing aggregation for spherical embeddings and then normalizing the resulting features to have the norm of the sphere radius. In this work, we introduce a full space manifold-preserving transformation matrix in SRBGCN without the need for normalization to keep the points on the manifold.

3.1. GRAPH CONVOLUTIONAL NETWORKS

A static graph G = {V, E} where V = {v 1 , v 2 , . . . , v n } is the set of n graph nodes with E representing the set of graph edges. The edge set E can be encoded in an adjacency matrix A ∈ R n×n where A i,j ∈ [0, 1] if there is a link between v i and v j otherwise, A i,j = 0. Each node v i has a feature vector in the Euclidean space x i ∈ R d of dimension d and X is the set of features for all the n nodes in the graph. The feature transformation step in GCNs can be formulated as: Y l = X l W l + B l (1) where W l is the weight matrix corresponding to the input X l at layer l and B l is the bias translation matrix. The weight matrix acts as a linear transformer whereas the optional bias matrix makes the transformation affine. Then the feature aggregation from neighboring nodes step with nonlinear activation applied can be formulated as: X l+1 = σ(D -1 /2 (A + I)D -1 /2 Y l ) (2) where σ is an activation function. D -1 /2 (A + I)D -1 /2 is the normalized adjacency matrix to normalize nodes weights in the neighboring set. D is a diagonal matrix where D ii = 1 + j A ij and I is the identity matrix to keep identity features. X l+1 represents the output of layer l which can be the input to the next layer l + 1. A GCN is built by stacking a number of those layers. Clearly, the linear transformation matrix W can not be used in hyperbolic networks as this unconstrained transformation matrix will not keep points on the manifold i.e. not manifold preserving transformations. The same applies for the aggregation step as the Euclidean mean operation is not manifold-preserving.



Chami et al. (2019)  proposed HGCNs where networks operations are performed in the tangent space of the manifold. They were able to achieve better performance than the Euclidean analogs on node classification and link prediction tasks. Concurrently to this work, Liu et al. (2019) proposed the Hyperbolic Graph Neural Networks (HGNNs) which performed well on graph classification tasks. Several models have been proposed using different hyperbolic models especially the Lorentz and the Poincaré models for different other tasks such as image segmentation Gulcehre et al. (2018), word embeddings Tifrea et al. (2018), human action recognition Peng et al. (2020), text classification Zhu et al. (2020), machine translation Shimizu et al. (2020); Gulcehre et al. (2018), knowledge graph embeddings Chami et al. (2020) and so on. Gu et al. (2018) built a two-stream network for Euclidean and hyperbolic features (operations on tangent space) and used an interaction module to enhance the learnt feature representations in the two geometries. Peng et al. (2021) presented a comprehensive survey for hyperbolic networks. Zhang et al. (2021b) rebuilt the network operations in HGCNs to guarantee that the learnt features follow the hyperbolic geometry and used the Lorentz centroid Ratcliffe et al. (1994); Law et al. (2019) for aggregating the features. Zhang et al. (2021a) used attention modules in the hyperbolic space to build hyperbolic networks. Dai et al. (

