AUTOENCODING HYPERBOLIC REPRESENTATION FOR ADVERSARIAL GENERATION

Abstract

With the recent advance of geometric deep learning, neural networks have been extensively used for data in non-Euclidean domains. In particular, hyperbolic neural networks have proved successful in processing hierarchical information of data. However, many hyperbolic neural networks are numerically unstable during training, which precludes using complex architectures. This crucial problem makes it difficult to build hyperbolic generative models for real and complex data. In this work, we propose a hyperbolic generative network in which we design novel architecture and layers to improve stability in training. Our proposed network contains three parts: first, a hyperbolic autoencoder (AE) that produces hyperbolic embedding for input data; second, a hyperbolic generative adversarial network (GAN) for generating the hyperbolic latent embedding of the AE from simple noise; third, a generator that inherits the decoder from the AE and the generator from the GAN. We call this network the hyperbolic AE-GAN, or HAE-GAN for short. The architecture of HAEGAN fosters expressive representation in the hyperbolic space, and the specific design of layers ensures numerical stability. Experiments show that HAEGAN is able to generate complex data with state-of-the-art structure-related performance.

1. INTRODUCTION

High-dimensional data often show an underlying geometric structure, which cannot be easily captured by neural networks designed for Euclidean spaces. Recently, there is intense interest in learning good representation for hierarchical data, for which the most natural underlying geometry is hyperbolic. A hyperbolic space is a Riemannian manifold with a constant negative curvature (Anderson, 2006) . The exponential growth of the radius of the hyperbolic space provides high capacity, which makes it particularly suitable for modeling tree-like hierarchical structures. Hyperbolic representation has been successfully applied to, for instance, social network data in product recommendation (Wang et al., 2019) , molecular data in drug discovery (Yu et al., 2020; Wu et al., 2021) , and skeletal data in action recognition (Peng et al., 2020) . Many recent works (Ganea et al., 2018; Shimizu et al., 2021; Chen et al., 2021) have successfully designed hyperbolic neural operations. These operations have been used in generative models for generating samples in the hyperbolic space. For instance, several recent works (Nagano et al., 2019; Mathieu et al., 2019; Dai et al., 2021b) have built hyperbolic variational autoencoders (VAE) (Kingma & Welling, 2014) . On the other hand, Lazcano et al. (2021) have generalized generative adversarial networks (GAN) (Goodfellow et al., 2014; Arjovsky et al., 2017) to the hyperbolic space. However, the above hyperbolic generative models are known to suffer from gradient explosion when the networks are deep. In order to build hyperbolic networks that can generate real data, it is desired to have a framework that has both representation power and numerical stability. To this end, we design a novel hybrid model which learns complex structures and hyperbolic embeddings from data, and then generates examples by sampling from random noises in the hyperbolic space. Altogether, our model contains three parts: first, we use a hyperbolic autoencoder (AE) to learn the embedding of training data in the latent hyperbolic space; second, we use a hyperbolic GAN to learn generating the latent hyperbolic distribution by passing a wrapped normal noise through the generator; third, we generate samples by applying sequentially the generator of the GAN and the decoder of the AE. We name our model as Hyperbolic AE-GAN, or HAEGAN for short. The advan-tage of this architecture is twofold: first, it enjoys expressivity since the noise goes through both the layers of the generator and the decoder; second, it allows flexible design of the AE according to the type of input data, which does not affect the sampling power of GAN. In addition, HAEGAN avoids the complicated form of ELBO in hyperbolic VAE, which is one source of numerical instability. We highlight the main contributions of this paper as follows: • HAEGAN is a novel hybrid AE-GAN framework for learning hyperbolic distributions that aims for both expressivity and numerical stability. • We validate the Wasserstein GAN formulation in HAEGAN, especially the way of sampling from the geodesic connecting a real sample and a generated sample.. • We design a novel concatenation layer in the hyperbolic space. We extensively investigate its numerical stability via theoretical and experimental comparisons. • In the experiments part, we illustrate that HAEGAN is not only able to faithfully generate synthetic hyperbolic data, but also able to generate real data with sound quality. In particular, we consider the molecular generation task and show that HAEGAN achieves state-of-the-art performance, especially in metrics related to structural properties.

2. BACKGROUND IN HYPERBOLIC NEURAL NETWORKS

2.1 HYPERBOLIC GEOMETRY Hyperbolic geometry is a special kind of Riemannian geometry with a constant negative curvature (Cannon et al., 1997; Anderson, 2006) . To extract hyperbolic representations, it is necessary to choose a "model", or coordinate system, for the hyperbolic space. Popular choices include the Poincaré ball model and the Lorentz model, where the latter is found to be numerically more stable (Nickel & Kiela, 2018) . We work with the Lorentz model L n K = (L, g) with a constant negative curvature K, which is an n-dimensional manifold L embedded in the (n + 1)-dimensional Minkowski space, together with the Riemannian metric tensor g = diag([-1, 1 ⊤ n ]), where 1 n denotes the n-dimensional vector whose entries are all 1's. Every point in L n K is represented by x = [x t , x ⊤ s ] ⊤ , x t > 0, x s ∈ R n , and satisfies ⟨x, x⟩ L = 1/K, where ⟨•, •⟩ L is the Lorentz inner product induced by g K :⟨x, y⟩ L := x ⊤ gy = -x t y t + x ⊤ s y s , x, y ∈ L n K . In the rest of the paper, we will refer to x t as the "time component" and x s as the "spatial component" . In the following, we describe some notations. Extensive details are provided in Appendix A. Notation We use d L (x, y) to denote the length of a geodesic ("distance" along the manifold) connecting x, y ∈ L n K . For each point x ∈ L n K , the tangent space at x is denoted by T x L n K . The norm ∥•∥ L = ⟨•, •⟩ L . For x, y ∈ L n K and v ∈ T x L n K , we use exp K x (v) to denote the exponential map of v at x; on the other hand, we use log K x : L n K → T x L n K to denote the logarithmic map such that log K x (exp K x (v)) = v. For two points x, y ∈ L n K , we use PT K x→y to denote the parallel transport map which "transports" a vector from T x L n K to T y L n K along the geodesic from x to y.

2.2. FULLY HYPERBOLIC LAYERS

One way to define hyperbolic neural operations is to use the tangent space, which is Euclidean. However, working with the tangent space requires taking exponential and logarithmic maps, which cause numerical instability. Moreover, tangent spaces are only local estimates of the hyperbolic spaces, but neural network operations are usually not local. Since generative networks have complex structures, we want to avoid using the tangent space whenever possible. The following hyperbolic layers take a "fully hyperbolic" approach and perform operations directly in hyperbolic spaces. The most fundamental hyperbolic neural layer, the hyperbolic linear layer (Chen et al., 2021), is a trainable "linear transformation" that maps from L n K to L m K . We remark that "linear" is used to analogize the Euclidean counterpart, which contains activation, bias and normalization. In general, for an input x ∈ L n K , the hyperbolic linear layer outputs y = HLinear n,m (x) = ∥h(W x, v)∥ 2 -1/K h(W x, v) . (1)

