PARAMETRIZING PRODUCT SHAPE MANIFOLDS BY COMPOSITE NETWORKS

Abstract

Parametrizations of data manifolds in shape spaces can be computed using the rich toolbox of Riemannian geometry. This, however, often comes with high computational costs, which raises the question if one can learn an efficient neural network approximation. We show that this is indeed possible for shape spaces with a special product structure, namely those smoothly approximable by a direct sum of low-dimensional manifolds. Our proposed architecture leverages this structure by separately learning approximations for the low-dimensional factors and a subsequent combination. After developing the approach as a general framework, we apply it to a shape space of triangular surfaces. Here, typical examples of data manifolds are given through datasets of articulated models and can be factorized, for example, by a Sparse Principal Geodesic Analysis (SPGA). We demonstrate the effectiveness of our proposed approach with experiments on synthetic data as well as manifolds extracted from data via SPGA.

1. INTRODUCTION

Modeling collections of shapes as data on Riemannian manifolds has enabled the usage of a rich set of mathematical tools in areas such as computer graphics and vision, medical imaging, computational biology, and computational anatomy. For example, Principal Geodesic Analysis, a generalization of Principal Component Analysis, can be used to parametrize submanifolds approximating given data points while preserving structure of the data such as its invariance to rigid motion. The evaluation of such a parametrization, however, typically comes at a high computational cost as the Riemannian exponential, mapping infinitesimal shape variations to shapes, has to be evaluated. This motivates trying to learn an efficient approximation for these parametrizations. Direct application of deep neural networks (NNs), however, proves ineffective for high-dimensional spaces with strongly nonlinear variations. Therefore, we consider more structured shape manifolds, namely, we assume that they can be approximated by an affine sum of low-dimensional submanifolds. In computer graphics, typical examples of data manifolds are given through datasets of articulated models, e.g. human bodies, faces or hands. Then, the desired structure of an affine sum of factor manifolds can be produced, for example, by a Sparse Principal Geodesic Analysis (SPGA). Motivated by this, we exploit the data manifolds' approximability with such affine sums: We separately approximate the exponential map on the factor manifolds by fully connected NNs and the subsequent combination of factors by a convolutional NN to yield our approximate parametrization. In formulas, based on a judiciously chosen decomposition v = v 1 + . . . + v J , our aim is to approximate the Riemannian exponential exp z (v) by Ψ ζ (ψ ζ 1 (v 1 ), . . . , ψ ζ J (v J )) , where Ψ ζ is a NN and the ψ ζ j are further NNs approximating the Riemannian exponential exp z on the low-dimensional factor manifolds. We develop our approach focusing on the shape space of discrete shells, where shapes are given by triangle meshes and the manifold is equipped with an elasticity-based metric. In principle, our approach is also applicable to other shape spaces such as manifolds of images, and we will include remarks on how we propose this could work. We evaluate our approach with experiments on data manifolds of triangle meshes, both synthetic ones and ones extracted from data via SPGA, and we demonstrate that the proposed composite network architecture outperforms a monolithic fully connected network architecture as well as an approach based on the affine combination of the factors. We see this work as a first step to use NNs to accelerate the complex computations of shape manifold parameterizations. Therefore, we think that our approach has great potential to stimulate further research in this direction, which could in turn advance the applications of Riemannian shape spaces. Contributions In summary, the contributions of this paper are • combining the Riemannian exponential map on shape spaces and neural network methodology for the efficient parametrization of shape space data manifolds, • demonstrating the applicability of such an approach for data manifolds which can be smoothly approximated via direct sums of low-dimensional submanifolds, • using a combination of fully connected neural networks for the factorwise Riemannian exponential maps and a convolutional network to couple them, • verifying that such a setup works well with existing methods to construct product manifolds, such as Sparse Principal Geodesic Analysis, and • showing that the composite network architecture outperforms alternative approaches.

2. RELATED WORK

Shape Spaces Shape spaces are manifolds in which each point is a shape, e.g., a triangle mesh or an image. A Riemannian metric on such a space provides means to define distances between shapes, to interpolate between shapes by computing shortest geodesic paths, and to explore the space by constructing the geodesic curve starting from a point into a given direction. Shape spaces have proven useful for applications in areas such as computer graphics (Kilian et al., 2007; Heeren et al., 2012; Wang et al., 2018) and vision (Heeren et al., 2018; Xie et al., 2014) , medical imaging (Kurtek et al., 2011b; Samir et al., 2014; Kurtek et al., 2016; Bharath et al., 2018 ), computational biology (Laga et al., 2014) , and computational anatomy (Miller et al., 2006; Pennec, 2009; Kurtek et al., 2011a) . For an introduction to the topic, we refer to the textbook of Younes ( 2010).

Shape Space of Meshes

Triangle meshes are widely used to represent shapes in computer graphics and vision. Riemannian metrics on shape spaces of triangle meshes can be defined geometrically, using norms on function spaces on the meshes (Kilian et al., 2007) , or physics-based, considering the meshes as thin shells and measuring the dissipation required to deform the shells (Heeren et al., 2012; 2014) . The computation of geodesic curves in these spaces requires numerically solving high-dimensional nonlinear variational value problems, which can be costly. For shape interpolation problems, model reduction methods can be used to efficiently find approximate solutions (Brandt et al., 2016; von Radziewsky et al., 2016) . Statistics in Shape Spaces Data in a Riemannian shape spaces can be analyzed using Principal Geodesic Analysis (PGA) (Fletcher et al., 2004; Pennec, 2006) . Analogous to principal component analysis (PCA) for data in Euclidean spaces, PGA can construct low-dimensional latent representations that preserve much of the variability in the data. This is achieved by mapping the data with a non-linear mapping, the Riemannian logarithmic map, from the manifold to a linear space, the tangential space at the mean shape, and computing a PCA there. Latent variables of the PCA are then mapped with the inverse mapping, the Riemannian exponential map, onto the manifold, so that the latent space describes a submanifold of the shape space. A PGA in shape spaces of meshes was introduced in (Heeren et al., 2018) and used to obtain a low-dimensional, nonlinear, rigid body motion invariant description of shape variation from data. Sparse PGA While PCA modes involve all variables of the data, Sparse Principal Component Analysis (Zou et al., 2006) constructs modes that involve just few variables. This is achieved by integrating a sparsity encouraging term to the objective that defines the modes. Based on this idea, Neumann et al. (2013) proposed a scheme for extracting Sparse Localized Deformation Components (SPLOCS) from a dataset of non-rigid shapes. Since SPLOCS are linear modes, they are well-suited to describe small deformations such as face motions accurately. To increase the range of deformations and to compensate linearization artifacts Huang et al. ( 2014) integrated SPLOCS with gradient-domain techniques and Wang et al. (2017; 2021) with edge lengths and dihedral angles. In (Sassen et al., 2020b) , a Sparse Principal Geodesic Analysis (SPGA) was introduced. Similar to PGA, the SPGA modes are nonlinear and rigid motion invariant. On top of that, however, the SPGA modes describe localized deformations. Due to the localization, many pairs of SPGA modes have disjoint support and are therefore independent of each other. We want to take advantage of this property to effectively learn the reconstruction of points in the manifold from their latent representation through an adapted network structure.

