GAUGE EQUIVARIANT MESH CNNS ANISOTROPIC CONVOLUTIONS ON GEOMETRIC GRAPHS

Abstract

A common approach to define convolutions on meshes is to interpret them as a graph and apply graph convolutional networks (GCNs). Such GCNs utilize isotropic kernels and are therefore insensitive to the relative orientation of vertices and thus to the geometry of the mesh as a whole. We propose Gauge Equivariant Mesh CNNs which generalize GCNs to apply anisotropic gauge equivariant kernels. Since the resulting features carry orientation information, we introduce a geometric message passing scheme defined by parallel transporting features over mesh edges. Our experiments validate the significantly improved expressivity of the proposed model over conventional GCNs and other methods.

1. INTRODUCTION

Convolutional neural networks (CNNs) have been established as the default method for many machine learning tasks like speech recognition or planar and volumetric image classification and segmentation. Most CNNs are restricted to flat or spherical geometries, where convolutions are easily defined and optimized implementations are available. The empirical success of CNNs on such spaces has generated interest to generalize convolutions to more general spaces like graphs or Riemannian manifolds, creating a field now known as geometric deep learning (Bronstein et al., 2017) . A case of specific interest is convolution on meshes, the discrete analog of 2-dimensional embedded Riemannian manifolds. Mesh CNNs can be applied to tasks such as detecting shapes, registering different poses of the same shape and shape segmentation. If we forget the positions of vertices, and which vertices form faces, a mesh M can be represented by a graph G. This allows for the application of graph convolutional networks (GCNs) to processing signals on meshes. The distinct geometry of the neighbourhoods is reflected in the different angles θpq i of incident edges from neighbours qi. Graph convolutional networks apply isotropic kernels and can therefore not distinguish both neighbourhoods. Gauge Equivariant Mesh CNNs apply anisotropic kernels and are therefore sensitive to orientations. The arbitrariness of reference orientations, determined by a choice of neighbour q0, is accounted for by the gauge equivariance of the model. However, when representing a mesh by a graph, we lose important geometrical information. In particular, in a graph there is no notion of angle between or ordering of two of a node's incident edges (see figure 1 ). Hence, a GCNs output at a node p is designed to be independent of relative angles and invariant to any permutation of its neighbours q i ∈ N (p). A graph convolution on a mesh graph therefore corresponds to applying an isotropic convolution kernel. Isotropic filters are insensitive to the orientation of input patterns, so their features are strictly less expressive than those of orientation aware anisotropic filters. To address this limitation of graph networks we propose Gauge Equivariant Mesh CNNs (GEM-CNN), which minimally modify GCNs such that they are able to use anisotropic filters while sharing weights across different positions and respecting the local geometry. One obstacle in sharing anisotropic kernels, which are functions of the angle θ pq of neighbour q with respect to vertex p, over multiple vertices of a mesh is that there is no unique way of selecting a reference neighbour q 0 , which has the direction θ pq0 = 0. The reference neighbour, and hence the orientation of the neighbours, needs to be chosen arbitrarily. order to guarantee the equivalence of the features resulting from different choices of orientations, we adapt Gauge Equivariant CNNs (Cohen et al., 2019b) to general meshes. The kernels of our model are thus designed to be equivariant under gauge transformations, that is, to guarantee that the responses for different kernel orientations are related by a prespecified transformation law. Such features are identified as geometric objects like scalars, vectors, tensors, etc., depending on the specific choice of transformation law. In order to compare such geometric features at neighbouring vertices, they need to be parallel transported along the connecting edge. In our implementation we first specify the transformation laws of the feature spaces and compute a space of gauge equivariant kernels. Then we pick arbitrary reference orientations at each node, relative to which we compute neighbour orientations and compute the corresponding edge transporters. Given these quantities, we define the forward pass as a message passing step via edge transporters followed by a contraction with the equivariant kernels evaluated at the neighbour orientations. Algorithmically, Gauge Equivariant Mesh CNNs are therefore just GCNs with anisotropic, gauge equivariant kernels and message passing via parallel transporters. Conventional GCNs are covered in this framework for the specific choice of isotropic kernels and trivial edge transporters, given by identity maps. In Sec. 2, we will give an outline of our method, deferring details to Secs. 3 and 4. In Sec. 3.2, we describe how to compute general geometric quantities, not specific to our method, used for the computation of the convolution. In our experiments in Sec. 6.1, we find that the enhanced expressiveness of Gauge Equivariant Mesh CNNs enables them to outperform conventional GCNs and other prior work in a shape correspondence task.

2. CONVOLUTIONS ON GRAPHS WITH GEOMETRY

We consider the problem of processing signals on discrete 2-dimensional manifolds, or meshes M . Such meshes are described by a set V of vertices in R 3 together with a set F of tuples, each consisting of the vertices at the corners of a face. For a mesh to describe a proper manifold, each edge needs to be connected to two faces, and the neighbourhood of each vertex needs to be homeomorphic to a disk. Mesh M induces a graph G by forgetting the coordinates of the vertices while preserving the edges. A conventional graph convolution between kernel K and signal f , evaluated at a vertex p, can be defined by (K f ) p = K self f p + q∈Np K neigh f q , where N p is the set of neighbours of p in G, and K self ∈ R Cin×Cout and K neigh ∈ R Cin×Cout are two linear maps which model a self interaction and the neighbour contribution, respectively. Importantly, graph convolution does not distinguish different neighbours, because each feature vector f q is multiplied by the same matrix K neigh and then summed. For this reason we say the kernel is isotropic. Consider the example in figure 1 , where on the left and right, the neighbourhood of one vertex p, containing neighbours q ∈ N p , is visualized. An isotropic kernel would propagate the signal from the neighbours to p in exactly the same way in both neighbourhoods, even though the neighbourhoods are geometrically distinct. For this reason, our method uses direction sensitive (anisotropic) kernels instead of isotropic kernels. Anisotropic kernels are inherently more expressive than isotropic ones which is why they are used universally in conventional planar CNNs.



Figure 1: Two local neighbourhoods around vertices p and their representations in the tangent planes TpM .

