ISOMETRIC TRANSFORMATION INVARIANT AND EQUIVARIANT GRAPH CONVOLUTIONAL NETWORKS

Abstract

Graphs are one of the most important data structures for representing pairwise relations between objects. Specifically, a graph embedded in a Euclidean space is essential to solving real problems, such as physical simulations. A crucial requirement for applying graphs in Euclidean spaces to physical simulations is learning and inferring the isometric transformation invariant and equivariant features in a computationally efficient manner. In this paper, we propose a set of transformation invariant and equivariant models based on graph convolutional networks, called IsoGCNs. We demonstrate that the proposed model has a competitive performance compared to state-of-the-art methods on tasks related to geometrical and physical simulation data. Moreover, the proposed model can scale up to graphs with 1M vertices and conduct an inference faster than a conventional finite element analysis, which the existing equivariant models cannot achieve.

1. INTRODUCTION

Graph-structured data embedded in Euclidean spaces can be utilized in many different fields such as object detection, structural chemistry analysis, and physical simulations. Graph neural networks (GNNs) have been introduced to deal with such data. The crucial properties of GNNs include permutation invariance and equivariance. Besides permutations, isometric transformation invariance and equivariance must be addressed when considering graphs in Euclidean spaces because many properties of objects in the Euclidean space do not change under translation and rotation. Due to such invariance and equivariance, 1) the interpretation of the model is facilitated; 2) the output of the model is stabilized and predictable; and 3) the training is rendered efficient by eliminating the necessity of data augmentation as discussed in the literature (Thomas et al., 2018; Weiler et al., 2018; Fuchs et al., 2020) . Isometric transformation invariance and equivariance are inevitable, especially when applied to physical simulations, because every physical quantity and physical law is either invariant or equivariant to such a transformation. Another essential requirement for such applications is computational efficiency because the primary objective of learning a physical simulation is to replace a computationally expensive simulation method with a faster machine learning model. In the present paper, we propose IsoGCNs, a set of simple yet powerful models that provide computationally-efficient isometric transformation invariance and equivariance based on graph convolutional networks (GCNs) (Kipf & Welling, 2017) . Specifically, by simply tweaking the definition of an adjacency matrix, the proposed model can realize isometric transformation invariance. Because the proposed approach relies on graphs, it can deal with the complex shapes that are usually presented using mesh or point cloud data structures. Besides, a specific form of the IsoGCN layer can be regarded as a spatial differential operator that is essential for describing physical laws. In addition, we have shown that the proposed approach is computationally efficient in terms of processing graphs with up to 1M vertices that are often presented in real physical simulations. Moreover, the proposed model exhibited faster inference compared to a conventional finite element analysis approach at the same level of accuracy. Therefore, an IsoGCN can suitably replace physical simulations regarding its power to express physical laws and faster, scalable computation. The corresponding implementation and the dataset are available onlinefoot_0 . The main contributions of the present paper can be summarized as follows: • We construct isometric invariant and equivariant GCNs, called IsoGCNs for the specified input and output tensor ranks. • We demonstrate that an IsoGCN model enjoys competitive performance against state-ofthe-art baseline models on the considered tasks related to physical simulations. • We confirm that IsoGCNs are scalable to graphs with 1M vertices and achieve inference considerably faster than conventional finite element analysis.

2. RELATED WORK

Graph neural networks. The concept of a GNN was first proposed by Baskin et al. (1997) ; Sperduti & Starita (1997) and then improved by (Gori et al., 2005; Scarselli et al., 2008) . Although many variants of GNNs have been proposed, these models have been unified under the concept of message passing neural networks (Gilmer et al., 2017) . Generally, message passing is computed with nonlinear neural networks, which can incur a tremendous computational cost. In contrast, the GCN developed by Kipf & Welling ( 2017) is a considerable simplification of a GNN, that uses a linear message passing scheme expressed as H out = σ( ÂH in W ), where H in (H out ) is an input (output) feature of the lth layer, Â is a renormalized adjacency matrix with self-loops, and W is a trainable weight. A GCN, among the variants of GNNs, is essential to the present study because the proposed model is based on GCNs for computational efficiency. Invariant and equivariant neural networks. A function f : X → Y is said to be equivariant to a group G when f (g • x) = g • f (x), for all g ∈ G and x ∈ X, assuming that group G acts on both X and Y . In particular, when f (g • x) = f (x), f is said to be invariant to the group G. Group equivariant convolutional neural networks were first proposed by Cohen & Welling (2016) for discrete groups. Subsequent studies have categorized such networks into continuous groups (Cohen et al., 2018) , three-dimensional data (Weiler et al., 2018) , and general manifolds (Cohen et al., 2019) . These methods are based on CNNs; thus, they cannot handle mesh or point cloud data structures as is. Specifically, 3D steerable CNNs (Weiler et al., 2018) uses voxels (regular grids), which though relatively easy to handle, are not efficient because they represent both occupied and non-occupied parts of an object (Ahmed et al., 2018) . In addition, a voxelized object tends to lose the smoothness of its shape, which can lead to drastically different behavior in a physical simulation, as typically observed in structural analysis and computational fluid dynamics.  (l) out,i = w ll H(l) in,i + k≥0 j =i W lk (x j -x i ) H(k) in,j , W lk (x) = k+l J=|k-l| φ lk J ( x ) J m=-J Y Jm (x/ x )Q lk Jm , where H(l) in,i ( H(l) out,i ) is a type-l input (output) feature at the ith vertex, φ lk J : R ≥0 → R is a trainable function, Y Jm is the mth component of the Jth spherical harmonics, and Q lk Jm is the Clebsch-Cordan coefficient. The SE(3)-Transformer (Fuchs et al., 2020) is a variant of the TFN with selfattention. These models achieve high expressibility based on spherical harmonics and message passing with nonlinear neural networks. However, for this reason, considerable computational resources



https://github.com/yellowshippo/isogcn-iclr2021



Thomas et al. (2018);Kondor (2018)  discussed how to provide rotation equivariance to point clouds. Specifically, the tensor field network (TFN) (Thomas et al., 2018) is a point cloud based rotation and translation equivariant neural network the layer of which can be written as H

