PARAMETERIZED PSEUDO-DIFFERENTIAL OPERA-TORS FOR GRAPH CONVOLUTIONAL NEURAL NET-WORKS

Abstract

We present a novel graph convolutional layer that is fast, conceptually simple, and provides high accuracy with reduced overfitting. Based on pseudo-differential operators, our layer operates on graphs with relative position information available for each pair of connected nodes. We evaluate our method on a variety of supervised learning tasks, including superpixel image classification using the MNIST, CIFAR10, and CIFAR100 superpixel datasets, node correspondence using the FAUST dataset, and shape classification using the ModelNet10 dataset. The new layer outperforms multiple recent architectures on superpixel image classification tasks using the MNIST and CIFAR100 superpixel datasets and performs comparably with recent results on the CIFAR10 superpixel dataset. We measure test accuracy without bias to the test set by selecting the model with the best training accuracy. The new layer achieves a test error rate of 0.80% on the MNIST superpixel dataset, beating the closest reported rate of 0.95% by a factor of more than 15%. After dropping roughly 70% of the edge connections from the input by performing a Delaunay triangulation, our model still achieves a competitive error rate of 1.04%.

1. INTRODUCTION

Convolutional neural networks have performed incredibly well on tasks such as image classification, segmentation, and object detection (Khan et al., 2020) . While there have been diverse architectural design innovations leading to improved accuracies across these tasks, all of these tasks share the common property that they operate on structured Euclidean domain inputs. A growing body of research on how to transfer these successes into non-Euclidean domains, such as manifolds and graphs, has followed. We focus on unstructured graphs which represent discretizations of an underlying metric space. These data types are ubiquitous in computational physics, faceted surface meshes, and (with superpixel conversion) images. Previous efforts to extend CNNs to this type of data have involved parameterized function approximations on localized neighborhoods, such as MoNet (Monti et al., 2017) and SplineCNN (Fey et al., 2017) . These function approximations (Gaussian mixture models in the case of MoNet and B-spline kernels in the case of SpineCNN) are complex and relatively expensive to calculate compared to CNN kernels. Inspired by earlier work in shape correspondence (Boscaini et al., 2016) , image segmentation on the unit sphere (Jiang et al., 2019) , and low-dimensional embeddings of computational physics data (Tencer & Potter, 2020) we seek to utilize parameterized differential operators (PDOs) to construct convolution kernels. In contrast to MoNet and SpineCNN, parameterized differential operators are cheap to compute and involve only elementary operations. Boscaini et al. (2016) used anisotropic diffusion kernels while Jiang et al. (2019) included gradient operators in addition to an isotropic diffusion operator. Tencer & Potter (2020) performed an ablation study of the differential operators used and demonstrated that the including the gradient operators is broadly beneficial, but that little is gained by including additional terms. Prior work (Jiang et al., 2019; Tencer & Potter, 2020) has used differential operators precomputed for specific meshes. This approach has two drawbacks: (1) precomputing operators is not practical

