EFFICIENT GENERALIZED SPHERICAL CNNS

Abstract

Many problems across computer vision and the natural sciences require the analysis of spherical data, for which representations may be learned efficiently by encoding equivariance to rotational symmetries. We present a generalized spherical CNN framework that encompasses various existing approaches and allows them to be leveraged alongside each other. The only existing non-linear spherical CNN layer that is strictly equivariant has complexity OpC 2 L 5 q, where C is a measure of representational capacity and L the spherical harmonic bandlimit. Such a high computational cost often prohibits the use of strictly equivariant spherical CNNs. We develop two new strictly equivariant layers with reduced complexity OpCL 4 q and OpCL 3 log Lq, making larger, more expressive models computationally feasible. Moreover, we adopt efficient sampling theory to achieve further computational savings. We show that these developments allow the construction of more expressive hybrid models that achieve state-of-the-art accuracy and parameter efficiency on spherical benchmark problems.

1. INTRODUCTION

Many fields involve data that live inherently on spherical manifolds, e.g. 360 ˝photo and video content in virtual reality and computer vision, the cosmic microwave background radiation from the Big Bang in cosmology, topographic and gravitational maps in planetary sciences, and molecular shape orientations in molecular chemistry, to name just a few. Convolutional neural networks (CNNs) have been tremendously effective for data defined on Euclidean domains, such as the 1D line, 2D plane, or nD volumes, thanks in part to their translation invariance properties. However, these techniques are not effective for data defined on spherical manifolds, which have a very different geometric structure to Euclidean spaces (see Appendix A). To transfer the remarkable success of deep learning to data defined on spherical domains, deep learning techniques defined inherently on the sphere are required. Recently, a number of spherical CNN constructions have been proposed. Existing CNN constructions on the sphere fall broadly into three categories: fully real (i.e. pixel) space approaches (e.g. Boomsma & Frellsen, 2017; Jiang et al., 2019; Perraudin et al., 2019; Cohen et al., 2019) ; combined real and harmonic space approaches (Cohen et al., 2018; Esteves et al., 2018; 2020) ; and fully harmonic space approaches (Kondor et al., 2018) . Real space approaches can often be computed efficiently but they necessarily provide an approximate representation of spherical signals and the connection to the underlying continuous symmetries of the sphere is lost. Consequently, such approaches cannot fully capture rotational equivariance. Other constructions take a combined real and harmonic space approach (Cohen et al., 2018; Esteves et al., 2018; 2020) , where sampling theorems (Driscoll & Healy, 1994; Kostelec & Rockmore, 2008) are exploited to connect with underlying continuous signal representations to capture the continuous symmetries of the sphere. However, in these approaches non-linear activation functions are computed pointwise in real space, which induces aliasing errors that break strict rotational equivariance. Fully harmonic space spherical CNNs have been constructed by Kondor et al. (2018) . A continual connection with underlying continuous signal representations is captured by using harmonic signal representations throughout. Consequently, this is the only approach exhibiting strict rotational equivariance. However, strict equivariance comes at great computational cost, which can often prohibit usage. ˚Corresponding author: jason.mcewen@kagenova.com

