NEURAL EPDOS: SPATIALLY ADAPTIVE EQUIVARI-ANT PARTIAL DIFFERENTIAL OPERATOR BASED NET-WORKS

Abstract

Endowing deep learning models with symmetry priors can lead to a considerable performance improvement. As an interesting bridge between physics and deep learning, the equivariant partial differential operators (PDOs) have drawn much researchers' attention recently. However, to ensure the PDOs translation equivariance, previous works have to require coefficient matrices to be constant and spatially shared for their linearity, which could lead to the sub-optimal feature learning at each position. In this work, we propose a novel nonlinear PDOs scheme that is both spatially adaptive and translation equivariant. The coefficient matrices are obtained by local features through a generator rather than spatially shared. Besides, we establish a new theory on incorporating more equivariance like rotations for such PDOs. Based on our theoretical results, we efficiently implement the generator with an equivariant multilayer perceptron (EMLP). As such equivariant PDOs are generated by neural networks, we call them Neural ePDOs. In experiments, we show that our method can significantly improve previous works with smaller model size in various datasets. Especially, we achieve the state-ofthe-art performance on the MNIST-rot dataset with only tenth of parameters of the previous best model.

1. INTRODUCTION

In recent years, convolutional neural networks (CNNs) have achieved superior performance on various vision tasks (Szegedy et al., 2015; He et al., 2016; Chen et al., 2017) . It is acknowledged that the success of CNNs is attributed to their ability to exploit the intrinsic translation-invariance symmetry of data to help downstream vision tasks. To incorporate other symmetries like rotation-invariance, various CNNs-based equivariant networks have been studied and carried out to enhance the performance of vision tasks (Cohen & Welling, 2016a; b; Weiler & Cesa, 2019) . In another branch, some works (Osher & Rudin, 1990; Perona & Malik, 1990 ) adopted partial differential operators (PDOs) to process images in the early period. Recently, PDOs with learnable coefficients are adopted by Shen et al. (2020) to design equivariant networks which achieve competitive performance compared to previous equivariant networks. Jenner & Weiler (2021) further generalized this work to a unified framework on the equivariant linear PDOs on Euclidean spaces of various representation types. Actually, the coefficient matrices of the current PDOs works are spatially shared, e.g. the same PDOs are applied to process features at each position (see Figure.1(a) ). However, such a coefficient sharing scheme of the PDOs is not the optimal pattern to extract features from input images (Wu et al., 2018; Su et al., 2019; Zhou et al., 2021; He et al., 2021a) . To be specific, the contents of the input images vary according to positions, e.g. some pixels cover the background while some express texture, which would make coefficient-sharing PDOs inefficient to extract features at each position. In fact, Jenner & Weiler (2021) have proved that the linear PDOs layer is translation equivariant if and only if its coefficient matrices are spatially shared, so it seems impossible to ensure both the spatial adaptivity and translation equivariance for PDOs. In this work, to deal with the above issue, we think outside the box of the linear limitation and propose brand new nonlinear PDOs that are both spatially adaptive and translation equivariant. Compared with spatially shared PDOs, we construct a coefficient generator that inputs local features and outputs the coefficient matrices. Since different positions produce different coefficient matrices, the PDOs are essentially position-specific and can extract individual features according to the local content (see Figure 1(b) ). In addition, the coefficient matrices generated by local features guarantee the translation equivariance for such PDOs naturally. However, such a nonlinear PDOs scheme is not intrinsically equivariant to rotations or reflections. To incorporate equivariance of these transformations, we establish a theory on the equivariant formulation of this nonlinear PDOs scheme under any given symmetry group. Specifically, the theory reveals that this type of PDOs is equivariant if and only if the coefficient generators are exactly equivariant maps of particular transformations. In practice, we choose a two-layer EMLP (Finzi et al., 2021) as the coefficient generator to satisfy the equivariance condition and provide an efficient implementation scheme. We name our model Neural ePDOs and evaluate its performance on MNIST-rot and ImageNet datasets. Extensive experiments show that our model can significantly improve accuracy with fewer parameters. Especially, we achieve the state-of-the-art results on MNIST-rot dataset with only a tenth of the parameters compared to previous best models. We summarize the main contributions as follows: • To our knowledge, we are the first one to propose the nonlinear form of PDOs that are both spatially adaptive and translation equivariant. The coefficient matrices of the novel PDOs are adaptive to local features, which could alleviate the sub-optimal feature learning problem at each position. • We develop a theory for such nonlinear PDOs that precisely characterize when it is equivariant under any given symmetry group. The theory reveals that the nonlinear PDOs are equivariant if and only if the coefficient generators are exactly equivariant maps of particular transformations. • We provide an efficient implementation which adopts a two-layer EMLP as the coefficient generator and could largely save parameters and computations. • Extensive experiments show that our method can significantly improve the results on MNIST-rot and ImageNet datasets with significantly fewer parameters. Especially, we achieve state-of-the-art results on the MNIST-rot dataset.



Figure 1: Illustration of two different designs for PDOs. Here, we use the 2-dimensional vector field to represent the feature map. (a)For linear PDOs, the coefficient matrices are shared to process features across different positions. (b) For nonlinear PDOs we propose in this paper, the coefficient matrices are generated by the local features through neural networks.

