GROUP EQUIVARIANT CONDITIONAL NEURAL PRO-CESSES

Abstract

We present the group equivariant conditional neural process (EquivCNP), a metalearning method with permutation invariance in a data set as in conventional conditional neural processes (CNPs), and it also has transformation equivariance in data space. Incorporating group equivariance, such as rotation and scaling equivariance, provides a way to consider the symmetry of real-world data. We give a decomposition theorem for permutation-invariant and group-equivariant maps, which leads us to construct EquivCNPs with an infinite-dimensional latent space to handle group symmetries. In this paper, we build architecture using Lie group convolutional layers for practical implementation. We show that EquivCNP with translation equivariance achieves comparable performance to conventional CNPs in a 1D regression task. Moreover, we demonstrate that incorporating an appropriate Lie group equivariance, EquivCNP is capable of zero-shot generalization for an image-completion task by selecting an appropriate Lie group equivariance.

1. INTRODUCTION

Data symmetry has played a significant role in the deep neural networks. In particular, a convolutional neural network, which play an important part in the recent achievements of deep neural networks, has translation equivariance that preserves the symmetry of the translation group. From the same point of view, many studies have aimed to incorporate various group symmetries into neural networks, especially convolutional operation (Cohen et al., 2019; Defferrard et al., 2019; Finzi et al., 2020) . As example applications, to solve the dynamics modeling problems, some works have introduced Hamiltonian dynamics (Greydanus et al., 2019; Toth et al., 2019; Zhong et al., 2019) . Similarly, Quessard et al. ( 2020) estimated the action of the group by assuming the symmetry in the latent space inferred by the neural network. Incorporating the data structure (symmetries) into the models as inductive bias, can reduce the model complexity and improve model generalization. In terms of inductive bias, meta-learning, or learning to learn, provides a way to select an inductive bias from data. Meta-learning use past experiences to adapt quickly to a new task T ∼ p(T ) sampled from some task distribution p(T ). Especially in supervised meta-learning, a task is described as predicting a set of unlabeled data (target points) given a set of labeled data (context points). Various works have proposed the use of supervised meta-learning from different perspectives (Andrychowicz et al., 2016; Ravi & Larochelle, 2016; Finn et al., 2017; Snell et al., 2017; Santoro et al., 2016; Rusu et al., 2018) . In this study, we are interested in neural processes (NPs) (Garnelo et al., 2018a; b) , which are meta-learning models that have encoder-decoder architecture (Xu et al., 2019) . The encoder is a permutation-invariant function on the context points that maps the contexts into a latent representation. The decoder is a function that produces the conditional predictive distribution of targets given the latent representation. The objective of NPs is to learn the encoder and the decoder, so that the predictive model generalizes well to new tasks by observing some points of the tasks. To achieve the 1

