META-LEARNING SYMMETRIES BY REPARAMETERI-ZATION

Abstract

Many successful deep learning architectures are equivariant to certain transformations in order to conserve parameters and improve generalization: most famously, convolution layers are equivariant to shifts of the input. This approach only works when practitioners know the symmetries of the task and can manually construct an architecture with the corresponding equivariances. Our goal is an approach for learning equivariances from data, without needing to design custom task-specific architectures. We present a method for learning and encoding equivariances into networks by learning corresponding parameter sharing patterns from data. Our method can provably represent equivariance-inducing parameter sharing for any finite group of symmetry transformations. Our experiments suggest that it can automatically learn to encode equivariances to common transformations used in image processing tasks. We provide our experiment code at https: //github.com/AllanYangZhou/metalearning-symmetries.

1. INTRODUCTION

In deep learning, the convolutional neural network (CNN) (LeCun et al., 1998) is a prime example of exploiting equivariance to a symmetry transformation to conserve parameters and improve generalization. In image classification (Russakovsky et al., 2015; Krizhevsky et al., 2012) and audio processing (Graves and Jaitly, 2014; Hannun et al., 2014 ) tasks, we may expect the layers of a deep network to learn feature detectors that are translation equivariant: if we translate the input, the output feature map is also translated. Convolution layers satisfy translation equivariance by definition, and produce remarkable results on these tasks. The success of convolution's "built in" inductive bias suggests that we can similarly exploit other equivariances to solve machine learning problems. However, there are substantial challenges with building in inductive biases. Identifying the correct biases to build in is challenging, and even if we do know the correct biases, it is often difficult to build them into a neural network. Practitioners commonly avoid this issue by "training in" desired equivariances (usually the special case of invariances) using data augmentation. However, data augmentation can be challenging in many problem settings and we would prefer to build the equivariance into the network itself. For example, robotics sim2real transfer approaches train agents that are robust to varying conditions by varying the simulated environment dynamics (Song et al., 2020) . But this type of augmentation is not possible once the agent leaves the simulator and is trying to learn or adapt to a new task in the real world. Additionally, building in incorrect biases may actually be detrimental to final performance (Liu et al., 2018b) . In this work we aim for an approach that can automatically learn and encode equivariances into a neural network. This would free practitioners from having to design custom equivariant architectures for each task, and allow them to transfer any learned equivariances to new tasks. Neural network layers can achieve various equivariances through parameter sharing patterns, such as the spatial parameter sharing of standard convolutions. In this paper we reparameterize network layers to learnably represent sharing patterns. We leverage meta-learning to learn the sharing patterns that help a model generalize on new tasks. The primary contribution of this paper is an approach to automatically learn equivariance-inducing parameter sharing, instead of using custom designed equivariant architectures. We show theoretically that reparameterization can represent networks equivariant to any finite symmetry group. Our

