LIE ALGEBRA CONVOLUTIONAL NETWORKS WITH AUTOMATIC SYMMETRY EXTRACTION

Abstract

Existing methods for incorporating symmetries into neural network architectures require prior knowledge of the symmetry group. We propose to learn the symmetries during the training of the group equivariant architectures. Our model, the Lie algebra convolutional network (L-conv), is based on infinitesimal generators of continuous groups and does not require discretization or integration over the group. We show that L-conv can approximate any group convolutional layer by composition of layers. We demonstrate how CNNs, Graph Convolutional Networks and fully-connected networks can all be expressed as an L-conv with appropriate groups. By allowing the infinitesimal generators to be learnable, L-conv can learn potential symmetries. We also show how the symmetries are related to the statistics of the dataset in linear settings. We find an analytical relationship between the symmetry group and a subgroup of an orthogonal group preserving the covariance of the input. Our experiments show that L-conv with trainable generators performs well on problems with hidden symmetries. Due to parameter sharing, L-conv also uses far fewer parameters than fully-connected layers.



Many machine learning (ML) tasks involve data from unfamiliar domains, which may or may not have hidden symmetries. While much of the work on equivariant neural networks focuses on equivariant architectures, the ability of the architecture to discover symmetries in a given dataset is less studied. Convolutional Neural Networks (CNN) (LeCun et al., 1989; 1998) incorporate translation symmetry into the architecture. Recently, more general ways to construct equivariant architectures have been introduced (Cohen & Welling, 2016a; b; Cohen et al., 2018; Kondor & Trivedi, 2018) . Encoding equivariance into an ML architecture can reduce data requirements and improve generalization, while significantly reducing the number of model parameters via parameter sharing (Cohen et al., 2019; Cohen & Welling, 2016b; Ravanbakhsh et al., 2017; Ravanbakhsh, 2020) . As a result, many other symmetries such as discrete rotations in 2D (Veeling et al., 2018; Marcos et al., 2017)  and 3D (Cohen et al., 2018; Cohen & Welling, 2016a) as well as permutations (Zaheer et al., 2017) 2020) also proved a universal approximation theorem for single hidden layer equivariant neural networks for Abelian and finite groups. General principles for constructing group convolutional layers were introduced in Cohen & Welling (2016b), Kondor & Trivedi (2018), and Cohen et al. (2019) , including for continuous groups. A challenge for implementation is having to integrate over the group manifold. This has been remedied either by generalizing Fast Fourier Transforms (Cohen et al., 2018) , or using irreducible representations (irreps) (Weiler et al., 2018a) either directly as spherical harmonics as in Worrall et al. (2017) or using more general Clebsch-Gordon coefficients (Kondor et al., 2018) . Other approaches include discretizing the group as in Weiler et al. (2018a; b); Cohen & Welling (2016a) , or solving constraints for equivariant irreps as in Weiler & Cesa (2019) , or approximating the integral by sampling (Finzi et al., 2020) . The limitations in all of the approaches above are that: 1) they rely on knowing the symmetry group a priori, and 2) require encoding the whole group into the architecture. For a continuous group, it is not possible to encode all elements and we have to resort to discretization or a truncated sum over irreps. Our work attempts to resolve the issues with continuous groups by using the Lie algebra (the linearization of the group near its identity) instead of the group itself. Unlike the Lie group which is



have been incorporated into the architecture of neural networks. Many existing works on equivariant architectures use finite groups such as permutations in Hartford et al. (2018) and Ravanbakhsh et al. (2017) or discrete subgroups of continuous groups, such as 90 degree rotations in (Cohen et al., 2018) or dihedral groups D N in Weiler & Cesa (2019). Ravanbakhsh (

