Symmetry Control Neural Networks

Abstract

This paper continues the quest for designing the optimal physics bias for neural networks predicting the dynamics of systems when the underlying dynamics shall be inferred from the data directly. The description of physical systems is greatly simplified when the underlying symmetries of the system are taken into account. In classical systems described via Hamiltonian dynamics this is achieved by using appropriate coordinates, so-called cyclic coordinates, which reveal conserved quantities directly. Without changing the Hamiltonian, these coordinates can be obtained via canonical transformations. We show that such coordinates can be searched for automatically with appropriate loss functions which naturally arise from Hamiltonian dynamics. As a proof of principle, we test our method on standard classical physics systems using synthetic and experimental data where our network identifies the conserved quantities in an unsupervised way and find improved performance on predicting the dynamics of the system compared to networks biasing just to the Hamiltonian. Effectively, these new coordinates guarantee that motion takes place on symmetry orbits in phase space, i.e. appropriate lower dimensional sub-spaces of phase space. By fitting analytic formulae we recover that our networks are utilising conserved quantities such as (angular) momentum.

1. Introduction

Building in a bias to neural networks has been a key mechanism to achieve extra-ordinary performance in tasks such as classification. A standard example is to utilise translation invariance in convolutional neural networks Krizhevsky et al. (2012) and by now building in equivariance to other symmetries such as rotational symmetries has proven to be very successful (e.g. Cohen & Welling (2016) ). Possible motions are constrained due to symmetries of the system. In technical terms, motion takes place on a subspace of phase space. Energy conversation -related to invariance under time translation -has been utilised in the context of Hamiltonian Neural Networks (HNNs) Greydanus et al. (2019) where the energy functional, i.e. the Hamiltonian is inferred from data. This approach has seen large improvements in predicting the dynamics over baseline neural networks which simply try to predict the change of phase-space coordinates in time. Here we extend this approach by learning and incorporating additional constraints due to further symmetries of the system. Coarsely speaking, finding symmetries corresponds to finding good coordinates. In classical mechanics this is achieved by performing canonical transformations and identifying cyclic coordinates which reveal conserved quantities. The aim of this paper is to demonstrate that multiple conserved quantities can indeed be automatically found in this way which has not been demonstrated beforehand. Similar in spirit to learning the Hamiltonian, we formulate loss functions which enforce a representation in terms of cyclic coordinates and use them as the input for our Hamiltonian, differing from previous flow-based approaches searching for these coordinates (Bondesan & Lamacraft, 2019; Li et al., 2020) . We experimentally find as a proof of principle that this mechanism identifies the underlying conserved quantities such as angular momentum, momentum, the splitting into decoupled subsystems, and can find the number of conserved quantities. We demonstrate significant improvement in the predictions of the underlying Hamiltonian and subsequently the dynamics of the system. From our trained networks, we can find analytic expressions for the conserved quantities and determine the number of conserved quantities.

2. Theory

We briefly describe the standard techniques in Hamiltonian mechanics which our network utilises. 1 We consider a classical system with N particles in d spatial dimensions. Such a system can be described by the variables (q, p), where q = (q 1 , ..., q N •d ) are typically the positions for each dimension of the objects and p = (p 1 , ..., p N •d ) are the corresponding momenta. This is the input to our network and we are interested in predicting the timeevolution of this system, i.e. (q, p) at later time steps. This pair (q, p) is an element of phase space in which every point corresponds to a state the system can take. The time evolution is governed by the Hamiltonian H(q, p)foot_1 and the associated Hamiltonian equations: dq dt = ∂H dp = {q, H} , dp dt = - ∂H dq = {p, H} , where (2) The Poisson bracket does not only arise for the time evolution of the canonical coordinates (q, p) but also for any function of these coordinates g(q, p) which does not explicitly depend on time: dg(q, p) dt = N •d i=1 ∂g ∂q i dq i dt - ∂g ∂p i dp i dt = {g, H} , where we have used the Hamiltonian equations (1) in the last step. From this expression we see that the Poisson bracket of a conserved quantity with the Hamiltonian H(q, p) vanishes and that the Hamiltonian itself is a conserved quantity. The physics of this system is invariant under diffeomorphic coordinate transformations on the canonical coordinates and we will use coordinate transformations to reveal the constants of motions and hence the symmetries of the system. We use a particular type of diffeomorphic transformations T , namely canonical transformations which are transformations that leave the structure of the Hamiltonian equations (1) and in particular the Poisson bracket unchanged: T : (q, p) → (Q(q, p), P(q, p)) , {f, g} p,q = {f, g} P,Q , H(p, q) = H(P(p, q), Q(p, q)) . (4)



We refer the reader for more details to standard textbooks such asLandau & Lifshitz (1982). We focus on time-independent Hamiltonians for simplicity.



Figure 1: Effect of additional loss components: Loss-contours for the HNN-loss (6) (shown in gray) and the Poisson loss (7) (red) arising from the angular momentum ( {L, H} 2 ) in the 2-body Hamiltonian (10) with respect to two model parameters m 1 and m 2 . The data model corresponds to m 1 = m 2 = g = 1 (indicated with a star) and we evaluate the loss over our training set. The analytic constraint m 1 = m 2 which arises from evaluating the Poisson bracket {L, H} ∼ (m 1 -m 2 ) is clearly visible and provides additional constraints on the model parameter space.

{•, •} are the Poisson bracket. They are defined as

