ON DISCRETE SYMMETRIES OF ROBOTICS SYSTEMS: A GROUP-THEORETIC AND DATA-DRIVEN ANALYSIS Anonymous

Abstract

In this work, we study the Morphological Symmetries of dynamical systems with one or more planes of symmetry, a predominant feature in animal biology and robotic systems, characterized by the duplication and balanced distribution of body parts. These morphological symmetries imply that the system's dynamics are symmetric (or approximately symmetric), which in turn imprints symmetries in optimal control policies and in all proprioceptive and exteroceptive measurements related to the evolution of the system's dynamics. For data-driven methods, symmetry represents an inductive bias that justifies data augmentation and the construction of symmetric function approximators. To this end, we use Group Theory to present a theoretical and practical framework allowing for (1) the identification of the system's morphological symmetry Group G, (2) the characterization of how the group acts upon the system state variables and any proprioceptive and exteroceptive measurement, and (3) the exploitation of data symmetries through the use of G-equivariant/G-invariant Neural Networks, for which we present experimental results on synthetic and real-world applications, demonstrating how symmetry constraints lead to better sample efficiency and generalization while reducing the number of trainable parameters.

1. INTRODUCTION

Symmetries are a predominant feature in animal biology. The majority of living (and extinct) species are bilaterally or radially symmetric (i.e., having one or more planes of symmetry), a property intuitively recognized by the patterns of balanced distribution and duplication of body parts and shapes (Holló, 2017) . Likewise, most robotic systems are symmetric, often featuring more precise symmetries than nature due to the accurate duplication of body parts and the tendency to design mechanisms with symmetric volumes and mass distributions. These morphological symmetries of animals and robots imply that the dynamics and control of body motions are also approximately symmetric, resulting in all proprioceptive and exteroceptive measurements, related to the evolution of the system's dynamics (e.g. joint torques, depth images, contact forces), to be also symmetric. This highly relevant inductive bias is frequently left unexploited in most data-driven applications in the fields of robotics, computer graphics, computational biology, and control. Recent works in computer graphics (Yeh et al., 2019; Abdolhosseini et al., 2019; Yu et al., 2018) and robotics/dynamical systems ( Van der Pol et al., 2020; Ordonez-Apraez et al., 2022; Hamed & Grizzle, 2013; Finzi et al., 2021a) have exploited through different approaches the morphological symmetry group associated with bilateral (or sagittal) symmetry (the reflection group C 2 ), obtaining improvements in generalization and sample efficiency of function approximators. Notably, Zinkevich & Balch (2001) proved that Markov Decision Processes with state symmetries have symmetric optimal value and policy functions. Despite these encouraging contributions, exploiting the inductive bias of morphological symmetries is not a widespread technique in the research community. We attribute the scarce adoption of these techniques to the lack of a unifying theoretical and practical framework, allowing to identify different morphological symmetries in arbitrary dynamical systems and efficiently and conveniently exploit them in data-driven applications. This work takes a step towards this unifying framework by studying morphological symmetries through the lens of dynamical systems and group theoryfoot_0 . Our theoretical contributions are: ) allow it to imitate the effect of reflections (g s , g t ) and 180 • rotations of space (g r ). Transformations affect both proprioceptive (state space, CoM linear l and angular k momentum) and exteroceptive (terrain elevation, external disturbances) quantities. Right: Diagram of a K 4 -equivariant NN. Each of the layer's linear maps W is constructed as a weighted average of the basis of the space of equivariant linear maps B, computed from the K 4 symmetries of the input-output spaces (see section 5). f 1 gr • f 1 g t • f 1 gs • f 1 ❈ A group-theoretic formalization of the concept of discrete morphological symmetry. ❈ A characterization of how the morphological symmetry group G affects the system's state variables and any relevant proprioceptive and exteroceptive measurements. Facilitating the identification of G and the augmentation of proprioceptive and exteroceptive measurements. Once the morphological symmetry group G is identified, our practical contributions focus on the efficient construction and versatile use of G-equivariant neural networks, for arbitrary discrete morphological symmetry groups G, for which we: ✥ Derived an optimal initialization for the trainable parameters of equivariant layersfoot_1 . ✥ Demonstrate that G-equivariance reduces the trainable parameters by approximately 1 /|G|. ✥ Enable the construction of large scale G-equivariant networks by mitigating the construction computational complexity and the storage memory complexity of equivariant architectures 2 .

2. BACKGROUND ON SYMMETRY GROUPS

In a nutshell, a symmetry group in Group Theory is an abstraction of the concept of symmetries that different geometric objects might have, understanding symmetry as a transformation that when applied to an object conserves a relevant property of its structure. For instance, in fig. 1 -left the Klein four-group K 4 describes the symmetries that vectors, pseudo-vectors, rigid bodies, and a quadruped robot have to 180 • rotations (g r ) and two perpendicular reflections (g s , g t ). Transformations that preserve vector magnitudes and energy. While on fig. 1 -right the same group describes the symmetries of vector spaces, representing the quadruped robot's state x and legs contact state y. Formally, a symmetry group is a set of invertible symmetry transformations (or actions) G = {e, g 1 , g 2 , . . . }, containing the trivial action e (which leaves objects unchanged) and having a binary operator (•) : G × G → G, that is associative (i.e. g 1 • (g 2 • g 3 ) = (g 1 • g 2 ) • g 3 ), which composes group members into other group members, such as g r = g s • g t for K 4 (see fig. 1 ). Group representations are characterizations of how each action g transforms a specific geometric object, say x ∈ R k . A representation ρ x : G → GL(k) (GL : General Linear group) is a group homomorphism associating each g to an invertible linear map ρ x (g) ∈ R k×k specifying how the object x is transformed, that is: g(x) ≡ g • x . = ρ x (g)x. Since group actions are abstract, it is common to define different object-dependent representations for each action, as we will see throughout this work. A fundamental concept for this work is the notion of function G-equivariance and G-invariance.

Consider the function

f : R n → R m . f is said to be G-equivariant or G-invariant if: g • y = f (g • x) | ∀ g ∈ G Equivariance y = f (g • x) | ∀ g ∈ G Invariance , In words, an equivariant function maps symmetries of the input to symmetries of the output, while an invariant function maps symmetries of the input to an invariant output. Being this short section undoubtedly an unsatisfactory introduction to Group Theory, we refer the uninitiated and interested reader to Bronstein et al. (2021) for a remarkable introduction to the field.

3. LAGRANGIAN MECHANICS AND SYMMETRIES OF DYNAMICAL SYSTEMS

First, we provide a group-theoretic perspective of symmetries in a dynamical system. To this end, let us consider a dynamical system with generalized coordinates q ∈ Q ⊆ R n and velocities q ∈ T q Q ⊆ R n ; as well as a Lagrangian function L : Q × T q Q → R = T (q, q) -U(q, q). Being Q the constrained configuration space, T q Q the configuration tangent space at q (i.e., the space of generalized velocities), and T (q, q), U(q, q) the state kinetic and potential energies, respectively. The symmetries of a dynamical system are defined as transformations in the space of generalized coordinates that keep the energy state of the system unchanged (Ostrowski & Burdick, 1996) . In this work, we study time-invariant point-transformationsfoot_2 of generalized coordinates g : Q → Q, which are interpreted as actions of a symmetry group, i.e. g ∈ G. Denoting ρ Q (g) ∈ R n×n as the action representation in Q, we define the transformed coordinates as g(q) . = ρ Q (g)q = g • q. Consequently, the velocity and acceleration of the transformed coordinates are given by g • q and g • q, respectively, considering that dg(q) dt k = ∂g(q) ∂q dq dt k = ρ Q (g) dq dt k . = g • dq dt k . Formally, we say that a dynamical system has a symmetry group G if its Lagrangian is G-invariant: L (q, q) = L (g • q, g • q) | ∀ g ∈ G, q ∈ Q, q ∈ T q Q. (2) Because the Lagrangian structure differs between the original (q, q) and transformed coordinates (g • q, g • q) | ∀ g ∈ G, when we derive the Equations of Motion (EoM) of the system in the transformed coordinates, we obtain a set of EoMs describing the true system dynamics in different coordinate systems. Formally, if we derive the EoM through the Euler-Lagrange equation of the second order Ä d dt ∂L(q, q) ∂ q -∂L(q, q) ∂q ≡ M(q) qτ (q, q) = 0 ä , the distinct EoM are equivariantfoot_3 to each other (Lanczos, 2020) , a property we will refer to as dynamics G-equivariance: g • [M(q) q Inertial -τ (q, q) M oving ] = M(g • q)g • q Inertial -τ (g • q, g • q) M oving = 0 | ∀ g ∈ G, q ∈ Q, q ∈ T q Q. (3) Denoting M(q) : Q → R n×n as the generalized mass matrix function and τ (q, q) : Q × T q Q → R n as the generalized moving forces at q and q. Note that, in eq. ( 3) the original and transformed dynamics are related linearly by the Jacobian of the coordinate transformation ∂g(q) /∂q = ρ Q (g) (Wheeler, 2014) , which to preserve notation is reduced to g. To ensure dynamics G-equivariance (eq. ( 3)), both the generalized inertial and moving forces need to be independently equivariant, meaning: M(g • q) = gM(q)g -1 ∧ g • τ (q, q) = τ (g • q, g • q) | ∀ g ∈ G, q ∈ Q, q ∈ T q Q. (4) The equivariance of the generalized mass matrix provides a pathway for the identification of the symmetry group G, and the group action representations ρ Q (g) | g ∈ G (see section 4.2). While the equivariance of the generalized moving forces (which in practice usually incorporates control forces, constraint forces, and external interactions) implies that dynamics G-equivariance is held until a symmetry braking force violates the equivariance of τ . Floating-base robotic/dynamical systems: Let us now narrow our focus to floating-based dynamical systems. Namely, legged/flying/swimming robots, animals, and animated characters evolving in a Euclidean space of d dimensions (with its corresponding Euclidean Lie Group E d ), whose generalized coordinates can be decoupled into q = î X B q ó ∈ Q . = E d × Q J . 5 Where X B ∈ E d represents the system's base (or center of mass (CoM)) position and orientationfoot_5 . q ∈ Q J ⊆ R n J represents the internal Degrees of Freedom (DoF) constrained configuration. And Q J the internal configuration space commonly referred to as joint space. In this coordinate space, we can differentiate the effect of g on E d and Q J , noting that: g(q) = g•q = ρ Q (g)q = ρ E d (g) 0 0 ρQ J (g) î X B q ó ∀ g ∈ G. Represent- ing ρ E d (g) ∈ E d a homogeneous matrix transformation affecting the base, and ρ Q J (g) ∈ R n J ×n J a transformation on the joint-space. The differentiation becomes handy in identifying the system's continuous and discrete (section 4) symmetries. Continuous symmetries of floating-base systems: The most commonly studied and exploited symmetries of floating-based systems are the continuous symmetries of the Euclidean space in which the system evolves, i.e., symmetry actions g ∈ E d , involving d-dimensional rotations/reflections + translationsfoot_6 . The property of these actions g, which is of most interest to us, is the E d -invariance of the joint-space configuration: g • q = q ⇐⇒ ρ Q J (g) = I n J | ∀ g ∈ E d .

4. DISCRETE MORPHOLOGICAL SYMMETRIES (DMSS)

A Discrete Morphological symmetry (DMS) is a mathematical formalization of the intuitive property of floating-base dynamical systems that can imitate the effect of rotations, translations, and infeasible reflections of space with feasible discrete change in the system configuration. To better introduce the concept of DMS it is useful to first study the most simple (and most frequent) instance of a DMS: the reflection symmetry, which all humans and most animals approximately possess (Holló, 2017) . Reflection DMS G = C 2 : Despite most floating-base dynamical systems being symmetric w.r.t reflections of space (g ∈ E d ), in practice, it is common to ignore these reflection symmetries, since, in general, it is impossible to subject a real-world robotic/dynamical system to a true reflection of space (Selig, 2005) . Think of your own body as a floating-base system, you can move and rotate your base (hip) in space but you are unable to execute a true reflection of space, which will force your heart to switch sides (and certainly die). Fortunately, your body is symmetric w.r.t. the sagittal plane (we will assume perfect symmetry for now) which allows you to imitate the effect of a true reflection of space by modifying your internal configuration (your body pose), and rotating and translating your base (i.e., with a feasible discrete change in your configuration, see supp.fig 6a  ∈ Q . = E d × Q J , evolving in a d-dimensional Euclidean space. The system is said to have a DMS if, for a given continuous symmetry action g ∈ E d , there exists an action g ∈ G, that is proper (|ρ E d (g)| = 1) and non-trivial in joint-space (ρ Q J (g) ̸ = I d ), such that: L (q, q) = L (g • q, g • q) = L (g • q, g • q) ∀ q ∈ Q, q ∈ T q Q, g ∈ G, g ∈ E d . ( ) Where g represents a rotation/reflection + translation in E d , and g is the action of the DMS finite group G, forcing a transformation of the internal joint-space configuration. The difference between g and g is highlighted when reformulating eq. ( 5) for a specific system configuration: L Åï ρE d (g)XB q ò , ï ρE d (g) ẊB q òã = L Åï ρE d (g)XB ρQ J (g) q ò , ï ρE d (g) ẊB ρQ J (g) q òã ρE d (g) = ±1, ρE d (g) = 1 ρE d (g)X = ρE d (g)XρE d (g) -1 (6) What eq. ( 6) highlights is that with DMS infeasible reflections (|ρ E d (g)| = -1) and feasible or infeasible rotations/translations of space are imitated by a feasible (|ρ E d (g)| = 1) transformation to the system's base and a change in joint-space configuration. Furthermore, the structure of the proper transformation ρ E d (g)X = ρ E d (g)Xρ E d (g) -1 , along with the properties of the dynamics of symmetrical dynamical systems (eq. ( 4)), provide a pathway for the identification of G and the representations ρ Q (g) | ∀g ∈ G for any floating-base dynamical system (section 4.3).

4.1. DATA AUGMENTATION IN SYSTEMS WITH DMS

Recall from section 3 that point-transformations g ∈ G have the same representation ρ Q (g) for the configuration space Q, its tangent space T q Q and any higher order tangent spaces (e.g., the space of generalized accelerations and forces (eq. ( 3))). Since our floating-base systems' configuration space has the topology Q . = E d × Q J , this property passes to the representations on E d and Q J . Meaning that the representation ρ E d (g) can be used to augment members of E d (i.e., points, vectors, and orientations) and members of E d higher order tangent spaces (i.e., linear & angular velocities/accelerations). Likewise the representation ρ Q J (g) can be used to augment members of Q J and its higher order tangent spaces (i.e., joints positions/velocities/accelerations, joint forces/torques). In practice, this means that any proprioceptive (e.g., joint torques, contact forces) and exteroceptive (e.g., point clouds, terrain height maps, RGBD-images) measurements relevant to the evolution of the system's dynamics, can be augmented solely with combinations of ρ E d (g) and ρ Q J (g). Since these measurements live in Q and E d and their higher order tangent spaces (see examples in supp.fig 5 and supplementary E.3.1 and E.4.1). To achieve this we need to identify the symmetry group G and its action representations (section 4.2).

4.2. DMS IN THE CASE OF RIGID-BODY DYNAMICS

Until now, we have only assumed our dynamical system is a floating-base system. Now, we assume the system dynamics are ruled by ridig-body dynamics. This means that our system is a collection of n B interconnected rigid bodies. This is the most frequent scenario in robotics, computer graphics, and experimental biology (see supplementary A). In rigid body dynamics the generalized mass matrix is given by M (q) = n B k J T k (q) ⊺ m k J T k (q)+J R k (q) ⊺ I k J R k (q), being J T k (q) : Q → R d×n and J R k (q) : Q → R d×n the position and orientation Jacobians that are used to map generalized velocities to the linear (ṙ k = J T k (q) q) and angular (w k = J R k (q) q) velocities of the body k (Wieber, 2006) . These Jacobians are functions of the kinematic parameters of the system. While m k and I k , the mass and inertia of body k, represent the dynamic parameters of the system dynamics. A DMS implies symmetries over the kinematic and dynamic parameters of the system, that in practice become useful for the identification of the DMS group G.

Symmetries of kinematic parameters (Kinematic Tree):

Considering only the kinematic parameters and the equivariance nature of M(q) (eq. ( 4)), we conclude that a rigid-body system with a symmetry group must have positional and rotational Jacobians that respect J T k (g • q) = J T k (q)g -1 ∧ J R k (g • q) = J R k (q)g -1 | ∀ g ∈ G. Being g a continuous or discrete symmetry action. In the case of DMS, in which the discrete action g ∈ G is designed to imitate the effect of a specific continuous symmetry action g ∈ E d , we have that the i th body Jacobians should respect: JT i (g•q)g = JT k (g •q) = JT k (q)g -1 ∧ JR i (g•q)g = JR k (g •q) = JR k (q)g -1 | ∀(g, g)|g ∈ G, g ∈ E d (7) Where k is the index of the body of the g transformed system (see appendix C.3). In practice, the symmetry in kinematic parameters described in eq. ( 7) is interpreted as a kinematic tree symmetry (see supplementary C), requiring the discrete action g to result in a kinematic tree indistinguishable from the one obtained by applying the rotation/reflection + translation g.

Symmetries of dynamic parameters (Mass and Inertia of rigid-bodies):

In order for a rigid-body dynamical system to have a DMS the bodies of the system must have symmetric mass distribution or the kinematic tree must be modular (subchains of the tree are symmetric to each other). To understand this morphological constraint consider the base body configuration X B ∈ E d and the definition ρ E d (g)X B = ρ E d (g)X B ρ E d (g) -1 in eq. ( 6). Where the action on the left of X B is interpreted as a Euclidean transformation of the base in a global reference frame and the action to the right as a transformation in the frame attached to the base. Recall that, for g and g to be Lagrangian-equivalent (eq. ( 5)) the dynamics of the base body at the configuration ρ E d (g)X B should be identical to the dynamics of the body at ρ E d (g)X B (eqs. ( 3) and ( 5)). Assuming exact kinematic parameter symmetries, both body configurations will have equivalent dynamics if their Inertia matrix I B in both configurations is identical . Because in general, ρ E d (g) ̸ = ρ E d (g), the rigid-body Inertia must be invariant to the right transformation X B ρ E d (g) -1 . This inertia invariance implies a symmetric mass distribution of the rigid body (see geometric proof in supplementary C.2)). And becomes a key property for the identification of the DMS group G. As an example consider the robot Solo in fig. 1 . This robot is able to imitate two reflections of space (g t , g s ) and a 180 • rotation of space g r . This is possible since the base body of the robot has two symmetry planes (see supp.fig 6b), making the inertia of the base I B , at any arbitrary configuration, invariant under the transformation X B ρ E d (g) -1 | g ∈ {g t , g s , g r } ∈ K 4 . Modular Kinematic Trees: Theoretically, the previously described constraint of symmetric mass distribution applies to all rigid bodies in the system, limiting the applicability of DMS to diverse floating-base systems. Conveniently, most systems of interest are modularfoot_7 , i.e., their kinematic trees are composed of subchains with identical or reflected rigid bodies (e.g., see in supp. These are the candidate actions that the system could imitate. 3. Identify modularity in the kinematic tree. I.e., all pairs of identical/reflected rigid bodies. 4. From base to end-effectors use eq. ( 7) to determine for each g, if the action g and ρ Q J (g) exists.

5. G-EQUIVARIANT AND G-INVARIANT FUNCTION APPROXIMATORS

Once we identified the DMS group G of our system, we know that any proprioceptive or exteroceptive measurements have the same symmetry group G (section 4.1). Therefore, to improve generalization and sample efficiency, we can exploit the known symmetries of the input x and output y spaces, of any mapping we desire to approximate, by constructing G-equivariant or G-invariant (eq. ( 1)) NN f (x; ϕ), with parameters ϕ (Bronstein et al., 2017) . This section is built on top of the framework for the construction of G-equivariant NN of Finzi et al. (2021b) . Where our main motivation is to address the limitations that prohibit the construction of large-scale G-equivariant NN (see supplementary D), which are ubiquitous in real-life applications. Consider f (x; ϕ) to be composed of multiple perceptron (or convolutional) layers of the form l y := σ( l W l x + l b), where l x ∈ R n , l y ∈ R m , l W ∈ R m×n and l b are the l layer's linear map and bias, respectively; and σ : R → R is a strictly monotonic nonlinearity (Ravanbakhsh et al., 2017) . With this parametrization, the equivariance constraints of eq. ( 1) can be reduced to constraints on the linear map W (dropping the layer index l for notation clarity):foot_8  ρ out (g)W = Wρ in (g) | ∀ g ∈ G ⇐⇒ (ρ W (g) -I)w = 0 | ∀ g ∈ G. (8) The RHS of eq. ( 8) is a reformulation of the equivariance constraints as a standard set of linear equations, defining ρ W (g) = ρ out (g) ⊗ ρ in (g -1 ) ⊺ ∈ R mn×mn as the representation of the group action acting on the linear map as a result of a semi-direct productfoot_9 of the input and output group actions (⊗ stands for the Kronecker product) and w = vec(W) ∈ R mn as a vectorized version of W (refer to Finzi et al. (2021b) for details). Since the constraint imposed by each g is linear in W, we can stack them into a single system of linear equations Cw = 0. The nullspace of this system of equations B ∈ R mn×r describes the r basis vectors spawning the entire space of equivariant linear maps. Allowing to parameterize all G-equivariant W as: w = r k c k B :,k ⇐⇒ W = r k c k unvec(B :,k ) . = r k c k B :,:,k . Where the basis coefficients c ∈ R r represent the free variables of the system of equations and the trainable parameters of the equivariant layer (see fig. 1 

right).

Dealing with memory complexity of equivariant layers: An equivariant layer needs to store the matrices ρ W (g) and B, in addition to the typical memory complexity of a perceptron or convolutional layer. These matrices' memory complexity quickly becomes intractable for moderate inputoutput dimensions (see supp.table 1 ). Fortunately, finite symmetry groups have sparse action matrix representations, resulting in both of the aforementioned matrices being sparse. Our implementation 2 extends the Pytorch API from Finzi et al. (2021b) to process finite groups with sparse matrix definitions limiting the additional memory footprint of equivariant layers to a minimum. Dealing with the computational complexity of determining the equivariant basis B: Computing B for a layer is a process with high computational complexity. Both approaches run in polynomial time O(r 2 (mn) 2 ) (prohibiting their use in large dimensional spaces) and approximate the space rank r numerically. Fortunately, DMS groups G are finite and have, in general, generalized permutation matrices as regular representations. Enabling the computation of B in linear time (see supplementary B): Note that the constraints imposed by each ρ W (g) result in parameter sharing constraints (e.g., w 10 = -w 2 = . . . = w 0 ). In these cases, every vector of the null-space of C (i.e., B i ) simply describes the sharing scheme of a free variable of the system of equations (i.e., the trainable parameter c i ), and this sharing scheme is nothing else but one of the unique r orbits of the dimensions of w when transformed by all group actions, e.g., G • w 10 = {g • w 10 : ∀ g ∈ G} = {w 10 , -w 2 , . . . , w 0 } (see the parameter orbits of length 4 in fig. 1 -right, for K 4 ). The orbits of all w ∈ w are trivially computed with [w, ρ W (g 1 )w, . . . , ρ W (g |G| )w], while the unique r orbits can be identified in O(mn) time. Our proposed solution can be thought of as a linear-time version of Ravanbakhsh et al. (2017) . Optimal trainable parameter initialization for equivariant layers: Proper initialization of the equivariant layer's trainable parameters c l (eq. ( 9)) is required to avoid activations/gradients from vanishing or exploding (Klambauer et al., 2017) . Following the same derivation of the Kaiming initialization (He et al., 2015) (see supplementary D.2), we can conclude that the parameters should be initially sampled from a distribution with Var(c l ) = m /λ B γσ, to ensure constant variance of activations throughout the network layers (see supp. 2017)). This initialization depends only on B. Thus, is applicable to any Lie or finite group. Reduction of trainable parameters in equivariant layers: Determining analytically the number of trainable parameters (i.e. the rank r) of an G-equivariant layer is, in general, an unresolved problem. However, for DMS groups, we show on supplementary D.1 that the number of trainable parameters of a G-equivariant layer can range from |w| /|G| ≤ r ≤ |w|, depending on the number of dimensions of the input-output spaces left invariant by the symmetry actions. In practice, this implies that for a G-equivariant layer without any input-output fixed points (e.g., all intermediate layers of a G-equivariant NN), the number of trainable parameters is reduced by 1 /|G| being |G| the group order. Therefore a G-equivariant architecture with G = C 2 (supp.fig 6a) (or G = K 4 , see fig. 1 ) will have approximately 1 /2 (or 1 /4) of the trainable parameters of an unconstrained NN of the same architectural size. The reduction of parameters is caused by the parameter sharing constraints (eq. ( 9)) and is visually depicted in fig. 1-right . 

6. EXPERIMENTS

We present two experiments of supervised learning, a regression application using synthetic data and a classification application using real-world data. Both experiments aim to illustrate the versatility of DMSs for data augmentation and training of equivariant functions, along with the impact on the model's sample efficiency and generalization capacity when exploiting DMSs. While we keep the presentation concise, all the technical aspects are detailed in supplementary E and 2 .

CoM Momentum Estimation (Regression):

In this experiment, we train a NN to approximate a robot's center-of-mass momentum given the joint-space position and velocities: h = A G ( q) q. Where h = [l ⊺ k ⊺ ] ⊺ are the linear l and angular k momentum components And A G is the Centroidal Momentum Matrix (CMM) of Orin et al. (2013) . This analytical function is highly non-linear and G-equivariant to the robot's DMS group G. Consequently, the function approximator is expected to be equivariant or to approximate equivariance. 2021) for the estimation of static-friction-regime contacts in each of the four legs of the Mini-Cheetah quadruped robot. The dataset samples, collected in the real-world, consist of the history of the past 150 time-frames of proprioceptive data collected from inboard sensors of the robot during locomotion with various gaits and over several terrains, x ∈ R 54×150 , and the ground truth contact state of the robot y ∈ R 16 , estimated off-line using a non-causal algorithm (i.e., dependant on the past and future). The objective is to train a causal function approximator f (x; ϕ) for estimating the contact state. The real-world Mini-Cheetah robot has an approximate reflection DMS G ≈ C 2 . Hence, both the proprioceptive data x and the contact state y share the symmetry group G (see supplementary E.4). We compare three variants of function approximators: the original Conv-NN architecture of Lin 3 ). Refer to appendix E.5 for details.

7. CONCLUSIONS & DISCUSSION

In this work, we present the concept of Discrete Morphological Symmetry (DMS). These are discrete symmetries of dynamical systems evolving in Euclidean space, that are associated with the capability of the system to imitate Euclidean transformations (rotations/reflections and translations) with discrete changes in the system's internal state configuration. With this formalism, we can describe the bilateral and radial symmetries that are ubiquitous in robotic systems and animals in nature. By studying these symmetries with the language of Group Theory, we propose a mechanism for the identification of the finite DMS group G, and of the representations of the symmetry actions in the system's state variables and relevant proprioceptive and exteroceptive measurements. Having made the connection between Dynamical Systems and Group Theory, we show why and how these symmetries should be exploited in data-driven applications-to obtain improvements in sample efficiency and generalization capacity-, either by using data-augmentation or G-equivariant Neural Networks. For the latter, we present practical contributions addressing the implementation drawbacks (intractable computational and memory complexity) of using G-equivariant architectures for real-world applications. Additionally, we release open-access code enabling the rapid prototyping of G-equivariant Neural Networks for the exploitation of DMS in applications processing data from rigid-body dynamics (e.g., robotics, computer graphics, and computational biology). Lastly, we present empirical results supporting our claims on two data-driven applications using synthetic and real-world data from three different robots. In both experiments improvements in sample efficiency and generalization are obtained by exploiting the morphological symmetries bias, motivating the use of this technique in applications using data from simulation and/or the real world. Limitations: For a detail account of limitations see supplementary B.

REPRODUCIBILITY STATEMENT

The experimental setup used in the experiments is described in section 6 and supplementary E. Moreover, our implementation 2 will be open-access, where any interested party can find: (1) The original scripts used to run the experiments and generate the results, (2) the parameters of the models used for comparison in the experiments, avoiding the need to retrain the models to test the results, (3) the datasets used in each of our experiments, including the custom partitioning of the dataset of Lin et al. ( 2021), (4) the scripts used to summarize the results into the figures used in this paper, and (5) 3D interactive environments that allow for the visualization of the morphological symmetries, one of this environments was used for capturing fig. 1 -left and its 3D animation.

SUPPLEMENTARY ON DISCRETE SYMMETRIES OF ROBOTICS SYSTEMS: A GROUP-THEORETIC AND DATA-DRIVEN ANALYSIS A APPLICATIONS OF DISCRETE MORPHOLOGICAL SYMMETRIES

In this section, we provide our perspective on possible applications where DMS can be of value and the respective fields of knowledge where these applications play an important role. All applications of DMS for data-driven techniques fall within two categories (1) data augmentation of proprioceptive and exteroceptive measurements, and (2) G-Equivariant function approximation. The fields of knowledge that can benefit from the aforementioned applications are: • Biology, Biomechanics, and Experimental Veterinary. Studying the biomechanics and dynamics of animals in nature is becoming a fundamental area of the fields of Biology, Biomechanics, and Experimental Veterinary (Wei & Kording, 2018) . Considering that around 99% of eumetazoans (most species excluding sponges and other sea species) are Bilaterian (Ferretti et al., 2020) (i.e., having approximate C 2 symmetry) or Radiatal (i.e., having approximate C n | n ≥ 2 symmetry), DMS become a flexible and natural approach to study the data gathered from the study of these systems, especially of vertebrates, whose dynamics are often approximated to rigid body dynamics (Wu et al., 2022) . The process of study of animal motion dynamics normally involves the use of motion capture data of animal motions using marker-based (Prankel et al., 2016) or marker-less (Mathis et al., 2018) sensor pipelines. The data of the markers is then either directly processed or fitted to kinematic models (that make the assumption of exact kinematic symmetries) and then processed for information retrieval. DMS offers a clear approach to mitigate the cost of data collection by providing a simple approach for data augmentation, and for the construction of G-Equivariant NN to process the dynamics of the kinematic chains. • Computer Graphics & Vision. Computer Graphics is perhaps the de facto application field of exact DMS. In this area, the kinematic structure of animated characters is assumed to be symmetric, as they often model the behavior of living vertebrates with C n | n ≥ 2 symmetry. The recorded trajectories are obtained through motion capture data or expert artist animations, and to the author's knowledge, the trajectories are seldomly augmented to their symmetric equivalents. In this field, NNs are commonly used to learn projection spaces where motion matching and animation interpolation becomes an easier problem than in minimal coordinate space (Holden et al., 2015; Starke et al., 2022) or control policies for physics-based animation (Peng et al., 2018; Ma et al., 2021) . However, the exploitation of the DMS inductive bias is not common in the field and has been approached (with specialized and costly algorithms) solely for the C 2 symmetry in Yeh et al. ( 2019 • Robotics. NN function approximators are becoming a valuable tool in robotics applications of perception, control (Miki et al., 2022) and state-estimation (Lin et al., 2021) . In all applications, NNs are used to approximate functions processing proprioceptive or exteroceptive measurements related to the evolution of the dynamics of the robot. Despite the majority of legged platforms (Radford et al., 2015; Grimminger et al., 2020; Miki et al., 2022) and manipulators having C n | n ≥ 2 symmetries, the use of data augmentation (or Gequivariant NN) to mitigate the high cost and risks of collecting data with robotic systems in the real-world (or simulation) is not a widespread technique. We believe the framework of DMS and our open-access code can contribute to the adoption of G-equivariant function approximators in the field. • Control In the case of model-based control, it is common to exploit the symmetries of the Euclidean Lie Group in which the robot evolves (section 3) (Wu & Sreenath, 2015) and to inherently exploit the equivariance of inertial forces (eq. ( 4)) by assuming approximate or exact symmetries in the kinematic and dynamic parameters of the dynamics model (Mastalli et al., 2022) . However, to the authors' knowledge, no approach exploits DMSs (and especially the discrete symmetries of the joint space Q J ) in applications of exploration of space, planning, and trajectory optimization, where DMS offers a technique to avoid the computation of symmetric trajectories. In model-free control, specifically in reinforcement learning (RL), the exploitation of symmetries in the dynamics implies mitigation of the sample inefficiency and sensibility to local optima that these learning algorithms have. Previous works have shown the impact of symmetry data-augmentation (Weissenbacher et al., 2022; Ordonez-Apraez et al., 2022) and of G-equivariant function approximators (Van der Pol et al., 2020; Ordonez-Apraez et al., 2022) on model-free RL.

B LIMITATIONS

Our work makes two main assumptions: 1. Symmetries are exact: By assuming that a dynamical system has exact and not approximate symmetries we are departing from the real-world nature of DMSs, since for any robotic system in the real-world the manufacturing and assembly process introduces errors/tolerances in the kinematic and dynamic parameters of each of the robot's bodies. Likewise, the dynamics of animals in nature are not perfectly equivariant since morphological symmetries are only approximate symmetries. Although exact symmetries seem to be a strong assumption, in practice, the reality is that it is a common assumption in the fields of robotics and control theory, in which idealized models of the dynamics are often assumed, implying exact DMSs through the exact symmetries in kinematic and dynamic parameters (which are responsible for the equivariant nature of the generalized mass matrix M(g • q) = gM(q)g -1 (eq. ( 4)) and therefore, for the equivariance of inertial, centripetal, gravitational, and Coriolis generalized forces). On section 6 we show that the exact symmetry bias is justifiable and beneficial for learning function approximators processing the dynamics of approximately symmetrical systems in the real world. However, the authors highlight the necessity to properly address the case of approximate equivariance, which we leave to future work. To address this case, system identification techniques Simpkins (2012) have been wildly used to approximate the deviation of the kinematic and dynamic parameters from the assumed values. While in the case of G-equivariant NN Wang et al. (2022) ; Finzi et al. (2021a) provide clear and valuable approaches to learn approximate G-equivariant NN. It is relevant to highlight that, in physics-based simulation, the most common practice is to work with the idealized model of dynamics. Thus, the assumption of exact symmetries is justifiable and encouraged in applications where simulation is a relevant tool (see supplementary A). 2. Linear time computation of the basis B of equivariant linear maps is restricted to group actions with regular matrix representations that are generalized permutation matrices: The algorithm for computing B in O(mn) time and determining analytically the rank r of this space (see section 5) is restricted to the scenario where ρ w (g) ∈ R nm×nm is a generalized permutation matrix (which occurs when both ρ in (g) and ρ out (g) are also generalized permutation matrices). Although this assumption always holds true for G = C 2 (the most common DMS) and for all action representations described in section 6 and supp.fig 5 and fig. 1 , in general, it might not hold for |G| > 2. In cases where either ρ in (g) and ρ out (g) are not generalized permutation matrices, the computation of B can be approached using the Krylov subspace method proposed by Finzi et al. (2021b) with complexity O(r 2 (mn) 2 ) and numerical approximation of r = rank(B). Although this seems like a strong assumption, consider that in the case of DMS groups G: • All action representations ρ Q J (•), acting on the joint-space manifold Q J and its tangent space, are generalized permutation matrices. This property holds true when using the common convention of minimum coordinates for q and q, in which each vector of the orthogonal basis of Q J corresponds with a degree of freedom of the system. With this convention any DMS action g acting on a single degree of freedom can be defined as a function of a single degree of freedom g • qi = g(q j ) s.t. qi , qj ∈ q. • All action representations (ρ in (g), ρ out (g)) of the latent vector spaces of internal layers of an equivariant neural network (e.g., l z in fig. 1 ) can be arbitrarily parameterized (while respecting the group axioms). Singe G is by definition a finite symmetry group, we can parameterize ρ in (g) and ρ out (g) to be generalized permutation matrices.

C PROPERTIES OF ROBOTIC SYSTEMS WITH MORPHOLOGICAL SYMMETRIES

For clarity of the explanation, let us imagine two different Euclidean spaces and two versions of the robot: the original space (with reference frame o) and robot with coordinates q and q, and the virtual rotated/reflected space (with a reference frame o, with configuration X o o = î Rg ro 0 1 ó ) and virtual robot with coordinates g • q and g • q referenced to o. Noting that in the case of a reflection, the virtual robot has reflected versions of each rigid body. For eqs. ( 3) and ( 5) to hold, there must exist an action g ∈ G transforming the real robot configuration g • q, g • q resulting in the same kinetic energy as the virtual robot's kinetic energy: T (g • q, g • q) = 1 2 n B i=1 m i ṙ⊺ g,i ṙg,i + w ⊺ g,i I i w g,i . = 1 2 n B k=1 m k ṙ⊺ k ṙk + w ⊺ k I k w k = T (g • q, g • q), where ṙg,i , w g,i , m i and I i are the linear and angular velocity, mass, and inertia matrix of the transformed body i (referenced to o). Likewise, ṙi , w i , m i and I i are the equivalent quantities for the virtual robot and body i (referenced to o).

C.1 KINEMATIC SYMMETRIES:

Ignore momentarily the influence of the mass and inertia in terms of the real and virtual bodies. We can assert that for supp.eq 10 to hold, the transformed configuration should result in a kinematic tree indistinguishable from the virtual robot's. Thus, for everybody i in the real robot kinematic tree, there should exist an equivalent virtual body k (as seen in supp.fig 5, not always k = i). By equating the linear and angular velocities of the real and virtual bodies, referenced to o, and expressing the velocities as functions of the generalized coordinates we obtain: ṙg,i = ṙk . = R g • ṙk J Ti (g • q)g • q = R g • J T k (q) q J Ti (g • q)g = R g • J T k (q) w g,i = w k . = |R g |R g • w k J Ri (g • q)g • q = |R g |R g • J R k (q) q, J Ri (g • q)g = |R g |R g • J R k (q), where J Ti (q), J Ri (q) ∈ R 3×n are the position and orientation analytical Jacobians (describing the instantaneous velocity vectors contributed by each DoF to body i) of the real robot at a configuration q (Wieber, 2006) . Formulating supp.eq 11 for each of the n B bodies of the robot we obtain at best n B × 3 × n non-linear equations that can be used to assert if g exists. In practice, the action representation ρ Q (g) and especially its component acting on the joint space ρ Q J (g) can be trivially determined by solving supp.eq 11 (or equivalently eq. ( 7)) for each body from top to bottom of the kinematic tree (i.e., base first, end-effectors last), if g exists.

C.2 REFLECTIONS/ROTATIONS REQUIRE MODULARITY OR SYMMETRIC RIGID BODIES

Let us assume kinematic symmetry and direct our attention now to the influence of the mass and inertia terms on the kinetic energy of a single rigid body when it is transformed with the action g, which imitates a true reflection of space g. Focus on the first two columns of supp.fig 4. Because of the kinematic symmetry the CoM of the reflected and transformed bodies coincide, both bodies have equivalent linear components of kinetic energy. However, for arbitrary rigid bodies, the reflected and transform bodies will have different angular components of kinetic energy. Note that in the general case, the transformed and reflected bodies' inertia will differ, thus even if both bodies have the same angular velocities, their kinetic energy will differ. Let p, p g and p g be frames located at the CoM of the original, reflected and transformed bodies, aligned with the principal axes of inertia of each of the bodies. Similarly, denote I wg Supplementary Figure 4 : Properties of bodies capable of imitating a true reflection g of space (w.r.t the yz-plane in this case), with a proper transformation g involving only a rotation and translation. The first row shows the original bodies with their respective angular velocities w , subjected to trivial symmetry transformation e (dashed lines represent the principle axes of inertia of the bodies), and the second and third rows display the effect of g and g on the bodies and angular velocities, respectively. The first column displays a rigid body with symmetric mass distribution, for which g exists, as the reflected and rotated bodies share an equivalent angular kinetic energy. The second column shows a rigid body with asymmetrical mass distribution, for which the rotation g, that produces a kinematic symmetry, results in the reflected and rotated bodies having different angular kinetic energies (eq. ( 2)). The third column shows two bodies with asymmetrical mass distributions, each a reflected version of the other, in this case, the action g swaps bodies configurations to imitate the configuration and energy state of the reflected bodies transformed with g. Angular velocity is a pseudo-vector (or axial-vector), for which a reflection transformation is computed as w g = |R g |R g w (see Quigley (1973) ). that: w o ⊺ g I g o w o g = w o ⊺ g I g o w o g , , (R g w o ) ⊺ I g o (R g w o ) = w o ⊺ g I g o w o g | w g o = |R g |R g w o , I g o = R g I g o R g | w o ≡ w o g , R g R g = I, R o pg I g pg R o pg ⊺ = R g R o pg I g pg R o pg ⊺ R g | I a = R b a I b R b a⊺ , R o pg I g pg R o pg ⊺ = R g R o p R p pg I g pg R p pg ⊺ R o pg ⊺ R g , R o pg . = R g R o p R p pg | I p ≡ I g pg ≡ I g pg . ( ) What eq. ( 12a) states is that in order for the reflected and transformed bodies to have equivalent angular kinetic energy, both bodies should have co-linear (or aligned) principal axes of inertia. This allows us to describe R o pg as a function of the original body configuration R o p and two reflection matrices: the true reflection of space R g and a body specific diagonal reflection matrix R p pg , which exists only if the rigid body has a symmetric mass distribution. A visual example for symmetric and asymmetrical rigid bodies is presented in supp.fig 4 left and middle columns.

C.3 MODULARITY IN KINEMATIC TREES

As previously described, body i of the real robot should have a reflected equivalent body k with the same CoM position and aligned principle axes of inertia, up to a reflection of any of these axes. If body i is unique in the kinematic tree, then k = i and the body must have at least a symmetric mass distribution. When body i is not unique (i.e. there exists another body k ̸ = i that is its true reflected version), then no constraint of symmetric mass distribution is imposed on i, only the alignment of the principal axes of inertia is required. A didactic example of this scenario is presented in supp. Although it is implicitly implied on eq. ( 2) that the constrained position Q and velocity T q Q configuration vector spaces should also be symmetric or equivariant, this property might be easily overlooked. As mentioned in section 4 the relevance of morphological symmetries relies on the equivariant nature of the system dynamics (eqs. ( 1) and ( 3)), which imprints symmetry constraints on optimal control policies and proprioceptive and exteroceptive measurements. However, with non-symmetric constrained configuration spaces, eq. ( 2) will not hold for every system state q ∈ Q, q ∈ T q Q, and any uncontrolled or controlled trajectory of the system dynamics shall not have a symmetric equivalent trajectory, as this has the potential to violate the constraints of the configuration space.

D EFFICIENT CONSTRUCTION OF G-EQUIVARIANT NNS FOR DISCRETE MORPHOLOGICAL SYMMETRY (DMS) GROUPS G

As mentioned in section 5 our work builds upon the framework for the construction of G-equivariant NN of Finzi et al. (2021b) . The core limitation of this framework is the inability to handle large dimensional spaces, due to the computational and memory complexities. For instance, for an equivariant layer with input dimension n and output dimension m, the computational complexity of finding the equivariant linear map basis B (which is quadratic O((mn) 2 r 2 ) through the Krylov submethod) and the memory complexity of B ∈ R mn×r | r ≤ mn, become easily intractable for moderate n and m dimensions. This limitation is openly discussed in the EMLP repository README.md, but regretfully not in the original paper. In practice, we found these limitations when trying to construct the equivariant version of the Contact CNN (Lin et al., 2021) in our second experiment. This architecture in its internal layers has n, m > 2000, for which: (i) the Krylov subspace method complexity renders the operation intractable with standard hardware and (ii) the matrices B of internal layers required storage of 1[Gb] > for moderate input output dimensions (m, n ≈ 250) and 1[P b] > for m, n > 2000). See supp.table 1 for a comparison between dense and sparse matrix representations.

D.1 TRAINABLE PARAMETER REDUCTION OF G-EQUIVARIANT LAYERS (FOR G

A DMS GROUP) Determining analytically the number of trainable parameters (i.e. the rank r) of an G-equivariant layer is, in general, an unresolved problem. However, for DMS groups, r can be computed once the input-output action representations are known. The requirement to compute r is that actions affecting the linear maps are a semi-direct product 10 of the input-output groups, and the inputoutput representations are generalized permutation matrices. These conditions are met for most DMS groups (see supplementary B). The equivariance constraints of eq. ( 9) on linear maps of perceptron (or convolutional) layers imply a reduction of trainable parameters from |w| = mn to |c| = r ≤ mn. For DMS groups, r is associated with the number of unique orbits of the elements of w. Thus we can compute this value using the orbit-counting theorem (also known as Burnside's Lemma), which states that the number of orbits is the average number of fix-points of G, that is r = 1 |G| g∈G |w g |, where w g . = {w ∈ w : g • w = w} represents the set of elements of w that are invariant to g (i.e. fix-points). Those fixpoints can be identified by the elements on the diagonal of ρ W (g) that are equal to one. Therefore,

Tri-Finger Robot G = C3

This fixed-based robot is symmetric w.r.t. rotations of space by θ = 120 • in the vertical axis. Therefore, its symmetry group is the cyclic group of order three (G = C3). To identify this symmetry group we apply the procedure in section 4.3: 1. Identify XB and IB: As a fix-based robot, we define XB to be the mounting structure supporting each finger, and the gray disk delimiting the workspace (see image). 2. Identify symmetries of IB: The inertia of this virtual base IB is invariant to rotations by 120 • in the vertical axis. I.e., IB is invariant to XBρE 3 (g) -1 |g ∈ {e, g θ , g 2 θ } ≡ C3 (eq. ( 6)).

3.. Identify modularity in the kinematic tree:

There are three symmetric kinematic subchains. Each finger is composed of replicated versions of the same bodies. 4. Identify the DMS group G: Consider that the transformation ρE 3 (g)XB . = ρE 3 (g)XBρE 3 (g) -1 (eq. ( 6)) can be interpreted as a rotation of the virtual base by θ • followed by a rotation -θ • in the z axis. Thus respecting the fix-base constraint of the system. Denote the jointspace q = q = [q ⊺ f 1 , q ⊺ f 2 , q ⊺ f 3 ] ⊺ be composed of each finger's DoF (q f i ∈ R 3 ). Then we can define ρQ J (g) . = ρ R 3 (g) ⊗ I3 | g ∈ C3. Being ρ R 3 (•) the unique representation of the actions of C3 on a 3-dimensional space (3 subchains). For the generator action of the group this is ρ R 3 (g) = î 0 1 0 0 0 1 1 0 0 ó . Lastly, we verify if G = C3 by testing all tentative group actions for DMSs eq. ( 5). Augmentation of data samples: Say we collect a dataset of robot states (q, q) and cube states XC at every time step t, to train the manipulation policy (Funk et al., 2021) . To obtain the symmetric states, at every t, we need to understand that since we are imitating the effect of a true rotation of space g, the symmetric states are obtained by (g • q, g • q) and (g • XC . = ρE 3 (g)XC ).

Bolt Bipedal Robot G = C2

Bolt is a bipedal robot with a sagittal plane reflection symmetry (G = C2). This morphological symmetry allows it to imitate the effect of arbitrary reflections of space (g ∈ E3) by re-configuring its base and legs. To identify this symmetry group we apply the procedure in section 4.3: 1. Identify XB and IB: XB is the robot base (hips) body, with its corresponding inertia IB 2. Identify symmetries of IB: The base body has symmetrical mass distribution w.r.t the sagittal plane. Thus, IB is invariant to the transformation XBρE 3 (g) -1 |g ∈ {e, g s } ≡ C2 (eq. ( 6)).

3.. Identify modularity in the kinematic tree:

There are two symmetric kinematic subchains. The left leg subchain and bodies are reflected versions of the right leg subchain and bodies. 4. Identify the DMS group G: Since a reflection w.r.t to the sagittal plane would imply a true reflection of the rigid bodies of the legs, we need to permute each body in the kinematic tree with each reflected version. Denote the joint-space q = [q ⊺ L , q ⊺ R ] ⊺ as composed of the left L and right R legs' DoF (q L/R ∈ R 3 ). Denote the sign-relation between the DoF of the Left and right legs' degrees of freedom as s L|R ∈ R 3 . Then we can define ρQ J (g) . = ρ R 2 (g) ⊗ (s L|R I3) | g ∈ C2. Being ρ R 2 (•) the unique representation of the actions of C2 on a 2-dimensional space (2 subchains). For the non-trivial action of the group this is ρ R 2 (g s ) = [ 0 1 1 0 ]. Lastly, recalling the definition of ρE 3 (g) in eq. ( 6), we verify if G = C2 by testing all tentative group actions for DMSs eq. ( 5). Augmentation of data samples: Say we collect a dataset of robot states (q, q) and ground reaction forces (fL, fR), that we transform to the space of generalized forces as (τ f L , τ f R ), at every timestep t. Aiming to train a reactive locomotion policy (Ordonez-Apraez et al., 2022) . The symmetric states, at every t, are then defined as: (g • q, g • q) and (g • τ f L , g • τ f R ) ≡ (ρQ(g)τ f L , ρQ(g)τ f R ) Supplementary Figure 5 : Example morphological symmetries of the Tri-Finger (Funk et al., 2021) and Bolt robots, allowing to imitate a rotation of space (left) and a reflection of space (right). The core difference in the initialization of unconstrained and equivariant layers lies in the way the linear map is parameterized. For equivariant layers we have: Var( l W l x + l b) = m i n j Var l W i,j l x j | Var( l b) = 0 = m i n j Var r k l c k l B m,n,k l x j , = Var l c l x m i n j r k B 2 m,n,k λ l B | Var a s a const p = a s 2 a Var(p) In the forward-propagation scenario, we are interested in conserving the variance of the activations throughout layers, that is we must ensure Var( l z) = Var( l-1 z). Using supp.eq 14 we obtain: Var( l z) = Var( l W l x + l b) m Var( l z) = λl B Var l c l x Var( l z) = λl B m Ö E( l c 2 ) Var( l c) E( l x 2 ) -E( l c) 2 =0 E( l x) 2 è Var( l z) = λl B m Var( l c)E l-1 y 2 | l x = l-1 y = σ( l-1 z) Var( l z) = λl B λ σ m Var( l c)Var( l-1 z) | E l-1 y 2 = λ σ Var l-1 z (15) Var( l z) ≡ Var( l-1 z) | Var 'l c . = m λl B λ σ (16) where λ σ in supp.eq 15 is a non-linearity dependent scalar computed analytically or empirically (see He et al. (2015) ). In supp.eq 16 we conclude that if we sample the equivariant layer trainable parameters l c from a distribution ensuring Var( l c) . = m λ l B λσ , the variance of the activations across equivariant layers remain constant. A similar procedure can be applied to the backward propagation case, concluding that in order to maintain a constant variance of the gradients across the network layers we should sample the trainable parameters ensuring Var( l c) . = n λ l B λσ . As remarked in He et al. (2015) both variance values for the forward and backward propagation cases lead to the proper flow of information in the network. On supp.fig 8, it can be appreciated that our method achieves equivalent results for equivariant architectures as He et al. (2015) does for standard linear and convolutional architectures.

E IMPLEMENTATION DETAILS & CODE

Additional to this section, we provide open-access code with the scripts for reproducing the experiments of this work, the parameters of the models used for comparison, along with additional interactive examples visualizing morphological symmetries of both robotic systems and data.

E.1 EFFICIENT DATA AUGMENTATION

Since any input x and output y spaces of equivariant architectures have matrix symmetry action representations, ρ x (g), ρ y (g), it is possible to perform batched data augmentation, reducing the computational complexity of augmenting a batch of N b samples from N b matrix-vector multiplications to a single matrix-matrix multiplication, preferably performed after data is loaded to GPU for optimal performance.

E.3.2 PRACTICAL DETAILS OF THE DATASET GENERATION

The URDF files of the robots Solo and Atlas are generated using XACRO scripts, which replicate the structure of limbs to their symmetric counterparts, making the dynamics of the robots in simulation exactly G-equivariant. However, the algorithm for computing the CoM momentum from Pinocchio is numerically sensitive, resulting in the orbits of the momentum G • h deviating slightly from the theoretical orbits. Therefore to reduce numerical errors and ensure the theoretical equivariance of the data, we replace every target variable by the average of its orbit y = h . = 1 |G| G • g -1(A G (ρ Q (g) q) ρ Q (g) q) | ∀ g ∈ G.

E.4 EXPERIMENT: STATIC-FRICTION-REGIME CONTACT DETECTION

The dataset presented in (Lin et al., 2021) is composed of output samples y ∈ R 16 , where each dimension of y represents a logit of a specific contact state, among the 16 different combinations of each of the 4 legs possible binary contact states. The input samples y = {z i } 150 i=0 ∈ R 54×150 , are a history of 150 samples z = [ q, q, a, w , p, v] ∈ R 54 . Where q ∈ R n J , q ∈ R n J , aR 3 , w ∈ R 3 , p ∈ R 12 , v ∈ R 12 are the MIT-Mini-Cheetah robot joint-space positions, velocities, base linear acceleration, base angular velocity, and each of the four legs feet's position and velocities, respectively, referenced to the robots base frame B. The function approximator to learn is expected to be approximately equivariant to the reflection group C 2 , considering the sagittal symmetry of the robot morphology. Therefore: The MiniCheetah robot evolves in the Euclidean space of 3-dimensions. Therefore its configuration space can be decoupled into Q . = E 3 × Q J . After identifying their symmetry groups and their corresponding E 3 and Q J representations (ρ E 3 (g), ρ Q (g) | ∀ g ∈ G), we can identify the representations of the input and output spaces of the NN function approximator (supp.eq 20), considering that: Where the representation ρ p (g) acting on p ∈ R 12 and v ∈ R 12 is determined understanding that each of the feet positions (p RF , p LF , p RH , p LH ) and velocities (v RF , v LF , v RH , v LH ) are simply vectors living in E 3 . Thus, we must apply the euclidean action ρ E 3 (g) while at the same time permuting the feets (similar to the permutation of the kinematic subchains described by ρ Q J (g)): g • y = f (g • x; ϕ) | g ∈ G = C 2 g • p = ρ R 4 (g) ⊗ ρ E 3 (g) ρp(g) p, g • v = ρ R 4 (g) ⊗ ρ E 3 (g) ρp(g) v | ∀(g, g)|g ∈ G, g ∈ E 3 = ï 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 ò ⊗ ρ E 3 (g) ï p RF p LF p RH p LH ò , = ï 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 ò ⊗ ρ E 3 (g) ï v RF v LF v RH v LH ò | (g, g)|C 2 = {e, g} Being ρ R 4 (g) | ∀g ∈ G the unique representations of the finite group G in a 4-dimensional space, representing the 4 kinematic tree's subchains. This representation, for the non-trivial action of the MiniCheetah DMS group G ≈ C 2 is expressed in supp.eq 22. The nature of this representation might be better understood if you consider that ρ Q J (g) . = ρ R 4 (g) ⊗ I ns , being n s the number of degrees of freedom of the kinematic subchain (which for the case of MiniCheetah is 3 DoF). See simpler examples in supp.fig 5 . Lastly, the representation for the contact state ρ y (g) is given by the permutation matrix relating y and g • y described in supp.table 3.



The field of mathematics that studies symmetries, which is broadly used in Machine Learning(ML) The link to an anonymous repository is available in the official comments on OpenReview, accessible to reviewers and area chairs. The repository will be made public to the general public upon paper acceptance. A point-transformation g is a finite, invertible, continuous, and differentiable function of q(Lanczos, 2020) Some authors refer to this property as covariance of the EoMs(Wheeler, 2014;Lanczos, 2020) Technically, the topology of Q . = E d × QJ is referred to as a trivial principal fiber bundle(Ostrowski & Burdick, 1996), with E d as the fiber Lie group, and QJ as the base space. Note that this topology applies to a larger range of dynamical systems than merely floating-base. We deliberately abuse notation to keep the homogeneous matrix representation XB of position and orientation, instead of the vector-quaternion representation, common in robotics and computer graphics. These symmetries are commonly studied since, in conservative systems, translational and rotational symmetries imply the conservation of linear and angular momentum, while time symmetries imply the conservation of energy(Noether, 1918). This covers most flying/swimming/legged robots, animals and animated characters (supplementary A) A similar analysis can be made for the bias vector b. Note thatFinzi et al. (2021b) considers always a direct product of the input-output symmetry actions. However, for DMS as the input and output groups are isomorphic, a direct product over-contains the models to symmetries not present in the data. See Pierre Ouannes blog: https://pouannes.github.io/blog/initialization/#xavier-and-kaiming-initialization



Figure 1: Left: Caley diagram and top-view (see 3D animation) of symmetric configurations of the quadruped robot Solo, whose morphological symmetries (described by the Klein four-group K 4 ) allow it to imitate the effect of reflections (g s , g t ) and 180 • rotations of space (g r ). Transformations affect both proprioceptive (state space, CoM linear l and angular k momentum) and exteroceptive (terrain elevation, external disturbances) quantities. Right: Diagram of a K 4 -equivariant NN. Each of the layer's linear maps W is constructed as a weighted average of the basis of the space of equivariant linear maps B, computed from the K 4 symmetries of the input-output spaces (see section 5).

fig 5 the identical replication of fingers in the TriFinger robot, or the reflected arms and legs of the humanoid Atlas supp.fig 6a). In such architectures, swapping identical/reflected bodies (and thus subchains of the tree) can satisfy the Inertia invariance required for ρ E d (g)X = ρ E d (g)Xρ E d (g) -1 without requiring symmetric mass distributions. Refer to appendix C.3 and supp.fig 5 for details and examples. 4.3 IDENTIFICATION OF DMS GROUP G IN RIGID-BODY DYNAMICS The identification of the DMS group G of a floating-base dynamical system, composed of rigidbodies, can be outlined in four steps (see simple examples in supp.fig 5): 1. Identify the configuration X B and its associated Inertia I B . Usually the base body or the CoM. 2. Identify the symmetries in mass distribution as invariances to Euclidean transformations g ∈ E d of the reflected I B .

fig 8). Where λ B = :k and γ σ is a nonlinearity dependant scalar (e.g., γ ReLu = 1 /2, γ SeLu = 1 following Klambauer et al. (

Figure 2: CoM-estimation results comparing MLP, MLP-aug, and EMLP models. Left and Middle: Test set sample efficiency, for robot Solo and Atlas, of model variants with different capacities (number of hidden layers' neurons hc). Right: Sample efficiency for robot Solo (fig. 1), of models with hc = 512, when exploiting G = K 4 (sagittal and traversal symmetries) and G = C 2 = {e, g s } ⊂ K 4 (only sagittal symmetry). Reported values represent the average and standard deviation across 10 different seeds.

We test two robots: (1) Atlas, a n J = 30[DoF]  humanoid robot with a reflection DMS group G = C 2 (see supp.fig 6a). (2) Solo, a n J = 12[DoF] quadruped robot with the Klein-4 group as DMS group G = K 4 (see fig. 1-left). At the same time, we compare three variants of a function approximation: a standard Multi-Layer Perceptron (MLP), a version of the MLP trained using data-augmentation (MLP-aug), and a version of the MLP with hard-equivariance constraints (E-MLP). On fig. 2-left-&-middle, we compare the model variants. For both robots and all model capacities, the E-MLP and MLP-Aug outperform MLP on sample efficiency (better generalization with less data) and robustness to overfitting when training data is scarce. Comparing the E-MLP and MLP-Aug model variants, we see that the lower capacity versions behave similarly, but as capacity increases, E-MLP starts to show better sample efficiency and generalization. Lastly, on (fig. 2-right) we compare, for the robot Solo, the performance of the model variants when exploiting the robots' entire symmetry group (K 4 ) and a subgroup of the real symmetry group (C 2 ⊂ K 4 ). The results indicate that sample efficiency and generalization capacity increase with the number of true symmetries of the data exploited. Static-Friction-Regime Contact Detection (Classification): This experiment uses the dataset presented in Lin et al. (

Figure 3: Static-Friction-Regime contact detection results comparing CNN, CNN-aug, and ECNN. Left: Sample efficiency in log-log scale. Middle: Average legs F1-score vs. training samples. Right: Classification metrics on test set performance of models trained with the entire training set. Selected metrics are contact-state (y ∈ R 16 ) accuracy (Acc) and f1-score (F1) for each leg binary contact state. Due to the sagittal symmetry of the robot the left front (LF) and right front (RF) legs are expected to be symmetric, as the left hind (LH) and right hind (RH) legs. F1-score is presented considering the dataset class imbalance (see supplementary E.4 and supp.fig 7). Reported values represent the average and standard deviation across 8 different seeds.

); Ordonez-Apraez et al. (2022); Abdolhosseini et al. (2019); Wu et al. (2022).

o and I g o as the original and transformed bodies inertias referenced to o, and I g o as the reflected body inertia referenced to the reflected Euclidean space o. In order to comply with supp.eq 11, we must ensure g

fig 4-right, for Bolt's legs in supp.fig 5, and for Solo's legs in fig. 1 C.4 SYMMETRIC POSITION AND VELOCITY CONSTRAINT CONFIGURATION SPACES

.1 DETERMINATION OF THE INPUT AND OUTPUT REPRESENTATIONS ρ x (g), ρ y (g) | g ∈ G

g, g)|g ∈ G, g ∈ E 3

Supplementary Table 1: Comparison of memory complexity of individual layers of the equivariant version of

 Contact-CNN Lin et al. (2021) (ECNN) . This example compares the sparse and dense representations of matrices B ∈ R mn×r and the |G| group action representations ρ w (g) ∈ R mn×mn , for the symmetry group G = C 2 of the Mini-Cheetah robot, with r = mn /2 (see supp.eq 13). Here, n, m represents the input and output dimensions of each layer. The dense memory complexity of all action representations increases with the group order |G| while the memory complexity for B decreases with larger group orders (since r ≤ mn becomes smaller). We assume floating point representations with 32 bits.Dense Memory [Bytes] for a G-equivariant layer, the number of trainable parameters is determined by:denoting χ 1 ρ (g) : G → N as the number of fix-points of the action representation ρ(g). Therefore, the number of trainable parameters can range from |w| /|G| ≤ r ≤ |w|, depending on the fix-points of the layers' input and output spaces.

D.2 PARAMETER INITIALIZATION OF EQUIVARIANT LAYERS FOR DMS

Consider a Equivariant Neural Network architecture composed of multiple layers of equivariant linear (or convolutional) layers of the form l y := σ( l W l x + l b), being l the layer index, l x ∈ R n and l y ∈ R m the layer's input and output vector spaces, l W . = r k l c k l B :,:,k ∈ R m×n the layer's linear map, l B ∈ R m×n×r the layer's r basis vectors spawning the space of equivariant linear maps, l c ∈ R r the layer's trainable parameters, and l b ∈ R m the layer's bias vector.For the optimal flow of information throughout the network, it's relevant to initialize the trainable parameters such that the variance of activations (during inference/forward-propagation) and gradients (during back-propagation) is kept constant, avoiding activations/gradients from vanishing or exploiting (Glorot & Bengio, 2010) 11 .The derivation is based on the equivalent process for unconstrained layers presented in He et al. (2015) . Let the layer's activations before the non-linearity be denoted by l z = l W l x + l b, such that l y = σ( l z), and note that l x = l-1 y. Furthermore, we will assume the elements of l c and l x are mutually independent and sampled from two independent distributions, denoting the random variables of the two distributions as l c and l x. 2021), which we retrained using the same hyperparameters reported by the authors) we ran a gridsearch in log-scale among 20 different learning rates. In this scenario, we always used the entire training dataset and optimized w.r.t computed loss in the entire validation partition. The learning rate values used for each model are depicted in supp.table 2.

E.3 EXPERIMENT: COM MOMENTUM ESTIMATION

The dataset for the CoM estimation experiment was generated using Pinocchio (Carpentier et al., 2019) , which in turn uses the URDF models of the robots Solo and Atlas, to extract the kinematic and dynamic parameters required to compute the Centroidal Momentum Matrix A G (q) matrix (Orin et al., 2013), with which computing the CoM momentum reduces to:Where supp.eq 17 expresses the analytical G-equivariant function to compute the CoM momentum. While supp.eq 18 is the approximation of this function by an G-equivariant NN, with parameters ϕ.

E.3.1 DETERMINATION OF THE INPUT AND OUTPUT

Both robots Solo and Atlas evolve in the Euclidean space of 3-dimensions. Therefore their configuration space can be decoupled into Q . = E 3 × Q J . After identifying their symmetry groups and their corresponding E 3 and, identifying the representations of the input and output spaces of the NN function approximator (supp.eq 18) becomes a trivial task considering that: Because we are interested in studying the generalization capacity of the models and the out-oftraining-distribution performance, we modified this partitioning such that among the 15 different recordings we selected randomly 5 recordings for testing, and the remaining 10 recordings were used for training splitting these recordings into (85%, 15%) training and validation splits as in Lin et al. ( 2021), that is, for each recording, the first 85% data-samples go for training and the remaining for validation.The selected training-validation recordings were: air walking gait, concrete difficult slippery, concrete left circle, middle pebble, rock road, asphalt road, concrete galloping, grass, old asphalt road, sidewalk. While the selected testing recordings were: air jumping gait, concrete pronking, concrete right circle, forest, small pebble.

E.5 MITIGATION OF SUBOPTIMAL ASYMMETRIES IN MODEL PERFORMANCE

When comparing individual leg classification we see that the equivariant model converges to having a similar performance for each symmetric pair of legs, while the unconstrained models converge to an asymmetrical suboptimal state favoring the contact detection of one leg at the expense of reduced performance for the symmetric leg (see LF and RF f1-scores). This asymmetrical performance is attributed to the CNN and CNN-aug models learning to extract temporal features for both symmetric legs separately, increasing the likelihood of converging to asymmetrical local minima. On the contrary, the equivariant model E-CNN can be thought of as learning to extract a single set of symmetric temporal features for each symmetric pair of states (a consequence of the model's equivariance and parameter sharing). This implies that the temporal features used for determining the contact state of, say the left frontal leg, would also be used to determine the contact state of the symmetric leg, the right frontal leg, and vice-versa.

E.6 EQUIVARIANT CONV1D LAYERS

For details on the construction of the Equivariant 1D Convolutional layers reefer to 2 . Note that the symmetry of a single time-sample z i is shared across all time-samples y = {z i } 150 i=0 .Supplementary = n /(λ l B λσ), respectively. In these cases, the variance of activations through the network depth remains nearly constant, as desired. The last two rows show the initialization of layer parameters with a constant variance of 0.05 2 and 0.8 2 , illustrating scenarios of activations vanishings and exploiting. All architectures are composed of 10-layers with 256 neurons on intermediate layers. In the equivariant case, the network is K 4 -equivariant.

