EIGENVALUE INITIALISATION AND REGULARISATION FOR

Abstract

Regularising the parameter matrices of neural networks is ubiquitous in training deep models. Typical regularisation approaches suggest initialising weights using small random values, and to penalise weights to promote sparsity. However, these widely used techniques may be less effective in certain scenarios. Here, we study the Koopman autoencoder model which includes an encoder, a Koopman operator layer, and a decoder. These models have been designed and dedicated to tackle physics-related problems with interpretable dynamics and an ability to incorporate physics-related constraints. However, the majority of existing work employs standard regularisation practices. In our work, we take a step toward augmenting Koopman autoencoders with initialisation and penalty schemes tailored for physicsrelated settings. Specifically, we propose the "eigeninit" initialisation scheme that samples initial Koopman operators from specific eigenvalue distributions. In addition, we suggest the "eigenloss" penalty scheme that penalises the eigenvalues of the Koopman operator during training. We demonstrate the utility of these schemes on two synthetic data sets: a driven pendulum and flow past a cylinder; and two real-world problems: ocean surface temperatures and cyclone wind fields. We find on these datasets that eigenloss and eigeninit improves the convergence rate by up to a factor of 5, and that they reduce the cumulative long-term prediction error by up to a factor of 3. Such a finding points to the utility of incorporating similar schemes as an inductive bias in other physics-related deep learning approaches.

1. INTRODUCTION

Modern neural networks are often overparameterised, i.e., their number of learnable parameters is significantly larger than the number of available training samples (Allen-Zhu et al., 2019a; b) . To guide optimisation through this immense parameter space, and to potentially improve performance by avoiding overfitting, neural networks are trained with regularisation techniques (Goodfellow et al., 2016) . The importance of regularisation has been shown in the theory and practice of deep learning. Prominent examples include the initialisation of parameter matrices (He et al., 2015; Hanin & Rolnick, 2018) , and constraining the parameters' norm via loss penalties (Hinton, 1987; Krogh & Hertz, 1991) . Initialising weights and penalising them with small random values and weight decay are arguably the most common regularisation techniques employed in training deep models with stochastic gradient descent algorithms. However, specific neural architectures, data domains, and learning problems may require different initialisation and penalty schemes. In this paper, we empirically study the effect of regularisation on physics-aware architectures. The ground-breaking success of deep learning in solving complex tasks in vision and other domains has inspired the physics community to develop deep models suited to deal with real-world problems arising in the field (Willard et al., 2020; Karniadakis et al., 2021) . In this context, we focus on dynamical systems analysed and processed using Koopman-based approaches (Takeishi et al., 2017; Lusch et al., 2018) . Koopman theory (Koopman, 1931) proves that under certain assumptions, nonlinear and finite-dimensional systems can be transformed to a linear (albeit infinite-dimensional) representation via the Koopman operator. Using finite-dimensional approximations of this Koopman operator is advantageous as they facilitate the analysis and understanding of dynamical systems by utilising linear analysis tools. Despite the theoretical and practical advances that have significantly improved Koopman-based learning methods, the majority of existing models still apply regularisation practices designed for general neural networks. Our investigation aims to answer the research

