FACTORS INFLUENCING GENERALIZATION IN CHAOTIC DYNAMICAL SYSTEMS

Abstract

Many real-world systems exhibit chaotic behaviour, for example: weather, fluid dynamics, stock markets, natural ecosystems, and disease transmission. While chaotic systems are often thought to be completely unpredictable, in fact there are patterns within and across that experts frequently describe and contrast qualitatively. We hypothesise that given the right supervision / task definition, representation learning systems will be able to pick up on these patterns, and successfully generalize both in-and out-of-distribution (OOD). Thus, this work explores and identifies key factors which lead to good generalization. We observe a variety of interesting phenomena, including: learned representations transfer much better when fine-tuned vs. frozen; forecasting appears to be the best pre-training task; OOD robustness falls off very quickly outside the training distribution; recurrent architectures generally outperform others on OOD generalization. Our findings are of interest to any domain of prediction where chaotic dynamics play a role.

1. INTRODUCTION

There are many reasons to be interested in understanding and predicting behaviour of chaotic systems. For example, the current climate crisis is arguably the most important issue of our time. From atmospheric circulation and weather prediction to economic and social patterns, there are chaotic dynamics in many data relevant to mitigate impact and adapt to climate changes. Most natural ecosystems exhibit chaos; a better understanding of the mechanisms of our impact on our environment is essential to ensuring a sustainable future on our planet. The spread of information in social networks, many aspects of market economies, and the spread of diseases, all have chaotic dynamics too, and of course these are not isolated systems -they all interact in complex ways, and the interaction dynamics can also exhibit chaos. This makes chaotic systems a compelling challenge for machine learning, particularly representation learning: Can models learn representations that capture high-level patterns and are useful across other tasks? Which losses, architectures, and other design choices lead to better representations? These are some of the questions which we aim to answer. Our main contributions are: • The development of a lightweight evaluation framework, ValiDyna, to evaluate representations learned by deep-learning models in new tasks, new scenarios, and on new data. • The design of experiments using this framework, showcasing its usefulness and flexibility. • A comparative analysis of 4 popular deep-learning architectures using these experiments.  model S C F S ↛ C F ↛ C S ↛ F C ↛ F F → S F → C C → S C → F GRU ✓ ✓ - ✓ ✓ - - ✓ ✓ ✓ - LSTM ✓ ✓ - ✓ ✓ - - ✓ ✓ ✓ - Transformer ✓ ✓ - ✓ ✓ - - ✓ - - - N-BEATS - -- - - - - - - - - 1



Summary of the generalisation results. S, C and F stand for the tasks of Supervised featurisation, Classification, and Forecasting. A ↛ B and A → B indicate strict (see section 5.2) and loose (see section 5.3) feature-transfer from task A to task B. All runs generalise in-distribution. ✓ andindicate whether or not the model-run pair achieves OOD generalisation in the final task.

