LEARNING TO ACT THROUGH ACTIVATION FUNCTION OPTIMIZATION IN RANDOM NETWORKS

Abstract

Biological neural networks are characterised by a high degree of neural diversity, a trait that artificial neural networks (ANNs) generally lack. Additionally, learning in ANNs is typically synonymous with only modifying the strengths of connection weights. However, there is much evidence from neuroscience that different classes of neurons each have crucial roles in the information processing done by the network. In nature, each neuron is a dynamical system that is a powerful information processor in its own right. In this paper we ask the question, how well can ANNs learn to perform reinforcement learning tasks only through the optimization of neural activation functions, without any weight optimization? We demonstrate the viability of the method and show that the neural parameters are expressive enough to allow learning three different continuous control tasks without weight optimization. These results open up for more possibilities for synergies between synaptic and neural optimization in ANNs in the future. Code is available from [anonymised].

1. INTRODUCTION

Artificial neural networks (ANNs) have been shown to be able to learn a wide variety of different tasks (Schmidhuber, 2015) . With inspiration from their biological counterparts (Hassabis et al., 2017) , ANNs have dramatically pushed the boundaries for what is achievable for artificial intelligence technologies. ANNs are trained by tuning a large number of parameters, each of which provides a small contribution to the final output of the network. Likewise, most, but not all (Titley et al., 2017) , learning and behavioral changes are manifested in the biological brain as long-or short-term potentiation or depression of synapses between neurons (Stiles, 2000) . Neurons of the human brain are characterised by a high degree of diversity (Lillien, 1997; Soltesz et al., 2006) , and different classes of neurons respond differently to the incoming signals (Izhikevich, 2003) . A single biological neuron is a sophisticated processor in its own rights (Izhikevich, 2007; Poirazi et al., 2003) , with information processing occurring at several steps both in its dendrites (Beaulieu-Laroche et al., 2018; Magee, 2000) , cell body and axon terminals (Kamiya & Debanne, 2020; Rama et al., 2018) . Neurons of various classes interconnect in intricate circuits (Breedlove & Watson, 2013; Kandel et al., 2000) . This suggests that at least part of the explanation behind the impressive ability of biological networks to learn and retain knowledge must be found in the interplay between the abundance of different neuron types (Nusser, 2018) . While the diversity of biological neurons is well documented, in ANNs it is common to have a single activation function used by all hidden neurons. Intrigued by the interesting properties of randomly-initialised networks in both machine learning (Gaier & Ha, 2019; Najarro & Risi, 2020; Ulyanov et al., 2018) and neuroscience (Lindsay et al., 2017) , we are interested in the computational expressivity of only optimizing parameterized neural activation functions without any weight optimization. As described below, our approach allows every neuron in our ANNs to be a unique dynamical system. We apply our method to three diverse continuous control tasks. The simpler CartPoleSwingUp Gaier & Ha (2019), the locomotion of a bipedal robot (Brockman et al., 2016) , and a vision-based car racing task with procedurally generated tracks (Brockman et al., 2016) . The results show that the method performs well on all three tasks, outperforming weight-optimized networks that have a

