ACCELERATING SPIKING NEURAL NETWORK TRAINING USING THE d -BLOCK MODEL

Abstract

There is a growing interest in using spiking neural networks (SNNs) to study the brain in silico and in emulating them on neuromorphic computers due to their lower energy consumption compared to artificial neural networks (ANNs). Significant progress has been made in directly training SNNs to perform on a par with ANNs in terms of accuracy. However, these methods are slow due to their sequential nature and require careful network regularisation to avoid overfitting. We propose a new SNN model, the d -block model, with stochastic absolute refractory periods and recurrent conductance latencies, which reduces the number of sequential computations using fast vectorised operations. Our model obtains accelerated training speeds and state-of-the-art performance across various neuromorphic datasets without the need for any regularisation and using fewer spikes compared to standard SNNs.

1. INTRODUCTION

Artificial neural Networks (ANNs) are ubiquitous in achieving state-of-the-art performance across various domains, such as image recognition (He et al., 2016) , natural language processing (NLP) (Brown et al., 2020) and computer games (Silver et al., 2017; Vinyals et al., 2019) . These networks have also proven useful for studying the brain due to their architectural similarities (Richards et al., 2019) and have further advanced our understanding of the computational processes underlying the visual and auditory system (Harper et al., 2016; Singer et al., 2018; Cadena et al., 2019; Francl & McDermott, 2022; Yamins & DiCarlo, 2016) . However, ANNs have been criticised for their substantial energy demands resulting from their continued exponential growth in size (Strubell et al., 2019; Schwartz et al., 2020) , as exemplified by the GPT language models scaling from 110 million to 1.5 billion to 175 billion parameters to deliver ever-improving advances across various NLP tasks (Radford et al., 2018; 2019; Brown et al., 2020) . Furthermore, the applicability of ANNs to neuroscience is confined by their activation function, as the brain employs spikes rather than continuous-valued outputs used by ANN units. Spiking neural networks (SNNs) are a type of binary neural network (Figure 1a ), which overcome these challenges as they consume drastically less energy than ANNs when deployed on neuromorphic hardware (Wunderlich et al., 2019) and their biological realism makes them a favourable model for studying the brain in silico (Vogels et al., 2011; Denève & Machens, 2016) . However, SNN training remains a challenging problem due to the non-differentiable binary activation function employed by the spiking neurons. This has historically resulted in solutions that impose constraints on the neurons, such as rate codes (O'Connor et al., 2013; Esser et al., 2015; Rueckauer et al., 2016) , or only allowing neurons to spike at most once (Bohte et al., 2002; Mostafa, 2017; Comsa et al., 2020) . A recent proposal known as surrogate gradient training can overcome these limitations and has been shown to improve training on challenging datasets using models of increasing biological realism (Eshraghian et al., 2021) . Surrogate gradient training replaces the undefined derivate of the neuron's activation function with a surrogate function and uses backpropagation through time (BPPT) for training, since SNNs are a particular form of recurrent neural network (RNN) (Neftci et al., 2019) . SNNs thus experience many shortcomings associated with RNNs, such as their notably slow training times resulting from their sequential nature (Kuchaiev & Ginsburg, 2017; Vaswani et al., 2017) . Furthermore, SNNs require multiple regularisation terms to avoid

