CBP-QSNN: SPIKING NEURAL NETWORKS QUAN-TIZED USING CONSTRAINED BACKPROPAGATION

Abstract

Spiking Neural Networks (SNNs) support sparse event-based data processing at high power efficiency when implemented in event-based neuromorphic processors. However, the limited on-chip memory capacity of neuromorphic processors strictly delimits the depth and width of SNNs implemented. A direct solution is the use of quantized SNNs (QSNNs) in place of SNNs with FP32 weights. To this end, we propose a method to quantize the weights using constrained backpropagation (CBP) with the Lagrangian function (conventional loss function plus well-defined weight-constraint functions) as an objective function. This work utilizes CBP as a post-training algorithm for deep SNNs pre-trained using various state-of-the-art methods including direct training (TSSL-BP, STBP, and surrogate gradient) and DNN-to-SNN conversion (SNN-Calibration), validating CBP as a general framework for QSNNs. CBP-QSNNs highlight their high accuracy insomuch as the degradation of accuracy on CIFAR-10, DVS128 Gesture, and CIFAR10-DVS in the worst case is less than 1%. Particularly, CBP-QSNNs for SNN-Calibrationpretrained SNNs on CIFAR-100 highlight an unexpected large increase in accuracy by 3.72% while using small weight-memory (3.5% of the FP32 case).

1. INTRODUCTION

Spiking Neural Networks (SNNs) are time-dependent models with spiking neurons whose dynamics in conjunction with synaptic current dynamics constitutes the rich dynamics of SNNs (Jeong, 2018) . Deep SNNs are clearly distinguished from deep neural networks (DNNs) such that (i) presynaptic spiking neurons send out 1-bit data (spikes a.k.a. events) to their postsynaptic neurons unlike the nodes sending out real-valued activation values to the nodes in the next layer in a DNN and (ii) SNN operations are based on asynchronous sparse spikes unlike DNNs based on layerwise synchronous activation calculations (Jeong, 2018; Pfeiffer & Pfeil, 2018) . These distinct features endow SNNs with high power efficiency given minimum data movements and high sparsity in operations. Yet, SNNs leverage the efficiency only when implemented in neuromorphic processors that supports event-based operations. Neuromorphic processor design technologies are diverse, e.g., mixed analog/digital circuits (Merolla et al., 2014a; Moradi et al., 2018; Neckar et al., 2019) , and fully digital circuits (Merolla et al., 2014b; Davies et al., 2018; Frenkel et al., 2018; Kornijcuk et al., 2019) . Albeit diverse, all designs commonly suffer from their limited on-chip memory (SRAM) capacity. The on-chip memory is mainly assigned to neurons (state variables and hyperparameters), synapses (weights, state variables, and hyperparameters), and event-router (lookup tables). The largest portion of on-chip memory is dedicated to synaptic weights given a significant number of synapses in a deep SNN. Additionally, most neuromorphic processors hardly allow weight-reuse for convolutional SNNs because they are designed for dense SNNs. Although some compilers for weight-reuse, e.g., NXTF for Loihi (Rueckauer et al., 2021) , the weight-reuse rate is still far below the ideal rate. Consequently, the limited on-chip memory capacity strictly limits the size (depth and width) of SNNs implementable in neuromorphic processors. Considering the limitation of on-chip memory capacity, attempts to reduce the use of synaptic weight-memory have been made, which include unstructured SNN pruning (Neftci et al., 2016; Rathi et al., 2019; Martinelli et al., 2020; Chen et al., 2021; Deng et al., 2021; Kim et al., 2022; Chen et al., 2022) and weight-quantization (Rueckauer et al., 2017; Yousefzadeh et al., 2018 ; Srini- Generally, deep SNNs learn optimal weights using three distinct methods: direct training using backprop based on (i) rate code (Wu et al., 2018; Shrestha & Orchard, 2018; Wu et al., 2019; Fang et al., 2021a; b; Zheng et al., 2021) and (ii) temporal code (Zhang & Li, 2020; Yang et al., 2021; Zhou et al., 2021) , and (iii) DNN-to-SNN conversion (Rueckauer et al., 2017; Sengupta et al., 2019; Han et al., 2020; Deng & Gu, 2020; Li et al., 2021) . Note that the last method covers SNNs on image (static rather than event-based) datasets only. Regarding this diversity in learning methods and datasets, a general weight-quantization framework can resolve the difficulty in weight-quantization. To this end, we propose a weight-quantization method based on constrained backpropagation (CBP) that uses Lagrangian functions (conventional loss function plus weight-constrain functions, each of which given a Lagrange multiplier) as objective functions (Kim & Jeong, 2021) . CBP offers a general weight-quantization framework given that any constraints on weight quantization (e.g., binary, ternary, etc.) can easily be applied to the weight-constraint functions in the Lagrangian function. CBP is a post-training method so that SNNs that learned real-valued weights using different pretraining methods can be post-trained using the Lagrangian functions with the same loss functions as for the pre-training. The main contributions of our work are as follows: • We validate CBP as a general framework for QSNNs by successfully binary-and ternaryquantizing deep SNNs (with various topologies) pre-trained on various datasets (CIFAR-10/100, ImageNet, DVS128 Gesture, CIFAR10-DVS) using representative (i) rate codebased backprop algorithms with surrogate gradients (Wu et al., 2018; Fang et al., 2021a) , (ii) temporal code-based backprop algorithm (Zhang & Li, 2020), and (iii) DNN-to-SNN conversion (Li et al., 2021) . • We propose a surrogate loss function to quantize SNNs that learned real-valued weights using the DNN-to-SNN conversion method and report surprisingly high accuracy, particularly, on CIFAR-100, which exceeds the accuracy of the real-valued SNNs by more than 3%. • We analyse the accuracy and weight-memory efficiency of CBP-QSNNs on various datasets in comparison with previous methods, which highlights the state-of-the-art (SOTA) accuracy with weight-memory usage similar to or lower than the previous methods as shown in Figure 1 .



Figure 1: Accuracy and weight-memory usage of CBP-QSNNs on CIFAR-10/100 and ImageNet. The diamond, square, and star symbols denote binary, ternary, and int2 weight precision, respectively.

