HARDWARE-RESTRICTION-AWARE TRAINING (HRAT) FOR MEMRISTOR NEURAL NETWORKS

Abstract

Memristor neural network (MNN), which utilizes memristor crossbars for vectormatrix multiplication, has huge advantages in terms of scalability and energy efficiency for neuromorphic computing. MNN weights are usually trained offline and then deployed as memristor conductances through a sequence of programming voltage pulses. Although weight uncertainties caused by process variation have been addressed in variation-aware training algorithms, efficient design and training of MNNs have not been systematically explored to date. In this work, we propose Hardware-Restriction-Aware Training (HRAT), which takes into account various non-negligible limitations and non-idealities of memristor devices, circuits, and systems. HRAT considers MNN's realistic behavior and circuit restrictions during offline training, thereby bridging the gap between offline training and hardware deployment. HRAT uses a new batch normalization (BN) fusing strategy to align the distortion caused by hardware restrictions between offline training and hardware inference. This not only improves inference accuracy but also eliminates the need for dedicated circuitry for BN operations. Furthermore, most normal scale signals are limited in amplitude due to the restriction of nondestructive threshold voltage of memristors. To avoid input signal distortion of memristor crossbars, HRAT dynamically adjusts the input signal magnitude during training using a learned scale factor. These scale factors can be incorporated into the parameters of linear operation together with fused BN, so no additional signal scaling circuits are required. To evaluate the proposed HRAT methodology, FC-4 and LeNet-5 on MNIST are firstly trained by HRAT and then deployed in hardware. Hardware simulations match well with the offline HRAT results. We also carried out various experiments using VGG-16 on the CIFAR datasets. The study shows that HRAT leads to high-performance MNNs without device calibration or on-chip training, thus greatly facilitating commercial MNN deployment.

1. INTRODUCTION

Memristor neural network (MNN) has emerged as an increasingly feasible option to alleviate the scalability and energy efficiency challenges in neuromorphic computing. While several small-scale MNNs have been prototyped Li et al. ( 2018 



); Yao et al. (2020); Wan et al. (2022), efficient design and training of MNNs require an in-depth understanding of various restrictions from device, circuit, and system perspectives. These hardware restrictions include weight uncertainty noise caused by memristor variability and a limited number of programming pulse cycles to tune memristor conductance (e.g., 500 in Yao et al. (2020)), weight quantization noise due to limited states of memristor conductance (e.g., 5-and 4-bit in Yao et al. (2020); Wan et al. (2022)), non-destructive threshold voltage of memristors Jo et al. (2010), limited output swing of operational amplifiers Karki (2021), and bias quantization noise from finite-resolution digital-to-analog converters (DACs). These hardware restrictions collectively reduce the accuracy of MNN inference. Ignoring these hardware restrictions during software offline training may result in poor inference or even functional failure. As a critical step in network training, batch normalization (BN) can accelerate training convergence Ioffe & Szegedy (2015). The scale and shift operations of BN can be merged into the previous linear operation (e.g., fully connected or convolutional layer) after training. In this way, the hardware complexity and cost of MNNs are alleviated, as BN does not require explicit memristor crossbars in

