Hebbian Deep Learning Without Feedback

Abstract

Recent approximations to backpropagation (BP) have mitigated many of BP's computational inefficiencies and incompatibilities with biology, but important limitations still remain. Moreover, the approximations significantly decrease accuracy in benchmarks, suggesting that an entirely different approach may be more fruitful. Here, grounded on recent theory for Hebbian learning in soft winner-take-all networks, we present multilayer SoftHebb, i.e. an algorithm that trains deep neural networks, without any feedback, target, or error signals. As a result, it achieves efficiency by avoiding weight transport, non-local plasticity, time-locking of layer updates, iterative equilibria, and (self-) supervisory or other feedback signals -which were necessary in other approaches. Its increased efficiency and biological compatibility do not trade off accuracy compared to state-of-the-art bioplausible learning, but rather improve it. With up to five hidden layers and an added linear classifier, accuracies on MNIST, CIFAR-10, STL-10, and ImageNet, respectively reach 99.4%, 80.3%, 76.2%, and 27.3%. In conclusion, SoftHebb shows with a radically different approach from BP that Deep Learning over few layers may be plausible in the brain and increases the accuracy of bio-plausible machine learning.

1. Introduction: Backpropagation and its limitations

The core algorithm in deep learning (DL) is backpropagation (BP), which operates by first defining an error or loss function between the neural network's output and the desired output. Despite its enormous practical utility (Sejnowski, 2020), BP requires operations that make training a highly expensive process computationally, setting limits to its applicability in resource-constrained scenarios. In addition, the same operations are largely incompatible with biological learning, demanding alternatives. In the following we describe these limitations and their significance for DL, neuromorphic computing hardware, and neuroscience. Notably, after the preprint of this paper, Hinton (2022) presented an algorithm with similar considerations. Weight transport. Backpropagating errors involves the transpose matrix of the forward connection weights. This is not possible in biology, as synapses are unidirectional. Synaptic conductances cannot be transported to separate backward synapses either. That is the weight transport problem (Grossberg, 1987; Crick, 1989) , and it also prevents learning on energy-efficient hardware substrates (Crafton et al., 2019) such as neuromorphic computing hardware, which is one of the most researched approaches to overcoming the efficiency and scaling bottlenecks of the von Neumann architecture that underlies today's computer chips (Indiveri, 2021) . A key element of the neuromorphic strategy is to place computation within the same devices that also store memories, akin to the biological synapses, which perform computations but also store weights (Sebastian et al., 2020; Sarwat et al., 2022a; b) . However, in BP, the transposition of stored memories is necessary for weight transport, and this implies significant circuitry and energy expenses, by preventing the full in-memory implementation of neuromorphic technology (Crafton et al., 2019) . Non-local plasticity. BP cannot update each weight based only on the immediate activations of the two neurons that the weight connects, i.e. the pre-and post-synaptic neurons, as



Huawei Zurich Research Center, Switzerland Huawei ACS Lab, Shenzhen, China * Corresponding author 1

