LEARNING THE CONNECTIONS IN DIRECT FEEDBACK ALIGNMENT

Abstract

Feedback alignment was proposed to address the biological implausibility of the backpropagation algorithm which requires the transportation of the weight transpose during the backwards pass. The idea was later built upon with the proposal of direct feedback alignment (DFA), which propagates the error directly from the output layer to each hidden layer in the backward path using a fixed random weight matrix. This contribution was significant because it allowed for the parallelization of the backwards pass by the use of these feedback connections. However, just as feedback alignment, DFA does not perform well in deep convolutional networks. We propose to learn the backward weight matrices in DFA, adopting the methodology of Kolen-Pollack learning, to improve training and inference accuracy in deep convolutional neural networks by updating the direct feedback connections such that they come to estimate the forward path. The proposed method improves the accuracy of learning by direct feedback connections and reduces the gap between parallel training to serial training by means of backpropagation.

1. INTRODUCTION

When feedback alignment was proposed by Lillicrap et al. (2016) it was cited as being a biologically plausible alternative to the backpropagation algorithm, but not long after Nøkland (2016) showed that variants of this approach may show tangible benefits during training such as mitigating the vanishing gradients issue or enabling parallelization of the backwards pass at the cost of additional memory requirements. Recently, interest in the latter has begun to grow as the memory capacity and compute capability of modern GPUs has continued to observe significant leaps. While many of these recently proposed alternatives have been shown to be just as capable as the backpropagation algorithm in terms of inference accuracy on deep convolutional networks, it should be noted that many of these approaches have not yet been shown to perform well outside of the image classification task. Direct feedback alignment (DFA), an earlier approach proposed by Nøkland (2016), was shown to perform reasonably well on a number of natural language processing tasks with recurrent neural networks and transformers by Launay et al. (2020) . However, direct feedback alignment still shows poor performance on the image classification task due to its inability to effectively train convolutional layers. We propose a modification to the DFA algorithm to improve its ability in training deep convolutional neural networks. Due to its relationship with another approach(Akrout et al., 2019), we call our method Direct Kolen-Pollack learning or DKP. We empirically show the mechanisms that allow the improvement in our approach over DFA by measuring DKP's ability to better estimate the backpropagation algorithm. We also show this improvement directly by training two deep convolutional neural network architectures on the Fashion-MNIST, CIFAR10, CIFAR100, and TinyIm-ageNet200(Le & Yang, 2015) datasets. More so, we recommend training procedures for training with DFA, pointing out the important role batch normalization plays in our experiments. And while a couple of works have shown that direct feedback connections can be viable when connecting to only the output of a block of layers in a network (Ororbia et al., 2020; Han & Yoo, 2019) , we show advances in the case of having feedback connections to all layers in deep convolutional neural networks. While direct feedback connections to all layers for current PC hardware, and also from a software perspective, may not be practical, in the future it may be useful for edge devices, IoT, SOC design, etc.(Frenkel et al., 2019; Han & Yoo, 2019) , especially those that involve learning vision tasks. Thus, making advances in the training scenario of direct feedback connections to all layers 1

