BRAIN-LIKE APPROACHES TO UNSUPERVISED LEARNING OF HIDDEN REPRESENTATIONS -A COMPARATIVE STUDY

Abstract

Unsupervised learning of hidden representations has been one of the most vibrant research directions in machine learning in recent years. In this work we study the brain-like Bayesian Confidence Propagating Neural Network (BCPNN) model, recently extended to extract sparse distributed high-dimensional representations. The saliency and separability of the hidden representations when trained on MNIST dataset is studied using an external linear classifier and compared with other unsupervised learning methods that include restricted Boltzmann machines and autoencoders.

1. INTRODUCTION

Artificial neural networks have made remarkable progress in supervised pattern recognition in recent years. In particular, deep neural networks have dominated the field largely due to their capability to discover hierarchies of salient data representations. However, most recent deep learning methods rely extensively on supervised learning from labelled samples for extracting and tuning data representations. Given the abundance of unlabeled data there is an urgent demand for unsupervised or semi-supervised approaches to learning of hidden representations (Bengio et al., 2013) . Although early concepts of greedy layer-wise pretraining allow for exploiting unlabeled data, ultimately the application of deep pre-trained networks to pattern recognition problems rests on label dependent end-to-end weight fine tuning (Erhan et al., 2009) . At the same time, we observe a surge of interest in more brain plausible networks for unsupervised and semi-supervised learning problems that build on some fundamental principles of neural information processing in the brain (Pehlevan & Chklovskii, 2019; Illing et al., 2019) . Most importantly, these brain-like computing approaches rely on local learning rules and label independent biologically compatible mechanisms to build data representations whereas deep learning methods predominantly make use of error back-propagation (backprop) for learning the weights. Although efficient, backprop has several issues that make it an unlikely candidate model for synaptic plasticity in the brain. The most apparent issue is that the synaptic connection strength between two biological neurons is expected to comply with Hebb's postulate, i.e. to depend only on the available local information provided by the activities of preand postsynaptic neurons. This is violated in backprop since synaptic weight updates need gradient signals to be communicated from distant output layers. Please refer to (Whittington & Bogacz, 2019; Lillicrap et al., 2020) for a detailed review of possible biologically plausible implementations of and alternatives to backprop. In this work we utilize the MNIST dataset to compare two classical learning systems, the autoencoder (AE) and the restricted Boltzmann machine (RBM), with two brain-like approaches to unsupervised learning of hidden representations, i.e. the recently proposed model by Krotov and Hopfield (referred to as the KH model) (Krotov & Hopfield, 2019) , and the BCPNN model (Ravichandran et al., 2020) , which both rely on biologically plausible learning strategies. In particular, we qualitatively examine the extracted hidden representations and quantify their label dependent separability using a simple linear classifier on top of all the networks under investigation. This classification step is not part of the learning strategy, and we use it merely to evaluate the resulting representations. Special emphasis is on the feedforward BCPNN model with a single hidden layer, which frames the update and learning steps of the neural network as probabilistic computations. Probabilistic ap-

