LARGE ASSOCIATIVE MEMORY PROBLEM IN NEURO-BIOLOGY AND MACHINE LEARNING

Abstract

Dense Associative Memories or modern Hopfield networks permit storage and reliable retrieval of an exponentially large (in the dimension of feature space) number of memories. At the same time, their naive implementation is non-biological, since it seemingly requires the existence of many-body synaptic junctions between the neurons. We show that these models are effective descriptions of a more microscopic (written in terms of biological degrees of freedom) theory that has additional (hidden) neurons and only requires two-body interactions between them. For this reason our proposed microscopic theory is a valid model of large associative memory with a degree of biological plausibility. The dynamics of our network and its reduced dimensional equivalent both minimize energy (Lyapunov) functions. When certain dynamical variables (hidden neurons) are integrated out from our microscopic theory, one can recover many of the models that were previously discussed in the literature, e.g. the model presented in "Hopfield Networks is All You Need" paper. We also provide an alternative derivation of the energy function and the update rule proposed in the aforementioned paper and clarify the relationships between various models of this class.

1. INTRODUCTION

Associative memory is defined in psychology as the ability to remember (link) many sets, called memories, of unrelated items. Prompted by a large enough subset of items taken from one memory, an animal or computer with an associative memory can retrieve the rest of the items belonging to that memory. The diverse human cognitive abilities which involve making appropriate responses to stimulus patterns can often be understood as the operation of an associative memory, with the "memories" often being distillations and consolidations of multiple experiences rather than merely corresponding to a single event. The intuitive idea of associative memory can be described using a "feature space". In a mathematical model abstracted from neurobiology, the presence (or absence) of each particular feature i is denoted by the activity (or lack of activity) of a model neuron v i due to being directly driven by a feature signal. If there are N f possible features, there can be only at most N 2 f distinct connections (synapses) in a neural circuit involving only these neurons. Typical cortical synapses are not highly reliable, and can store only a few bits of information 1 . The description of a particular memory requires roughly N f bits of information. Such a system can therefore store at most ∼ N f unrelated memories. Artificial neural network models of associative memory (based on attractor dynamics of feature neurons and understood through an energy function) exhibit this limitation even with precise synapses, with limits of memory storage to less than ∼ 0.14N f memories (Hopfield, 1982) . 1 For instance, a recent study (Bromer et al., 2018) reports the information content of individual synapses ranging between 2.7 and 4.7 bits, based on electron microscopy imaging, see also (Bartol Jr et al., 2015) . These numbers refer to the structural accuracy of synapses. There is also electrical and chemical noise in synaptic currents induced by the biophysical details of vesicle release and neurotransmitter binding. The unreliability of the fusion of pre-synaptic vesicles (containing neurotransmitter) with the pre-synaptic neuron membrane is the dominant source of trial-to-trial synaptic current variation (Allen & Stevens, 1994) . This noise decreases the electrical information capacity of individual synapses from the maximal value that the synaptic structure would otherwise provide. 1

