UNDERSTANDING NEURAL CODING ON LATENT MAN-IFOLDS BY SHARING FEATURES AND DIVIDING EN-SEMBLES

Abstract

Systems neuroscience relies on two complementary views of neural data, characterized by single neuron tuning curves and analysis of population activity. These two perspectives combine elegantly in neural latent variable models that constrain the relationship between latent variables and neural activity, modeled by simple tuning curve functions. This has recently been demonstrated using Gaussian processes, with applications to realistic and topologically relevant latent manifolds. Those and previous models, however, missed crucial shared coding properties of neural populations. We propose feature sharing across neural tuning curves which significantly improves performance and helps optimization. We also propose a solution to the ensemble detection problem, where different groups of neurons, i.e., ensembles, can be modulated by different latent manifolds. Achieved through a soft clustering of neurons during training, this allows for the separation of mixed neural populations in an unsupervised manner. These innovations lead to more interpretable models of neural population activity that train well and perform better even on mixtures of complex latent manifolds. Finally, we apply our method on a recently published grid cell dataset, and recover distinct ensembles, infer toroidal latents and predict neural tuning curves in a single integrated modeling framework.

1. INTRODUCTION

Neural population activity can appear high-dimensional (Stringer et al., 2019) , yet much recent work has reported that neural populations in higher brain areas are often confined to low dimensional subspaces (Yu et al., 2008; Harvey et al., 2012; Mante et al., 2013; Stokes et al., 2013; Shenoy et al., 2013; Kaufman et al., 2014; Sadtler et al., 2014; Gallego et al., 2017; Elsayed & Cunningham, 2017; Gao et al., 2017) . The bread and butter of classic systems neuroscience is linking neural activity to experimentally controlled or observable covariates such as orientation (Hubel & Wiesel, 1979 ), pitch (Lewicki, 2002 ), movement (Churchland et al., 2012; Kao et al., 2015) , posture (Mimica et al., 2018) and orientation in space (Taube et al., 1990) . These two parallel streams of neuroscientific research might at first seem to be at odds with each other (Kriegeskorte & Wei, 2021) ; tuning studies of individual neurons give a very different picture of neural coding than distributed representations over high-dimensional neural populations. However, they combine elegantly in the form of (neural) latent variable models (LVMs, see Lawrence, 2003; Yu et al., 2008; Pandarinath et al., 2018) . In their basic form, neural LVMs find the low-dimensional structure of neural population activity, for instance, when a large network of neurons is coding mostly along few linear subspaces (Mante et al., 2013; Gao et al., 2017) . One advantage is that these models can help us discover latent variables which may not be tracked as classical covariates in systems neuroscience. However, when the mapping from latent variables to predicted spike rate (decoding) is fully unconstrained, e.g., by using a multi-layer neural network, we lose the simple biological interpretation of tuning curves. In an effort to maintain a biologically interpretable relationship between the latent variables and the neural activity, recent work has proposed more constrained decoders approximating simple tuning 1) is a multilayered neural network. The latent spaces are separate with, potentially, different topologies (e.g. R 1 and T 2 ). The decoder is a parametric tuning curve model with feature sharing (2). The ensemble detection is a weighted (ideally one-hot) selection of latent spaces for each neuron (3). For decoding of the activity, we assume Poisson spiking. curves. These tuning curves have been parameterized as Gaussian processes in the framework of Gaussian process latent variable models (GPLVM, Wu et al., 2017; 2018) . Through the tuning curve approach, we limit ourselves to biologically plausible solutions that reveal the actual algorithmic structure of the neural system. Thus, LVMs with simple tuning curve decoders bring together the view on neural populations as distributed representations of low-dimensional latent variables, along with the biologically meaningful perspective on individual neural tuning properties. Some neural populations exhibit topologically interesting latent manifolds (Singh et al., 2008; Peyrache et al., 2015a; Gardner et al., 2022) . For instance, grid cells represent navigational space in toroidal coordinates of spatially repeating two dimensional hexagonal grids (Hafting et al., 2005) . They appear in different ensembles, commonly referred to as modules, each coding for space at different resolutions (Fyhn et al., 2007; Stensola et al., 2012) . Thus, the complete population of grid cells is best described as a collection of ensembles of neurons, where neurons in each ensemble have tuning curves of specific shapes on their respective toroidal latent representations of space (Curto, 2017) . By contrast, a two-dimensional Euclidean representation might also account for the complete population of grid cells, but would also completely obscure their efficient and theoretically interesting coding scheme for representing space (Solstad et al., 2006; Sreenivasan & Fiete, 2011; Mathis et al., 2012; Wei et al., 2015; Klukas et al., 2020) . A driving motivation behind this work was to model this beautiful neural structure with an LVM that separates the algorithmic and biological parts, while uniting shared tuning properties to be more accurate and trainable than previous approaches. We propose to train neural LVMs that not only have simple tuning curve decoders, but are also fully differentiable. Thus, we use a flexible encoder, i.e., a neural network as in variational autoencoders (Kingma & Welling, 2014; Rezende et al., 2014) , and a simple tuning curve based decoder, akin to GPLVM. The encoder can readily be made convolutional to allow for better latent estimation from adjacent time points. Additionally, we implement a feature basis for the tuning curve shapes in the decoder which can be shared across neurons. We demonstrate that the neural feature sharing, along with the variational end-to-end training, vastly improves both the training stability as well as the final performance of neural LVMs. Moreover, we propose hybrid inference at test-time and show that this, again, brings a considerable improvement in performance. Finally, we integrate the problem of separating distinct ensembles of neurons into our approach -a crucial task for the discovery of different biological structures and the precise mathematical understanding of their topological tuning properties. An illustration of our approach is provided in Fig. 1 . To summarize, our full model performs the task of finding latent variables, separating distinct ensembles of neurons and fitting the prototypical tuning curves on each ensemble's latent space in a single efficient framework. In the following, we therefore refer to our model as feature sharing and ensemble detection Latent Variable Model or faeLVM.

2. BACKGROUND

Let λ i be the instantaneous firing rate of a neuron i. To relate this to the spiking activity x i , we assume a Poisson noise model x i ∼ P(λ i ). We define latent variables z := {z 1 , . . . , z k } (in distinct



Figure 1: Model outline. Our main contributions are outlined in purple. The input is a matrix of neural spiking activity. The encoder (1) is a multilayered neural network. The latent spaces are separate with, potentially, different topologies (e.g. R 1 and T 2 ). The decoder is a parametric tuning curve model with feature sharing (2). The ensemble detection is a weighted (ideally one-hot) selection of latent spaces for each neuron (3). For decoding of the activity, we assume Poisson spiking.

