EFFICIENT APPROXIMATION OF NEURAL POPULATION STRUCTURE AND CORRELATIONS WITH PROBABILISTIC CIRCUITS

Abstract

We present a computationally efficient framework to model a wide range of population structures with high order correlations and a large number of neurons. Our method is based on a special type of Bayesian network that has linear inference time and is founded upon the concept of contextual independence. Our framework is both fast and accurate in approximating neural population structures. Furthermore, our approach enables us to reliably quantify higher order neural correlations. We test our method on simulated neural populations commonly used to generate higher order correlations, as well as on publicly available large-scale neural recordings from the Allen Brain Observatory. Our approach significantly outperforms other models both in terms of statistical measures and alignment with experimental evidence.

1. INTRODUCTION

With the rise and fast growth of simultaneous neural population recording, modeling population structures and measuring correlations has become a focus of computational neuroscience (Abbott & Dayan, 1999; Averbeck et al., 2006; Azeredo da Silveira & Rieke, 2021; Urai et al., 2022) . Theoretical and Experimental works have demonstrated the necessity of measuring population correlations to investigate information coding (Moreno-Bote et al., 2014; Averbeck et al., 2006) , functional connectivity (Dunn et al., 2015) , learning (Ganmor et al., 2011), and arousal (Vinck et al., 2015; Doiron et al., 2016) . Despite significant progress in recent years, research on measurement and analysis of population correlations still faces significant challenges (Kohn et al., 2016) . Exact measurement of population correlations is an NP-hard problem in the general case since it requires computing every form of dependency among spiking neurons. As a result, researchers have tried to come up with computationally efficient ways of approximation or indirect measurement of neural correlations. Existing approaches are energy-based models rooted in statistical mechanics where the energy function incorporates couplings between subsets of variables (here neurons) (Roudi et al., 2009c; Tkačik et al., 2006; Sohl-Dickstein et al., 2011; Aurell & Ekeberg, 2012) . However, these methods often carry auxiliary (and even unrealistic) assumptions about the neural dynamics and do not scale up for large populations (Roudi et al., 2009b) . Notably, generative models commonly used in other domains such as latent variable methods are often not applicable to neural populations as spiking neural data is discrete and sparse (Zhao et al., 2020) . Furthermore, various parameters such as behavioural and emotional state of the animal affect firing patterns of neurons even in sensory cortex (Urai et al., 2022) . As a result, a recording long enough to train these models contains many external variable changes and confounding factors that make drawing scientific conclusions difficult. It is important to distinguish whether modeling population correlations is important for predicting joint activity (called encoding) vs whether they are a channel for down-stream information flow (call decoding) (Pillow et al., 2008) . In the encoding paradigm, one needs to know whether it is important to include certain correlations in a generative model of the spiking activity. In the decoding paradigm, one can, e.g., compare a classifier's stimulus prediction accuracy on original and shuffled neural data to assess population correlations (Averbeck et al., 2006; Pillow et al., 2008; Christensen & Pillow, 2022; Runyan et al., 2017) . Importantly, the type of used classifier significantly affects the result with no confirmation that the brain utilizes a similar strategy Averbeck et al. (2006) . Here we work in the encoding paradigm and take a probabilistic approach by modeling the joint probability distribution of neural activity with Bayesian networks. Inference is NP-hard in general Bayesian networks (Cooper, 1990) , making them impractical to model the population structure. Therefore, we utilize a special family of Bayesian networks with linear inference time, first introduced as "Arithmetic circuits" (Darwiche, 2003; Shen et al., 2016) . This family of networks has been designed to take advantage of "context-specific independence" of variables mainly for the purpose of computational efficacy, which also makes it suitable to extract local structures in the data (Boutilier et al., 1996; Shen et al., 2020) . We use a modification (and equivalent (Rooshenas & Lowd, 2014 )) of arithmetic circuits, Sum-Product Networks (SPNs). SPNs are more known and used by the community (Poon & Domingos, 2011; Sanchez-Cauce et al., 2021) . In particular, we adapt sum-product networks to fit spiking neural data in order to capture a wide range of population correlation/structure from local to global in polynomial time. Due to the efficiency of architecture learning and inference in SPNs, population structure estimation is polynomial in the size of the population. In addition, we suggest a measure of high order population correlations based on our framework. Our results include fitting on neural population simulations constructed with higher order correlations, as well as large-scale neural recording in different brain regions on more than 20 mice. Our framework outperforms both energy-based and latent variable models for neural population structure estimation.

2. PROBLEM DEFINITION AND RELATED WORK

One of the critical problems in computational neuroscience is providing an accurate statistical description of spike trains in a population of neurons. As the full representation of the data, i.e. raw spike times, is high dimensional, spike trains are binned into small time windows. The time bin should be short enough so each neuron spikes at most once in each bin (with some amount of tolerance in potentially losing some spikes). In addition, this time bin should be large enough that the assumption of temporal independence of spikes holds. With this time binning strategy, each neuron's activity is a binary variable (S i for neuron i is equal to 1 if there is a spike in the corresponding bin, otherwise 0) and each time bin represents an i.i.d sample/instance. Therefore, spike trains of N neurons for the duration of T would be represented as a binary matrix D K×N where K = T /∆t in which ∆t is the bin length (Figure 1 , left plot). Consequently, the population activity has a probabilistic representation P (S 1 , . . . , S N ), and the problem turns into modelling this joint distribution, given the data. More specifically, the problem is to find a model m * from a family of models M , and optimize its free parameters Θ m so as to satisfy the following: m * , θ * m = arg max m∈M,θ∈Θm 1 K K k=1 log P (d k 1×N |m, θ) In the existing approaches, M is set to maximum entropy (Ising) models, which are energy-based methods rooted in statistical mechanics (Roudi et al., 2009c; Schneidman et al., 2006) . Since learning maximum entropy models is computationally very expensive, these models are restricted to estimate the statistical properties of the population up to a constant order. However, going beyond second order is not computationally feasible. In fact, building the exact generative model is an intractable problem in the general case even for the second order (pairwise) correlations. Therefore, even a Pairwise Maximum Entropy (PME) model requires further estimation where more accurate approximation algorithms requires thousands of samples for each pair, making them impossible to be used for large populations of neurons (Roudi et al., 2009c; Tkačik et al., 2006; Sohl-Dickstein et al., 2011; Aurell & Ekeberg, 2012) . Moreover, there exist plausible scenarios, such as a dichotomized common input to loosely coupled neurons, in which pairwise correlations are

