PARAMETRIC COPULA-GP MODEL FOR ANALYZING MULTIDIMENSIONAL NEURONAL AND BEHAVIORAL RELATIONSHIPS

Abstract

One of the main challenges in current systems neuroscience is the analysis of high-dimensional neuronal and behavioral data that are characterized by different statistics and timescales of the recorded variables. We propose a parametric copula model which separates the statistics of the individual variables from their dependence structure, and escapes the curse of dimensionality by using vine copula constructions. We use a Bayesian framework with Gaussian Process (GP) priors over copula parameters, conditioned on a continuous task-related variable. We improve the flexibility of this method by 1) using non-parametric conditional (rather than unconditional) marginals; 2) linearly mixing copula elements with qualitatively different tail dependencies. We validate the model on synthetic data and compare its performance in estimating mutual information against the commonly used non-parametric algorithms. Our model provides accurate information estimates when the dependencies in the data match the parametric copulas used in our framework. Moreover, when the exact density estimation with a parametric model is not possible, our Copula-GP model is still able to provide reasonable information estimates, close to the ground truth and comparable to those obtained with a neural network estimator. Finally, we apply our framework to real neuronal and behavioral recordings obtained in awake mice. We demonstrate the ability of our framework to 1) produce accurate and interpretable bivariate models for the analysis of inter-neuronal noise correlations or behavioral modulations; 2) expand to more than 100 dimensions and measure information content in the wholepopulation statistics. These results demonstrate that the Copula-GP framework is particularly useful for the analysis of complex multidimensional relationships between neuronal, sensory and behavioral data.

1. INTRODUCTION

Recent advances in imaging and recording techniques have enabled monitoring the activity of hundreds to several thousands of neurons simultaneously (Jun et al., 2017; Helmchen, 2009; Dombeck et al., 2007) . These recordings can be made in awake animals engaged in specifically designed tasks or natural behavior (Stringer et al., 2019; Pakan et al., 2018a; b) , which further augments these already large datasets with a variety of behavioral variables. These complex high dimensional datasets necessitate the development of novel analytical approaches (Saxena & Cunningham, 2019; Stevenson & Kording, 2011; Staude et al., 2010) to address two central questions of systems and behavioral neuroscience: how do populations of neurons encode information? And how does this neuronal activity correspond to the observed behavior? In machine learning terms, both of these questions translate into understanding the high-dimensional multivariate dependencies between the recorded variables (Kohn et al., 2016; Shimazaki et al., 2012; Ince et al., 2010; Shamir & Sompolinsky, 2004) . There are two major methods suitable for recording the activity of large populations of neurons from behaving animals: the multi-electrode probes (Jun et al., 2017) , and calcium imaging methods (Grienberger et al., 2015; Helmchen, 2009; Dombeck et al., 2007) that use changes in intracellular calcium concentration as a proxy for neuronal spiking activity at a lower temporal precision. While neuronal spiking occurs on a temporal scale of milliseconds, the behavior spans the timescales from milliseconds to hours and even days (Mathis et al., 2018) . As a result, the recorded neuronal and behavioral variables may operate at different timescales and exhibit different statistics, which further complicates the statistical analysis of these datasets. The natural approach to modeling statistical dependencies between the variables with drastically different statistics is based on copulas, which separate marginal (i.e. single variable) statistics from the dependence structure (Joe, 2014) . For this reason, copula models are particularly effective for mutual information estimation (Jenison & Reale, 2004; Calsaverini & Vicente, 2009b) , which quantifies how much knowing one variable reduces the uncertainty about another variable (Quiroga & Panzeri, 2009) . Copula models can also escape the 'curse of dimensionality' by factorising the multi-dimensional dependence into pair-copula constructions called vines (Aas et al., 2009; Czado, 2010) . Copula models have been successfully applied to spiking activity (Onken et al., 2009; Hu et al., 2015; Shahbaba et al., 2014; Berkes et al., 2009) , 2-photon calcium recordings (Safaai, 2019) and multi-modal neuronal datasets (Onken & Panzeri, 2016) . However, these models assumed that the dependence between variables was static, whereas in neuronal recordings it may be dynamic or modulated by behavioral context (Doiron et al., 2016; Shimazaki et al., 2012) . Therefore, it might be helpful to explicitly model the continuous time-or context-dependent changes in the relationships between variables, which reflect changes in an underlying computation. Here, we extend a copula-based approach by adding explicit conditional dependence to the parameters of the copula model, approximating these latent dependencies with Gaussian Processes (GP). It was previously shown that such a combination of parametric copula models with GP priors outperforms static copula models (Lopez-Paz et al., 2013) and even dynamic copula models on many real-world datasets, including weather forecasts, geological data or stock market data (Hernández-Lobato et al., 2013 ). Yet, this method has never been applied to neuronal recordings before. In this work, we increase the complexity of both marginal and copula models in order to adequately describe the complex dependencies commonly observed in neuronal data. In particular, we use conditional marginal models to account for changes of the single neuron statistics and mixtures of parametric copula models to account for changes in tail dependencies. We also improve the scalability of the method by using stochastic variational inference. We develop model selection algorithms, based on the fully-Bayesian Watanabe-Akaike information criterion (WAIC). Finally and most importantly, we demonstrate that our model is suitable for estimating mutual information. It performs especially well when the parametric model can closely approximate the target distribution. When it is not the case, our copula mixture model demonstrates sufficient flexibility and provides close information estimates, comparable to the best state-of-the-art non-parametric information estimators. The goal of this paper is to propose and validate the statistical Copula-GP method, and illustrate that it combines multiple desirable properties for neuroscience applications: interpretability of parametric copula models, accuracy in density and information estimation and scalability to large datasets. We first introduce the copula mixture models and propose model selection algorithms (Sec. 2). We then validate our model on synthetic data and compare its performance against other commonly used information estimators (Sec. 3). Next, we demonstrate the utility of the method on real neuronal and behavioral data (Sec. 4). We show that our Copula-GP method can produce bivariate models that emphasize the qualitative changes in tail dependencies and estimate mutual information that exposes the structure of the task without providing any explicit cues to the model. Finally, we measure information content in the whole dataset with 5 behavioral variables and more than 100 neurons.

2. PARAMETRIC COPULA MIXTURES WITH GAUSSIAN PROCESS PRIORS

Our model is based on copulas: multivariate distributions with uniform marginals. Sklar's theorem (Sklar, 1959) states that any multivariate joint distribution can be written in terms of univariate marginal distribution functions p(y i ) and a unique copula which characterizes the dependence structure: p(y 1 , . . . , y N ) = c(F 1 (y 1 ) . . . F N (y N ))× N i=1 p(y i ). Here, F i (•) are the marginal cumulative distribution functions (CDF). Thus, for each i, F i (y i ) is uniformly distributed on [0,1]. For high dimensional datasets (dim y), maximum likelihood estimation for copula parameters may become computationally challenging. The two-stage inference for margins (IFM) training scheme is typically used in this case (Joe, 2005) . First, univariate marginals are estimated and used to map the data onto a multidimensional unit cube. Second, the parameters of the copula model are inferred.

