SYNTHESISING REALISTIC CALCIUM TRACES OF NEURONAL POPULATIONS USING GAN

Abstract

Calcium imaging has become a powerful and popular technique to monitor the activity of large populations of neurons in vivo. However, for ethical considerations and despite recent technical developments, recordings are still constrained to a limited number of trials and animals. This limits the amount of data available from individual experiments and hinders the development of analysis techniques and models for more realistic sizes of neuronal populations. The ability to artificially synthesize realistic neuronal calcium signals could greatly alleviate this problem by scaling up the number of trials. Here, we propose a Generative Adversarial Network (GAN) model to generate realistic calcium signals as seen in neuronal somata with calcium imaging. To this end, we propose CalciumGAN, a model based on the WaveGAN architecture and train it on calcium fluorescent signals with the Wasserstein distance. We test the model on artificial data with known ground-truth and show that the distribution of the generated signals closely resembles the underlying data distribution. Then, we train the model on real calcium traces recorded from the primary visual cortex of behaving mice and confirm that the deconvolved spike trains match the statistics of the recorded data. Together, these results demonstrate that our model can successfully generate realistic calcium traces, thereby providing the means to augment existing datasets of neuronal activity for enhanced data exploration and modelling.

1. INTRODUCTION

The ability to record accurate neuronal activities from behaving animals is essential for the study of information processing in the brain. Electrophysiological recording, which measures the rate of change in voltage by microelectrodes inserted in the cell membrane of a neuron, has high temporal resolution and is considered the most accurate method to measure spike activities (Dayan & Abbott, 2001) . However, this method is not without shortcomings (Harris et al., 2016) . For instance, a single microelectrode can only detect activity from few neurons in close proximity, and extensive pre-processing is required to infer single-unit activity from a multi-unit signal. Disentangling circuit computations in neuronal populations of a large scale remains a difficult task (Rey et al., 2015) . On the other hand, calcium imaging monitors the calcium influx in the cell as a proxy of an action potential (Berridge et al., 2000) . Contrary to electrophysiological recordings, this technique yields data with high spatial resolution and low temporal resolution (Grienberger & Konnerth, 2012) , and has become a powerful imaging technique to monitor large neuronal populations. With the advancements in these recording technologies, it has become increasingly easier to obtain high-quality neuronal activity data in vivo from live animals. However, due to ethical considerations, the acquired datasets are often limited by the number of trials or the duration of each trial on a live animal. This poses a problem for assessing analysis techniques that take into account higher-order correlations (Brown et al., 2004; Staude et al., 2010; Stevenson & Kording, 2011; Saxena & Cunningham, 2019) . Even for linear decoders, the number of trials can be more important for determining coding accuracy than the number of neurons (Stringer et al., 2019) . Generative models of neuronal activity hold the promise of alleviating the above problem by enabling the synthesis of an unlimited number of realistic samples for assessing advanced analysis methods. Popular modelling approaches such as the maximum entropy framework (Schneidman et al., 2006; Tkačik et al., 2014) and the latent variable model (Macke et al., 2009; Lyamzin et al., 2010) have shown ample success in modelling spiking activities, though many of these models re-quire strong assumptions on the data and cannot generalize to different cortical areas. To this end, GANs have shown tremendous success in synthesizing data across a vast variety of domains and data-types (Karras et al., 2017; Gomez et al., 2018; Donahue et al., 2019) , and are good candidates for modelling neuronal activities. Spike-GAN (Molano-Mazon et al., 2018) demonstrated that GANs can model neural spikes that accurately match the statistics of real recorded spiking behaviour from a small number of neurons. Moreover, the discriminator in Spike-GAN is able to learn to detect which population activity pattern is the relevant feature, and this can provide insights into how a population of neurons encodes information. Ramesh et al. ( 2019) trained a conditional GAN (Mirza & Osindero, 2014) , conditioned on the stimulus, to generate multivariate binary spike trains. They fitted the generative model with data recorded in the V1 area of macaque visual cortex, and the GAN generated spike trains were able to capture the firing rate and pairwise correlation statistics better than the dichotomized Gaussian model (Macke et al., 2009) and a deep supervised convolution model. Nevertheless, the aforementioned deep generative models operate on spike trains which are discrete in nature, and back-propagation on discrete data remains a difficult task (Caccia et al., 2018) . For instance, Ramesh et al. (2019) used the REINFORCE gradient estimate (Williams, 1992) to train the generator in order to perform back-propagation on discrete data. Still, gradient estimation with the REINFORCE approach yields large variance, which is known to be challenging for optimization (Maddison et al., 2016; Zhang et al., 2017) . In addition, generating and training on binary spike trains directly introduces uncertainty as the generator has to learn the deconvolution process as well, making it an even more difficult task. In this work, we investigate the possibility of synthesising continuous calcium fluorescent signals using the GAN framework, as a method to scale-up or augment the amount of population activity data. In addition, modelling the calcium signals directly has several advantages (a) the generator needs to learn the deconvolution process when synthesising directly on binary spike trains, hence there is additional uncertainty, which is not present for calcium signals. (b) Calcium imaging signals have inherently more information about the neuronal activities than binary spike trains. (c) Based on calcium signals with known ground-truth, calcium deconvolution algorithms can be evaluated. Hence, We devised a workflow to synthesize and evaluate calcium imaging signals, then validate the method on artificial data with known ground-truth as well as mimicking real two-photon calcium (Ca 2+ ) imaging data as recorded from the primary visual cortex of a behaving mouse (Pakan et al., 2018; Henschke et al., 2020) .

2.1. NETWORK ARCHITECTURE

The original GAN framework, introduced in Goodfellow et al. ( 2014), plays a min-max game where the generator G attempts to generate convincing samples from the latent space Z, and the discriminator D learns to distinguish between generated samples and real samples X. In this work, we use the WGAN-GP (Gulrajani et al., 2017) formulation of the loss function without the need of incorporating any information of the neural activities into the training objective: L D = E z∼Z [D(G(z))] -E x∼X [D(x)] + λ E x∼ X[( ∇ xD(x) 2 -1) 2 ] (1) where λ denotes the gradient penalty coefficient, x = x + (1 -)x are samples taken between the real and generated data distribution. For learning calcium signal generation, we adapted the WaveGAN architecture (Donahue et al., 2019) , which has shown promising results in audio signal generation. In the generator, we used 1-dimensional transposed convolution layers to up-sample the input noise. We added Layer Normalization (Ioffe & Szegedy, 2015) in between each convolution and activation layer, in order to stabilize training as well as to make the operation compatible with the WGAN-GP framework. To improve the model learning performance and stability, the calcium signals were scaled to the range between 0 and 1 by normalizing with the maximum value of the calcium signal in the data. Correspondingly, we chose sigmoid activation in the output layer of the generator and then re-scaled the signals to their original range before inferring their spike trains.

