SYNTHESISING REALISTIC CALCIUM TRACES OF NEURONAL POPULATIONS USING GAN

Abstract

Calcium imaging has become a powerful and popular technique to monitor the activity of large populations of neurons in vivo. However, for ethical considerations and despite recent technical developments, recordings are still constrained to a limited number of trials and animals. This limits the amount of data available from individual experiments and hinders the development of analysis techniques and models for more realistic sizes of neuronal populations. The ability to artificially synthesize realistic neuronal calcium signals could greatly alleviate this problem by scaling up the number of trials. Here, we propose a Generative Adversarial Network (GAN) model to generate realistic calcium signals as seen in neuronal somata with calcium imaging. To this end, we propose CalciumGAN, a model based on the WaveGAN architecture and train it on calcium fluorescent signals with the Wasserstein distance. We test the model on artificial data with known ground-truth and show that the distribution of the generated signals closely resembles the underlying data distribution. Then, we train the model on real calcium traces recorded from the primary visual cortex of behaving mice and confirm that the deconvolved spike trains match the statistics of the recorded data. Together, these results demonstrate that our model can successfully generate realistic calcium traces, thereby providing the means to augment existing datasets of neuronal activity for enhanced data exploration and modelling.

1. INTRODUCTION

The ability to record accurate neuronal activities from behaving animals is essential for the study of information processing in the brain. Electrophysiological recording, which measures the rate of change in voltage by microelectrodes inserted in the cell membrane of a neuron, has high temporal resolution and is considered the most accurate method to measure spike activities (Dayan & Abbott, 2001) . However, this method is not without shortcomings (Harris et al., 2016) . For instance, a single microelectrode can only detect activity from few neurons in close proximity, and extensive pre-processing is required to infer single-unit activity from a multi-unit signal. Disentangling circuit computations in neuronal populations of a large scale remains a difficult task (Rey et al., 2015) . On the other hand, calcium imaging monitors the calcium influx in the cell as a proxy of an action potential (Berridge et al., 2000) . Contrary to electrophysiological recordings, this technique yields data with high spatial resolution and low temporal resolution (Grienberger & Konnerth, 2012) , and has become a powerful imaging technique to monitor large neuronal populations. With the advancements in these recording technologies, it has become increasingly easier to obtain high-quality neuronal activity data in vivo from live animals. However, due to ethical considerations, the acquired datasets are often limited by the number of trials or the duration of each trial on a live animal. This poses a problem for assessing analysis techniques that take into account higher-order correlations (Brown et al., 2004; Staude et al., 2010; Stevenson & Kording, 2011; Saxena & Cunningham, 2019) . Even for linear decoders, the number of trials can be more important for determining coding accuracy than the number of neurons (Stringer et al., 2019) . Generative models of neuronal activity hold the promise of alleviating the above problem by enabling the synthesis of an unlimited number of realistic samples for assessing advanced analysis methods. Popular modelling approaches such as the maximum entropy framework (Schneidman et al., 2006; Tkačik et al., 2014) and the latent variable model (Macke et al., 2009; Lyamzin et al., 2010) have shown ample success in modelling spiking activities, though many of these models re-

