SIGNAL CODING AND RECONSTRUCTION USING SPIKE TRAINS

Abstract

In many animal sensory pathways, the transformation from external stimuli to spike trains is essentially deterministic. In this context, a new mathematical framework for coding and reconstruction, based on a biologically plausible model of the spiking neuron, is presented. The framework considers encoding of a signal through spike trains generated by an ensemble of neurons via a standard convolve-thenthreshold mechanism, albeit with a wide variety of convolution kernels. Neurons are distinguished by their convolution kernels and threshold values. Reconstruction is posited as a convex optimization minimizing energy. Formal conditions under which perfect and approximate reconstruction of the signal from the spike trains is possible are then identified. Coding experiments on a large audio dataset are presented to demonstrate the strength of the framework.

1. INTRODUCTION

In biological systems, sensory stimuli is communicated to the brain primarily via ensembles of discrete events that are spatiotemporally compact electrical disturbances generated by neurons, otherwise known as spikes. Spike train representation of signals, when sparse, are not only intrinsically energy efficient, but can also facilitate downstream computation(6; 10). In their seminal work, Olshausen and Field (13) showed how efficient codes can arise from learning sparse representations of natural stimulus statistics, resulting in striking similarities with observed biological receptive fields. ( 19) developed a biophysically motivated spiking neural network which for the first time predicted the full diversity of V1 simple cell receptive field shapes when trained on natural images. Although these results signify substantial progress, an effective end to end signal processing framework that deterministically represents signals via spike train ensembles is yet to be laid out. Here we present a new framework for coding and reconstruction leveraging a biologically plausible coding mechanism which is a superset of the standard leaky integrate-and-fire neuron model (5). Our proposed framework identifies reconstruction guarantees for a very general class of signals-those with finite rate of innovation (18)-as shown in our perfect and approximate reconstruction theorems. Most other classes, e.g. bandlimited signals, are subsets of this class. The proposed technique first formulates reconstruction as an optimization that minimizes the energy of the reconstructed signal subject to consistency with the spike train, and then solves it in closed form. We then identify a general class of signals for which reconstruction is provably perfect under certain ideal conditions. Subsequently, we present a mathematical bound on the error of an approximate reconstruction when the model deviates from those ideal conditions. Finally, we present simulation experiments coding for a large dataset of audio signals that demonstrate the efficacy of the framework. In a separate set of experiments on a smaller subset of audio signals we compare our framework with existing sparse coding algorithms viz matching pursuit and orthogonal matching pursuit, establishing the strength of our technique. The remainder of the paper is structured as follows. In Sections 2 and 3 we introduce the coding and decoding frameworks. Section 4 identifies the class of signals for which perfect reconstruction is achievable if certain ideal conditions are met. In Section 5 we discuss how in practice those ideal conditions can be approached and provide a mathematical bound for approximate reconstruction. Simulation results are presented in Section 6. We conclude in Section 8.

2. CODING

The general class of deterministic mappings (i.e., the set of all nonlinear operators) from continuous time signals to spike trains is difficult to characterize because the space of all spike trains does not lend itself to a natural topology that is universally embraced. The result is that simple characterizations, such as the set of all continuous operators, can not be posited in a manner that has general consensus. To resolve this issue, we take a cue from biological systems. In most animal sensory pathways, external stimulus passes through a series of transformations before being turned into spike trains(17). For example, visual signal in the retina is processed by multiple layers of non-spiking horizontal, amacrine and bipolar cells, before being converted into spike trains by the retinal ganglion cells. Accordingly, we can consider the set of transformations that pass via an intermediate continuous time signal which is then transformed into a spike train through a stereotyped mapping where spikes mark threshold crossings. The complexity of the operator now lies in the mapping from the continuous time input signal to the continuous time intermediate signal. Since any time invariant, continuous, nonlinear operator with fading memory can be approximated by a finite Volterra series operator(2), this general class of nonlinear operators from continuous time signals to spike trains can be modeled as the composition of a finite Volterra series operator and a neuronal thresholding operation to generate a spike train. Here, the simplest subclass of these transformations is considered: the case where the Volterra series operator has a single causal, bounded-time, linear term, the output of which is composed with a thresholding operation of a potentially time varying threshold. The overall operator from the input signal to the spike train remains nonlinear due to the thresholding operation. The code generated by an ensemble of such transformations, corresponding to an ensemble of spike trains, is explored. Formally, we assume the input signal X(t) to be a bounded square integrable function over the compact interval [0, T ] for some T ∈ R + , i.e., we are interested in the class of input signals F = {X(t)|X(t) ∈ L 2 [0, T ]}. Since the framework involves signal snippets of arbitrary length, this choice of T is without loss of generalization. We assume an ensemble of convolution kernels K = {K j |j ∈ Z + , j ≤ n}, consisting of n kernels K j , j = 1, . . . , n. We assume that K j (t) is a continuous function on a bounded time interval [0, T ], i.e. ∀j ∈ {1, . . . , n}, K j (t) ∈ C[0, T ], T ∈ R + . Finally, we assume that K j has a time varying threshold denoted by T j (t). The ensemble of convolution kernels K encodes a given input signal X(t) into a sequence of spikes {(t i , K ji )}, where the i th spike is produced by the j th i kernel K ji at time t i if and only if: X(τ )K ji (t i -τ )dτ = T ji (t i ) In our experiments a specific threshold function is assumed in which the time varying threshold T j (t) of the jth kernel remains constant at C j until that kernel produces a spike, at which time an after-hyperpolarization potential (ahp) is introduced to raise the threshold to a high value M j C j , which then drops back linearly to its original value within a refractory period δ j . Stated formally, T j (t) =    C j , t -δ j > t j l (t) M j - (t-t j l (t))(M j -C j ) δj , t -δ j ≤ t j l (t) Where t j l (t) denotes the time of the last spike generated by K j prior to time t.

3. DECODING

How rich is the coding mechanism just described? We can investigate this question formally by positing a decoding module. The objective of the decoding module is to reconstruct the original signal from the encoded ensemble of spike trains. It is worthwhile to mention that to be able to communicate signals properly by our proposed framework, the decoding module needs to be designed in a manner so that it can operate solely on the spike train data handed over by the encoding module, without explicit access to the input signal itself. Considering the prospect of the invertibility of the coding scheme, we seek a signal that satisfies the same set of constraints as the original signal when generating all spikes apropos the set of kernels in ensemble K. Recognizing that such a signal might not be unique, we choose the reconstructed signal as the one with minimum L2-norm. Formally, the reconstruction (denoted by X * (t)) of the input signal X(t) is formulated to be the solution to the optimization problem:

