ACTIVE DEEP PROBABILISTIC SUBSAMPLING

Abstract

Subsampling a signal of interest can reduce costly data transfer, battery drain, radiation exposure and acquisition time in a wide range of problems. The recently proposed Deep Probabilistic Subsampling (DPS) method effectively integrates subsampling in an end-to-end deep learning model, but learns a static pattern for all datapoints. We generalize DPS to a sequential method that actively picks the next sample based on the information acquired so far; dubbed Active-DPS (A-DPS). We validate that A-DPS improves over DPS for MNIST classification at high subsampling rates. We observe that A-DPS learns to actively adapt based on the previously sampled elements, yielding different sampling sequences across the dataset. Moreover, we demonstrate strong performance in active acquisition Magnetic Resonance Image (MRI) reconstruction, outperforming DPS and other deep learning methods.

1. INTRODUCTION

Present-day technologies produce and consume vast amounts of data, which is typically acquired using an analog-to-digital converter (ADC). The amount of data digitized by an ADC is determined not only by the temporal sampling rate, but also by the manner in which spatial acquisitions are taken, e.g. by using a specific design of sensor arrays. Reducing the number of sample acquisitions needed, can lead to meaningful reductions in scanning time, e.g. in Magnetic Resonance Imaging (MRI), radiation exposure, e.g. in Computed Tomography (CT), battery drain, and bandwidth requirements. While the Nyquist theorem is traditionally used to provide theoretical bounds on the sampling rate, in recent years signal reconstruction from sub-Nyquist sampled data has been achieved through a framework called Compressive Sensing (CS). Donoho (2006) , and later applied for MRI by Lustig et al. (2007) , CS leverages structural signal priors, specifically sparsity under some known transform. By taking compressive measurements followed by iterative optimization of a linear system under said sparsity prior, reconstruction of the original signal is possible while sampling at sub-Nyquist rates. Researchers have employed CS with great success in a wide variety of applications, such as radar (Baraniuk & Steeghs, 2007; Ender, 2010) , seismic surveying (Herrmann et al., 2012 ), spectroscopy (Sanders et al., 2012) , and medical imaging (Han et al., 2016; Lai et al., 2016) .

First proposed by

However, both the need to know the sparsifying basis of the data, and the iterative nature of the reconstruction algorithms, still hamper practical applicability of CS in many situations. These limitations can be overcome by the use of deep learning reconstruction models that make the sparsity assumption implicit, and facilitate non-iterative inference once trained. Moreover, the (typically random) nature of the measurement matrix in CS does, despite adhering to the given assumptions, not necessarily result in an optimal measurement given the underlying data statistics and the downstream system task. This has recently been tackled by algorithms that learn the sampling scheme from a data distribution. In general, these data-driven sampling algorithms can be divided into two categories: algorithms that learn sampling schemes which are fixed once learned (Huijben et al., 2020a; b; c; Ravishankar & Bresler, 2011; Sanchez et al., 2020; Bahadir et al., 2019; Bahadir et al., 2020; Weiss et al., 2019) , and algorithms that learn to actively sample (Ji et al., 2008; Zhang et al., 2019; Jin et al., 2019; Pineda et al., 2020; Bakker et al., 2020) ; selecting new samples based on sequential acquisition of the information. The former type of algorithms learn a sampling scheme that -on averageselects informative samples of all instances originating from the training distribution. However, when this distribution is multi-modal, using one globally optimized sampling scheme, can easily be sub-optimal on instance-level. Active acquisition algorithms deal with such shifts in underlying data statistics by conditioning sampling behavior on previously acquired information from the instance (e.g. the image to be sampled). This results in a sampling sequence that varies across test instances, i.e. sampling is adapted to the new data. This adaptation as a result of conditioning, promises lower achievable sampling rates, or better downstream task performance for the same rate, compared to sampling schemes that operate equivalently on all data. In this work, we extend the Deep Probabilistic Subsampling (DPS) framework (Huijben et al., 2020a) to an active acquisition framework by making the sampling procedure iterative and conditional on the samples already acquired, see Fig. 1 . We refer to our method as Active Deep Probabilistic Subsampling (A-DPS). We show how A-DPS clearly exploits the ten different modalities (i.e. the digits) present in the MNIST dataset to adopts instance-adaptive sampling sequences. Moreover, we demonstrate both on MNIST (LeCun et al., 1998) and the real-world fast MRI knee dataset (Zbontar et al., 2018) , that A-DPS outperforms other state-of-the-art models for learned sub-Nyquist sampling. We make all code publicly available upon publication, in order to facilitate benchmarking to all provided baselines and A-DPS in future research. 

2. RELATED WORK

Recently, several techniques for learning a fixed sampling pattern have been proposed, especially in the field of MR imaging, in which Ravishankar & Bresler (2011) were one of the firsts. In this work, the authors make use of non-overlapping cells in k-space, and move samples between these cells.During training Ravishankar & Bresler (2011) alternate between reconstruction and relocation of sampling positions. After a reconstruction step they sort the cells in terms of reconstructing error and an infinite-p norm. Selected samples from lower scoring cells are relocated to higher scoring cells in a greedy fashion. 2020) also propose a greedy approach, in which samples are not relocated between cells, but greedily chosen to optimize a reconstruction loss on a batch of examples. Both of the types of greedy optimization do however not allow for joint learning of sampling together with a downstream reconstruction/task model, as the reconstruction has to either be parameter-free or pretrained to work well with a variety of sampling schemes. One of the first active sampling schemes was proposed by Ji et al. (2008) , who leverage CS reconstruction techniques that also give a measure of uncertainty of the reconstruction using Bayesian modeling. Ji et al. (2008) leveraged this uncertainty in the reconstruction to adaptivly select the next measurement that will reduce this uncertainty by the largest amount. However, this method -and other similar works from (Carson et al., 2012; Li et al., 2013) -rely on linearly combined measurements, rather than discrete sampling, with which we concern ourselves here.



Figure 1: Architectural overview for 3 acquisition steps of A-DPS, with extensions of DPS shown in red.

Bahadir et al. (2019)  on the other hand propose to learn the sampling pattern by thresholding pixelbased i.i.d. samples drawn from a uniform distribution, dubbed Learning-based Optimization of the Under-sampling PattErn (LOUPE). The sample rate of LOUPE is indirectly controlled by promoting sparsity through the use of an 1 penalty on the thresholds.

