ONE-STEP ESTIMATOR FOR PERMUTED SPARSE RECOVERY

Abstract

This paper considers the unlabeled sparse recovery under multiple measurements, i.e., represents the observations, missing (or incomplete) correspondence information, sensing matrix, sparse signals, and additive sensing noise, respectively. Different from the previous works on multiple measurements (m > 1) which all focus on the sufficient samples regime, namely, n > p, we consider a sparse matrix B and investigate the insufficient samples regime (i.e., n p) for the first time. To begin with, we establish the lower bound on the sample number and signal-to-noise ratio (SNR) for the correct permutation recovery. Moreover, we present a simple yet effective estimator. Under mild conditions, we show that our estimator can restore the correct correspondence information with high probability. Numerical experiments are presented to corroborate our theoretical claims.

1. INTRODUCTION

In recent years, linear regression with permuted correspondence has received increasing attention due to its wide applications in the field of machine learning, signal processing, and statistics. Among all these applications, two most prominent examples are (i) linkage record, which merges two datasets pertaining to the same objects into one comprehensive dataset; and (ii) data de-anonymization, which infers the hidden labels of private data with public datasets. Apart from these two applications, other applications include correspondence estimation between pose and estimation in graphics; timedomain sampling in the presence of clock jitter; multi-target tracking; unsupervised data alignment, etc (Pananjady et al., 2018; Slawski & Ben-David, 2019; Slawski et al., 2020; Zhang et al., 2018) . In this paper, we consider the canonical model, i.e., a linear sensing relation with permuted labels: Y = Π XB + W, where Y ∈ R n×m is the sensing result, Π ∈ R n×n is an unknown permutation matrix, X ∈ R n×p is the design (sensing) matrix, B ∈ R p×m represents the sparse signals of interests, and W ∈ R n×m denotes the additive noise. Assuming the signal B is a sparse signal, to put more specifically, each column of B is k-sparse, we would like to (i) study the statistical limits of the permutation recovery under this scenario, e.g., the minimum sample number n and signal-to-noise ratio (SNR); and (ii) propose a practical estimator that can efficiently recover the permutation once the minimum requirements are met. To begin with, we briefly review the previous works. Related Works. The study of permuted linear regression has a long history that can at least date back to DeGroot & Goel (1976; 1980); Goel (1975); Bai & Hsing (2005) . Recent interests on this area start from Unnikrishnan et al. (2015) . Focusing on the noiseless case W = 0 with single measurement (m = 1), Unnikrishnan et al. (2015) establish the necessary condition n ≥ 2p for the permutation recovery if B is an arbitrary vector residing within the linear space R p . Later, Pananjady et al. (2018) extend the analysis to the noisy scenario. They showed the minimum SNR should be at least the order of Ω(n c ), where c > 0 is some positive constant. Numerical experiments suggest c is within the region [4, 5] . Other works such as Hsu et al. ( 2017 2014), the setting with a sparse signal B is first studied. However, only empirical investigation is conducted without rigorous theoretical analysis. In the first work with theoretical analysis (Zhang & Li, 2021), both the statistical limits and practical estimators with almost optimal performance are presented for the permutation recovery. Peng et al. ( 2021) studies the problem from the viewpoint of algebraic geometry. All existing works suggest that SNR = Ω(n c ) is inevitable for the permutation reconstruction if only one measurement is conducted, namely, m = 1. On the other hand, numerous works suggest multiple measurements, i.e., m > 1, can greatly reduce the SNR requirement, even to some positive constant. This line of research starts from Zhang et al. (2022) , where the information theoretic lower bounds and the maximum likelihood (ML) estimator are investigated. Later, Zhang & Li (2020) study this problem from the viewpoint of non-convex optimization and propose an optimal estimator for the permutation recovery. Independently, Slawski et al. ( 2020) investigate this problem from the viewpoint of denoising. Putting parsimonious constraints on the number of permuted rows, they view (I -Π )XB as sparse outliers and design the permutation recovery algorithm accordingly. These works focus on the sufficient samples regime, namely, n = Ω(p). In this paper, we focus on the insufficient samples regime. Assuming B to be sparse, we would like to show the correct permutation can be obtained with n p and SNR = O(1). Our contributions are summarized as follows • We propose a one-step estimator for the correspondence recovery, which consists of two sub-parts: one for Π and another for B . By formulating the correspondence recovery as a linear assignment problem (LAP) (Kuhn, 1955; Bertsekas & Castañón, 1992; Burkard et al., 2012) , the correct permutation matrix can be obtained when SNR is above certain positive constant. On top of the above contributions, we would like to briefly mention our proof strategy, which is based on a tailored version of the leave-one-out technique. Compared with the previous works that adopt the leave-out-out technique (Chen et al., 2020; Sur et al., 2019; El Karoui, 2013; 2018; Cai et al., 2021) , our construction method bas the following characteristics • We not only leave out the rows, but also modify the thresholding operator operated on the perturbed samples B (••• ) from thres(•) to (•) imax (its definition is deferred to Subsection 4.2). This step is essential in controlling the approximation error, since otherwise the non-zero elements in matrices B (∝ X Y) and B (•) may not share the same position and the approximation error can be considerably large. A thorough understanding is deferred to the proof of Theorem 3. • Notations. Denote c, c , c i as some positive constants, whose values are not necessarily the same even for those with the same notations. We denote a b if there exists some positive constants c 0 > 0 such that a ≤ c 0 b. Similarly, we define a b provided a ≥ c 0 b for some positive constant c 0 . We write a b when a b and a b hold simultaneously. For an arbitrary matrix M, we denote M i,: as its ith row, M :,i as its ith column, and M ij as its (i, j)th element. Its Frobenius norm is defined as | | |M| | | F while the operator norm is denoted as | | |M| | | OP , whose definitions can be found in Section 2.3 of Golub & Loan (2013). In addition, we define its stable rank as srank( (Section 2.1.15 in Tropp (2015) ) and its support set supp(•) as {(i, j) : (•) i,j = 0}. The inner product between matrices is denoted as •, • while the inner product between vectors is denoted as •, • . •) | | |•| | | 2 F /| | |•| | | 2 OP We define the set of all possible permutation matrices as P n , which is defined as {Π ∈ {0, 1} n×n : n i=1 Π ij = 1, n j=1 Π ij = 1}. Associate with each permutation matrix Π, we define the operator π(•) that transforms index i to π(i) under Π. The Hamming distance d H (Π 1 , Π 2 ) between two permutation matrices Π 1 and Π 2 is defined as d H (Π 1 , Π 2 ) = n i=1 1 (π 1 (i) = π 2 (i)). The SNR is defined as B 2 F /(m • σ 2 ).



); Abid et al. (2017); Slawski & Ben-David (2019); Tsakiris et al. (2020); Haghighatshoar & Caire (2018) also focus on this regime and obtain the same answer. In Emiya et al. (

Our construction method is adaptive, which replaces multiple rows ranging from 2 to 4 depending on each permuted row. Meanwhile, previous works such as Chen et al. (2020); Sur et al. (2019); El Karoui (2013; 2018); Cai et al. (2021) replace a fixed number of rows (or columns).

We study the lower bounds w.r.t the number n and signal-to-noise ratio (SNR) for the correct reconstruction of both the permutation matrix Π and the signal B . Assuming each column

