EFFECTIVE SUBSPACE INDEXING VIA INTERPOLATION ON STIEFEL AND GRASSMANN MANIFOLDS

Abstract

We propose a novel local Subspace Indexing Model with Interpolation (SIM-I) for low-dimensional embedding of image data sets. Our SIM-I is constructed via two steps: in the first step we build a piece-wise linear affinity-aware subspace model under a given partition of the data set; in the second step we interpolate between several adjacent linear subspace models constructed previously using the "center of mass" calculation on Stiefel and Grassmann manifolds. The resulting subspace indexing model built by SIM-I is a globally non-linear low-dimensional embedding of the original data set. Furthermore, the interpolation step produces a "smoothed" version of the piece-wise linear embedding mapping constructed in the first step, and can be viewed as a regularization procedure. We provide experimental results validating the effectiveness of SIM-I, that improves PCA recovery for SIFT data set and nearest-neighbor classification success rates for MNIST and CIFAR-10 data sets.



have been successful in many application problems related to dimension reduction (Zhou et al. (2010), Bian & Tao (2011), Si et al. (2010), Zhang et al. (2009)), with applications including, e.g., human face recognition (Fu & Huang (2008)), speech and gait recognition (Tao et al. (2007)), etc.. The classical approaches of subspace selection in dimension reduction include algorithms like Principle Component Analysis (PCA, see Jolliffe (2002)) and Linear Discriminant Analysis (LDA, see Belhumeur et al. (1997), Tao et al. (2009)). They are looking for globally linear subspace models. Therefore, they fail to estimate the nonlinearity of the intrinsic data manifold, and ignore the local variation of the data (Saul & Roweis (2003), Strassen (1969)). Consequently, these globally linear models are often ineffective for search problems on large scale image data sets. To resolve this difficulty, nonlinear algorithms such as kernel algorithms (Ham et al. (2004)) and manifold learning algorithms (Belkin et al. (2006), Guan et al. (2011)) are proposed. However, even though these nonlinear methods significantly improve the recognition performance, they face a serious computational challenge dealing with large-scale data sets due to the complexity of matrix decomposition at the size of the number of training samples.

Figure 1: The idea of "smoothing" a piece-wise linear low-dimensional embedding model: (a) The piece-wise linear low-dimensional embedding model built from LPP; (b) The regularized lowdimensional embedding by taking Stiefel/Grassmann manifold center-of-mass among adjacent linear pieces.

