EFFICIENT COVARIANCE ESTIMATION FOR SPARSIFIED FUNCTIONAL DATA

Abstract

To avoid prohibitive computation cost of sending entire data, we propose four sparsification schemes RANDOM-KNOTS, RANDOM-KNOTS-SPATIAL, B-SPLINE, BSPLINE-SPATIAL, and present corresponding nonparametric estimation of the covariance function. The covariance estimators are asymptotically equivalent to the sample covariance computed directly from the original data. And the estimated functional principal components effectively approximate the infeasible principal components under regularity conditions. The convergence rate reflects that leveraging spatial correlation and B-spline interpolation helps to reduce information loss. Data-driven selection method is further applied to determine the number of eigenfunctions in the model. Extensive numerical experiments are conducted to illustrate the theoretical results. 1

1. INTRODUCTION

Dimension reduction has received increasing attention to avoid expensive and slow computation. Stich et al. (2018) investigated the convergence rate of Stochastic Gradient Descent after sparsification. Jhunjhunwala et al. (2021) focused on the mean function of a vector containing only a subset of the original vector. The goal of this paper is to estimate the covariance function of sparsified functional data, which is a set of sparsified vectors collected from a distributed system of nodes. Functional data analysis (FDA) has become an important research area due to its wide applications. Classical FDA requires a large number of regularly spaced measurements per subject. The data takes the form {(x ij , j/d) , 1 ≤ i ≤ n, 1 ≤ j ≤ d} in which x i (•) is a latent smooth trajectory, x i (•) = m(•) + Z i (•). (1) The deterministic function m(•) denotes the common population mean, the random Z i (•) are subjectspecific small variation with EZ i (•) = 0. Both m (•) and Z i (•) are smooth functions of time t = j/d which is rescaled to domain D = [0, 1]. Trajectories x i (•) are identically distributed realizations of the continuous stochastic process {x(t), t ∈ D}, E sup t∈D |x(t)| 2 < +∞ which can be decomposed as Hsing & Eubank (2015) . Mercer Lemma entails that the ψ k 's are continuous and ) , in which the random coefficients, ξ k , called functional principal component (FPC) scores, are uncorrelated with mean 0 and variance 1. The rescaled eigenfunctions, ϕ k , called FPC, satisfy x(•) = m(•) + Z(•), EZ(t) = 0. The true covariance function is G (t, t ′ ) = Cov {Z(t), Z (t ′ )}. Let sequences {λ k } ∞ k=1 and {ψ k } ∞ k=1 be the eigenvalues and eigenfunctions of G (t, t ′ ), respec- tively, in which λ 1 ≥ λ 2 ≥ • • • ≥ 0, ∞ k=1 λ k < ∞, {ψ k } ∞ k=1 form an orthonormal basis of L 2 [0, 1], see G (t, t ′ ) = ∞ k=1 λ k ψ k (t)ψ k (t ′ ), G (t, t ′ ) ψ k (t ′ ) dt ′ = λ k ψ k (t). The standard process x(•) allows Karhunen-Loève L 2 representation x(•) = m(•)+ ∞ k=1 ξ k ϕ k (• ϕ k = √ λ k ψ k and {x(t) -m(t)}ϕ k (t)dt = λ k ξ k , for k ≥ 1. Although the sequences {λ k } ∞ k=1 , {ϕ k } ∞ k=1 and {ξ ik } n,∞ i=1,k=1 exist mathematically, they are either unknown or unobservable.

1.1. MAIN CONTRIBUTION

In FDA, covariance estimation plays a critical role in FPC analysis (Ramsay & Silverman (2005) , Li & Hsing (2010)), functional generalized linear models and other nonlinear models (Yao et al. 1 The code is attached to the supplementary material and will be publicly available once accepted. 1

