APPROXIMATION AND NON-PARAMETRIC ESTIMATION OF FUNCTIONS OVER HIGH-DIMENSIONAL SPHERES VIA DEEP RELU NETWORKS

Abstract

We develop a new approximation and statistical estimation analysis of deep feedforward neural networks (FNNs) with the Rectified Linear Unit (ReLU) activation. The functions of interests for the approximation and estimation are assumed to be from Sobolev spaces defined over the d-dimensional unit sphere with smoothness index r > 0. In the regime where r is of the constant order (i.e., r = O(1)), it is shown that at most d d active parameters are required for getting d -C approximation rate for some constant C > 0. In the regime where the index r grows in the order of d (i.e., r = O(d)) asymptotically, we prove the approximation error decays in the rate d -d β with 0 < β < 1 up to some constant factor independent of d. The required number of active parameters in the networks for the approximation increases polynomially in d as d → ∞. It is also shown that bound on the excess risk has a d d factor, when r = O(1), whereas it has d O(1) factor, when r = O(d). We emphasize our findings by making comparisons to the results on the approximation and estimation errors of deep ReLU FNN when functions are from Sobolev spaces defined over d-dimensional cube. In this case, we show that with the current state-of-the-art result, d d factor remain both in the approximation and estimation errors, regardless of the order of r.

1. INTRODUCTION

Neural networks have demonstrated tremendous success in the tasks of image classification (Krizhevsky et al., 2012; Long et al., 2015) , pattern recognition (Silver et al., 2016) , natural language processing (Graves et al., 2013; Bahdanau et al., 2015; Young et al., 2018) , etc. The datasets used in these real world applications frequently lie in high-dimensional spaces (Wainwright, 2019) . In this paper, we try to understand the fundamental limits of neural networks in the high-dimensional regime through the lens of its approximation power and its generalization error. Both approximation power and generalization error of neural network can be analyzed through specifying the target function's property such as its smoothness index r > 0 and its input space X . In particular, deep feed-forward neural networks (FNNs) with Rectified Linear Units (ReLU) have been extensively studied when they are used for approximating and estimating functions from general function class such as Sobolev class defined on d-dimensional cube (i.e., X := C d ), denoted as W r p (C d ) for 1 ≤ p ≤ ∞. However, in practice, signals on a spherical surface (i.e., X := S d-1 = {x ∈ R d : ∥x∥ 2 = 1}) rather than on Euclidean spaces often arise in various fields, such as astrophysics (Starck et al., 2006; Wiaux et al., 2005 ), computer vision (Brechbühler et al., 1995) , and medical imaging (Yu et al., 2007) . Motivated by this, we focus our attention on the cases where deep ReLU FNNs are used for function approximators and estimators, when functions are assumed to be from the Sobolev spaces defined over S d-1 ; that is f ∈ W r ∞ (S d-1 ). Under this setting, our analysis focuses on how the input dimension d explicitly affects the approximation and estimation rates of f ∈ W r ∞ (S d-1 ). At the same time, we show how the scalability of deep ReLU FNNs grows in the high-dimensional regime. Here, the scalability is mainly measured through the three metrics: (1) the width denoted as W,

