PROJECTIVE PROXIMAL GRADIENT DESCENT FOR A CLASS OF NONCONVEX NONSMOOTH OPTIMIZA-TION PROBLEMS: FAST CONVERGENCE WITHOUT KURDYKA-ŁOJASIEWICZ (KŁ) PROPERTY

Abstract

Nonconvex and nonsmooth optimization problems are important and challenging for statistics and machine learning. In this paper, we propose Projected Proximal Gradient Descent (PPGD) which solves a class of nonconvex and nonsmooth optimization problems, where the nonconvexity and nonsmoothness come from a nonsmooth regularization term which is nonconvex but piecewise convex. In contrast with existing convergence analysis of accelerated PGD methods for nonconvex and nonsmooth problems based on the Kurdyka-Łojasiewicz (KŁ) property, we provide a new theoretical analysis showing local fast convergence of PPGD. It is proved that PPGD achieves a fast convergence rate of O(1/k 2 ) when the iteration number k ≥ k 0 for a finite k 0 on a class of nonconvex and nonsmooth problems under mild assumptions, which is locally Nesterov's optimal convergence rate of first-order methods on smooth and convex objective function with Lipschitz continuous gradient. Experimental results demonstrate the effectiveness of PPGD.

1. INTRODUCTION

Nonconvex and nonsmooth optimization problems are challenging ones which have received a lot of attention in statistics and machine learning (Bolte et al., 2014; Ochs et al., 2015) . In this paper, we consider fast optimization algorithms for a class of nonconvex and nonsmooth problems presented as min x∈R d F (x) = g(x) + h(x), where g is convex, h(x) = d j=1 h j (x j ) is a separable regularizer, each h j is piecewise convex. A piecewise convex function is defined in Definition 1.1. For simplicity of analysis we let h j = f for all j ∈ [d], and f is a piecewise convex function. Here [d] is the set of natural numbers between 1 and n inclusively. f can be either nonconvex or convex, and all the results in this paper can be straightforwardly extended to the case when {h j } are different. Definition 1.1. A univariate function f : R → R is piecewise convex if f is lower semicontinuous and there exist intervals {R m } M m=1 such that R = M m=1 R m , and f restricted on R m is convex for each m ∈ [M ]. The left and right endpoints of R m are denoted by q m-1 and q m for all m ∈ [M ], where {q m } M m=0 are the endpoints such that q 0 = -∞ ≤ q 1 < q 2 < . . . < q M = +∞. Furthermore, f is either left continuous or right continuous at each endpoint q m for m ∈ [M -1]. {R m } M m=1 are also referred to as convex pieces throughout this paper. It is important to note that for all m ∈ [M -1], when f is continuous at the endpoint q m or f is only left continuous at q m , q m ∈ R m and q m / ∈ R m+1 . If f is only right continuous at q m , q m / ∈ R m and q m ∈ R m+1 . This ensures that any point in R lies in only one convex piece.

