A GENERAL COMPUTATIONAL FRAMEWORK TO MEA-SURE THE EXPRESSIVENESS OF COMPLEX NETWORKS USING A TIGHT UPPER BOUND OF LINEAR REGIONS Anonymous

Abstract

The expressiveness of deep neural network (DNN) is a perspective to understand the surprising performance of DNN. The number of linear regions, i.e. pieces that a piece-wise-linear function represented by a DNN, is generally used to measure the expressiveness. And the upper bound of regions number partitioned by a rectifier network, instead of the number itself, is a more practical measurement of expressiveness of a rectifier DNN. In this work, we propose a new and tighter upper bound of regions number. Inspired by the proof of this upper bound and the framework of matrix computation in Hinz & Van de Geer ( 2019), we propose a general computational approach to compute a tight upper bound of regions number for theoretically any network structures (e.g. DNN with all kind of skip connections and residual structures). Our experiments show our upper bound is tighter than existing ones, and explain why skip connections and residual structures can improve network performance.

1. INTRODUCTION

Deep nerual network (DNN) (LeCun et al., 2015) has obtained great success in many fields such as computer vision, speech recognition and neural language process (Krizhevsky et al., 2012; Hinton et al., 2012; Devlin et al., 2018; Goodfellow et al., 2014) . However, it has not been completely understood why DNNs can perform well with satisfying generalization on different tasks. Expressiveness is one perspective used to address this open question. More specifically, one can theoretically study expressiveness of DNNs using approximation theory (Cybenko, 1989; Hornik et al., 1989; Hanin, 2019; Mhaskar & Poggio, 2016; Arora et al., 2016) , or measure the expressiveness of a DNN. While sigmoid or tanh functions are employed as the activation functions in early work of DNNs, rectified linear units (ReLU) or other piece-wise linear functions are more popular in nowadays. Yarotsky (2017) has proved that any DNN with piece-wise linear activation functions can be transformed to a DNN with ReLU. Thus, the study of expressiveness usually focuses on ReLU DNNs. It is known that a ReLU DNN represents a piece-wise linear (PWL) function, which can be regarded to have different linear transforms for each region. And with more regions the PWL function is more complex and has stronger expressive ability. Therefore, the number of linear regions is intuitively a meaningful measurement of expressiveness (Pascanu et al., 2013; Montufar et al., 2014; Raghu et al., 2017; Serra et al., 2018; Hinz & Van de Geer, 2019) . A direct measurement of linear regions number is difficult, if not impossible, and thus the upper bound of linear regions number is practically used as a figure of metrics to characterize the expressiveness. Inspired by the computational framework in (Hinz & Van de Geer, 2019), we improve the upper bound in Serra et al. (2018) for multilayer perceptrons (MLPs) and extend the framework to more complex networks. More importantly, we propose a general approach to construct a more accurate upper bound for almost any type of network. The contributions of this paper are listed as follows. • Through a geometric analysis, we derive a recursive formula for γ, which is a key parameter to construct a tight upper bound. Employing a better initial value, we propose a tighter upper bound for deep fully-connected ReLU networks. In addition, the recursive formula provide a potential to further improve the upper bound given an improved initial value.

