ON THE DECISION BOUNDARIES OF NEURAL NET-WORKS. A TROPICAL GEOMETRY PERSPECTIVE

Abstract

This work tackles the problem of characterizing and understanding the decision boundaries of neural networks with piecewise linear non-linearity activations. We use tropical geometry, a new development in the area of algebraic geometry, to characterize the decision boundaries of a simple network of the form (Affine, ReLU, Affine). Our main finding is that the decision boundaries are a subset of a tropical hypersurface, which is intimately related to a polytope formed by the convex hull of two zonotopes. The generators of these zonotopes are functions of the network parameters. This geometric characterization provides new perspectives to three tasks. (i) We propose a new tropical perspective to the lottery ticket hypothesis, where we view the effect of different initializations on the tropical geometric representation of a network's decision boundaries. (ii) Moreover, we propose new tropical based optimization reformulations that directly influence the decision boundaries of the network for the task of network pruning. (iii) At last, we briefly discuss the reformulation of the generation of adversarial attacks in a tropical sense, where we elaborate on this in detail in the supplementary material. 1

1. INTRODUCTION

Deep Neural Networks (DNNs) have demonstrated outstanding performance across a variety of research domains, including computer vision (Krizhevsky et al., 2012) , speech recognition (Hinton et al., 2012) , natural language processing (Bahdanau et al., 2015; Devlin et al., 2018) , quantum chemistry Schütt et al. (2017) , and healthcare (Ardila et al., 2019; Zhou et al., 2019) to name a few (LeCun et al., 2015) . Nevertheless, a rigorous interpretation of their success remains elusive (Shalev-Shwartz & Ben-David, 2014) . For instance, in an attempt to uncover the expressive power of DNNs, the work of Montufar et al. (2014) studied the complexity of functions computable by DNNs that have piecewise linear activations. They derived a lower bound on the maximum number of linear regions. Several other works have followed to improve such estimates under certain assumptions (Arora et al., 2018) . In addition, and in attempt to understand some of the subtle behaviours DNNs exhibit, e.g. the sensitive reaction of DNNs to small input perturbations, several works directly investigated the decision boundaries induced by a DNN for classification. The work of Moosavi-Dezfooli et al. (2019) showed that the smoothness of these decision boundaries and their curvature can play a vital role in network robustness. Moreover, the expressiveness of these decision boundaries at perturbed inputs was studied in He et al. (2018) , where it was shown that these boundaries do not resemble the boundaries around benign inputs. The work of Li et al. (2018) showed that under certain assumptions, the decision boundaries of the last fully connected layer of DNNs will converge to a linear SVM. Also, Beise et al. (2018) showed that the decision regions of DNNs with width smaller than the input dimension are unbounded. More recently, and due to the popularity of the piecewise linear ReLU as an activation function, there has been a surge in the number of works that study this class of DNNs in particular. As a result, this has incited significant interest in new mathematical tools that help analyze piecewise linear functions, such as tropical geometry. While tropical geometry has shown its potential in many applications such as dynamic programming (Joswig & Schröter, 2019), linear programming (Allamigeon et al., 2015) , multi-objective discrete optimization (Joswig & Loho, 2019), enumerative geometry (Mikhalkin, 2004) , and economics (Akian et al., 2009; Mai Tran & Yu, 2015) , it has only been recently used to analyze DNNs. For instance, the work of Zhang et al. (2018) showed an equivalency between the family of DNNs with piecewise linear activations and integer weight matrices and the family of tropical rational maps, i.e. ratio between two multi-variate polynomials in tropical algebra. This study was mostly concerned about characterizing the complexity of a DNN by counting the number of linear regions, into which the function represented by the DNN can divide the input space. This was done by counting the number of vertices of a polytope representation recovering the results of Montufar et al. ( 2014) with a simpler analysis. More recently, Smyrnis & Maragos (2019) leveraged this equivalency to propose a heuristic for neural network minimization through approximating the tropical rational map. Contributions. In this paper, we take the results of Zhang et al. ( 2018) several steps further and present a novel perspective on the decision boundaries of DNNs using tropical geometry. To that end, our contributions are three-fold. (i) We derive a geometric representation (convex hull between two zonotopes) for a super set to the decision boundaries of a DNN in the form (Affine, ReLU, Affine). (ii) We demonstrate a support for the lottery ticket hypothesis (Frankle & Carbin, 2019) from a geometric perspective. (iii) We leverage the geometric representation of the decision boundaries, referred to as the decision boundaries polytope, in two interesting applications: network pruning and adversarial attacks. For tropical pruning, we design a geometrically inspired optimization to prune the parameters of a given network such that the decision boundaries polytope of the pruned network does not deviate too much from its original network counterpart. We conduct extensive experiments with AlexNet (Krizhevsky et al., 2012) and VGG16 (Simonyan & Zisserman, 2014) on SVHN (Netzer et al., 2011), CIFAR10, and CIFAR 100 (Krizhevsky & Hinton, 2009) datasets, in which 90% pruning rate is achieved with a marginal drop in testing accuracy. For tropical adversarial attacks, we show that one can construct input adversaries that can change network predictions by perturbing the decision boundaries polytope.

2. PRELIMINARIES TO TROPICAL GEOMETRY

For completeness, we first provide preliminaries to tropical geometry (Itenberg et al., 2009; Maclagan & Sturmfels, 2015) . Definition 1. (Tropical Semiringfoot_1 ) The tropical semiring T is the triplet {R ∪ {-∞}, ⊕, }, where ⊕ and define tropical addition and tropical multiplication, respectively. They are denoted as: x ⊕ y = max{x, y}, x y = x + y, ∀x, y ∈ T. It can be readily shown that -∞ is the additive identity and 0 is the multiplicative identity. Given the previous definition, a tropical power can be formulated as x a = x x • • • x = a. x, for x ∈ T, a ∈ N, where a.x is standard multiplication. Moreover, a tropical quotient can be defined as: x y = x -y, where x -y is standard subtraction. For ease of notation, we write x a as x a . Definition 2. (Tropical Polynomials) For x ∈ T d , c i ∈ R and a i ∈ N d , a d-variable tropical polynomial with n monomials f : T d → T d can be expressed as: f (x) = (c 1 x a1 ) ⊕ (c 2 x a2 ) ⊕ • • • ⊕ (c n x an ), ∀ a i = a j when i = j. We use the more compact vector notation x a = x a1 1 x a2 2 • • • x a d d . Moreover and for ease of notation, we will denote c i x ai as c i x ai throughout the paper. Definition 3. (Tropical Rational Functions) A tropical rational is a standard difference or a tropical quotient of two tropical polynomials: f (x) -g(x) = f (x) g(x). Algebraic curves or hypersurfaces in algebraic geometry, which are the solution sets to polynomials, can be analogously extended to tropical polynomials too. Definition 4. (Tropical Hypersurfaces) A tropical hypersurface of a tropical polynomial f (x) = c 1 x a1 ⊕ • • • ⊕ c n x an is the set of points x where f is attained by two or more monomials in f , i.e. T (f ) := {x ∈ R d : c i x ai = c j x aj = f (x), for some a i = a j }. Tropical hypersurfaces divide the domain of f into convex regions, where f is linear in each region. Also, every tropical polynomial can be associated with a Newton polytope.



Code regenerating all our experiments is attached in the supplementary material. A semiring is a ring that lacks an additive inverse.

