ALGORITHMIC DETERMINATION OF THE COMBINATO-RIAL STRUCTURE OF THE LINEAR REGIONS OF RELU NEURAL NETWORKS

Abstract

We algorithmically determine the regions and facets of the canonical polyhedral complex, the universal object into which a ReLU network decomposes its input space. We show that the locations of the vertices of the canonical polyhedral complex along with their signs with respect to layer maps determine the full facet structure across all dimensions. Our algorithm which implements this approach makes use of our theorems that the dual complex to the canonical polyhedral complex is cubical, and it possesses a multiplication compatible with its facet structure. The resulting algorithm is numerically stable, polynomial time in the number of intermediate neurons, and obtains accurate information across all dimensions. This permits us to obtain, for example, the true topology of the decision boundaries of networks with low-dimensional inputs for binary classification tasks. We run empirics on such networks at initialization, finding that width alone does not increease observed topology, but width in the presence of depth does.

1. INTRODUCTION

For fully-connected ReLU networks (Nair & Hinton, 2010) , the canonical polyhedral complex of the network, as defined by Grigsby & Lindsey (2020), encodes its decomposition of input space into linear regions and determines key structures such as the decision boundary for a binary classification task. Investigation of properties and characterizations of this decomposition of input space are ongoing, in particular with respect to counting the top-dimensional linear regions (Serra et al., 2018; Hanin & Rolnick, 2019a; Montufar et al., 2014; Serra & Ramalingam, 2020; Xiong et al., 2020) , since these bounds give one measure of the expressivity of the associated network architecture. However, a theoretical understanding of adjacency between regions and more generally the connectivity of lower-dimensional facets are to our knowledge generally undocumented. Understanding the face relations in the canonical polyhedral complex-for example, for tiled surfaces, understanding all inclusions between vertices (0-faces), edges (1-faces) and polygons (2-faces) -is necessary to relate combinatorial properties of the polyhedral complex of a network to the topology of regions into which the decision boundary partitions input space, geometric measurements such as the presence of critical points (Grigsby et al., 2022) , or other notions of topological expressivity, as explored by Guss & Salakhutdinov (2018) ; Bianchini & Scarselli (2014) . It is common to describe linear regions of the input space, R n0 , using "activation patterns" or "neural codes" recorded as vectors in {0, 1} N (Itskov et al., 2020). Unfortunately, having a list of which activation patterns are present in the interiors of the linear regions does not determine their pairwise intersection properties (Theorem 15), and computing the intersections of these regions directly is not numerically stable. Furthermore, the polyhedra comprising the linear regions appear, at first glance, arbitrarily complicated. In this work, we establish a simpler representation by proving the theorem that the geometric dualfoot_0 of any network's canonical polyhedral complex is a union of n 0 -dimensional cubes (see Figure 1 ). Inspired by the theory of oriented matroids and hyperplane arrangements (Anders et al., 2000; Aguiar & Mahajan, 2017) , we demonstrate how the notion of "sign vectors" from oriented matroids, which are vectors with entries in {-1, 0, 1}, can serve as a labeling scheme for vertices, edges, and higher-dimensional regions. These sign vectors have properties that track Figure 1 : An illustration of a canonical polyhedral complex and its geometric dual. Left: C(F ) for a specific neural network function F : R 2 → R 3 → R. The three straight lines and the solid colored regions together form R (1) , which is also C(F 1 ). Middle: F 1 : R 2 → R 3 is piecewise linear on cells of C(F 1 ). The hyperplane in R 3 is the hyperplane associated with A 2 : R 3 → R, and this hyperplane together with the two halfspaces on either side of it form R (2) . The cells of C(F ) on the left are mathematically determined by taking one region R of R (2) , considering its preimage F -1 1 (R), and taking the intersection of this preimage with a cell of C(F 1 ). Right: The geometric dual sign sequence complex S(F ) is superimposed in white over C(F ), with one vertex for each region of C(F ). As we prove in general, S(F ) is cubical, with each two-cube (quadrilateral) containing a unique vertex of C(F ). face relations and intersections between all linear regions in input space. We show that with full probability, computing only the vertices present in the polyhedral complex and recording their sign vectors gives enough information to determine all face relations in the polyhedral complex. The ability to compute the explicit decision boundary of a network provides a new means to evaluate topological expressivity of network functions. Our main contributions are: • Mathematically, we prove the existence of a combinatorial description of the geometric dual of the canonical polyhedral complex of a ReLU neural network which consists of a generalization of activation patterns, which we call the sign sequence complex. Furthermore, using a product structure adapted from the theory of oriented matroids which we prove to be well-defined, we show that the only information needed to determine the face poset structure of the sign sequence complex is the sign sequences corresponding to the vertices of the polyhedral complex. • We implement an algorithm for obtaining these sign sequences which is numerically stable and, on average at initialization, runs in polynomial time in the total number of neurons, and exponential time in the input dimension. • We show that the sign sequence complex can be naturally restricted to particular substructures of the polyhedral complex of a ReLU network. In particular, a chain complex computing the homology of the decision boundary can be obtained with simple operations on a subset of the vertices of the polyhedral complex, together with their sign sequences. • We demonstrate the usefulness of this characterization of a network by obtaining the statistics of ReLU networks' binary decision boundaries' topological properties at initialization, dependent on architecture. These experiments provide empirics suggesting that depth of a network plays a stronger role in topological expressivity, at least at initialization, than the number of intermediate neurons. In this work, our primary aim is to understand C(F ) more completely. We do this by demonstrating how to use sign sequences to record all information about the adjacency of linear regions without needing to separately list all edges, faces, and higher-dimensional cells. Sign sequences have a geometric interpretation as a cubical complex, and an algebraic structure which is useful in proofs. It is also computationally tractable to treat these sign sequences as a data structure which provides a key for translating between local geometric and analytic information about C(F ) (vertex locations and function behavior at those locations) and global topological and combinatorial information about the linear regions, including the previously-inaccessible topology of binary decision boundaries.



Recall, for example, that the dual of the icosahedron is the dodecahedron.

