ISOMETRIC REPRESENTATIONS IN NEURAL NET-WORKS IMPROVE ROBUSTNESS

Abstract

Artificial and biological agents are unable to learn given completely random and unstructured data. The structure of data is encoded in the distance or similarity relationships between data points. In the context of neural networks, the neuronal activity within a layer forms a representation reflecting the transformation that the layer implements on its inputs. In order to utilize the structure in the data in a truthful manner, such representations should reflect the input distances and thus be continuous and isometric. Supporting this statement, recent findings in neuroscience propose that generalization and robustness are tied to neural representations being continuously differentiable. However, in machine learning, most algorithms lack robustness and are generally thought to rely on aspects of the data that differ from those that humans use, as is commonly seen in adversarial attacks. During cross-entropy classification, the metric and structural properties of network representations are usually broken both between and within classes. This side effect from training can lead to instabilities under perturbations near locations where such structure is not preserved. One of the standard solutions to obtain robustness is to train specifically by introducing perturbations in the training data. This leads to networks that are particularly robust to specific training perturbations but not necessarily to general perturbations. While adding ad hoc regularization terms to improve robustness has become common practice, to our knowledge, forcing representations to preserve the metric structure of the input data as a stabilising mechanism has not yet been introduced. In this work, we train neural networks to perform classification while simultaneously maintaining the metric structure within each class, leading to continuous and isometric within-class representations. We show that such network representations turn out to be a beneficial component for making accurate and robust inferences about the world. By stacking layers with this property we provide the community with an network architecture that facilitates hierarchical manipulation of internal neural representations. Finally, we verify that our isometric regularization term improves the robustness to adversarial attacks on MNIST.

1. INTRODUCTION

Using neuroscience as an inspiration to enforce properties in machine learning has roots dating back to the birth of artificial neural networks (McCulloch & Pitts, 1943; Rosenblatt, 1958) . One way to study natural and artificial neural networks is to look at how they transform specific structural properties of input data. The output of such a transformation is typically called a neural, or latent, representation, and it carries information about the computational role of a brain region or network layer (Kriegeskorte, 2008; Kriegeskorte & Diedrichsen, 2019; Bengio et al., 2013) . Different properties of representations are helpful in different ways for both organisms and artificial agents. Some examples of this are efficient coding Barlow et al. (1961) , mixed selectivity (Rigotti et al., 2013) , sparse coding (Olshausen & Field, 2004) , response normalization (Carandini & Heeger, 2012) , efficiency and smoothness (Stringer et al., 2019) and expressivity (Poole et al., 2016; Raghu et al., 2017 ) among others. For example, one subsection of theories related to efficient coding proposes that neural circuits should generate discontinuous and high-dimensional representations to pack the most information possible into a network (Barlow et al., 1961; Simoncelli & Olshausen, 2001) . On the other hand, empirical results point out that neural circuits generate low dimensional smooth representations of the data (Gao & Ganguli, 2015; Gao et al., 2017) . This apparent contradiction has already been rigorously discussed in Stringer et al. (2019) . According to the work of Stringer et al., neural circuits try to be as efficient as possible while smoothly mapping inputs. Without this smoothness constraint, infinitesimal perturbations of input stimuli could drastically change the output, thereby making such circuits non-robust to some perturbations. Given the empirical support, it seems likely for these properties to hold in early sensory systems and thus to be important for a broad class of machine learning algorithms. Organisms seem particularly robust to random input perturbations. However, artificial models suffer from a lack of robustness to adversarial attacks Goodfellow et al. (2014) . One argument for why this happens can be deduced from Naitzat et al. (2020) , in which the authors use a topological approach based on persistent homology (Carlsson, 2009; Edelsbrunner & Harer, 2022) , to study the mappings realized by neural networks performing a classification task. They claim that, in classification problems, neural networks implement structure-breaking (non-homeomorphic) mappings, and as argued above, models implementing such mappings are unlikely to be robust. There are many ways to improve the robustness of a network (Madry et al., 2017; Silva & Najafirad, 2020; Xu et al., 2020) While regularities in neural representations help with robustness, they do not necessarily guarantee that the input and output representations will have the same metric relationships, thereby reflecting the actual structure of the data. To our knowledge, there are still no methods to preserve the class metric structure while allowing for robust classification. To achieve this behavior, we create a neural network model with, what we call, Locally Isometric Layers (LILs) and study the representations generated by training such networks. Furthermore, we extend these networks to generate representations in a hierarchical manner, which makes them helpful in performing classification at different resolutions. Finally, we train LILs on MNIST and show that the isometry condition leads to an improvement in network robustness to both the Fast Gradient Sign Method (FGSM) and the Projected Gradient Descent (PGD) adversarial attacks.

2. BACKGROUND AND METHODS

In this section, we summarise the mathematical background of some of the different mappings that a neural network can implement and introduce LILs. We treat both training and test data as being sampled from a manifold M and the neural network N as a set of maps N = {f l i |f l i : M → R}, with l denoting the layer and i -the index of a neuron. Another way to interpret the action of a neural network, which will become useful later on, is to define its mapping layer-wise. Definition 1: A Neural Network N with L layers acting on a manifold M is a set of functions {F l : M → R n l } L l=1 , where n l is the number of neurons in a particular layer. To perform classification, one usually tries to train a network to realize a particular function Φ : M → R n L , which holds easily separable representations of the data. After this, a simple linear layer can be used for the final classification. This procedure requires the specification of a cost function. Here we will consider the following example: L(X, T ) = T log σ[Φ(X)] + 1 N 2 ||G ⊙ D M -G ⊙ D Φ || 2 F = L CSE + L ISO , where X = {x 1 , ..., x N } are the inputs, T = {t 1 , ..., t N } are the desired labels, σ is the Softmax function, || • || F is the Frobenius norm, D M and D Φ are the distance matrices in the input and output space generated by d(x i , x j ) and d(Φ(x i ), Φ(x j )) respectively and G is an indexing matrix. The distance functions can be any metric, but in this work, we stick to the Euclidean distance. Given a partition of the training set V = k V k , V k ⊂ M, the indexing matrix G is defined in the following way:



. Nevertheless, of particular interest to us are strategies that try to solve this problem by restricting the properties of the mapping realized by a network. Examples of this are Jacobian regularization (Hoffman et al., 2019), spectral regularization (Miyato et al., 2018; Nassar et al., 2020), Lipschitz continuity (Virmaux & Scaman, 2018; Liu et al., 2022), topological regularization (Chen et al., 2019) and manifold regularization (Jin & Rinard, 2020) among others.

