LIPSCHITZ-BOUNDED EQUILIBRIUM NETWORKS

Abstract

This paper introduces new parameterizations of equilibrium neural networks, i.e. networks defined by implicit equations. This model class includes standard multilayer and residual networks as special cases. The new parameterization admits a Lipschitz bound during training via unconstrained optimization: no projections or barrier functions are required. Lipschitz bounds are a common proxy for robustness and appear in many generalization bounds. Furthermore, compared to previous works we show well-posedness (existence of solutions) under less restrictive conditions on the network weights and more natural assumptions on the activation functions: that they are monotone and slope restricted. These results are proved by establishing novel connections with convex optimization, operator splitting on non-Euclidean spaces, and contracting neural ODEs. In image classification experiments we show that the Lipschitz bounds are very accurate and improve robustness to adversarial attacks.

1. INTRODUCTION

Deep neural network models have revolutionized the field of machine learning: their accuracy on practical tasks such as image classification and their scalability have led to an enormous volume of research on different model structures and their properties (LeCun et al., 2015) . In particular, deep residual networks with skip connections He et al. (2016) have had a major impact, and neural ODEs have been proposed as an analog with "implicit depth" (Chen et al., 2018) . Recently, a new structure has gained interest: equilibrium networks (Bai et al., 2019; Winston & Kolter, 2020) , a.k.a. implicit deep learning models (El Ghaoui et al., 2019) , in which model outputs are defined by implicit equations incorporating neural networks. This model class is very flexible: it is easy to show that includes many previous structures as special cases, including standard multi-layer networks, residual networks, and (in a certain sense) neural ODEs. However model flexibility in machine learning is always in tension with model regularity or robustness. While deep learning models have exhibited impressive generalisation performance in many contexts it has also been observed that they can be very brittle, especially when targeted with adversarial attacks (Szegedy et al., 2014) . In response to this, there has been a major research effort to understand and certify robustness properties of deep neural networks, e.g. Raghunathan et al. 2019) and many others. Global Lipschitz bounds (a.k.a. incremental gain bounds) provide a somewhat crude but nevertheless highly useful proxy for robustness (Tsuzuku et al., 2018; Fazlyab et al., 2019) , and appear in several analyses of generalization (e.g. (Bartlett et al., 2017; Zhou & Schoellig, 2019) ). Inspired by both of these lines of research, in this paper we propose new parameterizations of equilibrium networks with guaranteed Lipschitz bounds. We build directly on the monotone operator framework of Winston & Kolter (2020) The main contribution of our paper is the ability to enforce tight bounds on the Lipschitz constant of an equilibrium network during training with essentially no extra computational effort. In addition, we prove existence of solutions with less restrictive conditions on the weight matrix and more natural assumptions on the activation functions via novel connections to convex optimization and contracting dynamical systems. Finally, we show via small-scale image classification experiments that the proposed parameterizations can provide significant improvement in robustness to adversarial attacks with little degradation in nominal accuracy. Furthermore, we observe small gaps between certified Lipschitz upper bounds and observed lower bounds computed via adversarial attack.



(2018a); Tjeng et al. (2018); Liu et al. (2019); Cohen et al. (

and the work of Fazlyab et al. (2019) on Lipschitz bounds.

