ROBUST NEURAL ODES VIA CONTRACTIVITY-PROMOTING REGULARIZATION

Abstract

Neural networks can be fragile to input noise and adversarial attacks. In this work, we consider Neural Ordinary Differential Equations (NODEs) -a family of continuous-depth neural networks represented by dynamical systems -and propose to use contraction theory to improve their robustness. A dynamical system is contractive if two trajectories starting from different initial conditions converge to each other exponentially fast. Contractive NODEs can enjoy increased robustness as slight perturbations of the features do not cause a significant change in the output. Contractivity can be induced during training by using a regularization term involving the Jacobian of the system dynamics. To reduce the computational burden, we show that it can also be promoted using carefully selected weight regularization terms for a class of NODEs with slope-restricted activation functions, including convolutional networks commonly used in image classification. The performance of the proposed regularizers is illustrated through benchmark image classification tasks on MNIST and FashionMNIST datasets, where images are corrupted by different kinds of noise and attacks.

1. INTRODUCTION

Neural networks (NNs) have demonstrated outstanding performance in image classification, natural language processing, and speech recognition tasks. However, they can be sensitive to input noise or meticulously crafted adversarial attacks (Xu et al., 2020; Carlini & Wagner, 2017; Athalye et al., 2018; Szegedy et al., 2013) . The customary remedies are either heuristic, such as feature obfuscation (Miller et al., 2020 ), adversarial training (Goodfellow et al., 2014; Allen-Zhu & Li, 2022) , and defensive distillation (Papernot et al., 2016) , or certificate-based such as Lipschitz regularization (Xu et al., 2020; Fazlyab et al., 2019; Pauli et al., 2021; Aquino et al., 2022; Virmaux & Scaman, 2018; Combettes & Pesquet, 2020) . The overall intent of certificate-based approaches is to penalize the input-to-output sensitivity of NNs to improve robustness. Recently, the connections between NNs and dynamical systems have been extensively explored. Representative results include classes of NNs stemming from the discretization of dynamical systems (Haber & Ruthotto, 2017) and NODEs (Chen et al., 2018) , which transform the input through a continuous-time ODE embedding training parameters. The continuous-time nature of NODEs makes them particularly suitable for learning complex dynamical systems (Rubanova et al., 2019; Greydanus et al., 2019) and allows borrowing tools from dynamical system theory to analyze their properties (Fazlyab et al., 2022; Galimberti et al., 2021) . In this paper, we employ contraction theory to improve the robustness of NODEs. A dynamical system is contractive if all trajectories converge exponentially fast to each other (Lohmiller & Slotine, 1998; Tsukamoto et al., 2021) . Through the lens of contraction, slight perturbations of initial conditions have a diminishing impact over time on the NODE state. With the above considerations, we propose a class of regularizers that promote contractivity of NODEs during the training. In the most general case, the regularizers require the Jacobian matrix of the NODE, which might be computationally challenging to obtain for deep networks. Nevertheless, for a wide class of NODEs with slope-restricted activation functions, we show that contractivity can be promoted by directly penalizing the weights during the training. Moreover, by leveraging the linearity of convolution operations, we demonstrate that contractivity can be promoted for convolutional NODEs by regularizing the convolution filters only.

