EXPLOITING VERIFIED NEURAL NETWORKS VIA FLOATING POINT NUMERICAL ERROR

Abstract

Motivated by the need to reliably characterize the robustness of deep neural networks, researchers have developed verification algorithms for deep neural networks. Given a neural network, the verifiers aim to answer whether certain properties are guaranteed with respect to all inputs in a space. However, little attention has been paid to floating point numerical error in neural network verification. We exploit floating point errors in the inference and verification implementations to construct adversarial examples for neural networks that a verifier claims to be robust with respect to certain inputs. We argue that, to produce sound verification results, any verification system must accurately (or conservatively) model the effects of any float point computations in the network inference or verification system.

1. INTRODUCTION

Deep neural networks (DNNs) are known to be vulnerable to adversarial inputs (Szegedy et al., 2014) , which are images, audio, or texts indistinguishable to human perception that cause a DNN to give substantially different results. This situation has motivated the development of network verification algorithms that claim to prove the robustness of a network (Bunel et al., 2020; Tjeng et al., 2019; Salman et al., 2019) , specifically that the network produces identical classifications for all inputs in a perturbation space around a given input. Verification algorithms typically reason about the behavior of the network assuming real-valued arithmetic. In practice, however, the computation of both the verifier and the neural network is performed on physical computers that use floating point numbers and floating point arithmetic to approximate the underlying real-valued computations. This use of floating point introduces numerical error that can potentially invalidate the guarantees that the verifiers claim to provide. Moreover, the existence of multiple software and hardware systems for DNN inference further complicates the situation, because different implementations exhibit different numerical error characteristics. We present concrete instances where numerical error leads to unsound verification of real-valued networks. Specifically, we train robust networks on the MNIST and CIFAR10 datasets. We work with the MIPVerify complete verifier (Tjeng et al., 2019) and several inference implementations included in the PyTorch (Paszke et al., 2019) framework. For each implementation, we construct image pairs (x 0 , x adv ) where x 0 is a brightness modified natural image, such that the implementation classifies x adv differently from x 0 , x adv falls in a ∞ -bounded perturbation space around x 0 , and the verifier incorrectly claims that no such adversarial image x adv exists for x 0 within the perturbation space. Moreover, we show that the incomplete verifier CROWN is also vulnerable to floating point error. Our method of constructing adversarial images is not limited to our setting, and it is applicable to other verifiers that do not soundly model floating point arithmetic.

2. BACKGROUND AND RELATED WORK

Training robust networks: Researchers have developed various techniques to train robust networks (Madry et al., 2018; Mirman et al., 2018; Tramer & Boneh, 2019; Wong et al., 2020) . Madry et al. formulate the robust training problem as minimizing the worst loss within the input perturbation and propose to train robust networks on the data generated by the Projected Gradient Descent (PGD) adversary (Madry et al., 2018) . In this work we consider robust networks trained with the PGD adversary.

Complete verification:

The goal of complete verification (a.k.a. exact verification) methods is to either prove the property being verified or provide a counterexample to disprove it. Complete verification approaches have formulated the verification problem as a Satisfiability Modulo Theories (SMT) problem (Scheibler et al., 2015; Huang et al., 2017; Katz et al., 2017; Ehlers, 2017; Bunel et al., 2020) or as a Mixed Integer Linear Programming (MILP) problem (Lomuscio & Maganti, 2017; Cheng et al., 2017; Fischetti & Jo, 2018; Dutta et al., 2018; Tjeng et al., 2019) . While SMT solvers are able to model exact floating point arithmetic (Rümmer & Wahl, 2010) or exact real arithmetic (Corzilius et al., 2012) , deployed SMT solvers for verifying neural networks all use inexact floating point arithmetic to reason about the neural network inference for efficiency reasons. MILP solvers work directly with floating point, do not attempt to exactly model real arithmetic, and therefore exhibit numerical error. Since floating point arithmetic is not associative, different neural network implementations may produce different results for the same neural network, implying that any sound verifier for this class of networks must reason about the specific floating point error characteristics of the neural network implementation at hand. To the best of our knowledge, no prior work formally recognizes the problem of floating point error in neural network complete verification or exploits floating point error to invalidate verification results. Incomplete verification: On the spectrum of the tradeoff between completeness and scalability, incomplete methods (a.k.a. certification methods) aspire to deliver more scalable verification by adopting over-approximation, while admitting the inability to either prove or disprove the properties in certain cases. There is a large body of related research (Wong & Kolter, 2017; Weng et al., 2018; Gehr et al., 2018; Zhang et al., 2018; Raghunathan et al., 2018; Dvijotham et al., 2018; Mirman et al., 2018; Singh et al., 2019) . Salman et al. (2019) has unified most of the relaxation methods under a common convex relaxation framework. Their results suggest that there is an inherent barrier to tight verification via layer-wise convex relaxation captured by their framework. We highlight that floating point error of implementations that use a direct dot product formulation has been accounted for in some certification frameworks (Singh et al., 2018; 2019) by maintaining upper and lower rounding bounds for sound floating point arithmetic (Miné, 2004) . Such frameworks should be extensible to model numerical error in more sophisticated implementations like the Winograd convolution (Lavin & Gray, 2016), but the effectiveness of this extension remains to be studied. Most of the certification algorithms, however, have not considered floating point error and may be vulnerable to attacks that exploit this deficiency. Floating point arithmetic: Floating point is widely adopted as an approximate representation of real numbers in digital computers. After each calculation, the result is rounded to the nearest representable value, which induces roundoff error. In the field of neural networks, the SMT-based verifier Reluplex (Katz et al., 2017) has been observed to produce false adversarial examples due to floating point error (Wang et al., 2018) . The MILP-based verifier MIPVerify (Tjeng et al., 2019) has been observed to give NaN results when verifying pruned neural networks (Guidotti et al., 2020) . Such observed floating point unsoundness behavior occurs unexpectedly in running large scale benchmarks. However, no prior work tries to systematically invalidate neural network verification results via exploiting floating point error. The IEEE-754 (IEEE, 2008) standard defines the semantics of operations and correct rounding behavior. On an IEEE-754 compliant implementation, computing floating point expressions consisting of multiple steps that are equivalent in the real domain may result in different final roundoff error because rounding is performed after each step, which complicates the error analysis. Research on estimating floating point roundoff error and verifying floating point programs has a long history and is actively growing (Boldo & Melquiond, 2017 ), but we are unaware of any attempt to apply these tools to obtain a sound verifier for any neural network inference implementation. Any such verifier must reason soundly about floating point errors in both the verifier and the neural network inference algorithm. The failure to incorporate floating point error in software systems has caused real-world disasters. For example, in 1992, a Patriot missile missed its target and lead to casualties due to floating point roundoff error related to time calculation (Skeel, 1992) .

