SOUND RANDOMIZED SMOOTHING IN FLOATING-POINT ARITHMETIC

Abstract

Randomized smoothing is sound when using infinite precision. However, we show that randomized smoothing is no longer sound for limited floating-point precision. We present a simple example where randomized smoothing certifies a radius of 1.26 around a point, even though there is an adversarial example in the distance 0.8 and show how this can be abused to give false certificates for CIFAR10. We discuss the implicit assumptions of randomized smoothing and show that they do not apply to generic image classification models whose smoothed versions are commonly certified. In order to overcome this problem, we propose a sound approach to randomized smoothing when using floating-point precision with essentially equal speed for quantized input. It yields sound certificates for image classifiers which for the ones tested so far are very similar to the unsound practice of randomized smoothing. Our only assumption is that we have access to a fair coin.

1. INTRODUCTION

Shortly after the advent of deep learning, it was observed in Szegedy et al. (2014) that there exist adversarial examples, i.e., small imperceptible modifications of the input which change the decision of the classifier. This property is of major concern in application areas where safety and security are critical such as medical diagnosis or in autonomous driving. To overcome this issue, a lot of different defenses have appeared over the years, but new attacks were proposed and could break these defenses, see, e.g., (Athalye et al., 2018; Croce and Hein, 2020; Tramer et al., 2020; Carlini et al., 2019) . The only empirical (i.e., without guarantees) method which seems to work is adversarial training (Goodfellow et al., 2015; Madry et al., 2018) but also there, a lot of defenses turned out to be substantially weaker than originally thought (Croce and Hein, 2020). Hence, there has been a focus on certified robustness. Here, the aim is to produce certificates assuring no adversarial example exists in a small neighborhood of the original image. For the neighborhood, typically called threat model, one often uses ℓ p -balls centered at the original image. However, there also exist other choices, such as Wasserstein balls (Wong et al., 2019; Levine and Feizi, 2020) or balls induced by perceptual metrics (Laidlaw et al., 2021; Voráček and Hein, 2022) et al., 2019; Cohen et al., 2019; Salman et al., 2019) , which is hitherto the only method scaling to ImageNet. Note that the concept of randomized smoothing may also be interpreted as a special case of (1), see Salman et al. (2019) . All of these certificates expect that calculations can be done with unlimited precision and do not take into account how finite precision arithmetic affects the certificates. For Lipschitz networks (1), the round-off error is of the order of the lowest significant bits of mantissa, which we can estimate to be in the orders of ∼ 10 -8 for single-precision floating-point numbers. Thus, we should assume that the adversary can also inject ℓ ∞ -perturbation bounded by ∼ 10 -8 in every layer. However, since the networks have small Lipschitz constants by construction, those errors will not be significantly magnified. Although we cannot universally quantify the numerical errors of Lipschitz networks, they will likely be very small and in particular, can be efficiently traced during the forward pass so that the certificates can be made sound. For the verification methods from category (2), previous works have shown that numerical errors may lead to false certificates for methods based on SMT or mixed-integer linear programming (Jia and Rinard, 2021; Zombori et al., 2021) . However, it is possible (and often done in practice) to adapt the verification procedure to be sound w.r.t. floatingpoint inaccuracies (Singh et al., 2019) ; thus, the problem is not fundamental, and these verification techniques can be made sound. 2021) focuses on randomized smoothing when using only integer arithmetic in neural networks for embedded devices, so they will, by definition, not have problems with floating-point errors. On the other hand, it does not cover some modern architectures, such as transformers. Furthermore, the way the certificates are computed is derived from the continuous normal distribution; thus, the certificates are approximate, see Appendix G. Another direction is so-called derandomized smoothing -methods that remind randomized smoothing but are deterministic. See, e.g., Levine and Feizi (2021) . In this paper, we make the following contributionsfoot_0 : 1. We perform a novel analysis of numerical errors in randomized smoothing approaches when using floating-point arithmetic and identify qualitatively new problems. 2. Building on the observations, we present a simple approach for developing classifiers whose smoothed version will provide fundamentally wrong certificates for chosen points and discuss how this could be exploited in practice. 3. We propose a sound randomized smoothing procedure for floating-point arithmetic with negligible computational overhead for image classification compared to the unsound practice. While we could not find substantial differences of our sound certificates compared to the unsound practice for our tested classifiers, a lack of a counterexample is not a proof of the correctness. The past has shown that such gaps will be exploited by malicious actors in the future. It is to be expected that certificates of adversarial robustness are required for classifiers used in safety-critical systems (see European AI act European Commission (2021)) and thus will be controlled by regulatory bodies. A malicious company could use then the problems of randomized smoothing in floating-point arithmetic to provide fake certificates on a known/leaked test set. Since our sound randomized smoothing procedure for floating-point arithmetic comes at essentially no additional cost for quantized input e.g., images, we believe that using our sound procedure should always be used for such domains.

Manuscript organization:

We start with the definition of randomized smoothing in Section 2, then we continue with the introduction of floating-point arithmetic following the IEEE standard 754 (iee, 2008) in Section 3. In Section 4, we exploit the properties of floating-point arithmetic and present a simple classifier producing wrong certificates, and we follow with the identification of the implicit assumptions of randomized smoothing. In Section 5 we conclude the main result by proposing a method of sound randomized smoothing in floating-point arithmetic and provide an experimental comparison of the old unsound and the new sound certificates.

2. RANDOMIZED SMOOTHING

Throughout the paper, we consider for clarity the problem of binary classification, but every phenomenon we discuss can be easily transferred to the multiclass setting. We note that the proposed algorithmic fix, see Appendix K, as well as the experiments in A, are done for the multiclass setting. We first introduce randomized smoothing and define certificates with respect to a norm ball. Definition 2.1. A classifier F ∶ R d → {0, 1} is said to be certifiably robust at point x ∈ R d with radius r, w.r.t. norm ∥⋅∥ if the correct label at x is y ∈ {0, 1}, and ∥x -x ′ ∥ ≤ r ⇒ F (x ′ ) = y. One way to get such a certificate is randomized smoothing (Lecuyer et al., 2019; Cohen et al., 2019; Salman et al., 2019) which we introduce following. We are given a base classifier F ∶ R d → {0, 1}.



Code is available at https://github.com/vvoracek/Sound-Randomized-Smoothing



. The common certification techniques include (1) Bounding the Lipschitz constant of the network, see Hein and Andriushchenko (2017); Li et al. (2019); Trockman and Kolter (2021); Leino et al. (2021); Singla et al. (2022) for the ℓ 2 threat model and Zhang et al. (2022) for ℓ ∞ . (2) Overapproximating the threat model by its convex relaxation (admittedly, bounding Lipschitz constant can also be interpreted this way), possibly combined with mixed-integer linear programs or SMT; see, e.g., Katz et al. (2017); Gowal et al. (2018); Wong et al. (2018); Balunovic and Vechev (2020). (3) Randomized smoothing (Lecuyer

For randomized smoothing certificates (3), Jin et al. (2022) perform floating-point attacks on certifiably robust networks and indicate the existence of false certificates; see Appendix F for a discussion. The recent work of Lin et al. (

