LOSS LANDSCAPE MATTERS: TRAINING CERTIFIABLY ROBUST MODELS WITH FAVORABLE LOSS LAND-SCAPE

Abstract

In this paper, we study the problem of training certifiably robust models. Certifiable training minimizes an upper bound on the worst-case loss over the allowed perturbation, and thus the tightness of the upper bound is an important factor in building certifiably robust models. However, many studies have shown that Interval Bound Propagation (IBP) training uses much looser bounds but outperforms other models that use tighter bounds. We identify another key factor that influences the performance of certifiable training: smoothness of the loss landscape. We consider linear relaxation-based methods and find significant differences in the loss landscape across these methods. Based on this analysis, we propose a certifiable training method that utilizes a tighter upper bound and has a landscape with favorable properties. The proposed method achieves performance comparable to state-of-the-art methods under a wide range of perturbations.

1. INTRODUCTION

Despite the success of deep learning in many applications, the existence of adversarial example, an imperceptibly modified input that is designed to fool the neural network (Szegedy et al., 2013; Biggio et al., 2013) , hinders the application of deep learning to safety-critical domains. There has been increasing interest in building a model that is robust to adversarial attacks (Goodfellow et al., 2014; Papernot et al., 2016; Kurakin et al., 2016; Madry et al., 2018; Tramèr et al., 2017; Zhang et al., 2019a; Xie et al., 2019) . However, most defense methods evaluate their robustness with adversarial accuracy against predefined attacks such as PGD attack (Madry et al., 2018) or C&W attack (Carlini & Wagner, 2017) . Thus, these defenses can be broken by new attacks (Athalye et al., 2018) . To this end, many training methods have been proposed to build a certifiably robust model that can be guaranteed to be robust to adversarial perturbations (Hein & Andriushchenko, 2017; Raghunathan et al., 2018b; Wong & Kolter, 2018; Dvijotham et al., 2018; Mirman et al., 2018; Gowal et al., 2018; Zhang et al., 2019b) . They develop an upper bound on the worst-case loss over valid adversarial perturbations and minimize it to train a certifiably robust model. These certifiable training methods can be mainly categorized into two types: linear relaxation-based methods and bound propagation methods. Linear relaxation-based methods use relatively tighter bounds, but are slow, hard to scale to large models, and memory-inefficient (Wong & Kolter, 2018; Wong et al., 2018; Dvijotham et al., 2018) . On the other hand, bound propagation methods, represented by Interval Bound Propagation (IBP), are fast and scalable due to the use of simple but much looser bounds (Mirman et al., 2018; Gowal et al., 2018) . One would expect that training with tighter bounds would lead to better performance, but IBP outperforms linear relaxation-based methods in many cases, despite using much looser bounds.

These observations on the performance of certifiable training methods raise the following questions:

Why does training with tighter bounds not result in a better performance? What other factors may influence the performance of certifiable training? How can we improve the performance of certifiable training methods with tighter bounds? In this paper, we provide empirical and theoretical analysis to answer these questions. First, we demonstrate that IBP (Gowal et al., 2018) has a more favorable loss landscape than other linear relaxation-based methods, and thus it often leads to better performance even with much looser bounds. To account for this difference, we present a unified view of IBP and linear relaxation-based methods and find that the relaxed gradient approximation (which will be defined in Definition 1) of each method plays a crucial role in its optimization behavior. Based on the analysis of the loss landscape and the optimization behavior, we propose a new certifiable training method that has a favorable landscape with tighter bounds. The performance of the proposed method is comparable to that of state-of-the-art methods under a wide range of perturbations. We summarize the contributions of this study as follows: • We provide empirical and theoretical analysis of the loss landscape of certifiable training methods and find that smoothness of the loss landscape is important for building certifiably robust models. • We propose a certifiable training method with tighter bounds and a favorable loss landscape, obtaining comparable performance with state-of-the-art methods under a wide range of perturbations. Although beyond our focus here, there is another line of work on randomized smoothing (Li et al., 2018; Lecuyer et al., 2019; Cohen et al., 2019; Salman et al., 2019) , which can probabilistically certify the robustness with arbitrarily high probability by using a smoothed classifier. However, it requires a large number of samples for inference.

2. RELATED WORK

There are many other works on certifiable verification (Weng et al., 2018; Singh et al., 2018a; 2019; 2018b; Zhang et al., 2018; Boopathy et al., 2019; Lyu et al., 2020) . However, our work focuses on "certifiable training".



We also encourage our model to avoid unstable ReLUs, but we train the model with an upper bound of the worst-case loss and investigate ReLU stability from the loss landscape perspective.Mirman et al. (2018)  proposed the propagation of a geometric bound (called domain) through the network to yield an outer approximation in logit space. This can be done with an efficient layerwise computation that exploits interval arithmetic. Over the outer domain, one can compute the worstcase loss to be minimized during training. Gowal et al. (2018) used a special case of the domain propagation called Interval Bound Propagation (IBP) using the simplest domain, the interval domain (or interval bound). In IBP, the authors introduced a different objective function, heuristic scheduling on the hyperparameters, and elision of the last layer to stabilize the training and to improve the performance.Both approaches, linear relaxation-based methods and bound propagation methods, use an upper bound on the worst-case loss. Bound propagation methods exploit much looser upper bounds, but they enjoy an unexpected benefit in many cases: better robustness than linear relaxation-based methods. Balunovic & Vechev (2019) hypothesized that the complexity of the loss computation makes the optimization more difficult, which could be a reason why IBP outperforms linear relaxationbased methods. They proposed a new optimization procedure with the existing linear relaxation. In this paper, we further investigate the causes of the difficulties in the optimization.Recently, Zhang  et al. (2019b)  proposed CROWN-IBP which uses linear relaxation in a verification method calledCROWN (Zhang et al., 2018)  in conjunction with IBP to train a certifiably robust model.

