Scaling provable adversarial defenses
Advances in Neural Information Processing Systems (NIPS) - Dec 2018
Download the publication: 445 KB
Recent work has developed methods for learning deep network classifiers that
are provably robust to norm-bounded adversarial perturbation; however, these
methods are currently only possible for relatively small feedforward
networks. In this paper, in an effort to scale these approaches to
ubstantially larger models, we extend previous work in three main directions.
First, we present a technique for extending these training procedures to much
more general networks, with skip connections (such as ResNets) and general
nonlinearities; the approach is fully modular, and can be implemented
automatically (analogous to automatic differentiation). Second, in the
specific case of l∞ adversarial perturbations and networks with ReLU
nonlinearities, we adopt a nonlinear random projection for training, which
scales linearly in the number of hidden units (previous approaches scaled
quadratically). Third, we show how to further improve robust error through
cascade models. On both MNIST and CIFAR data sets, we train classifiers that
improve substantially on the state of the art in provable robust adversarial
error bounds: from 5.8% to 3.1% on MNIST (with l∞ perturbations of
ε=0.1), and from 80% to 36.4% on CIFAR (with l∞ perturbations
of ε=2/255).
This paper is also stored on arXiv.
This paper is also stored on arXiv.
Images and movies
BibTex references
@InProceedings\{WSMK18, author = "Wong, Eric and Schmidt, Frank R. and Metzen, Jan Hendrik and Kolter, J. Zico", title = "Scaling provable adversarial defenses", booktitle = "Advances in Neural Information Processing Systems (NIPS)", month = "Dec", year = "2018", url = "http://frank-r-schmidt.de/Publications/2018/WSMK18" }