Transferability with a Bayesian approach

Efficient and Transferable Adversarial Examples from Bayesian Neural Networks

Martin Gubri, Maxime Cordy, Mike Papadakis, Yves Le Traon, Koushik Sen

Abstract

An established way to improve the transferability of black-box evasion attacks is to craft the adversarial examples on an ensemble-based surrogate to increase diversity. We argue that transferability is fundamentally related to uncertainty. Based on a state-of-the-art Bayesian Deep Learning technique, we propose a new method to efficiently build a surrogate by sampling approximately from the posterior distribution of neural network weights, which represents the belief about the value of each parameter. Our extensive experiments on ImageNet, CIFAR-10 and MNIST show that our approach improves the success rates of four state-of-the-art attacks significantly (up to 83.2 percentage points), in both intra-architecture and inter-architecture transferability. On ImageNet, our approach can reach 94% of success rate while reducing training computations from 11.6 to 2.4 exaflops, compared to an ensemble of independently trained DNNs. Our vanilla surrogate achieves 87.5% of the time higher transferability than three test-time techniques designed for this purpose. Our work demonstrates that the way to train a surrogate has been overlooked, although it is an important element of transfer-based attacks. We are, therefore, the first to review the effectiveness of several training methods in increasing transferability. We provide new directions to better understand the transferability phenomenon and offer a simple but strong baseline for future work.

Type

Paper

Publication

Accepted at UAI 2022

Date

July, 2022

Links

PDF Code Poster

This paper was accepted at UAI 2022.

Paper
The manuscript can be downloaded from arXiv or OpenReview.

Poster

The poster presented at UAI 2022 can be downloaded here.