通过部分转移有条件的对抗分布来增强黑盒攻击

论文标题

通过部分转移有条件的对抗分布来增强黑盒攻击

Boosting Black-Box Attack with Partially Transferred Conditional Adversarial Distribution

论文作者

Feng, Yan, Wu, Baoyuan, Fan, Yanbo, Liu, Li, Li, Zhifeng, Xia, Shutao

论文摘要

这项工作研究了针对深神经网络（DNN）的黑盒对抗攻击，在该攻击者只能访问攻击的DNN模型返回的查询反馈，而其他信息（例如模型参数或培训数据集）却未知。提高攻击性能的一种有希望的方法是利用某些白盒替代模型和目标模型（即受到攻击模型）之间的对抗转移性。但是，由于替代模型和目标模型之间的模型体系结构和训练数据集的可能差异，称为“替代偏见”，因此对抗性可传递性对改善攻击性能的贡献可能会削弱。为了解决这个问题，我们通过开发一种新颖的对抗性转移性机制来创新提出一种黑盒攻击方法，这对代孕偏见是可靠的。总体想法是传递替代模型条件对抗分布（CAD）的部分参数，同时学习基于对目标模型的查询学习未转移的参数，以保持灵活性以在任何新的良性样本上调整目标模型的CAD。在基准数据集上进行了广泛的实验和对现实世界API的攻击，证明了该方法的出色攻击性能。

This work studies black-box adversarial attacks against deep neural networks (DNNs), where the attacker can only access the query feedback returned by the attacked DNN model, while other information such as model parameters or the training datasets are unknown. One promising approach to improve attack performance is utilizing the adversarial transferability between some white-box surrogate models and the target model (i.e., the attacked model). However, due to the possible differences on model architectures and training datasets between surrogate and target models, dubbed "surrogate biases", the contribution of adversarial transferability to improving the attack performance may be weakened. To tackle this issue, we innovatively propose a black-box attack method by developing a novel mechanism of adversarial transferability, which is robust to the surrogate biases. The general idea is transferring partial parameters of the conditional adversarial distribution (CAD) of surrogate models, while learning the untransferred parameters based on queries to the target model, to keep the flexibility to adjust the CAD of the target model on any new benign sample. Extensive experiments on benchmark datasets and attacking against real-world API demonstrate the superior attack performance of the proposed method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题