论文标题

GreedyFool:多因素不可识别及其在设计黑盒对抗攻击的应用

GreedyFool: Multi-Factor Imperceptibility and Its Application to Designing a Black-box Adversarial Attack

论文作者

Liu, Hui, Zhao, Bo, Ji, Minzhi, Liu, Peng

论文摘要

对抗性示例是精心设计的输入样本,其中扰动对人的眼睛无法察觉,但很容易误导深神经网络(DNN)的输出。现有作品通过利用简单的指标来惩罚扰动,从而综合了对抗性示例,这些指标缺乏对人类视觉系统(HVS)的充分考虑,而人类视觉系统(HVS)产生了明显的文物。为了探讨为什么可见扰动,本文总结了影响人眼易感性的四个主要因素。基于这项研究,我们设计了一种多因素度量的Mulfactorloss,用于测量良性示例和对抗性阶段之间的感知损失。为了测试多因素度量标准的不可识别性,我们提出了一种新型的黑盒对抗攻击,称为贪婪的狂热。 GreedyFool应用差异进化来评估扰动像素对目标DNN置信度的影响,并引入贪婪的近似值以自动产生对抗性扰动。我们对ImageNet和CIFRA-10数据集进行了广泛的实验,并对60名参与者进行了全面的用户研究。实验结果表明,Mulfactorloss比现有的PixelWise度量标准更不可感知,而GreedyFool以黑盒方式获得了100%的成功率。

Adversarial examples are well-designed input samples, in which perturbations are imperceptible to the human eyes, but easily mislead the output of deep neural networks (DNNs). Existing works synthesize adversarial examples by leveraging simple metrics to penalize perturbations, that lack sufficient consideration of the human visual system (HVS), which produces noticeable artifacts. To explore why the perturbations are visible, this paper summarizes four primary factors affecting the perceptibility of human eyes. Based on this investigation, we design a multi-factor metric MulFactorLoss for measuring the perceptual loss between benign examples and adversarial ones. In order to test the imperceptibility of the multi-factor metric, we propose a novel black-box adversarial attack that is referred to as GreedyFool. GreedyFool applies differential evolution to evaluate the effects of perturbed pixels on the confidence of a target DNN, and introduces greedy approximation to automatically generate adversarial perturbations. We conduct extensive experiments on the ImageNet and CIFRA-10 datasets and a comprehensive user study with 60 participants. The experimental results demonstrate that MulFactorLoss is a more imperceptible metric than the existing pixelwise metrics, and GreedyFool achieves a 100% success rate in a black-box manner.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源