论文标题
超越影像网攻击:为黑框域制作对抗性示例
Beyond ImageNet Attack: Towards Crafting Adversarial Examples for Black-box Domains
论文作者
论文摘要
对抗性的例子由于其可转移性而对深层神经网络构成了严重威胁。当前,各种作品已经付出了巨大的努力来增强跨模型可传递性,这主要假设替代模型与目标模型相同的领域进行了训练。但是,实际上,部署模型的相关信息不太可能泄漏。因此,建立一个更实用的黑框威胁模型以克服这一限制并评估部署模型的脆弱性至关重要。在本文中,只有对Imagenet域的了解,我们提出了一个超越成像网攻击(BIA)来研究对黑盒域(未知分类任务)的可传递性。具体而言,我们利用生成模型来学习对抗功能,以破坏输入图像的低级特征。基于此框架,我们进一步提出了两个变体,分别从数据和模型角度分别缩小了源和目标域之间的差距。对粗粒和细粒域进行的广泛实验证明了我们提出的方法的有效性。值得注意的是,我们的方法平均超过了最新的方法(朝着粗粒域)和25.91 \%(朝向细粒域)的最高最高方法。我们的代码可在\ url {https://github.com/qilong-zhang/beyond-imagenet-attack}中获得。
Adversarial examples have posed a severe threat to deep neural networks due to their transferable nature. Currently, various works have paid great efforts to enhance the cross-model transferability, which mostly assume the substitute model is trained in the same domain as the target model. However, in reality, the relevant information of the deployed model is unlikely to leak. Hence, it is vital to build a more practical black-box threat model to overcome this limitation and evaluate the vulnerability of deployed models. In this paper, with only the knowledge of the ImageNet domain, we propose a Beyond ImageNet Attack (BIA) to investigate the transferability towards black-box domains (unknown classification tasks). Specifically, we leverage a generative model to learn the adversarial function for disrupting low-level features of input images. Based on this framework, we further propose two variants to narrow the gap between the source and target domains from the data and model perspectives, respectively. Extensive experiments on coarse-grained and fine-grained domains demonstrate the effectiveness of our proposed methods. Notably, our methods outperform state-of-the-art approaches by up to 7.71\% (towards coarse-grained domains) and 25.91\% (towards fine-grained domains) on average. Our code is available at \url{https://github.com/qilong-zhang/Beyond-ImageNet-Attack}.