通过从黑盒神经网络中泄漏普遍的扰动来防止图像翻译深击。

论文标题

通过从黑盒神经网络中泄漏普遍的扰动来防止图像翻译深击。

Protecting Against Image Translation Deepfakes by Leaking Universal Perturbations from Black-Box Neural Networks

论文作者

Ruiz, Nataniel, Bargal, Sarah Adel, Sclaroff, Stan

论文摘要

在这项工作中，我们开发了黑框图像翻译深击系统的有效破坏。我们是第一个通过介绍最初针对分类模型提出的攻击的图像翻译公式来证明黑盒深泡产生中断的人。然而，分类黑盒攻击的天真适应导致现实世界中图像翻译系统的查询数量过高。我们提出了一种令人沮丧的简单而高效的算法泄漏通用扰动（LUP），可显着减少攻击图像所需的查询数量。 LUP由两个阶段组成：（1）短暂的泄漏阶段，我们使用传统的黑盒攻击来攻击网络，并收集有关在小数据集中成功攻击的信息，以及（2）和一个利用阶段，我们利用上述信息以随后以提高效率攻击网络。我们的攻击使攻击Ganimation和Stargan所需的查询总数减少了30％。

In this work, we develop efficient disruptions of black-box image translation deepfake generation systems. We are the first to demonstrate black-box deepfake generation disruption by presenting image translation formulations of attacks initially proposed for classification models. Nevertheless, a naive adaptation of classification black-box attacks results in a prohibitive number of queries for image translation systems in the real-world. We present a frustratingly simple yet highly effective algorithm Leaking Universal Perturbations (LUP), that significantly reduces the number of queries needed to attack an image. LUP consists of two phases: (1) a short leaking phase where we attack the network using traditional black-box attacks and gather information on successful attacks on a small dataset and (2) and an exploitation phase where we leverage said information to subsequently attack the network with improved efficiency. Our attack reduces the total number of queries necessary to attack GANimation and StarGAN by 30%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题