论文标题
对抗性碰撞攻击图像哈希功能
Adversarial collision attacks on image hashing functions
论文作者
论文摘要
具有感知算法的散列图像是解决重复图像检测问题的常见方法。但是,感知图像散列算法是可区分的,因此容易受到基于梯度的对抗攻击的影响。我们证明,不仅可以修改图像以产生无关的哈希,而且可以通过微小的对抗扰动产生源和目标图像之间的精确图像哈希碰撞。在白盒设置中,这些碰撞几乎可以在几乎每个图像对和哈希类型(包括深度和非学习的哈希)中复制。此外,通过攻击哈希功能的输出以外的其他点,攻击者可以避免不得不知道特定算法的细节,从而导致碰撞在不同的哈希大小或模型架构上传输的碰撞。使用这些技术,对手可以毒化重复图像检测服务的图像查找表,从而导致不确定或不必要的行为。最后,我们为基于梯度的图像哈希攻击提供了几种潜在的缓解。
Hashing images with a perceptual algorithm is a common approach to solving duplicate image detection problems. However, perceptual image hashing algorithms are differentiable, and are thus vulnerable to gradient-based adversarial attacks. We demonstrate that not only is it possible to modify an image to produce an unrelated hash, but an exact image hash collision between a source and target image can be produced via minuscule adversarial perturbations. In a white box setting, these collisions can be replicated across nearly every image pair and hash type (including both deep and non-learned hashes). Furthermore, by attacking points other than the output of a hashing function, an attacker can avoid having to know the details of a particular algorithm, resulting in collisions that transfer across different hash sizes or model architectures. Using these techniques, an adversary can poison the image lookup table of a duplicate image detection service, resulting in undefined or unwanted behavior. Finally, we offer several potential mitigations to gradient-based image hash attacks.