后门攻击视觉变压器

论文标题

后门攻击视觉变压器

Backdoor Attacks on Vision Transformers

论文作者

Subramanya, Akshayvarun, Saha, Aniruddha, Koohpayegani, Soroush Abbasi, Tejankar, Ajinkya, Pirsiavash, Hamed

论文摘要

视觉变压器（VIT）最近在各种视觉任务上表现出了典范的性能，并被用作CNN的替代性。他们的设计基于一种自我发挥的机制，该机制将图像作为一系列斑块进行处理，与CNN相比，这是完全不同的。因此，研究VIT是否容易受到后门攻击的影响很有趣。当攻击者出于恶意目的，攻击者毒害培训数据的一小部分时，就会发生后门攻击。模型性能在干净的测试图像上很好，但是攻击者可以通过在测试时显示扳机来操纵模型的决策。据我们所知，我们是第一个证明VIT容易受到后门攻击的人。我们还发现VIT和CNNS之间有一个有趣的差异 - 解释算法有效地突出了VIT的测试图像的触发因素，但没有针对CNN。基于此观察结果，我们提出了一个测试时间图像阻止VIT的防御，这将攻击成功率降低了很大。代码可在此处提供：https：//github.com/ucdvision/backdoor_transformer.git

Vision Transformers (ViT) have recently demonstrated exemplary performance on a variety of vision tasks and are being used as an alternative to CNNs. Their design is based on a self-attention mechanism that processes images as a sequence of patches, which is quite different compared to CNNs. Hence it is interesting to study if ViTs are vulnerable to backdoor attacks. Backdoor attacks happen when an attacker poisons a small part of the training data for malicious purposes. The model performance is good on clean test images, but the attacker can manipulate the decision of the model by showing the trigger at test time. To the best of our knowledge, we are the first to show that ViTs are vulnerable to backdoor attacks. We also find an intriguing difference between ViTs and CNNs - interpretation algorithms effectively highlight the trigger on test images for ViTs but not for CNNs. Based on this observation, we propose a test-time image blocking defense for ViTs which reduces the attack success rate by a large margin. Code is available here: https://github.com/UCDvision/backdoor_transformer.git

下载PDF全文

下载文献需遵守相关版权规定

论文标题