论文标题
通过通过随机平滑来证明对后门攻击的鲁棒性
On Certifying Robustness against Backdoor Attacks via Randomized Smoothing
论文作者
论文摘要
后门攻击是对深神经网络(DNN)的严重安全威胁。我们设想,像对抗性的例子一样,将有一个用于后门攻击的猫和小鼠游戏,即开发了新的经验防御措施来防御后门攻击,但很快就会因强烈的适应性后门攻击而破坏。为了防止这种猫和小鼠游戏,我们迈出了针对后门攻击的认证防御措施的第一步。具体而言,在这项工作中,我们研究了使用最近称为随机平滑的技术来证明鲁棒性抵抗后门攻击的可行性和有效性。随机平滑最初是为了证明对抗性例子的鲁棒性而开发的。我们将随机平滑概括以防止后门攻击。我们的结果表明,使用随机平滑来证明对后门攻击的鲁棒性的理论可行性。但是,我们还发现,现有的随机平滑方法在防御后门攻击方面具有有限的有效性,这突出了新理论的需求和方法,以证明鲁棒性抵抗后门攻击。
Backdoor attack is a severe security threat to deep neural networks (DNNs). We envision that, like adversarial examples, there will be a cat-and-mouse game for backdoor attacks, i.e., new empirical defenses are developed to defend against backdoor attacks but they are soon broken by strong adaptive backdoor attacks. To prevent such cat-and-mouse game, we take the first step towards certified defenses against backdoor attacks. Specifically, in this work, we study the feasibility and effectiveness of certifying robustness against backdoor attacks using a recent technique called randomized smoothing. Randomized smoothing was originally developed to certify robustness against adversarial examples. We generalize randomized smoothing to defend against backdoor attacks. Our results show the theoretical feasibility of using randomized smoothing to certify robustness against backdoor attacks. However, we also find that existing randomized smoothing methods have limited effectiveness at defending against backdoor attacks, which highlight the needs of new theory and methods to certify robustness against backdoor attacks.