论文标题
随机安全:使用基于能量的模型的长期动态的对抗防御
Stochastic Security: Adversarial Defense Using Long-Run Dynamics of Energy-Based Models
论文作者
论文摘要
从认知和安全的角度来看,深层网络对对抗性攻击的脆弱性是深入学习的核心问题。当前最成功的防御方法是使用学习过程中创建的对抗图像训练分类器。另一种防御方法涉及对原始输入的转换或纯化,以在对图像分类之前删除对抗信号。我们专注于使用马尔可夫链蒙特卡洛(MCMC)采样以基于能量的模型(EBM)来捍卫自然训练的分类器,以进行对抗纯化。与对抗训练相反,我们的方法旨在确保先前且脆弱的分类器。 长期MCMC采样的无内存行为最终将消除对抗性信号,而在许多步骤后,亚稳态行为可以保留MCMC样本的一致外观,以允许准确的长期预测。平衡这些因素可以导致有效的纯化和稳健的分类。我们使用EBM使用最强的抗纯化攻击来评估对抗性防御。我们的贡献是1)一种改进的方法,可以通过现实的长期MCMC样本训练EBM,2)一种期望反过来的预期(EOT)防御(EOT)防御,可以解决对随机防御的理论歧义,而EOT自然而然地攻击了EOT攻击,并且从中遵循的eot攻击以及3)对自然的对手的竞争者和有能力的竞争者,与对手有能力的竞争者相比,与对手的竞争者相比,与对手的竞争者相比 - SVHN和CIFAR-100。代码和预培训模型可在https://github.com/point0bar1/ebm-defense上找到。
The vulnerability of deep networks to adversarial attacks is a central problem for deep learning from the perspective of both cognition and security. The current most successful defense method is to train a classifier using adversarial images created during learning. Another defense approach involves transformation or purification of the original input to remove adversarial signals before the image is classified. We focus on defending naturally-trained classifiers using Markov Chain Monte Carlo (MCMC) sampling with an Energy-Based Model (EBM) for adversarial purification. In contrast to adversarial training, our approach is intended to secure pre-existing and highly vulnerable classifiers. The memoryless behavior of long-run MCMC sampling will eventually remove adversarial signals, while metastable behavior preserves consistent appearance of MCMC samples after many steps to allow accurate long-run prediction. Balancing these factors can lead to effective purification and robust classification. We evaluate adversarial defense with an EBM using the strongest known attacks against purification. Our contributions are 1) an improved method for training EBM's with realistic long-run MCMC samples, 2) an Expectation-Over-Transformation (EOT) defense that resolves theoretical ambiguities for stochastic defenses and from which the EOT attack naturally follows, and 3) state-of-the-art adversarial defense for naturally-trained classifiers and competitive defense compared to adversarially-trained classifiers on Cifar-10, SVHN, and Cifar-100. Code and pre-trained models are available at https://github.com/point0bar1/ebm-defense.