论文标题

稀疏攻击的高效且强大的分类

Efficient and Robust Classification for Sparse Attacks

论文作者

Beliaev, Mark, Delgosha, Payam, Hassani, Hamed, Pedarsani, Ramtin

论文摘要

在过去的二十年中,我们看到神经网络的普及与它们的分类准确性相结合。与此相关的是,我们还目睹了相同的预测模型的脆弱性:对输入的微小扰动可能会导致整个数据集中的错误分类错误。在本文中,我们考虑了受$ \ ell_0 $ - 规范界定的扰动,这些扰动已显示为在图像识别,自然语言处理和恶意软件检测领域的有效攻击。为此,我们提出了一种新颖的防御方法,该方法包括“截断”和“对抗训练”。然后,我们从理论上研究高斯混合物设置,并证明我们提出的分类器的渐近最佳性。由我们获得的见解的动机,我们将这些组件扩展到神经网络分类器。我们使用MNIST和CIFAR数据集在计算机视觉领域进行数值实验,这表明神经网络的鲁棒分类误差有显着改善。

In the past two decades we have seen the popularity of neural networks increase in conjunction with their classification accuracy. Parallel to this, we have also witnessed how fragile the very same prediction models are: tiny perturbations to the inputs can cause misclassification errors throughout entire datasets. In this paper, we consider perturbations bounded by the $\ell_0$--norm, which have been shown as effective attacks in the domains of image-recognition, natural language processing, and malware-detection. To this end, we propose a novel defense method that consists of "truncation" and "adversarial training". We then theoretically study the Gaussian mixture setting and prove the asymptotic optimality of our proposed classifier. Motivated by the insights we obtain, we extend these components to neural network classifiers. We conduct numerical experiments in the domain of computer vision using the MNIST and CIFAR datasets, demonstrating significant improvement for the robust classification error of neural networks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源