一个统一的瓦斯汀分布鲁棒性框架用于对抗训练

论文标题

一个统一的瓦斯汀分布鲁棒性框架用于对抗训练

A Unified Wasserstein Distributional Robustness Framework for Adversarial Training

论文作者

Bui, Tuan Anh, Le, Trung, Tran, Quan, Zhao, He, Phung, Dinh

论文摘要

众所周知，深度神经网络（DNN）容易受到对抗攻击的影响，暴露了深度学习系统的严重脆弱性。结果，对抗性训练（AT）方法通过在训练中纳入对抗性示例，是一种自然而有效的方法，可以增强基于DNN的分类器的鲁棒性。但是，大多数基于基于PGD-AT和交易的基于基于的方法，通常会寻求一个侧面的对手，该对手通过独立扰动每个数据样本来生成最坏的对抗示例，以“探测”分类器的脆弱性。可以说，考虑到整个分布中的这种对抗性效应，有未开发的好处。为此，本文提出了一个统一的框架，将Wasserstein分布鲁棒性与当前最新方法联系起来。我们介绍了新的Wasserstein成本功能和一系列新的风险功能，我们通过该功能表明，在我们的框架中，方法的标准是其对应物的特殊情况。这种联系导致了现有方法的直观放松和概括，并促进了基于基于算法的新分配鲁棒性系列的发展。广泛的实验表明，我们在算法上的分布鲁棒性在各种环境中进一步稳健地标准了其标准。

It is well-known that deep neural networks (DNNs) are susceptible to adversarial attacks, exposing a severe fragility of deep learning systems. As the result, adversarial training (AT) method, by incorporating adversarial examples during training, represents a natural and effective approach to strengthen the robustness of a DNN-based classifier. However, most AT-based methods, notably PGD-AT and TRADES, typically seek a pointwise adversary that generates the worst-case adversarial example by independently perturbing each data sample, as a way to "probe" the vulnerability of the classifier. Arguably, there are unexplored benefits in considering such adversarial effects from an entire distribution. To this end, this paper presents a unified framework that connects Wasserstein distributional robustness with current state-of-the-art AT methods. We introduce a new Wasserstein cost function and a new series of risk functions, with which we show that standard AT methods are special cases of their counterparts in our framework. This connection leads to an intuitive relaxation and generalization of existing AT methods and facilitates the development of a new family of distributional robustness AT-based algorithms. Extensive experiments show that our distributional robustness AT algorithms robustify further their standard AT counterparts in various settings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题