鲁棒：标准化的对抗性鲁棒性基准测试

论文标题

鲁棒：标准化的对抗性鲁棒性基准测试

RobustBench: a standardized adversarial robustness benchmark

论文作者

Croce, Francesco, Andriushchenko, Maksym, Sehwag, Vikash, Debenedetti, Edoardo, Flammarion, Nicolas, Chiang, Mung, Mittal, Prateek, Hein, Matthias

论文摘要

作为一个研究界，我们仍然缺乏对对抗性鲁棒性进步的系统理解，这通常使很难在培训健壮的模型中确定最有前途的想法。基准鲁棒性的一个关键挑战是，其评估通常是错误的，导致稳健性高估。我们的目标是建立对抗性鲁棒性的标准化基准，该基准尽可能准确地反映了在合理的计算预算中所考虑的模型的鲁棒性。为此，我们首先考虑图像分类任务，并在允许的模型上引入限制（可能在将来放松）。我们用AutoAttack（白色和黑色盒子攻击的合奏）评估了对抗性鲁棒性，最近在一项大规模研究中显示了这一攻击，以改善与原始出版物相比的几乎所有鲁棒性评估。为了防止对自动攻击的新防御能力过度适应，我们欢迎基于自适应攻击的外部评估，尤其是在AutoAttack标志可能高估稳健性的情况下。我们的排行榜在https://robustbench.github.io/上主持，包含对120多个模型的评估，旨在在$ \ ell_ \ ell_ \ elfty $ - 和$ \ ell_2 $ throw的模型和$ \ ell_2 $ throw的模型和常见的腐败以及与未来相处的普通腐败中反映一组明确定义的任务的图像分类状态。此外，我们将库https://github.com/robustbench/robustbench开放源，该库提供了对80多个强大模型的统一访问权限，以促进其下游应用程序。最后，根据收集的模型，我们分析了鲁棒性对性能对分布变化，校准，分布外检测，公平性，隐私泄漏，平稳性和可传递性的影响。

As a research community, we are still lacking a systematic understanding of the progress on adversarial robustness which often makes it hard to identify the most promising ideas in training robust models. A key challenge in benchmarking robustness is that its evaluation is often error-prone leading to robustness overestimation. Our goal is to establish a standardized benchmark of adversarial robustness, which as accurately as possible reflects the robustness of the considered models within a reasonable computational budget. To this end, we start by considering the image classification task and introduce restrictions (possibly loosened in the future) on the allowed models. We evaluate adversarial robustness with AutoAttack, an ensemble of white- and black-box attacks, which was recently shown in a large-scale study to improve almost all robustness evaluations compared to the original publications. To prevent overadaptation of new defenses to AutoAttack, we welcome external evaluations based on adaptive attacks, especially where AutoAttack flags a potential overestimation of robustness. Our leaderboard, hosted at https://robustbench.github.io/, contains evaluations of 120+ models and aims at reflecting the current state of the art in image classification on a set of well-defined tasks in $\ell_\infty$- and $\ell_2$-threat models and on common corruptions, with possible extensions in the future. Additionally, we open-source the library https://github.com/RobustBench/robustbench that provides unified access to 80+ robust models to facilitate their downstream applications. Finally, based on the collected models, we analyze the impact of robustness on the performance on distribution shifts, calibration, out-of-distribution detection, fairness, privacy leakage, smoothness, and transferability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题