massif：对深度学习的对抗性攻击的互动解释

论文标题

massif：对深度学习的对抗性攻击的互动解释

Massif: Interactive Interpretation of Adversarial Attacks on Deep Learning

论文作者

Das, Nilaksh, Park, Haekyu, Wang, Zijie J., Hohman, Fred, Firstman, Robert, Rogers, Emily, Chau, Duen Horng

论文摘要

深度神经网络（DNNS）越来越多地为高风险的应用程序提供动力，例如自动驾驶汽车和医疗保健；但是，在此类应用中，DNN通常被视为“黑匣子”。最近的研究还表明，DNN非常容易受到对抗攻击的影响，这引起了人们对在现实世界中部署DNN的严重关注。为了克服这些缺陷，我们正在开发massif，这是一种用于破译对抗攻击的交互式工具。 Massif识别并交互式地可视化神经元及其在DNN中的连接，这些神经元被对抗攻击强烈激活或抑制。 Massif既提供了对DNN攻击影响的高级，可解释的概述，也提供了对受影响神经元的低级，详细描述。这些紧密耦合的观点可以帮助人们更好地了解哪些输入功能对于正确的预测最脆弱或最重要。

Deep neural networks (DNNs) are increasingly powering high-stakes applications such as autonomous cars and healthcare; however, DNNs are often treated as "black boxes" in such applications. Recent research has also revealed that DNNs are highly vulnerable to adversarial attacks, raising serious concerns over deploying DNNs in the real world. To overcome these deficiencies, we are developing Massif, an interactive tool for deciphering adversarial attacks. Massif identifies and interactively visualizes neurons and their connections inside a DNN that are strongly activated or suppressed by an adversarial attack. Massif provides both a high-level, interpretable overview of the effect of an attack on a DNN, and a low-level, detailed description of the affected neurons. These tightly coupled views in Massif help people better understand which input features are most vulnerable or important for correct predictions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题