现场特洛伊木马攻击深度神经网络

论文标题

现场特洛伊木马攻击深度神经网络

Live Trojan Attacks on Deep Neural Networks

论文作者

Costales, Robby, Mao, Chengzhi, Norwitz, Raphael, Kim, Bryan, Yang, Junfeng

论文摘要

像所有软件系统一样，深度学习模型的执行部分由以内存中的数据表示为逻辑。数十年来，攻击者通过操纵这些数据来利用传统软件程序。我们提出对深度学习系统的实时攻击，该系统对内存中的模型参数进行修补，以在一组输入中实现预定义的恶意行为。通过最小化这些补丁的大小和数量，攻击者可以减少网络通信和内存覆盖的量，而系统故障或其他可检测到的副作用的风险很小。我们通过在多个深度学习模型上计算有效的补丁来证明这种攻击的可行性。我们表明，所需的特洛伊木马行为可以通过一些小补丁进行诱导，并且访问训练数据的访问量有限。我们描述了如何在真实系统上进行此攻击的详细信息，并提供用于修补Windows和Linux中张量流模型参数的示例代码。最后，我们提出了一种技术，可以有效地操纵熵输入的熵，以绕过绕过条带，这是一种最先进的运行时间特洛伊木马检测技术。

Like all software systems, the execution of deep learning models is dictated in part by logic represented as data in memory. For decades, attackers have exploited traditional software programs by manipulating this data. We propose a live attack on deep learning systems that patches model parameters in memory to achieve predefined malicious behavior on a certain set of inputs. By minimizing the size and number of these patches, the attacker can reduce the amount of network communication and memory overwrites, with minimal risk of system malfunctions or other detectable side effects. We demonstrate the feasibility of this attack by computing efficient patches on multiple deep learning models. We show that the desired trojan behavior can be induced with a few small patches and with limited access to training data. We describe the details of how this attack is carried out on real systems and provide sample code for patching TensorFlow model parameters in Windows and in Linux. Lastly, we present a technique for effectively manipulating entropy on perturbed inputs to bypass STRIP, a state-of-the-art run-time trojan detection technique.

下载PDF全文

下载文献需遵守相关版权规定

论文标题