论文标题
通过在提交中捕获结构来增强安全补丁标识
Enhancing Security Patch Identification by Capturing Structures in Commits
论文作者
论文摘要
随着开源软件(OSS)数量的迅速增加,开源组件中的大多数软件漏洞都是无声地固定的,这导致已部署的软件集成了它们无法及时更新。因此,设计安全补丁标识系统以确保使用的软件的安全性至关重要。但是,大多数现有用于安全补丁标识的作品只需考虑更改的代码和提交的提交消息,即具有简单神经网络的平坦序列,以学习其语义,而结构信息则被忽略。为了解决这些局限性,在本文中,我们提出了精心设计的方法E-SPI,该方法提取了隐藏在有效识别的承诺中的结构信息。具体而言,它由代码更改编码器组成,以与Bilstm一起提取更改代码的句法,以学习代码表示形式和消息编码器,以使用图形神经网络(GNN)来构造提交消息的依赖关系图,以学习消息表示。我们通过嵌入与更改代码相关的上下文信息来进一步增强代码更改编码器。为了证明我们的方法的有效性,我们对现有数据集和实际部署环境的六种最先进方法进行了广泛的实验。实验结果证实,我们的方法可以显着优于当前最新基线。
With the rapid increasing number of open source software (OSS), the majority of the software vulnerabilities in the open source components are fixed silently, which leads to the deployed software that integrated them being unable to get a timely update. Hence, it is critical to design a security patch identification system to ensure the security of the utilized software. However, most of the existing works for security patch identification just consider the changed code and the commit message of a commit as a flat sequence of tokens with simple neural networks to learn its semantics, while the structure information is ignored. To address these limitations, in this paper, we propose our well-designed approach E-SPI, which extracts the structure information hidden in a commit for effective identification. Specifically, it consists of the code change encoder to extract the syntactic of the changed code with the BiLSTM to learn the code representation and the message encoder to construct the dependency graph for the commit message with the graph neural network (GNN) to learn the message representation. We further enhance the code change encoder by embedding contextual information related to the changed code. To demonstrate the effectiveness of our approach, we conduct the extensive experiments against six state-of-the-art approaches on the existing dataset and from the real deployment environment. The experimental results confirm that our approach can significantly outperform current state-of-the-art baselines.