细粒度的数据集和错误修复纠缠中的纠结分析

论文标题

细粒度的数据集和错误修复纠缠中的纠结分析

A Fine-grained Data Set and Analysis of Tangling in Bug Fixing Commits

论文作者

Herbold, Steffen, Trautsch, Alexander, Ledel, Benjamin, Aghamohammadi, Alireza, Ghaleb, Taher Ahmed, Chahal, Kuljit Kaur, Bossenmaier, Tim, Nagaria, Bhaveet, Makedonski, Philip, Ahmadabadi, Matin Nili, Szabados, Kristof, Spieker, Helge, Madeja, Matej, Hoy, Nathaniel, Lenarduzzi, Valentina, Wang, Shangwen, Rodríguez-Pérez, Gema, Colomo-Palacios, Ricardo, Verdecchia, Roberto, Singh, Paramvir, Qin, Yihao, Chakroborti, Debasish, Davis, Willard, Walunj, Vijay, Wu, Hongjun, Marcilio, Diego, Alam, Omar, Aldaeej, Abdullah, Amit, Idan, Turhan, Burak, Eismann, Simon, Wickert, Anna-Katharina, Malavolta, Ivano, Sulir, Matus, Fard, Fatemeh, Henley, Austin Z., Kourtzanidis, Stratos, Tuzun, Eray, Treude, Christoph, Shamasbi, Simin Maleki, Pashchenko, Ivan, Wyrich, Marvin, Davis, James, Serebrenik, Alexander, Albrecht, Ella, Aktas, Ethem Utku, Strüber, Daniel, Erbel, Johannes

论文摘要

上下文：纠结提交是对软件的更改，可以立即解决多个问题。对于对虫子感兴趣的研究人员来说，纠结的提交意味着他们实际上不仅研究虫子，而且还研究其他问题与研究错误无关。目的：我们希望提高我们对纠结的普遍性以及在错误修复提交中纠结的变化类型的理解。方法：我们使用人群采购方法进行手动标记来验证哪些更改会导致错误固定提交中每行的错误修复。每行都由四个参与者标记。如果至少三个参与者同意同一标签，我们就达成共识。结果：我们估计，错误修复的所有更改中的17％至32％consits consits consits修改源代码以解决潜在的问题。但是，当我们仅考虑更改生产代码文件时，该比率将增加到66％至87％。我们发现，大约11％的线路很难标记，从而导致参与者之间的积极分歧。由于确认的纠结和数据中的不确定性，我们估计如果没有手动弄清楚，则3％至47％的数据是嘈杂的，具体取决于用例。结论：纠结的提交在错误修复中的患病率很高，并且可能导致数据中大量的噪声。先前的研究表明，这种噪声可能会改变结果。作为研究人员，我们应该怀疑，并假设未验证的数据可能非常嘈杂，直到另有证明。

Context: Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled commits mean that they actually study not only bugs, but also other concerns irrelevant for the study of bugs. Objective: We want to improve our understanding of the prevalence of tangling and the types of changes that are tangled within bug fixing commits. Methods: We use a crowd sourcing approach for manual labeling to validate which changes contribute to bug fixes for each line in bug fixing commits. Each line is labeled by four participants. If at least three participants agree on the same label, we have consensus. Results: We estimate that between 17% and 32% of all changes in bug fixing commits modify the source code to fix the underlying problem. However, when we only consider changes to the production code files this ratio increases to 66% to 87%. We find that about 11% of lines are hard to label leading to active disagreements between participants. Due to confirmed tangling and the uncertainty in our data, we estimate that 3% to 47% of data is noisy without manual untangling, depending on the use case. Conclusion: Tangled commits have a high prevalence in bug fixes and can lead to a large amount of noise in the data. Prior research indicates that this noise may alter results. As researchers, we should be skeptics and assume that unvalidated data is likely very noisy, until proven otherwise.

下载PDF全文

下载文献需遵守相关版权规定

论文标题