债券：基准在静态归因图上对无监督的离群节点检测检测

论文标题

债券：基准在静态归因图上对无监督的离群节点检测检测

BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs

论文作者

Liu, Kay, Dou, Yingtong, Zhao, Yue, Ding, Xueying, Hu, Xiyang, Zhang, Ruitong, Ding, Kaize, Chen, Canyu, Peng, Hao, Shu, Kai, Sun, Lichao, Li, Jundong, Chen, George H., Jia, Zhihao, Yu, Philip S.

论文摘要

检测图形中的哪些节点是异常值，这是一项相对较新的机器学习任务，具有许多应用程序。尽管近年来为这项任务发展了算法的扩散，但尚未进行绩效评估的标准综合设置。因此，很难理解哪种方法运行良好，以及在广泛的设置下。为了弥合这一差距，我们可以介绍最好的知识 - 在称为键的静态属性图上无监督的离群节点检测的第一个综合基准，并带有以下亮点。（1）我们基准测试了从经典矩阵分解到最新图形神经网络的14种方法的离群检测性能。（2）使用九个实际数据集，我们的基准测试评估了不同的检测方法如何响应两种主要类型的合成异常值，并分别对“有机”（实际非合成）异常值响应。（3）使用现有的随机图生成技术，我们生成了一个合成生成的不同图形大小的数据集，使我们能够比较不同离群检测算法的运行时间和内存使用情况。根据我们的实验结果，我们讨论了现有图形离群值检测算法的优缺点，并重点介绍了未来研究的机会。重要的是，我们的代码是免费的，可以易于扩展：https：//github.com/pygod-team/pygod/tree/main/main/main/benchmark

Detecting which nodes in graphs are outliers is a relatively new machine learning task with numerous applications. Despite the proliferation of algorithms developed in recent years for this task, there has been no standard comprehensive setting for performance evaluation. Consequently, it has been difficult to understand which methods work well and when under a broad range of settings. To bridge this gap, we present--to the best of our knowledge--the first comprehensive benchmark for unsupervised outlier node detection on static attributed graphs called BOND, with the following highlights. (1) We benchmark the outlier detection performance of 14 methods ranging from classical matrix factorization to the latest graph neural networks. (2) Using nine real datasets, our benchmark assesses how the different detection methods respond to two major types of synthetic outliers and separately to "organic" (real non-synthetic) outliers. (3) Using an existing random graph generation technique, we produce a family of synthetically generated datasets of different graph sizes that enable us to compare the running time and memory usage of the different outlier detection algorithms. Based on our experimental results, we discuss the pros and cons of existing graph outlier detection algorithms, and we highlight opportunities for future research. Importantly, our code is freely available and meant to be easily extendable: https://github.com/pygod-team/pygod/tree/main/benchmark

下载PDF全文

下载文献需遵守相关版权规定

论文标题