集线器感知的随机步行图嵌入方法用于分类

论文标题

集线器感知的随机步行图嵌入方法用于分类

Hub-aware Random Walk Graph Embedding Methods for Classification

论文作者

Tomčić, Aleksandar, Savić, Miloš, Radovanović, Miloš

论文摘要

在过去的二十年中，我们目睹了以图形或网络形式构建的有价值的大数据的大量增加。为了将传统的机器学习和数据分析技术应用于此类数据，有必要将图形转换为基于矢量的表示，以保留图形最重要的结构属性。为此，文献中已经提出了大量的图形嵌入方法。它们中的大多数产生了适用于各种应用的通用嵌入，例如节点聚类，节点分类，图形可视化和链接预测。在本文中，我们提出了两个新的图形嵌入算法，这些算法是基于专门为节点分类问题设计的随机步道。已设计的算法的随机步行采样策略旨在特别注意集线器 - 高度节点，这些节点在大型图中具有最关键的作用。通过分析对现实世界网络嵌入的三种分类算法的分类性能，对所提出的方法进行了实验评估。获得的结果表明，与当前最流行的随机步行方法相比，我们的方法可大大提高所检查分类器的预测能力（NODE2VEC）。

In the last two decades we are witnessing a huge increase of valuable big data structured in the form of graphs or networks. To apply traditional machine learning and data analytic techniques to such data it is necessary to transform graphs into vector-based representations that preserve the most essential structural properties of graphs. For this purpose, a large number of graph embedding methods have been proposed in the literature. Most of them produce general-purpose embeddings suitable for a variety of applications such as node clustering, node classification, graph visualisation and link prediction. In this paper, we propose two novel graph embedding algorithms based on random walks that are specifically designed for the node classification problem. Random walk sampling strategies of the proposed algorithms have been designed to pay special attention to hubs -- high-degree nodes that have the most critical role for the overall connectedness in large-scale graphs. The proposed methods are experimentally evaluated by analyzing the classification performance of three classification algorithms trained on embeddings of real-world networks. The obtained results indicate that our methods considerably improve the predictive power of examined classifiers compared to currently the most popular random walk method for generating general-purpose graph embeddings (node2vec).

下载PDF全文

下载文献需遵守相关版权规定

论文标题