论文标题

矮人图:具有模式分解的高性能图挖掘系统

DwarvesGraph: A High-Performance Graph Mining System with Pattern Decomposition

论文作者

Chen, Jingji, Qian, Xuehai

论文摘要

图模式挖掘(GPM)是一个重要的应用程序,可以标识图形的结构。尽管取得了最近的进展,但最先进的GPM系统与有效算法 - 模式分解之间的性能差距至少仍然是一个数量级。本文清除了对GPM系统采用模式分解的基本障碍。 首先,模式分解算法的性能取决于如何将整个模式分解为子图案。原始方法对不同选择进行算法进行复杂性分析,并选择具有最低复杂性上限的算法。显然,对于普通用户甚至专家使用者,这种方法是不可行的。为了解决此问题,我们开发了一个具有常规和GPM特异性优化的GPM编译器,以生成针对不同分解选择的算法,该算法是根据准确的成本模型对其进行评估的。 GPM任务的可执行文件是从具有最佳性能的算法中获得的。其次,我们提出了一种新型的部分装饰API,足以构建高级GPM应用,同时保留模式分解算法的优势。与最先进的系统相比,我们的新GPM系统基于思想开发的DwarvesGraph,将GPM在大图和模式上的执行时间从几天到几个小时,而编程工作较低。

Graph pattern mining (GPM) is an important application that identifies structures from graphs. Despite the recent progress, the performance gap between the state-of-the-art GPM systems and an efficient algorithm--pattern decomposition--is still at least an order of magnitude. This paper clears the fundamental obstacles of adopting pattern decomposition to a GPM system. First, the performance of pattern decomposition algorithms depends on how to decompose the whole pattern into subpatterns. The original method performs complexity analysis of algorithms for different choices, and selects the one with the lowest complexity upper bound. Clearly, this approach is not feasible for average or even expert users. To solve this problem, we develop a GPM compiler with conventional and GPM-specific optimizations to generate algorithms for different decomposition choices, which are evaluated based on an accurate cost model. The executable of the GPM task is obtained from the algorithm with the best performance. Second, we propose a novel partial-embedding API that is sufficient to construct advanced GPM applications while preserving pattern decomposition algorithm advantages. Compared to state-of-the-art systems, our new GPM system, DwarvesGraph, developed based on the ideas, reduces the execution time of GPM on large graphs and patterns from days to a few hours with low programming effort.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源