论文标题
关于用于图形应用程序的轻质FPGA编程框架的设计
On The Design of a Light-weight FPGA Programming Framework for Graph Applications
论文作者
论文摘要
设计用于图形处理的FPGA加速器正在越来越受欢迎。用于图形处理的域特定语言(DSL)框架可以降低算法设计的编程复杂性和开发成本。但是,特定于加速器的开发需要某些技术专业知识和重大努力来设计,实施和验证系统。对于大多数算法设计师而言,硬件编程体验的昂贵成本使FPGA加速器无法使用或不经济。尽管通用高级合成(HLS)工具有助于将高级语言映射到硬件说明语言(HDLS),但与高度优化的图形加速器相比,生成的代码通常效率低下且冗长。一个人无法充分利用FPGA加速器的能力,而开发成本较低。 To easily program graph algorithms while keeping performance degradation acceptable, we propose a graph programming system named JGraph, which contains two main parts: 1) a DSL for graph atomic operations with a graph library for high-level abstractions including user-defined functions with parameters, 2) a light-weight HLS translator to generate high-performance HDL code, cooperating with a communication manager and a runtime scheduler.据我们所知,我们的工作是FPGA平台上使用DSL和翻译器的第一个图形编程系统。我们的系统最多可以在数十秒钟内生成300 mTEPS BFS遍历。
FPGA accelerators designed for graph processing are gaining popularity. Domain Specific Language (DSL) frameworks for graph processing can reduce the programming complexity and development cost of algorithm design. However, accelerator-specific development requires certain technical expertise and significant effort to devise, implement, and validate the system. For most algorithm designers, the expensive cost for hardware programming experience makes FPGA accelerators either unavailable or uneconomic. Although general-purpose High-Level Synthesis (HLS) tools help to map high-level language to Hardware Description Languages (HDLs), the generated code is often inefficient and lengthy compared with the highly-optimized graph accelerators. One cannot make full use of an FPGA accelerator's capacity with low development cost. To easily program graph algorithms while keeping performance degradation acceptable, we propose a graph programming system named JGraph, which contains two main parts: 1) a DSL for graph atomic operations with a graph library for high-level abstractions including user-defined functions with parameters, 2) a light-weight HLS translator to generate high-performance HDL code, cooperating with a communication manager and a runtime scheduler. To the best of our knowledge, our work is the first graph programming system with DSL and translator on the FPGA platform. Our system can generate up to 300 MTEPS BFS traversal within tens of seconds.
