论文标题

与嵌套并联的数组语言的广告

AD for an Array Language with Nested Parallelism

论文作者

Schenck, Robert, Rønning, Ola, Henriksen, Troels, Oancea, Cosmin E.

论文摘要

我们提出了一种用于应用(正向和)反向模式自动分化(AD)的技术,该技术支持嵌套并行性,主要旨在有效地执行GPU。关键想法是通过依靠冗余执行来消除“磁带”的需求,以将差异化代码可能需要的所有新范围带入每个新范围。通过观察到完整的示波器不会引入重新执行,并且通过已知的编译器转换(例如变平的扁平化)产生这样的完美巢穴,从而实现了有效的执行。我们的技术通过特定的重写规则来区分循环和批量并联运算符,例如MAP,REDID,直方图,扫描,分散,并积极优化所得的嵌套并联代码。我们报告了一项实验评估,该评估与已建立的AD解决方案进行了比较,并在最近应用AD文献的9个常见基准上证明了竞争性能。

We present a technique for applying (forward and) reverse-mode automatic differentiation (AD) on a non-recursive second-order functional array language that supports nested parallelism and is primarily aimed at efficient GPU execution. The key idea is to eliminate the need for a "tape" by relying on redundant execution to bring into each new scope all program variables that may be needed by the differentiated code. Efficient execution is enabled by the observation that perfectly-nested scopes do not introduce re-execution, and such perfect nests are produced by known compiler transformations, e.g., flattening. Our technique differentiates loops and bulk-parallel operators, such as map, reduce, histogram, scan, scatter, by specific rewrite rules, and aggressively optimizes the resulting nested-parallel code. We report an experimental evaluation that compares with established AD solutions and demonstrates competitive performance on nine common benchmarks from recent applied AD literature.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源