论文标题

映射事项:3-D处理器拓扑上的申请过程映射

Mapping Matters: Application Process Mapping on 3-D Processor Topologies

论文作者

Korndörfer, Jonas H. Müller, Bielert, Mario, Pilla, Laércio L., Ciorba, Florina M.

论文摘要

应用程序的性能受过程映射到计算节点的映射,处理元素之间的交换的频率和数量,网络容量和路由协议的频率和数量。应用程序的映射差会降低性能和浪费资源。过程映射通常被忽略为明确的优化步骤,因为系统通常提供默认映射,用户可能缺乏对应用程序的通信行为的认识,并且通常不清楚通过映射提高性能的机会。这项工作研究了应用程序映射对几个处理器拓扑的影响。我们提出了一个工作流,该工作流将映射作为并行应用程序的明确优化步骤。我们将工作流程应用于四个应用程序,十二个映射算法和三个直接网络拓扑。我们使用诸如扩张等指标(以Hop $ \ cdot $ byte测量)来评估映射的数量,频率和距离的质量。使用基于痕量的模拟器,我们使用十二个映射预测应用程序对三个拓扑的执行。我们在执行和通信时间方面评估了过程映射对应用程序模拟性能的影响,并确定在两种情况下都达到最高性能的映射。为了确保模拟的正确性,我们比较了仿真结果。这项工作强调了过程映射作为明确优化步骤的重要性,并为并行应用程序提供了解决方案,以利用给定系统上分配资源的全部潜力。

Applications' performance is influenced by the mapping of processes to computing nodes, the frequency and volume of exchanges among processing elements, the network capacity, and the routing protocol. A poor mapping of application processes degrades performance and wastes resources. Process mapping is frequently ignored as an explicit optimization step since the system typically offers a default mapping, users may lack awareness of their applications' communication behavior, and the opportunities for improving performance through mapping are often unclear. This work studies the impact of application process mapping on several processor topologies. We propose a workflow that renders mapping as an explicit optimization step for parallel applications. We apply the workflow to a set of four applications, twelve mapping algorithms, and three direct network topologies. We assess the mappings' quality in terms of volume, frequency, and distance of exchanges using metrics such as dilation (measured in hop$\cdot$Byte). With a parallel trace-based simulator, we predict the applications' execution on the three topologies using the twelve mappings. We evaluate the impact of process mapping on the applications' simulated performance in terms of execution and communication times and identify the mappings that achieve the highest performance in both cases. To ensure the correctness of the simulations, we compare the pre- and post-simulation results. This work emphasizes the importance of process mapping as an explicit optimization step and offers a solution for parallel applications to exploit the full potential of the allocated resources on a given system.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源