论文标题

关于改进现场合并算法并行化

On the improvement of the in-place merge algorithm parallelization

论文作者

Bramas, Berenger, Bramas, Quentin

论文摘要

在本文中,我们介绍了现场合并算法的并行化的几个改进,该算法将两个连续排序的数组合并为一个具有O(t)空间复杂性的阵列(其中T是螺纹的数量)。该方法将两个阵列分为有线程的多对分区。这样每个线程都可以稍后与其他线程合并一对分区。我们通过提出一种新算法来查找两个分区的中位数来扩展现有方法。此外,我们还提供了一种新策略,以将输入阵列划分,以最大程度地减少数据移动,但以制作此阶段的成本为代价。最后,我们提供所谓的线性移位算法,该算法将两个就位的两个分区与连续的数据访问交换。我们强调,我们的方法很容易实施,并且也可以用于外部(不合适)合并。结果表明,与顺序执行相比,当数组的大小大于一千个元素时,它提供了显着的加速。

In this paper, we present several improvements in the parallelization of the in-place merge algorithm, which merges two contiguous sorted arrays into one with an O(T) space complexity (where T is the number of threads). The approach divides the two arrays into as many pairs of partitions as there are threads available; such that each thread can later merge a pair of partitions independently of the others. We extend the existing method by proposing a new algorithm to find the median of two partitions. Additionally, we provide a new strategy to divide the input arrays where we minimize the data movement, but at the cost of making this stage sequential. Finally, we provide the so-called linear shifting algorithm that swaps two partitions in-place with contiguous data access. We emphasize that our approach is straightforward to implement and that it can also be used for external (out of place) merging. The results demonstrate that it provides a significant speedup compared to sequential executions, when the size of the arrays is greater than a thousand elements.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源