论文标题
最佳运输性能的研究
A Study of Performance of Optimal Transport
论文作者
论文摘要
我们研究了有效计算最佳传输(OT)距离的问题,这等同于两部分图中的节点载有最低成本最大流量问题。我们比较了从多个域的数据计算距离的运行时间,例如几何形状的合成数据,文档中令牌的嵌入以及图像中的像素。我们表明,在实践中,诸如网络单纯形和基于路径的算法之类的组合方法可以始终如一地超过基于数值矩阵量表的方法,例如sindhorn [cuturi'13]和greenkhorn [altschuler等人[Altschuler等人],即使在准确的机制下,也可以达到较低的速度。最后,我们提出了一种新的组合算法,该算法可改善古典库恩·蒙克雷斯算法。
We investigate the problem of efficiently computing optimal transport (OT) distances, which is equivalent to the node-capacitated minimum cost maximum flow problem in a bipartite graph. We compare runtimes in computing OT distances on data from several domains, such as synthetic data of geometric shapes, embeddings of tokens in documents, and pixels in images. We show that in practice, combinatorial methods such as network simplex and augmenting path based algorithms can consistently outperform numerical matrix-scaling based methods such as Sinkhorn [Cuturi'13] and Greenkhorn [Altschuler et al'17], even in low accuracy regimes, with up to orders of magnitude speedups. Lastly, we present a new combinatorial algorithm that improves upon the classical Kuhn-Munkres algorithm.