论文标题

可扩展的线性时间密度密度直接求解器,用于3-D问题,而无需落后子矩阵依赖项

Scalable Linear Time Dense Direct Solver for 3-D Problems Without Trailing Sub-Matrix Dependencies

论文作者

Ma, Qianxiang, Deshmukh, Sameer, Yokota, Rio

论文摘要

大型矩阵的分解在工程和数据科学应用中无处不在,例如迭代边界积分求解器的预调节器,稀疏多额定求解器中的额矩阵以及计算协方差矩阵的决定因素。 HSS和$ \ MATHCAL {H}^2 $ -MATRICES是层次低秩矩阵格式,可以将此类密集矩阵从$ \ Mathcal {o}(n^3)$分解为$ \ Mathcal {O} {O}(o}(n)$的复杂性。对于HSS矩阵,可以在Cholesky/Lu分解过程中去除对尾随矩阵的依赖性,从而导致高度平行的算法。但是,HSS的可接受性较弱会导致偏外块的排名在3-D问题上增长,并且该方法不再是$ \ MATHCAL {O}(o}(n)$。另一方面,$ \ MATHCAL {H}^2 $ -MATRICES的强大可接受性使其可以在$ \ Mathcal {O}(N)$中处理3-D问题,但引入了对落后矩阵的依赖性。在目前的工作中,我们将填充物预先计算并将其集成到共享基础中,这使我们能够删除对尾随 - 摩托的依赖性,即使对于$ \ MATHCAL {H}^2 $ -MATRICES。与块低分子分解代码Lorapo的比较显示,对于复杂的几何形状,3D问题的最大速度为4,700倍。

Factorization of large dense matrices are ubiquitous in engineering and data science applications, e.g. preconditioners for iterative boundary integral solvers, frontal matrices in sparse multifrontal solvers, and computing the determinant of covariance matrices. HSS and $\mathcal{H}^2$-matrices are hierarchical low-rank matrix formats that can reduce the complexity of factorizing such dense matrices from $\mathcal{O}(N^3)$ to $\mathcal{O}(N)$. For HSS matrices, it is possible to remove the dependency on the trailing matrices during Cholesky/LU factorization, which results in a highly parallel algorithm. However, the weak admissibility of HSS causes the rank of off-diagonal blocks to grow for 3-D problems, and the method is no longer $\mathcal{O}(N)$. On the other hand, the strong admissibility of $\mathcal{H}^2$-matrices allows it to handle 3-D problems in $\mathcal{O}(N)$, but introduces a dependency on the trailing matrices. In the present work, we pre-compute the fill-ins and integrate them into the shared basis, which allows us to remove the dependency on trailing-matrices even for $\mathcal{H}^2$-matrices. Comparisons with a block low-rank factorization code LORAPO showed a maximum speed up of 4,700x for a 3-D problem with complex geometry.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源