SE（3）通过可进入的内核和小组卷积之间的等效性

论文标题

SE（3）通过可进入的内核和小组卷积之间的等效性

Equivalence Between SE(3) Equivariant Networks via Steerable Kernels and Group Convolution

论文作者

Poulenard, Adrien, Ovsjanikov, Maks, Guibas, Leonidas J.

论文摘要

近年来，已经提出了广泛的技术，用于设计针对输入旋转和翻译的3D数据的神经网络。在欧几里得集团$ \ mathrm {se}（3）旋转和翻译的大多数均值方法属于两个主要类别之一。第一个类别由使用$ \ mathrm {se}（3）$ - 卷积的方法组成，该方法将经典$ \ mathbb {r}^3 $ -convolution概括在$ \ mathrm {se}（3）$上。另外，可以使用\ textIt {Stopory卷积}来实现$ \ mathrm {se}（3）$ - $ - equivariance通过对$ \ mathbb {r}^3 $ - 概括性tensor fields施加约束。该领域的专家知道这两种方法是等效的，而可说的是卷积是$ \ m athrm {se}}（3）$卷积的傅立叶变换。不幸的是，这些结果并不广为人知，此外，基于这两种方法的深度学习体系结构之间的确切关系尚未在有关模棱两可的深度学习中得到精确描述。在这项工作中，我们提供了对方法及其等效性的深入分析，并将两种构造与多视卷积网络相关联。此外，我们提供了$ \ mathrm {se}（3）$组卷积的可分离性的理论辩护，这解释了一些最近的方法的适用性和成功。最后，我们使用单个连贯的形式主义表达了不同的方法，并提供了与不同方法学到的内核相关的明确公式。通过这种方式，我们的工作有助于统一不同以前提出的技术，以实现Roto-Translational均值，并有助于阐明各种替代方案之间的效用和精确差异。我们还从我们的等价原理中得出了新的TFN非线性，并在实用的基准数据集上测试它们。

A wide range of techniques have been proposed in recent years for designing neural networks for 3D data that are equivariant under rotation and translation of the input. Most approaches for equivariance under the Euclidean group $\mathrm{SE}(3)$ of rotations and translations fall within one of the two major categories. The first category consists of methods that use $\mathrm{SE}(3)$-convolution which generalizes classical $\mathbb{R}^3$-convolution on signals over $\mathrm{SE}(3)$. Alternatively, it is possible to use \textit{steerable convolution} which achieves $\mathrm{SE}(3)$-equivariance by imposing constraints on $\mathbb{R}^3$-convolution of tensor fields. It is known by specialists in the field that the two approaches are equivalent, with steerable convolution being the Fourier transform of $\mathrm{SE}(3)$ convolution. Unfortunately, these results are not widely known and moreover the exact relations between deep learning architectures built upon these two approaches have not been precisely described in the literature on equivariant deep learning. In this work we provide an in-depth analysis of both methods and their equivalence and relate the two constructions to multiview convolutional networks. Furthermore, we provide theoretical justifications of separability of $\mathrm{SE}(3)$ group convolution, which explain the applicability and success of some recent approaches. Finally, we express different methods using a single coherent formalism and provide explicit formulas that relate the kernels learned by different methods. In this way, our work helps to unify different previously-proposed techniques for achieving roto-translational equivariance, and helps to shed light on both the utility and precise differences between various alternatives. We also derive new TFN non-linearities from our equivalence principle and test them on practical benchmark datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题