论文标题
单程360深度估计的神经轮廓网络
Neural Contourlet Network for Monocular 360 Depth Estimation
论文作者
论文摘要
对于单眼360图像,深度估计是一个具有挑战性的,因为失真沿纬度增加。为了感知失真,现有方法致力于设计深层且复杂的网络体系结构。在本文中,我们提供了一种新的观点,该视角为360图像构建了可解释且稀疏的表示形式。考虑到几何结构在深度估计中的重要性,我们利用Contourlet变换来捕获光谱域中的显式几何提示,并将其与空间域中的隐含提示整合在一起。具体而言,我们提出了一个由卷积神经网络和Contourlet变换分支组成的神经轮廓网络。在编码器阶段,我们设计了一个空间光谱融合模块,以有效融合两种类型的提示。与编码器相反,我们采用了逆方形变换,并通过学习的低通子带和带通路的子带来构成解码器中的深度。在三个流行的全景图像数据集上的实验表明,所提出的方法的表现优于最先进的方案,其收敛速度更快。代码可在https://github.com/zhijieshen-bjtu/neural-contourlet-network-for-mode上找到。
For a monocular 360 image, depth estimation is a challenging because the distortion increases along the latitude. To perceive the distortion, existing methods devote to designing a deep and complex network architecture. In this paper, we provide a new perspective that constructs an interpretable and sparse representation for a 360 image. Considering the importance of the geometric structure in depth estimation, we utilize the contourlet transform to capture an explicit geometric cue in the spectral domain and integrate it with an implicit cue in the spatial domain. Specifically, we propose a neural contourlet network consisting of a convolutional neural network and a contourlet transform branch. In the encoder stage, we design a spatial-spectral fusion module to effectively fuse two types of cues. Contrary to the encoder, we employ the inverse contourlet transform with learned low-pass subbands and band-pass directional subbands to compose the depth in the decoder. Experiments on the three popular panoramic image datasets demonstrate that the proposed approach outperforms the state-of-the-art schemes with faster convergence. Code is available at https://github.com/zhijieshen-bjtu/Neural-Contourlet-Network-for-MODE.