论文标题
自我监督的深度估计以使膝关节镜的语义分割正规化
Self-supervised Depth Estimation to Regularise Semantic Segmentation in Knee Arthroscopy
论文作者
论文摘要
膝关节结构的术中自动语义分割可以在膝关节镜检查过程中以情境意识为帮助。但是,由于成像条件差(例如,纹理低,过度曝光等),自动语义分割是一个充满挑战的场景,这证明了有关该主题的稀缺文献。在本文中,我们提出了一种新型的自我监督的单眼深度估计,以使膝关节镜检查中语义分割的训练正常。为了进一步规范深度估计,我们建议使用常规对象的立体关节镜捕获的清洁训练图像(没有呈现不良的成像条件和丰富的纹理信息)来预先培训该模型。我们使用从膝盖内部拍摄的立体关节镜图像来微调这样的模型,以同时生成语义分割和自我监督的单眼深度。使用包含在尸体膝关节镜检查过程中捕获的3868个关节镜图像的数据集,并带有语义分割注释,2000个立体声图像对尸体膝关节关节镜检查和2150个常规对象的立体形式图像对,我们表明我们的语义段是由自我段的段落段列出的,而不是自我序列的段落,而不是自我缩减的序列 - 更准确的估计 - 仅以语义分割注释为建模。
Intra-operative automatic semantic segmentation of knee joint structures can assist surgeons during knee arthroscopy in terms of situational awareness. However, due to poor imaging conditions (e.g., low texture, overexposure, etc.), automatic semantic segmentation is a challenging scenario, which justifies the scarce literature on this topic. In this paper, we propose a novel self-supervised monocular depth estimation to regularise the training of the semantic segmentation in knee arthroscopy. To further regularise the depth estimation, we propose the use of clean training images captured by the stereo arthroscope of routine objects (presenting none of the poor imaging conditions and with rich texture information) to pre-train the model. We fine-tune such model to produce both the semantic segmentation and self-supervised monocular depth using stereo arthroscopic images taken from inside the knee. Using a data set containing 3868 arthroscopic images captured during cadaveric knee arthroscopy with semantic segmentation annotations, 2000 stereo image pairs of cadaveric knee arthroscopy, and 2150 stereo image pairs of routine objects, we show that our semantic segmentation regularised by self-supervised depth estimation produces a more accurate segmentation than a state-of-the-art semantic segmentation approach modeled exclusively with semantic segmentation annotation.