多器官分割的空间上下文感知的自我注意力学模型

论文标题

多器官分割的空间上下文感知的自我注意力学模型

Spatial Context-Aware Self-Attention Model For Multi-Organ Segmentation

论文作者

Tang, Hao, Liu, Xingwei, Han, Kun, Sun, Shanlin, Bai, Narisu, Chen, Xuming, Qian, Huang, Liu, Yong, Xie, Xiaohui

论文摘要

多器官细分是在医学图像分析中深度学习的最成功应用之一。深卷积神经网（CNN）在CT或MRI图像上实现临床适用的图像分割性能方面表现出了巨大的希望。最先进的CNN分割模型在输入图像上应用2D或3D卷积，并且与每种方法相关的优点和缺点：2D卷积是快速，较少的内存密集型，但不足以从体积图像中提取3D上下文信息，而相反的3D卷积是正确的。为了在商品GPU上拟合CT或MRI图像上的3D CNN模型，通常必须下样本输入图像或将裁切的本地区域用作输入，这限制了3D模型进行多器官分割的效用。在这项工作中，我们提出了一个新的框架，用于结合3D和2D模型，其中分割是通过高分辨率2D卷积实现的，但以从低分辨率3D模型中提取的空间上下文信息进行指导。我们实施了一种自我注意的机制来控制哪些3D特征应用于指导2D分割。我们的模型是对内存使用情况的灯光，但设备齐全，可以考虑3D上下文信息。多个器官分割数据集的实验表明，通过利用2D和3D模型，我们的方法始终优于器官分割精度的现有2D和3D模型，同时能够直接将原始的全量图像数据作为输入。

Multi-organ segmentation is one of most successful applications of deep learning in medical image analysis. Deep convolutional neural nets (CNNs) have shown great promise in achieving clinically applicable image segmentation performance on CT or MRI images. State-of-the-art CNN segmentation models apply either 2D or 3D convolutions on input images, with pros and cons associated with each method: 2D convolution is fast, less memory-intensive but inadequate for extracting 3D contextual information from volumetric images, while the opposite is true for 3D convolution. To fit a 3D CNN model on CT or MRI images on commodity GPUs, one usually has to either downsample input images or use cropped local regions as inputs, which limits the utility of 3D models for multi-organ segmentation. In this work, we propose a new framework for combining 3D and 2D models, in which the segmentation is realized through high-resolution 2D convolutions, but guided by spatial contextual information extracted from a low-resolution 3D model. We implement a self-attention mechanism to control which 3D features should be used to guide 2D segmentation. Our model is light on memory usage but fully equipped to take 3D contextual information into account. Experiments on multiple organ segmentation datasets demonstrate that by taking advantage of both 2D and 3D models, our method is consistently outperforms existing 2D and 3D models in organ segmentation accuracy, while being able to directly take raw whole-volume image data as inputs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题