放射学报告中的radlex归一化

论文标题

放射学报告中的radlex归一化

RadLex Normalization in Radiology Reports

论文作者

Datta, Surabhi, Godfrey-Stovall, Jordan, Roberts, Kirk

论文摘要

放射学报告已被广泛用于提取有关患者成像研究的各种临床上重要信息。但是，有限的研究重点是将实体标准化为常见的放射学特异性词汇。此外，迄今为止尚无研究试图利用Radlex进行标准化。在本文中，我们旨在将各种放射学实体集合到radlex术语中。我们通过从三种类型的报告中注释实体来手动构建标准化语料库。这包含1706个实体提及。我们根据预先训练的语言模型（BERT）提出了两种基于深度学习的NLP方法，以进行自动归一化。首先，我们采用BM25来检索基于BERT的模型（重新级别和SPAN检测器）的候选概念来预测标准化概念。结果是有希望的，跨度检测器获得的最佳精度（78.44％）。此外，我们讨论了语料库构建所涉及的挑战，并提出了新的Radlex术语。

Radiology reports have been widely used for extraction of various clinically significant information about patients' imaging studies. However, limited research has focused on standardizing the entities to a common radiology-specific vocabulary. Further, no study to date has attempted to leverage RadLex for standardization. In this paper, we aim to normalize a diverse set of radiological entities to RadLex terms. We manually construct a normalization corpus by annotating entities from three types of reports. This contains 1706 entity mentions. We propose two deep learning-based NLP methods based on a pre-trained language model (BERT) for automatic normalization. First, we employ BM25 to retrieve candidate concepts for the BERT-based models (re-ranker and span detector) to predict the normalized concept. The results are promising, with the best accuracy (78.44%) obtained by the span detector. Additionally, we discuss the challenges involved in corpus construction and propose new RadLex terms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题