论文标题

放射学文本分析系统(RADTEXT):体系结构和评估

Radiology Text Analysis System (RadText): Architecture and Evaluation

论文作者

Wang, Song, Lin, Mingquan, Ding, Ying, Shih, George, Lu, Zhiyong, Peng, Yifan

论文摘要

分析放射学报告是一项耗时且容易出错的任务,它提高了有效的自动放射学报告分析系统,以减轻放射科医生的工作量并鼓励精确诊断。在这项工作中,我们提出了Radtext,这是Python开发的开源放射学文本分析系统。 radtext提供了易于使用的文本分析管道,包括去识别,部分细分,句子拆分和单词令牌化,命名为实体识别,解析和否定检测。 radtext具有灵活的模块化设计,提供了混合文本处理模式,并支持原始文本处理和本地处理,从而可以更好地可用性和改进的数据隐私。 Radtext采用BIOC作为统一界面,并将输入 /输出标准化为与观察性医学结果伙伴关系(OMOP)共同数据模型(CDM)兼容的结构化表示。这允许采用更系统的方法来观察多个不同数据源的研究。我们评估了MIMIC-CXR数据集上的radtext,并用五个新的疾病标签注释了这项工作。 radtext表现出高度准确的分类性能,平均精度为0.94,F-1得分为0.92。我们在https://github.com/bionlplab/radtext上提供了代码,文档,示例和测试集。

Analyzing radiology reports is a time-consuming and error-prone task, which raises the need for an efficient automated radiology report analysis system to alleviate the workloads of radiologists and encourage precise diagnosis. In this work, we present RadText, an open-source radiology text analysis system developed by Python. RadText offers an easy-to-use text analysis pipeline, including de-identification, section segmentation, sentence split and word tokenization, named entity recognition, parsing, and negation detection. RadText features a flexible modular design, provides a hybrid text processing schema, and supports raw text processing and local processing, which enables better usability and improved data privacy. RadText adopts BioC as the unified interface, and also standardizes the input / output into a structured representation compatible with Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). This allows for a more systematic approach to observational research across multiple, disparate data sources. We evaluated RadText on the MIMIC-CXR dataset, with five new disease labels we annotated for this work. RadText demonstrates highly accurate classification performances, with an average precision of, a recall of 0.94, and an F-1 score of 0.92. We have made our code, documentation, examples, and the test set available at https://github.com/bionlplab/radtext .

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源