论文标题
在研究和教育的组织病理学中整个幻灯片图像的匿名化
Anonymization of Whole Slide Images in Histopathology for Research and Education
论文作者
论文摘要
目的:与健康相关的数据的交换受区域法律和法规的约束,例如欧盟的一般数据保护法规(GDPR)或美国的《健康保险可移植性和问责制》(HIPAA),导致研究人员和教育工作者在使用这些数据时面临非平凡的挑战。在病理学中,诊断组织样本的数字化不可避免地会生成识别数据,这些数据包括敏感但也以获取相关的信息为组成,以供应商特定的文件格式存储。这些整个幻灯片图像(WSI)的分布和外部使用通常以这些格式进行,因为诸如DICOM之类的行业标准化尚未暂时采用,并且当前幻灯片扫描仪供应商当前没有提供匿名功能。 方法:我们制定了一个指南,以适当处理组织病理学图像数据,尤其是针对GDPR的研究和教育。在这种情况下,我们评估了现有的匿名方法,并检查了专有格式规格,以确定最常见的WSI格式的所有敏感信息。这项工作导致了软件库,该软件库可以在保留本机格式的同时对WSI的符合GDPR的匿名化。 结果:基于对专有格式的分析,确定了经常在临床常规中使用的文件格式的所有敏感信息的出现,最后,开发了带有可执行性CLI-Tool的开源编程库和用于不同编程语言的包装器。 结论:我们的分析表明,在维护数据格式的同时,没有直接的软件解决方案以符合GDPR的方式匿名WSIS。我们使用可扩展的开源库缩小了这一差距,该图书馆即时且离线工作。
Objective: The exchange of health-related data is subject to regional laws and regulations, such as the General Data Protection Regulation (GDPR) in the EU or the Health Insurance Portability and Accountability Act (HIPAA) in the United States, resulting in non-trivial challenges for researchers and educators when working with these data. In pathology, the digitization of diagnostic tissue samples inevitably generates identifying data that can consist of sensitive but also acquisition-related information stored in vendor-specific file formats. Distribution and off-clinical use of these Whole Slide Images (WSI) is usually done in these formats, as an industry-wide standardization such as DICOM is yet only tentatively adopted and slide scanner vendors currently do not provide anonymization functionality. Methods: We developed a guideline for the proper handling of histopathological image data particularly for research and education with regard to the GDPR. In this context, we evaluated existing anonymization methods and examined proprietary format specifications to identify all sensitive information for the most common WSI formats. This work results in a software library that enables GDPR-compliant anonymization of WSIs while preserving the native formats. Results: Based on the analysis of proprietary formats, all occurrences of sensitive information were identified for file formats frequently used in clinical routine, and finally, an open-source programming library with an executable CLI-tool and wrappers for different programming languages was developed. Conclusions: Our analysis showed that there is no straightforward software solution to anonymize WSIs in a GDPR-compliant way while maintaining the data format. We closed this gap with our extensible open-source library that works instantaneously and offline.