论文标题
一个可解释的不平衡的半监督深度学习框架,用于改善皮肤疾病的差异诊断
An interpretable imbalanced semi-supervised deep learning framework for improving differential diagnosis of skin diseases
论文作者
论文摘要
皮肤病是全球最常见的疾病之一。本文介绍了使用58,457个皮肤图像,该研究对多类智能皮肤诊断框架(ISDL)的可解释性和不平衡的半监督学习,并使用10,857个未标记样品的皮肤图像进行了研究。来自少数类别的伪标记的样本在每种迭代的班级自我培训的迭代中都有更高的概率,从而促进了使用未标记的样本来解决类别不平衡问题的利用。我们的ISDL的精度为0.979,敏感性为0.975,特异性为0.973,宏F1得分为0.974,在接收器操作特征曲线(AUC)下,对多标签皮肤疾病分类。 Shapley添加说明(SHAP)方法与我们的ISDL结合使用,以解释深度学习模型如何做出预测。这一发现与临床诊断一致。我们还提出了一种采样分布优化策略,以更有效的方式使用ISDLPLUS选择伪标记的样品。此外,它有可能减轻对专业医生的压力,并有助于解决与农村地区此类医生短缺相关的实际问题。
Dermatological diseases are among the most common disorders worldwide. This paper presents the first study of the interpretability and imbalanced semi-supervised learning of the multiclass intelligent skin diagnosis framework (ISDL) using 58,457 skin images with 10,857 unlabeled samples. Pseudo-labelled samples from minority classes have a higher probability at each iteration of class-rebalancing self-training, thereby promoting the utilization of unlabeled samples to solve the class imbalance problem. Our ISDL achieved a promising performance with an accuracy of 0.979, sensitivity of 0.975, specificity of 0.973, macro-F1 score of 0.974 and area under the receiver operating characteristic curve (AUC) of 0.999 for multi-label skin disease classification. The Shapley Additive explanation (SHAP) method is combined with our ISDL to explain how the deep learning model makes predictions. This finding is consistent with the clinical diagnosis. We also proposed a sampling distribution optimisation strategy to select pseudo-labelled samples in a more effective manner using ISDLplus. Furthermore, it has the potential to relieve the pressure placed on professional doctors, as well as help with practical issues associated with a shortage of such doctors in rural areas.
